# A to Z of Excel Functions: the NORMINV Function

16 January 2023

*Welcome
back to our regular A to Z of Excel Functions blog. Today we look at the NORMINV function. *

** **

**The
NORMINV function**

Imagine I toss an unbiased coin; half of the time it will come down heads, half tails:

It is not the most exciting chart I have ever constructed, but it’s a start.

If I toss two coins, I get four possibilities: two Heads, a Head and a Tail, a Tail and a Head, and two Tails.

In summary, I should get two heads a quarter of the time, one head half of the time and no heads a quarter of the time. Note that (1/4) + (1/2) + (1/4) = 1. These fractions are the probabilities of the events occurring and the sum of all possible outcomes must always add up to 1.

The story is similar if we consider 16 coin tosses say:

Again, if you were to add up all of the individual probabilities, they would total to 1. Notice that in symmetrical distributions (such as this one) it is common for the most likely event (here, eight heads) to be the event at the midpoint.

Of course, why should we stop at 16 coin tosses?

All of these charts represent ** probability distributions**,

*i.e.*it displays how the probabilities of certain events occurring are distributed. If we can formulate a probability distribution, we can estimate the likelihood of a particular event occurring (

*e.g.*probability of precisely 47 heads from 100 coin tosses is 0.0666, probability of less than or equal to 25 heads occurring in 100 coin tosses is 2.82 x 10

^{-7}).

Now, I would like to ask the reader to verify this last
chart. Assuming you can toss 100 coins,
count the number of heads and record the outcome at one coin toss per second,
it shouldn’t take you more than 4.0 X 10^{22} ** centuries **to generate
every permutation. Even if we were to
simulate this experiment using a computer programme capable of generating many
calculations a second it would not be possible.
For example, in February 2012, the

*Japan Times*announced a new computer that could compute 10,000,000,000,000,000 calculations per second. If we could use this computer, it would only take us a mere 401,969 years to perform this computation. Sorry, but I can’t afford the electricity bill.

Let’s put this all into perspective. All I am talking about here is considering
100 coin tosses. If only business were
that *simple*. Potential outcomes for a business would be
much more complex. Clearly, if we want
to consider all possible outcomes, we can only do this using some sampling
technique based on understanding the underlying probability distributions.

*Probability Distributions*

If I plotted charts for 1,000 or 10,000 coin tosses similar
to the above, I would generate similarly shaped distributions. This classic distribution which only allows
for two outcomes is known as the **Binomial
distribution** and is regularly used in probabilistic analysis.

The 100 coin toss chart shows that the average (or ‘**expected**’ or ‘**mean**’) number of heads here is 50.
This can be calculated using a weighted average in the usual way. The ‘spread’ of heads is clearly quite narrow
(tapering off very sharply at less than 40 heads or greater than 60). This spread is measured by statisticians
using a measure called **standard
deviation** which is defined as the square root of the average value of the
square of the difference between each possible outcome and the mean, *i.e.*

where: **σ** = standard deviation

**N** = total number of possible outcomes

**Σ** = summation

**x _{i}** = each outcome event
(from first

**x**to last

_{1}**x**)

_{N} **μ** = mean or average

The Binomial distribution is not the most common
distribution used in probability analysis: that honour belongs to the **Gaussian** or **Normal distribution**:

The Central Limit Theorem states that the sampling distribution of the sample means approaches a normal distribution as the sample size gets larger — no matter what the shape of the population distribution. This is why this distribution is so important in probability and statistics.

Generated by a complex mathematical formula, this
distribution is defined by specifying the **mean **and **standard deviation** (see
above). The Normal distribution’s
population is spread as follow:

*i.e.* 68% of the
population is within one standard deviation of the mean, 95% within two
standard deviations and 99.7% within three standard deviations.

The formula for the Normal distribution is given by

However, sometimes you may wish to determine **x **for a given probability, known as the
inverse function. Excel has the function **NORMINV** to calculate the inverse of
the normal cumulative distribution for the specified mean and standard
deviation.

Its syntax is:

**NORMINV(probability, mean, standard_deviation)**

**NORMINV** has the
following arguments:

**probability**: this is required and represents a probability corresponding to the normal distribution**mean**:**standard_deviation**: this last argument is also required. This is the standard deviation of the distribution.

It should be further noted that:

- since Excel 2007, Microsoft has updated many of
its statistical functions. This is an
older function which has been superseded by
**NORM.INV**. If you are new to both of these functions, I would suggest using**NORM.INV**as Microsoft has advised it may not support the original function in future versions of Excel - if any argument is nonnumeric,
**NORMINV**returns the*#VALUE!*error value - if
**probability**≤**NORMINV**returns the*#NUM!*error value - if
**standard_deviation**≤ 0,**NORMINV**returns the*#NUM!*error value - if
**mean**= 0 and**standard_deviation**= 1,**NORMINV**uses the standard normal distribution,**NORMSINV** - given a value for
**probability**,**NORMINV**seeks that value**x**such that**NORMDIST(x, mean, standard_deviation, TRUE) = probability**. Thus, precision of**NORMINV**depends upon the precision of**NORMDIST**.

Please see my example below:

*Extending the Idea to
Simulations Analysis*

For any given distribution, we cannot model every possible combination / permutation of outcomes. The aim is to analyse a representative sample based on a known, or assumed, probability distribution.

There are various ways to sample, with the most popular
approach being the “Monte Carlo” method, which involves picking data randomly (*i.e.* using no stratification or bias). Excel’s **RAND() **function picks a number between 0 and 1 randomly, which is very useful as
cumulative probabilities can only range between 0 and 1.

**NORMINV(probability,
mean, standard_deviation)** returns the value **x** such that the cumulative probability specified (**probability**) represents the observed
value of a Normal random variable with specified **mean** and **standard_deviation **is less than or equal to **x**. In essence, this is the inverse function of **NORMDIST(x, mean, standard_deviation, TRUE)**.

Therefore, we can get Excel to pick a random number between zero (0) and one (1), and for a given mean and standard deviation, generate a particular outcome appropriate to the probability distribution specified, which can then be used in the model as in the following illustration:

**=NORMINV(RAND(),Mean_Sales,SD_Sales)**

The mean and standard deviation are easy to calculate –
simply list all of your historical data and use the Excel functions **AVERAGE** for mean and **STDEV.S** for the standard deviation.

Here, three variables, **Sales**, **Foreign Exchange **and **Gross Margin** all employ the **NORMINV** function to generate the
assumptions needed to calculate the Gross Profit. We can run the simulation a given number of
times by running a simple one-dimensional Data Table.

The actual approach is a little crafty though:

Since the variables use the **RAND** function to generate random numbers, each time the end user
presses **ENTER** or **F9**, the variables will recalculate (this quality
is known as ‘volatility’). I have created
a Data Table (**ALT + D + T**) to create multiple trials (the headings are
actually the outputs required using clever number formatting to disguise what
they are). Once dynamic arrays become
Generally Available, this technique will become even simpler.

Since each Data Table entry causes the model to recalculate
for each column input, the values will change automatically. On this basis, note that the column input
cell in the example above refers to **E30** (the cell highlighted in yellow) which is unused by any formula on the
spreadsheet.

The example in the attached Excel file has 1,000 rows (*i.e.*1,000 simulations). Since the variables
have been generated randomly, this is a simple Monte Carlo simulation – no
fancy software or macros required!

It only requires a quick collation step to summarise the outputs graphically:

Please refer to __the file__ for more details.

It should be noted that there are three key issues to consider:

- not all variables are Normally distributed. Consequently, using the
**NORMINV**function may be inappropriate in some instances - the above example has assumed all variables are
__independent__. If there are interrelationships or correlations between inputs, this simple approach would need to be revised accordingly - working with probabilities is notoriously counter-intuitive. Take care with interpreting results and always remember that the results only represent a sample of possible outcomes (the sample size may be too small for extrapolation to the entire population). If in doubt, consult an expert (hi, we are over here…).

* *

*We’ll continue our A
to Z of Excel Functions soon. Keep checking
back – there’s a new blog post every business day.*