# A to Z of Excel Functions: the PEARSON Function

10 July 2023

*Welcome
back to our regular A to Z of Excel Functions blog. Today we look at the PEARSON function. *

** **

**The
PEARSON function**

Imagine you are attempting to understand the relationship between two variables, which are assumed to exhibit a linear relationship:

The closer the data points are to the line of best
fit on a scatter graph, the stronger the correlation. This fit may be measured numerically by a
correlation coefficient. There are
several coefficients that may be used, but the one we shall consider here is
the Pearson's Product Moment Correlation Coefficient (bizarrely denoted **r**),
which measures the strength of the linear correlation between two variables.

Pearson's
product moment correlation coefficient (sometimes known as PPMCC or PCC) is a
measure of the linear relationship between two variables that have been
measured on interval or ratio scales. It
can only be used to measure the relationship between two variables which are
both normally distributed. It is usually
denoted by **r**, and it can only take values between -1 and +1. Its value may be interpreted as follows:

To calculate Pearson's Product
Moment Correlation Coefficient, **r**:

- Plot the scatter diagram for your data; you have to do this first to detect any outliers. If you do not exclude these outliers in your calculation, the correlation coefficient will be misleading

- By being able to see the distribution of your data, you will get a good idea of the strength of correlation of your data before you calculate the correlation coefficient

- Next, check that your data meets all the calculation criteria. The variables need to be:
**measured on an interval / ratio scale**(like weight in kilograms or distance in kilometres)**Normally distributed:**you can check this by looking at a boxplot of your data. If the boxplot is approximately symmetric, it is likely that the data will be Normally distributed**linearly correlated:**look at a significance test of the null and alternative hypotheses.- Finally, calculate the correlation coefficient using the following formula:

where **xi** and **yi** are the **i ^{th} **pair of array
points and

**x̄**and

**ȳ**are the means of the two arrays.

Alternatively, you can calculate this metric using
the **PEARSON **function in Excel. It
returns the Pearson product moment correlation coefficient, **r**, a
dimensionless index that ranges from -1.0 to 1.0 inclusive and reflects the
extent of a linear relationship between two data sets, as explained above.

The **PEARSON **function has the following
syntax:

**=PEARSON(array1,
array2)**

It contains two arguments:

**array1**which is required and represents a set of independent values**array2**which is also required and represents a set of dependent values.

It should be noted that:

- the arguments must be either numbers or names, array constants or references that contain numbers

- if an array or reference argument contains text, logical values or empty cells, those values are ignored; however, cells with the value zero [0] are included

- if
**array1**and**array2**are empty or have a different number of data points,**PEARSON**returns the*#N/A*error value.

Please see my example below:

*We’ll continue our A to Z of Excel Functions soon. Keep checking back – there’s a new blog post every business day.*

*A full page of the function articles can be found here. *