Calculating the Pearson correlation (r) using Excel

Cell Preparation

Here’s an image of the cells prepared for calculating r using Excel.

Overview of the prepared cells for Pearson correlation (r) calculation using Excel.

I hypothesized that we want to calculate the r using two variables (X and Y) as shown below:

X: 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10.

Y: 10, 9, 8, 7, 6, 5, 4, 3, 2, and 1.

Assuming that you filled in the cells for X and Y’s Z-score calculation, an additional column we need to calculate r is Zx*ZY:

Zx*ZY column for r calculation (column N)

At the bottom of the Zx*ZY column, let’s add two more cells for the Sum and r, respectively.

You don’t have to use the ∑ notation. You can simply have SUM instead of ∑ .

  • If you want to insert the ∑ notation, then, click the Insert option from the main menu ribbon, and at the right end, click the Symbol option (see below). Then, you can look up the ∑ symbol or use the character code of 2211 to find it faster.
Inserting the summation symbol (∑).

Pearson Correlation (r) Calculation Using Excel

Here’s the formula for r:

    \[r = \frac{\sum(z_x*z_y)}{N-1}\]

As you may have noticed, we have prepared all the ingredients of the r formula. That is, we have the numerator Zx and Zy as well as the denominator N-1. Specifically, column E and K shows the Z-scores of X and Y, respectively. Therefore, to calculate the r, we just need to multiply the Zx and Zy across the X-Y pairs, add them up (in Cell N12), and divide the sum of the multiplied Z-scores by N-1. See below for the detailed instructions.

1. Multiplication of Zx and Zy (Zx * Zy) in column N

The first X score is 1 and the first Y score is 10. Their Z scores are -1.486 and 1.486, respectively. The multiplication of these two Z-scores should come in the first cell of the Zx * Zy column (column N). You can calculate this by using the command: =E2*K2 as shown below.

Multiplication of Z-score pairs for the first case.

If you hit enter, you would get -2.209 in the first cell of Zx * Zy column.

You need to repeat this process for the remaining celss, which you can do easily by doing automatic fill-in. Specifically, choose the first cell of the Zx * Zy column, and move the cursor at the lower right corner of the cell.

Move the cursor at the lower right corner of the cell with the Zx * Zy. Then, the cursor will become a fat crosshair.

Then, the cursor will be a fat crosshair. Then, left-click and drag the cursor until row 11, which has the last score of the variables.

Automatic filling of the Zx * Zy.

2. Sum of the Zx * Zy (the numerator of the r)

To insert the sum of Zx * Zy, you would want to use the AutoSum function. Specifically, select the cell next to the Sum notation (∑), which is N12 cell, and click the AutoSum function as shown below:

Using the AutoSum function to calculate the sum of Zx * Zy.

2. Calculating the Pearson Correlation (r)

Finally, by dividing the sum Zx * Zy by N-1, we can calculate the r.

r calculation by dividing the sum of Zx * Zy by N-1.

If you hit enter, you can get the r of -1, which should make sense considering that X and Y have the exact same scores that were listed backword:

X: 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10.

Y: 10, 9, 8, 7, 6, 5, 4, 3, 2, and 1.

That is, as X increases by 1, Y decreases by 1 systematically, resulting in a perfectly negative correlation, -1.

A perfect negative correlation, -1.

Note, the built-in command of Excel for correlation, =correl(), would give you the exact same correlation of -1 as shown below (cell N13).

Scatter Plot Insertion

Finally, if you want to create a scatter plot, then, you would first want to select the raw scores of X and Y simultaneously. *You can do so by selecting X scores first, pressing and holding the control key, and selecting the Y scores.

Creating a scatter plot using Excel.

If you hit enter, you would get a scatter plot as below:

Scatter Plot of X and Y.

Leave a Comment