The Complete Guide to Understand Pearson's Correlation

Statistics is used in a variety of fields in our daily lives. Businesses now require statistics to understand their customers better. It is a type of mathematical analysis that employs quantified models, representations, and summaries based on data derived from experiments and real-world studies. In statistics, there are several theories and formulas that you need to be aware of.

This tutorial will cover everything to know about Pearson's correlation coefficient. You will begin with an introduction to statistics and correlation. And then look at the origins of Pearson's correlation coefficient and how it's calculated. You will also see how you can calculate Pearson's correlation coefficient using Microsoft Excel.

Professional Certificate Program in Data Analytics

In partnership with Purdue UniversityView Course
Professional Certificate Program in Data Analytics

What Is Statistics?

The art and science of gathering, organizing, analyzing, and drawing conclusions from data are known as statistics. People cannot make informed judgments and evaluations of a wide range of things encountered in daily life unless they understand how it works.

Psychology, business, physical and social sciences, humanities, government, and manufacturing are just a few fields that use statistics. Statistics is used in business to help people make better decisions.

There are two types of statistics: Descriptive statistics and Inferential statistics. 

What Is Correlation?

Correlation is a statistic that measures the relationship between two variables. The correlation coefficient is a numerical representation of the strength and direction of the relationship between the two variables. 

The correlation coefficient's numerical values range from -1.0 to +1.0.

A negative correlation coefficient indicates that when one variable changes, the other changes in proportion but in the opposite direction, whereas a positive correlation coefficient indicates that both variables change in proportion but in the same direction.

A perfect negative correlation indicates that two assets move in opposite directions, whereas a zero correlation indicates that there is no linear relationship. The absolute value of the correlation coefficient can be used to determine the strength of a relationship between two variables.

Pearson_Correlation_1.

Pearson’s Correlation Coefficient

In statistics, the Pearson correlation coefficient, also known as Pearson's r, measures the strength of a relationship between two variables and their association with one another.

Pearson's Correlation Coefficient is named after Karl Pearson. 

Example: A child's height will (in most cases) continue to rise until they reach a certain age. Of course, various factors influence their development such as genes, location, diet, and lifestyle.

Pearson’s Correlation Coefficient Formula

Pearson_Correlation_2

  • r = Coefficient of correlation
  • xbar = Mean of x-variable
  • ybar = Mean of y-variable.
  • xi yi = Samples of variable x, y

FREE Course: Introduction to Data Analytics

Learn Data Analytics Concepts, Tools & SkillsStart Learning
FREE Course: Introduction to Data Analytics

How to Compute the Pearson Correlation Coefficient in Excel?

Now that you have a basic understanding of Pearson's correlation, you will explore see how you can calculate the correlation between two variables using Microsoft Excel. Here, you have the data of test scores and hours spent studying. 

Pearson_Correlation_3.

  • Step 1: On the Data tab, in the Analysis group, click Data Analysis.

Pearson_Correlation_4

  • Step 2: Select Correlation and click OK.

Pearson_Correlation_5

  • Step 3: Select input and output range.

Pearson_Correlation_6

  • Step 4: Click OK. You will get the correlation data.

Pearson_Correlation_7

0.980 shows there is a strong positive correlation between the two variables. 

Data Analyst Master's Program

In Collaboration With IBMExplore Course
Data Analyst Master's Program

Now, see how you can find the correlation using the CORREL function.

  • Step 1:  Go to formula tab > more functions > Statistical > CORREL 

Pearson_Correlation_8.

  • Step 2: Select the Test score as array 1 and hours spent on studying as array 2.

Pearson_Correlation_9

You will get the correlation value to be 0.9804

Looking forward to a career in Data Analytics? Check out the Data Analytics Bootcamp and get certified today.

Conclusion

In this tutorial, you learned about Pearson's correlation coefficient. You also learned about correlation, a statistical method for determining the relationship between two variables. The value of correlation coefficients lies between -1 and +1, which is a noteworthy correlation feature. The magnitude indicates the relationship's strength, while the sign indicates the relationship's direction.

If you are looking to make a career as an Analyst, Simplilearn's Data Analytics PGP in partnership with Purdue University & in collaboration with IBM is the program for you.

If you have any doubts or questions, please mention them in this tutorial’s comments section, and we'll have our experts answer them for you.

About the Author

Aryan GuptaAryan Gupta

Aryan is a tech enthusiast who likes to stay updated about trending technologies of today. He is passionate about all things technology, a keen researcher, and writes to inspire. Aside from technology, he is an active football player and a keen enthusiast of the game.

View More
  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.
  • *According to Simplilearn survey conducted and subject to terms & conditions with Ernst & Young LLP (EY) as Process Advisors