Statistics is used in a variety of fields in our daily lives. Businesses now require statistics to understand their customers better. It is a type of mathematical analysis that employs quantified models, representations, and summaries based on data derived from experiments and real-world studies. In statistics, there are several theories and formulas that you need to be aware of.
This tutorial will cover everything to know about Pearson's correlation coefficient. You will begin with an introduction to statistics and correlation. And then look at the origins of Pearson's correlation coefficient and how it's calculated. You will also see how you can calculate Pearson's correlation coefficient using Microsoft Excel.
What Is Statistics?
The art and science of gathering, organizing, analyzing, and drawing conclusions from data are known as statistics. People cannot make informed judgments and evaluations of a wide range of things encountered in daily life unless they understand how it works.
Psychology, business, physical and social sciences, humanities, government, and manufacturing are just a few fields that use statistics. Statistics is used in business to help people make better decisions.
There are two types of statistics: Descriptive statistics and Inferential statistics.
What Is Correlation?
Correlation is a statistic that measures the relationship between two variables. The correlation coefficient is a numerical representation of the strength and direction of the relationship between the two variables.
The correlation coefficient's numerical values range from -1.0 to +1.0.
A negative correlation coefficient indicates that when one variable changes, the other changes in proportion but in the opposite direction, whereas a positive correlation coefficient indicates that both variables change in proportion but in the same direction.
A perfect negative correlation indicates that two assets move in opposite directions, whereas a zero correlation indicates that there is no linear relationship. The absolute value of the correlation coefficient can be used to determine the strength of a relationship between two variables.
Pearson’s Correlation Coefficient
In statistics, the Pearson correlation coefficient, also known as Pearson's r, measures the strength of a relationship between two variables and their association with one another.
Pearson's Correlation Coefficient is named after Karl Pearson.
Example: A child's height will (in most cases) continue to rise until they reach a certain age. Of course, various factors influence their development such as genes, location, diet, and lifestyle.
Pearson’s Correlation Coefficient Formula
- r = Coefficient of correlation
- xbar = Mean of x-variable
- ybar = Mean of y-variable.
- xi yi = Samples of variable x, y
How to Compute the Pearson Correlation Coefficient in Excel?
Now that you have a basic understanding of Pearson's correlation, you will explore see how you can calculate the correlation between two variables using Microsoft Excel. Here, you have the data of test scores and hours spent studying.
- Step 1: On the Data tab, in the Analysis group, click Data Analysis.
- Step 2: Select Correlation and click OK.
- Step 3: Select input and output range.
- Step 4: Click OK. You will get the correlation data.
0.980 shows there is a strong positive correlation between the two variables.
Now, see how you can find the correlation using the CORREL function.
- Step 1: Go to formula tab > more functions > Statistical > CORREL
- Step 2: Select the Test score as array 1 and hours spent on studying as array 2.
You will get the correlation value to be 0.9804
Looking forward to a career in Data Analytics? Check out the Data Analytics Bootcamp and get certified today.
In this tutorial, you learned about Pearson's correlation coefficient. You also learned about correlation, a statistical method for determining the relationship between two variables. The value of correlation coefficients lies between -1 and +1, which is a noteworthy correlation feature. The magnitude indicates the relationship's strength, while the sign indicates the relationship's direction.
If you are looking to make a career as an Analyst, Simplilearn's Data Analytics PGP in partnership with Purdue University & in collaboration with IBM is the program for you.
If you have any doubts or questions, please mention them in this tutorial’s comments section, and we'll have our experts answer them for you.