In today’s datadriven world, decisions are based on data all the time. Hypothesis plays a crucial role in that process, whether it may be making business decisions, in the health sector, academia, or in quality improvement. Without hypothesis and hypothesis tests, you risk drawing the wrong conclusions and making bad decisions. In this tutorial, you will look at Hypothesis Testing in Statistics.
What Is Hypothesis Testing in Statistics?
Hypothesis Testing is a type of statistical analysis in which you put your assumptions about a population parameter to the test. It is used to estimate the relationship between 2 statistical variables.
Let's discuss few examples of statistical hypothesis from reallife 
 A teacher assumes that 60% of his college's students come from lowermiddleclass families.
 A doctor believes that 3D (Diet, Dose, and Discipline) is 90% effective for diabetic patients.
Now that you know about hypothesis testing, look at the two types of hypothesis testing in statistics.
Importance of Hypothesis Testing in Data Analysis
Here is what makes hypothesis testing so important in data analysis and why it is key to making better decisions:

Avoiding Misleading Conclusions (Type I and Type II Errors)
One of the biggest benefits of hypothesis testing is that it helps you avoid jumping to the wrong conclusions. For instance, a Type I error could occur if a company launches a new product thinking it will be a hit, only to find out later that the data misled them. A Type II error might happen when a company overlooks a potentially successful product because their testing wasn’t thorough enough. By setting up the right significance level and carefully calculating the pvalue, hypothesis testing minimizes the chances of these errors, leading to more accurate results.

Making Smarter Choices
Hypothesis testing is key to making smarter, evidencebased decisions. Let’s say a city planner wants to determine if building a new park will increase community engagement. By testing the hypothesis using data from similar projects, they can make an informed choice. Similarly, a teacher might use hypothesis testing to see if a new teaching method actually improves student performance. It’s about taking the guesswork out of decisions and relying on solid evidence instead.

Optimizing Business Tactics
In business, hypothesis testing is invaluable for testing new ideas and strategies before fully committing to them. For example, an ecommerce company might want to test whether offering free shipping increases sales. By using hypothesis testing, they can compare sales data from customers who received free shipping offers and those who didn’t. This allows them to base their business decisions on data, not hunches, reducing the risk of costly mistakes.
Hypothesis Testing Formula
Z = ( x̅ – μ0 ) / (σ /√n)
 Here, x̅ is the sample mean,
 μ0 is the population mean,
 σ is the standard deviation,
 n is the sample size.
How Hypothesis Testing Works?
An analyst performs hypothesis testing on a statistical sample to present evidence of the plausibility of the null hypothesis. Measurements and analyses are conducted on a random sample of the population to test a theory. Analysts use a random population sample to test two hypotheses: the null and alternative hypotheses.
The null hypothesis is typically an equality hypothesis between population parameters; for example, a null hypothesis may claim that the population means return equals zero. The alternate hypothesis is essentially the inverse of the null hypothesis (e.g., the population means the return is not equal to zero). As a result, they are mutually exclusive, and only one can be correct. One of the two possibilities, however, will always be correct.
Null Hypothesis and Alternative Hypothesis
The Null Hypothesis is the assumption that the event will not occur. A null hypothesis has no bearing on the study's outcome unless it is rejected.
H0 is the symbol for it, and it is pronounced Hnaught.
The Alternate Hypothesis is the logical opposite of the null hypothesis. The acceptance of the alternative hypothesis follows the rejection of the null hypothesis. H1 is the symbol for it.
Let's understand this with an example.
A sanitizer manufacturer claims that its product kills 95 percent of germs on average.
To put this company's claim to the test, create a null and alternate hypothesis.
H0 (Null Hypothesis): Average = 95%.
Alternative Hypothesis (H1): The average is less than 95%.
Another straightforward example to understand this concept is determining whether or not a coin is fair and balanced. The null hypothesis states that the probability of a show of heads is equal to the likelihood of a show of tails. In contrast, the alternate theory states that the probability of a show of heads and tails would be very different.
Hypothesis Testing Calculation With Examples
Let's consider a hypothesis test for the average height of women in the United States. Suppose our null hypothesis is that the average height is 5'4". We gather a sample of 100 women and determine their average height is 5'5". The standard deviation of population is 2.
To calculate the zscore, we would use the following formula:
z = ( x̅ – μ0 ) / (σ /√n)
z = (5'5"  5'4") / (2" / √100)
z = 0.5 / (0.045)
z = 11.11
We will reject the null hypothesis as the zscore of 11.11 is very large and conclude that there is evidence to suggest that the average height of women in the US is greater than 5'4".
Steps in Hypothesis Testing
Hypothesis testing is a statistical method to determine if there is enough evidence in a sample of data to infer that a certain condition is true for the entire population. Here’s a breakdown of the typical steps involved in hypothesis testing:
Formulate Hypotheses
 Null Hypothesis (H0): This hypothesis states that there is no effect or difference, and it is the hypothesis you attempt to reject with your test.
 Alternative Hypothesis (H1 or Ha): This hypothesis is what you might believe to be true or hope to prove true. It is usually considered the opposite of the null hypothesis.
Choose the Significance Level (α)
The significance level, often denoted by alpha (α), is the probability of rejecting the null hypothesis when it is true. Common choices for α are 0.05 (5%), 0.01 (1%), and 0.10 (10%).
Select the Appropriate Test
Choose a statistical test based on the type of data and the hypothesis. Common tests include ttests, chisquare tests, ANOVA, and regression analysis. The selection depends on data type, distribution, sample size, and whether the hypothesis is onetailed or twotailed.
Collect Data
Gather the data that will be analyzed in the test. To infer conclusions accurately, this data should be representative of the population.
Calculate the Test Statistic
Based on the collected data and the chosen test, calculate a test statistic that reflects how much the observed data deviates from the null hypothesis.
Determine the pvalue
The pvalue is the probability of observing test results at least as extreme as the results observed, assuming the null hypothesis is correct. It helps determine the strength of the evidence against the null hypothesis.
Make a Decision
Compare the pvalue to the chosen significance level:
 If the pvalue ≤ α: Reject the null hypothesis, suggesting sufficient evidence in the data supports the alternative hypothesis.
 If the pvalue > α: Do not reject the null hypothesis, suggesting insufficient evidence to support the alternative hypothesis.
Report the Results
Present the findings from the hypothesis test, including the test statistic, pvalue, and the conclusion about the hypotheses.
Perform Posthoc Analysis (if necessary)
Depending on the results and the study design, further analysis may be needed to explore the data more deeply or to address multiple comparisons if several hypotheses were tested simultaneously.
Types of Hypothesis Testing
1. Z Test
To determine whether a discovery or relationship is statistically significant, hypothesis testing uses a ztest. It usually checks to see if two means are the same (the null hypothesis). Only when the population standard deviation is known and the sample size is 30 data points or more, can a ztest be applied.
2. T Test
A statistical test called a ttest is employed to compare the means of two groups. To determine whether two groups differ or if a procedure or treatment affects the population of interest, it is frequently used in hypothesis testing.
3. ChiSquare
You utilize a Chisquare test for hypothesis testing concerning whether your data is as predicted. To determine if the expected and observed results are wellfitted, the Chisquare test analyzes the differences between categorical variables from a random sample. The test's fundamental premise is that the observed values in your data should be compared to the predicted values that would be present if the null hypothesis were true.
4. ANOVA
ANOVA, or Analysis of Variance, is a statistical method used to compare the means of three or more groups. It’s particularly useful when you want to see if there are significant differences between multiple groups. For instance, in business, a company might use ANOVA to analyze whether three different stores are performing differently in terms of sales. It’s also widely used in fields like medical research and social sciences, where comparing group differences can provide valuable insights.
Hypothesis Testing and Confidence Intervals
Both confidence intervals and hypothesis tests are inferential techniques that depend on approximating the sample distribution. Data from a sample is used to estimate a population parameter using confidence intervals. Data from a sample is used in hypothesis testing to examine a given hypothesis. We must have a postulated parameter to conduct hypothesis testing.
Bootstrap distributions and randomization distributions are created using comparable simulation techniques. The observed sample statistic is the focal point of a bootstrap distribution, whereas the null hypothesis value is the focal point of a randomization distribution.
A variety of feasible population parameter estimates are included in confidence ranges. In this lesson, we created just twotailed confidence intervals. There is a direct connection between these twotail confidence intervals and these twotail hypothesis tests. The results of a twotailed hypothesis test and twotailed confidence intervals typically provide the same results. In other words, a hypothesis test at the 0.05 level will virtually always fail to reject the null hypothesis if the 95% confidence interval contains the predicted value. A hypothesis test at the 0.05 level will nearly certainly reject the null hypothesis if the 95% confidence interval does not include the hypothesized parameter.
Become a Data Scientist through handson learning with hackathons, masterclasses, webinars, and AskMeAnything sessions! Start learning!
Simple and Composite Hypothesis Testing
Depending on the population distribution, you can classify the statistical hypothesis into two types.
Simple Hypothesis: A simple hypothesis specifies an exact value for the parameter.
Composite Hypothesis: A composite hypothesis specifies a range of values.
Example:
A company is claiming that their average sales for this quarter are 1000 units. This is an example of a simple hypothesis.
Suppose the company claims that the sales are in the range of 900 to 1000 units. Then this is a case of a composite hypothesis.
OneTailed and TwoTailed Hypothesis Testing
The OneTailed test, also called a directional test, considers a critical region of data that would result in the null hypothesis being rejected if the test sample falls into it, inevitably meaning the acceptance of the alternate hypothesis.
In a onetailed test, the critical distribution area is onesided, meaning the test sample is either greater or lesser than a specific value.
In two tails, the test sample is checked to be greater or less than a range of values in a TwoTailed test, implying that the critical distribution area is twosided.
If the sample falls within this range, the alternate hypothesis will be accepted, and the null hypothesis will be rejected.
Right Tailed Hypothesis Testing
If the larger than (>) sign appears in your hypothesis statement, you are using a righttailed test, also known as an upper test. Or, to put it another way, the disparity is to the right. For instance, you can contrast the battery life before and after a change in production. Your hypothesis statements can be the following if you want to know if the battery life is longer than the original (let's say 90 hours):
 The null hypothesis is (H0 <= 90) or less change.
 A possibility is that battery life has risen (H1) > 90.
The crucial point in this situation is that the alternate hypothesis (H1), not the null hypothesis, decides whether you get a righttailed test.
Left Tailed Hypothesis Testing
Alternative hypotheses that assert the true value of a parameter is lower than the null hypothesis are tested with a lefttailed test; they are indicated by the asterisk "<".
Example:
Suppose H0: mean = 50 and H1: mean not equal to 50
According to the H1, the mean can be greater than or less than 50. This is an example of a Twotailed test.
In a similar manner, if H0: mean >=50, then H1: mean <50
Here the mean is less than 50. It is called a Onetailed test.
Type 1 and Type 2 Error
A hypothesis test can result in two types of errors.
Type 1 Error: A TypeI error occurs when sample results reject the null hypothesis despite being true.
Type 2 Error: A TypeII error occurs when the null hypothesis is not rejected when it is false, unlike a TypeI error.
Example:
Suppose a teacher evaluates the examination paper to decide whether a student passes or fails.
H0: Student has passed
H1: Student has failed
Type I error will be the teacher failing the student [rejects H0] although the student scored the passing marks [H0 was true].
Type II error will be the case where the teacher passes the student [do not reject H0] although the student did not score the passing marks [H1 is true].
Practice Problems on Hypothesis Testing
Here are the practice problems on hypothesis testing that will help you understand how to apply these concepts in realworld scenarios:
Question 1
A telecom service provider claims that customers spend an average of ₹400 per month, with a standard deviation of ₹25. However, a random sample of 50 customer bills shows a mean of ₹250 and a standard deviation of ₹15. Does this sample data support the service provider’s claim?
Solution: Let’s break this down:
 Null Hypothesis (H0): The average amount spent per month is ₹400.
 Alternate Hypothesis (H1): The average amount spent per month is not ₹400.
Given:
 Population Standard Deviation (σ): ₹25
 Sample Size (n): 50
 Sample Mean (x̄): ₹250
1. Calculate the zvalue:
z=x̄/n
z=25040025/50 −42.42
2. Compare with critical zvalues: For a 5% significance level, critical zvalues are 1.96 and +1.96. Since 42.42 is far outside this range, we reject the null hypothesis. The sample data suggests that the average amount spent is significantly different from ₹400.
Question 2
Out of 850 customers, 400 made online grocery purchases. Can we conclude that more than 50% of customers are moving towards online grocery shopping?
Solution: Here’s how to approach it:
 Proportion of customers who shopped online (p): 400 / 850 = 0.47
 Null Hypothesis (H0): The proportion of online shoppers is 50% or more.
 Alternate Hypothesis (H1): The proportion of online shoppers is less than 50%.
Given:
 Sample Size (n): 850
 Significance Level (α): 5%
1. Calculate the zvalue:
z=pPP(1P)/n
z=0.470.500.50.5/850 −1.74
2. Compare with the critical zvalue: For a 5% significance level (onetailed test), the critical zvalue is 1.645. Since 1.74 is less than 1.645, we reject the null hypothesis. This means the data does not support the idea that most customers are moving towards online grocery shopping.
Question 3
In a study of code quality, Team A has 250 errors in 1000 lines of code, and Team B has 300 errors in 800 lines of code. Can we say Team B performs worse than Team A?
Solution: Let’s analyze it:
 Proportion of errors for Team A (pA): 250 / 1000 = 0.25
 Proportion of errors for Team B (pB): 300 / 800 = 0.375
 Null Hypothesis (H0): Team B’s error rate is less than or equal to Team A’s.
 Alternate Hypothesis (H1): Team B’s error rate is greater than Team A’s.
Given:
 Sample Size for Team A (nA): 1000
 Sample Size for Team B (nB): 800
 Significance Level (α): 5%
1. Calculate the zvalue:
p=nApA+nBpBnA+nB
p=10000.25+8000.3751000+800 ≈ 0.305
z=pA−pBp(1p)(1nA+1nB)
z=0.25−0.3750.305(10.305) (11000+1800) ≈ −5.72
2. Compare with the critical zvalue: For a 5% significance level (onetailed test), the critical zvalue is +1.645. Since 5.72 is far less than +1.645, we reject the null hypothesis. The data indicates that Team B’s performance is significantly worse than Team A’s.
Our Data Scientist Master's Program will help you master core topics such as R, Python, Machine Learning, Tableau, Hadoop, and Spark. Get started on your journey today!
Applications of Hypothesis Testing
Apart from the practical problems, let's look at the realworld applications of hypothesis testing across various fields:

Medicine and Healthcare
In medicine, hypothesis testing plays a pivotal role in assessing the success of new treatments. For example, researchers may want to find out if a new exercise regimen improves heart health. By comparing data from patients who followed the program to those who didn’t, they can determine if the exercise significantly improves health outcomes. Such rigorous testing allows medical professionals to rely on proven methods rather than assumptions.

Quality Control and Manufacturing
In manufacturing, ensuring product quality is vital, and hypothesis testing helps maintain those standards. Suppose a beverage company introduces a new bottling process and wants to verify if it reduces contamination. By analyzing samples from the new and old processes, hypothesis testing can reveal whether the new method reduces the risk of contamination. This allows manufacturers to implement improvements that enhance product safety and quality confidently.

Education and Learning
In education and learning, hypothesis testing is a tool to evaluate the impact of innovative teaching techniques. Imagine a situation where teachers introduce projectbased learning to boost critical thinking skills. By comparing the performance of students who engaged in projectbased learning with those in traditional settings, educators can test their hypothesis. The results can help educators make informed choices about adopting new teaching strategies.

Environmental Science
Hypothesis testing is essential in environmental science for evaluating the effectiveness of conservation measures. For example, scientists might explore whether a new water management strategy improves river health. By collecting and comparing data on water quality before and after the implementation of the strategy, they can determine whether the intervention leads to positive changes. Such findings are crucial for guiding environmental decisions that have longterm impacts.

Marketing and Advertising
In marketing, businesses use hypothesis testing to refine their approaches. For instance, a clothing brand might test if offering limitedtime discounts increases customer loyalty. By running campaigns with and without the discount and analyzing the outcomes, they can assess if the strategy boosts customer retention. Datadriven insights from hypothesis testing enable companies to design marketing strategies that resonate with their audience and drive growth.
Limitations of Hypothesis Testing
Hypothesis testing has some limitations that researchers should be aware of:
 It cannot prove or establish the truth: Hypothesis testing provides evidence to support or reject a hypothesis, but it cannot confirm the absolute truth of the research question.
 Results are samplespecific: Hypothesis testing is based on analyzing a sample from a population, and the conclusions drawn are specific to that particular sample.
 Possible errors: During hypothesis testing, there is a chance of committing type I error (rejecting a true null hypothesis) or type II error (failing to reject a false null hypothesis).
 Assumptions and requirements: Different tests have specific assumptions and requirements that must be met to accurately interpret results.
Conclusion
After reading this tutorial, you would have a much better understanding of hypothesis testing, one of the most important concepts in the field of Data Science. The majority of hypotheses are based on speculation about observed behavior, natural phenomena, or established theories.
If you are interested in statistics of data science and skills needed for such a career, you ought to explore the Post Graduate Program in Data Science.
FAQs
1. What is hypothesis testing in statistics with example?
Hypothesis testing is a statistical method used to determine if there is enough evidence in a sample data to draw conclusions about a population. It involves formulating two competing hypotheses, the null hypothesis (H0) and the alternative hypothesis (Ha), and then collecting data to assess the evidence. An example: testing if a new drug improves patient recovery (Ha) compared to the standard treatment (H0) based on collected patient data.
2. What is H0 and H1 in statistics?
In statistics, H0 and H1 represent the null and alternative hypotheses. The null hypothesis, H0, is the default assumption that no effect or difference exists between groups or conditions. The alternative hypothesis, H1, is the competing claim suggesting an effect or a difference. Statistical tests determine whether to reject the null hypothesis in favor of the alternative hypothesis based on the data.
3. What is a simple hypothesis with an example?
A simple hypothesis is a specific statement predicting a single relationship between two variables. It posits a direct and uncomplicated outcome. For example, a simple hypothesis might state, "Increased sunlight exposure increases the growth rate of sunflowers." Here, the hypothesis suggests a direct relationship between the amount of sunlight (independent variable) and the growth rate of sunflowers (dependent variable), with no additional variables considered.
4. What are the 3 major types of hypothesis?
The three major types of hypotheses are:
 Null Hypothesis (H0): Represents the default assumption, stating that there is no significant effect or relationship in the data.
 Alternative Hypothesis (Ha): Contradicts the null hypothesis and proposes a specific effect or relationship that researchers want to investigate.
 Nondirectional Hypothesis: An alternative hypothesis that doesn't specify the direction of the effect, leaving it open for both positive and negative possibilities.
5. What software tools can assist with hypothesis testing?
Several software tools offering distinct features can help with hypothesis testing. R and RStudio are popular for their advanced statistical capabilities. The Python ecosystem, including libraries like SciPy and Statsmodels, also supports hypothesis testing. SAS and SPSS are wellestablished tools for comprehensive statistical analysis. For basic testing, Excel offers simple builtin functions.
6. How do I interpret the results of a hypothesis test?
Interpreting hypothesis test results involves comparing the pvalue to the significance level (alpha). If the pvalue is less than or equal to alpha, you can reject the null hypothesis, indicating statistical significance. This suggests that the observed effect is unlikely to have occurred by chance, validating your analysis findings.
7. Why is sample size important in hypothesis testing?
Sample size is crucial in hypothesis testing as it affects the test’s power. A larger sample size increases the likelihood of detecting a true effect, reducing the risk of Type II errors. Conversely, a small sample may lack the statistical power needed to identify differences, potentially leading to inaccurate conclusions.
8. Can hypothesis testing be used for nonnumerical data?
Yes, hypothesis testing can be applied to nonnumerical data through nonparametric tests. These tests are ideal when data doesn't meet parametric assumptions or when dealing with categorical data. Nonparametric tests, like the Chisquare or MannWhitney U test, provide robust methods for analyzing nonnumerical data and drawing meaningful conclusions.
9. How do I choose the proper hypothesis test?
Selecting the right hypothesis test depends on several factors: the objective of your analysis, the type of data (numerical or categorical), and the sample size. Consider whether you're comparing means, proportions, or associations, and whether your data follows a normal distribution. The correct choice ensures accurate results tailored to your research question.