75+ Data Analyst Interview Questions With Answers
TL;DR: This 2026 guide covers 75+ data analyst interview questions with clear, job-ready answers across SQL, statistics, Excel, Python, Tableau, and case rounds. Use it to master the most-asked questions and practice the exact skills interviewers test.

Introduction

Data analyst interviews have become more competitive as companies hire for decision-ready analytics. In the United States, the U.S. Bureau of Labor Statistics projects 34% growth in data analyst roles, signaling sustained demand for analytics capabilities across teams, not just in niche roles.

You’ll see this demand show up in today’s hiring trends. Analytics roles are common across finance, e-commerce, healthcare, SaaS, and operations, with a growing mix of remote and hybrid work.

On LinkedIn, current United States listings show 115,000+ Data Analyst jobs, split across on-site, hybrid, and remote; a useful snapshot of how widely the role appears and how work models vary.

Data Analyst Interview Scenario

Data analyst interviews typically test SQL, Excel, statistics, dashboards (Tableau/Power BI), and Python, plus your ability to clean data, validate results, and explain insights. Expect technical questions, workflow/process questions, and a few case-style prompts about how you’d solve real business problems.

Top 15 Most Asked Data Analyst Interview Questions

1. What does a data analyst do?

A data analyst turns raw data into insights that drive business decisions using SQL, Excel, dashboards, and clear storytelling.

2. Why is exploratory data analysis (EDA) important?

EDA helps you spot data quality issues early, understand distributions and segments, identify anomalies, and avoid misleading conclusions before reporting or modeling.

3. What is data wrangling?

Data wrangling is the method of cleaning, structuring, and enriching raw data to make it ready for analysis.

4. Difference between WHERE and HAVING in SQL.

  • WHERE filters rows before aggregation
  • HAVING filters grouped results after aggregation

WHERE VS HAVING

5. Difference between INNER JOIN and LEFT JOIN.

INNER Join keeps only matches; LEFT keeps all rows from the left table and matches from the right (else NULL).

6. What is a window function?

The window function performs calculations across related rows without collapsing results (e.g., ranking with ROW_NUMBER).

7. What is CTE?

A CTE (WITH ...) is a named temporary result set that improves readability and debugging for multi-step logic.

8. How do you handle missing values?

Measure missingness, then drop, impute (median/mean), model-impute, or use multiple imputation based on bias risk to handle missing values.

9. What is a normal distribution?

A normal distribution is a bell-shaped distribution where mean=median=mode and most values fall within 1–3 standard deviations (68–95–99.7).

10. Difference between Correlation and causation.

Correlation shows association; causation requires testing (ideally with experiments) to rule out confounders.

11. What are Type I and Type II errors?

  • Type I is a false positive error
  • Type II is a false negative error

12. How do you treat outliers?

Validate the cause, then remove errors, cap values, transform (e.g., log), or use robust statistics, depending on the context.

13. Difference between VLOOKUP and XLOOKUP.

  • VLOOKUP only searches left-to-right and relies on a fixed column index
  • XLOOKUP can look left or right, defaults  to an exact match, and won’t break if columns are inserted

14. Difference between Extract and Live in Tableau.

  • An extract is a fast snapshot that you refresh
  • A live query connects to the source in real time for always-current data

15. How do you explain insights to stakeholders?

Share the takeaway, the impact, the key driver, and the recommended next step in plain language.

Your Data Analytics Career is Around The Corner!

Data Analyst CourseExplore Program
Your Data Analytics Career is Around The Corner!

General Data Analyst Interview Questions

This section covers high-level questions about the data analyst role, core concepts, and how you approach problem-solving. These Q&As are great for early-round interviews where clarity and fundamentals matter most.

1. What’s the difference between data mining and data profiling?

  • Data mining: discovers patterns/relationships (segments, anomalies, predictions).
  • Data profiling: evaluates quality and structure (nulls, uniqueness, validity, distributions).

2. What common problems do data analysts face during analysis?

  • messy data (missing, duplicates, inconsistent formats)
  • unclear metric definitions across teams
  • incorrect joins or grain mismatches that inflate totals
  • access/privacy/compliance constraints
  • stakeholder misalignment and shifting scope
  • speed vs accuracy tradeoffs

3. Which tools have you used for analysis and presentation?

  • SQL for querying and aggregation
  • Excel/Sheets for quick analysis, QA, pivots
  • Python/R for cleaning, automation, stats
  • BI tools for dashboards (Tableau/Power BI)
  • Slides for story & recommendation

4. What are the best practices for data cleaning?

  • profile first (nulls, duplicates, ranges, distributions)
  • standardize formats (dates, units, categories)
  • validate business rules (e.g., order_date ≤ ship_date)
  • handle missing values intentionally (drop/impute/flag)
  • investigate outliers before removing
  • document changes for reproducibility

5. What are the steps in an analytics project?

  • Define the problem
  • Collect and Validate data
  • Run EDA
  • Analyze and Visualize
  • Recommend actions and Document results

Analytics Project Workflow

Image Representation: 8-step analytics project workflow

6. Explain descriptive vs predictive vs prescriptive analytics.

  • Descriptive: what happened (KPIs, dashboards)
  • Predictive: what might happen (forecasts, probability models)
  • Prescriptive: what to do (recommendations, optimization)

7. What sampling techniques do analysts use?

  • simple random
  • systematic (every nth record)
  • stratified (by segment, sample within each)
  • cluster (sample groups)
  • purposive/judgmental (use cautiously; bias risk)

8. What are your strengths and weaknesses as a data analyst?

  • Strength: I’m strong in SQL debugging and metric definitions.
  • Weakness & fix: I used to over-polish dashboards; now I timebox and validate the decision first.

9. What are ethical considerations in data analysis?

  • privacy & security
  • informed consent/purpose limitation
  • bias and fairness
  • transparency of assumptions
  • data integrity and accuracy
  • accountability for downstream impact

10. Univariate vs bivariate vs multivariate analysis?

  • Univariate: one variable (distribution)
  • Bivariate: relationship between two variables
  • Multivariate: 3+ variables together (interactions)

11. What data visualization tools are commonly used?

Tableau, Power BI, Excel charts, Python libraries (Matplotlib/Plotly), Looker/Looker Studio, and others.

Want to Become a Data Analyst? Learn From Experts!

Data Analyst CourseExplore Program
Want to Become a Data Analyst? Learn From Experts!

SQL Interview Questions for Data Analysts

SQL-based data analyst interview questions focus on the SQL patterns most commonly tested, such as sets, subqueries/CTEs, ranking, and join-based aggregation. These Q&As reflect real reporting and dashboard queries analysts write daily.

12. Can you use a SELECT alias in the WHERE clause?

Usually, no; WHERE is evaluated before SELECT.

Incorrect:

SELECT salary * 12 AS annual_salary
FROM employees
WHERE annual_salary > 100000;

Correct:

SELECT salary * 12 AS annual_salary
FROM employees
WHERE salary * 12 > 100000;

13. Compare UNION vs INTERSECT vs EXCEPT?

  • UNION: combines results (removes duplicates); UNION ALL keeps duplicates
  • INTERSECT: returns common rows
  • EXCEPT: returns rows in the first query not in the second (MINUS in some DBs)

14. What is a correlated vs. non-correlated subquery?

  • Non-correlated subquery: runs once and feeds the outer query.
  • Correlated subquery: runs per row (more expensive); use carefully.

15. ROW_NUMBER vs RANK vs DENSE_RANK: how are they different?

  • ROW_NUMBER: unique numbering even with ties (1,2,3…)
  • RANK: ties share rank; skips numbers (1,1,3…)
  • DENSE_RANK: ties share rank; no gaps (1,1,2…)

16. How do you find the Nth highest value in SQL?

Use DENSE_RANK() (distinct ranks) or ROW_NUMBER() (row-based ranking) with an ORDER BY to find the Nth highest value in SQL.

Row Number VS Dense Rank

Image Representation: Difference between ROW_NUMBER and DENSE_RANK for nth highest SQL queries.

17. Write a code: Join and aggregate example: products with total units sold > 1.5 million.

SELECT p.ProductID,
p.ProductName,
SUM(s.UnitSold) AS total_units_sold
FROM product p
JOIN sales_order_detail s
ON p.ProductID = s.ProductID
GROUP BY p.ProductID, p.ProductName
HAVING SUM(s.UnitSold) > 1500000;

18. What is a stored procedure?

A stored procedure is a saved, reusable set of SQL statements (often parameterized) used for repeatable tasks, transformations, and controlled access.

19. Write a stored procedure to print even numbers between two inputs.

Logic:
accept start_num, end_num
loop from start to end
output values where n % 2 = 0

(Exact syntax depends on SQL Server/MySQL/Postgres.)

Learn 17+ in-demand data analysis skills and tools, including Tableau, Excel, SQL, Python, R, PowerBI and more, through 10+ course-end projects with this Data Analyst Course.

Statistics Interview Questions for Data Analysts

This set of data analyst interview questions and answers tests your ability to reason with data using time series, regression, hypotheses, and experimentation.

20. What is time series analysis?

Time series analysis examines data collected over time (trend, seasonality, autocorrelation). Use time-based splits (not random) and consider lag/rolling features.

21. Overfitting vs underfitting: what’s the difference?

  • Overfitting: great on training data, poor on test data (learns noise)
  • Underfitting: poor on both (too simple to capture patterns)

Underfitting vs Good fit vs Overfitting

Image Representation: Underfitting vs good fit vs overfitting illustrated with model curves.

22. Null hypothesis vs alternative hypothesis.

  • H₀ (null): no effect/relationship exists
  • H₁ (alternative): an effect/relationship exists

Example: H₀ “no conversion difference,” H₁ “conversion differs.”

23. Compare variance and covariance.

  • Variance: spread of one variable around its mean
  • Covariance: how two variables move together (direction, not standardized)

Variance and covariance are used in correlation, regression, PCA, and risk/variability reasoning.

24. What is regression, and when would you use it?

Regression is a statistical method that models the relationship between one or more input variables and an outcome variable to explain drivers or make predictions.

You use it to quantify impact (e.g., how price affects sales), control for confounders, forecast a metric, or estimate probabilities.

  • Use linear regression for continuous outcomes (revenue, demand)
  • Use logistic regression for probability outcomes (conversion yes/no)

Use it to quantify drivers, control for factors, and build interpretable baselines.

25. What is A/B testing, and how do you interpret results?

A/B testing is a randomized experiment that compares the control and the variant.

Interpret using lift, confidence interval/p-value, sample size, test duration, practical impact, and guardrails (quality/revenue) to avoid winning the wrong way.

Excel Interview Questions for Data Analysts

This section covers Excel skills used for quick analysis, QA checks, pivots, and repeatable reporting workflows. Q&As include the kinds of functions, logic, and pivot thinking interviewers expect.

26. In Excel, a numeric value is treated as text if it precedes what?

In Excel, a numeric value is treated as text when it’s preceded by an apostrophe (').

Example: typing '123 stores 123 as text (often left-aligned and not used in math until converted).

27. COUNT vs COUNTA vs COUNTBLANK vs COUNTIF.

  • COUNT: numeric cells
  • COUNTA: non-empty cells
  • COUNTBLANK: blank cells
  • COUNTIF: count by one condition

28. How do you create a dropdown list?

To create a dropdown list in Excel:

  • Select the cell(s)
  • Go to Data → Data Validation
  • In Allow, choose List
  • In Source, select the range (or type values separated by commas)
  • Click OK

29. Can PivotTables use dynamic ranges?

Yes. The best method is to use an Excel Table as the source (it auto-expands). Named ranges work too; OFFSET is volatile.

30. Which function returns the day of the week?

Use WEEKDAY().

Example: =WEEKDAY(A1, 2) returns 1–7 for Mon–Sun

31. How does AND() work?

Returns TRUE only if all conditions are TRUE: AND(B2>45, C2="Yes")

32. TODAY() vs NOW().

  • TODAY() returns the current date
  • NOW() returns the current date and time

33. SUMIFS scenario: total quantity where Sales Rep starts with “A” and Cost > 10

Use SUMIFS with criteria "A*" and ">10":

SUMIFS(QuantityRange, SalesRepRange, "A*", CostRange, ">10")

34. PivotTable: sales by Sales Rep and Item as % of total.

To show sales by Sales Rep and Item as % of Grand Total in an Excel PivotTable:

  • Select your data → Insert → PivotTable
  • Drag Sales Rep and Item to Rows
  • Drag Sales Total to Values (make sure it’s Sum)
  • Right-click any value in Sum of Sales Total → Show Values As → % of Grand Total

Pivot Table Setup

Image Representation: How to show PivotTable values as a percentage of the total in Excel.

35. INDEX-MATCH vs VLOOKUP: Why do many analysts prefer INDEX-MATCH?

INDEX-MATCH is more flexible (left/right), more robust when tables change, and supports stronger matching patterns than a hard-coded column index.

36. What is Power Query, and why is it useful?

Power Query imports and transforms data with repeatable steps (clean, split, merge, and append) and supports refreshable reporting workflows.

Python Interview Questions for Data Analysts

Python-specific data analyst interview questions center on NumPy & Pandas tasks such as indexing, filtering, grouping, merging, handling missing values, and working with dates.

37. What is the correct syntax for reshape() in NumPy?

np.reshape(arr, newshape)
# or
arr.reshape(newshape)

38. What are the different ways to create a DataFrame in Pandas?

  • From a dictionary

import pandas as pd
df = pd.DataFrame({"Name": ["A", "B"], "Age": [25, 30]})
  • From a list of dictionaries

df = pd.DataFrame([{"Name": "A", "Age": 25}, {"Name": "B", "Age": 30}])
  • From files (CSV/Excel)

df = pd.read_csv("emp.csv")
# df = pd.read_excel("emp.xlsx")
  • From NumPy Arrays

import numpy as np
arr = np.array([[1, 2], [3, 4]])
df = pd.DataFrame(arr, columns=["A", "B"])

39. Create a DataFrame from emp.csv and show the head and summary.

import pandas as pd
df = pd.read_csv("emp.csv")
print(df.head())
print(df.describe())
print(df.info())

40. How to select Department and Age columns from a DataFrame?

To select the Department and Age columns from a Pandas DataFrame:

df[["Department", "Age"]]

41. How to extract 8 from a 2D NumPy array using indexing?

import numpy 
as npnum = np.array([[1,2,3],[4,5,6],[7,8,9]])
num[2, 1]

42. From [0–9], return [1,3,5,7,9].

arr = np.array([0,1,2,3,4,5,6,7,8,9])
arr[1::2]
# or
arr[arr % 2 == 1]

43. How to stack arrays a and b horizontally?

np.hstack((a, b))
# or for 2D
np.concatenate((a, b), axis=1)

44. How to add a column to a DataFrame?

To add a new column to a Pandas DataFrame:

df["Address"] = ["NY", "CA", "TX"]

45. How to print four random integers between 1 and 15?

import numpy as np
np.random.randint(1, 16, size=4)

46. Find unique values per column and subset Age < 35 and Height > 6.

df.nunique()
df["Department"].unique()
df[(df["Age"] < 35) & (df["Height"] > 6)]

47. Plot a sine wave.

import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 2*np.pi, 200)
y = np.sin(x)
plt.plot(x, y)
plt.show()

48. Company with the highest average sales and transposed summary stats.

avg_sales = df.groupby("Company")["Sales"].mean()
top_company = avg_sales.idxmax()
df["Sales"].describe().to_frame().T

49. How to merge two DataFrames in Pandas?

import pandas as pd
df_merged = pd.merge(df1, df2, on="EmployeeID", how="left")

50. How to handle missing values in Pandas?

df.isna().sum()
df["Age"] = df["Age"].fillna(df["Age"].median())

Missing Values Workflow

Image Representation: Pandas missing values workflow with detect, decide, and fix steps.

51. How to convert and filter dates in Pandas?

df["OrderDate"] = pd.to_datetime(df["OrderDate"])
df[(df["OrderDate"] >= "2026-01-01") & (df["OrderDate"] < "2026-02-01")]
df["month"] = df["OrderDate"].dt.month

Tableau Interview Questions for Data Analysts

These data analyst interview questions evaluate how you connect and model data, build accurate KPIs, and choose the right visual form. They also test performance choices and dashboard-building fundamentals.

52. Joining vs blending in Tableau: what’s the difference?

  • Join: combines tables at the row level (like SQL) using common fields.
  • Blend: combines multiple sources at the viz layer (primary/secondary).

Use joins/relationships when possible; blend when sources can’t be joined cleanly.

53. What is LOD in Tableau?

LOD (Level of Detail) expressions control how metrics are computed regardless of view dimensions.

54. What joins does Tableau support?

Tableau supports

  • Inner Join
  • Left Join
  • Right Join
  • Full Outer Join

55. What is a Gantt chart?

A Gantt chart shows duration/progress over time using bars along a time axis (useful for timelines and SLAs).

56. Superstore: Analyze Sales, Profit, Quantity by Category/Subcategory.

  • Category & Subcategory → Rows
  • Sales → Columns
  • Profit → Color
  • Quantity → Label

Sort by Sales descending.

57. Create a dual-axis chart for Sales and Profit over time.

To create a dual-axis chart for Sales and Profit over time in Tableau:

  • Drag Order Date to Columns (set to Year or Month)
  • Drag Sales to Rows
  • Drag Profit to Rows (next to Sales), then right-click Profit → Dual Axis
  • Right-click an axis → Synchronize Axis
  • In Marks, set Sales = Bar and Profit = Line (adjust size/colors as needed)

58. How to build a state-wise Sales and Profit map?

To build a State-wise Sales and Profit map in Tableau:

  • Drag State to the view (or double-click it to create a map)
  • Set Marks to Map (or Symbol Map)
  • Drag Sales to Size
  • Drag Profit to Color
  • Add labels/tooltips (e.g., State, Sales, Profit) and adjust the color scale for readability

59. Treemap vs Heatmap: when do you use each?

  • Treemap: part-to-whole in nested rectangles
  • Heatmap: patterns across two dimensions using color intensity

60. Top 5 and bottom 5 customers by profit (sets).

Create a Top N set and a Bottom N set on Customer Name by SUM(Profit), combine the sets, and filter the combined set.

Join The Ranks of Top-Notch Data Analysts!

Data Analyst CourseExplore Program
Join The Ranks of Top-Notch Data Analysts!

Case an Behavioral Data Analyst Interview Questions

This section helps you structure strong answers for real-world scenarios and stakeholder conversations. These Q&As show how you think, validate, and influence decisions with data.

61. Walk me through how you’d analyze a drop in conversions.

Confirm definitions → segment (device/channel/geo/cohort) → validate tracking/data freshness → isolate funnel step issues → map to releases/campaigns → recommend actions with expected impact.

62. How do you explain a technical finding to a non-technical stakeholder?

State the takeaway, quantify impact, explain the “why” simply, and give a recommended next step with confidence level and assumptions.

63. What do you do when the data looks wrong?

Check source freshness, compare to baselines, validate joins and grain, audit nulls/outliers, and document what changed before reporting results.

64. How do you choose the right dashboard metric?

Choose metrics tied to decisions, clearly defined, stable over time, segmentable for drivers, and paired with guardrails to prevent misleading wins.

65. Tell me about a time you influenced a decision with data.

Use this structure: Context → Analysis → Insight → Action → Impact. Briefly explain the problem, what you analyzed (metrics/segments), the recommendation you made, and the measurable outcome (lift, savings, risk reduced).

Data Analyst Interview Prep Radar

1. SQL (Core Screen)

  • Joins & aggregation (GROUP BY, HAVING)
  • Window functions (ROW_NUMBER, DENSE_RANK)
  • CTEs & subqueries (clean, readable queries)

2. Statistics (Decision Quality)

  • Hypothesis testing (Type I/II, significance)
  • Outliers & missing data (treatments & tradeoffs)
  • Regression & A/B testing basics (drivers, lift)

3. Excel (Fast Analysis)

  • PivotTables (% of total, sorting)
  • Lookup logic (XLOOKUP/INDEX-MATCH)
  • Data validation & cleaning (dropdowns, formats)

4. Python (Real-World Workflows)

  • Pandas filtering & grouping (boolean masks, groupby)
  • Merge & missing values (merge, isna, fillna)
  • Dates & quick plots (to_datetime, dt, Matplotlib)

5. Tableau (Dashboard Skills)

  • Join vs blend & data modeling choices
  • LOD expressions (FIXED/INCLUDE/EXCLUDE)
  • Dual-axis & maps (Sales vs Profit over time, state views)

About the Author

Kshitij ChoughuleKshitij Choughule

Kshitij is a data analytics professional passionate about turning numbers into business stories. He enjoys working on websites, CRM, and revenue analytics to improve lead conversion and marketing ROI. In his writing, he shares practical tips on SQL, dashboards, KPIs, and data-driven decision making.

View More
  • Acknowledgement
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, OPM3 and the PMI ATP seal are the registered marks of the Project Management Institute, Inc.
  • *All trademarks are the property of their respective owners and their inclusion does not imply endorsement or affiliation.
  • Career Impact Results vary based on experience and numerous factors.