TL;DR: This guide breaks down the top SQL interview questions for 2026, covering everything from basic queries to advanced joins. It’s the perfect cheat sheet for data analysts, engineers, and freshers preparing for technical rounds.

Introduction

SQL is used to communicate with relational databases. It proposes a standardized way to interact with databases, allowing users to perform various operations on the data, including retrieval, insertion, updation, and deletion.

SQL dominates the tech world and serves as the backbone for nearly every modern data stack. Stack Overflow’s 2025 Developer survey reveals that 59% of developers rely on it daily, while salaries for senior data roles have climbed past $124,000.

We compiled this list of SQL interview questions as landing these highly competitive jobs requires both strong theory and rigorous practice. Whether you want a role at a startup or an MNC, these problems mirror the actual challenges hiring managers throw at you.

A. SQL Interview Questions for Freshers

1. What are the different types of SQL commands?

SQL commands are grouped into categories:

  • DDL (Data Definition Language): CREATE, ALTER, DROP, TRUNCATE
  • DML (Data Manipulation Language): INSERT, UPDATE, DELETE
  • DQL (Data Query Language): SELECT
  • DCL (Data Control Language): GRANT, REVOKE
  • TCL (Transaction Control Language): COMMIT, ROLLBACK, SAVEPOINT

2. What are the different types of SQL commands?

SQL commands are grouped into categories:

  • DDL (Data Definition Language): CREATE, ALTER, DROP, TRUNCATE
  • DML (Data Manipulation Language): INSERT, UPDATE, DELETE
  • DQL (Data Query Language): SELECT
  • DCL (Data Control Language): GRANT, REVOKE
  • TCL (Transaction Control Language): COMMIT, ROLLBACK, SAVEPOINT

3. What is the difference between Database, DBMS, and RDBMS?

  • Database: Organized collection of data
  • DBMS: Software to manage databases (e.g., MS Access)
  • RDBMS: A DBMS using relational tables with rows and columns, supporting relationships (e.g., MySQL, PostgreSQL, SQL Server)

4. What is a Primary Key?

A column or set of columns that uniquely identify each row. A primary key cannot contain duplicates or NULLs.

5. What is a foreign key?

Foreign key is a field in one table referencing the primary key in another. It establishes a relationship between the two tables, ensuring data consistency and enabling data retrieval across tables.

6. What is a Unique Key?

Often referred to as a unique constraint, a unique key guarantees that every value in a column (or a combination of columns) remains distinct and cannot be replicated within a table. In contrast to a primary key, a table has the flexibility to incorporate multiple unique keys.

7. What is NULL in SQL?

NULL represents missing or unknown data. It is not zero or an empty string. To check NULLs, use IS NULL or IS NOT NULL.

8. What is the difference between CHAR and VARCHAR?

  • CHAR(n): Fixed length; pads unused space with blanks
  • VARCHAR(n): Variable length; stores only entered characters

9. What are Data Types in SQL?

Common types include:

  • Numeric → INT, DECIMAL, FLOAT
  • String → CHAR, VARCHAR, TEXT
  • Date/Time → DATE, TIME, TIMESTAMP
  • Binary → BLOB

10. What is the difference between Primary Key and Unique Key?

  • Primary Key: One per table, cannot be NULL
  • Unique Key: Multiple allowed, NULLs permitted (depends on DB engine)

11. What are Default and Check Constraints?

  • DEFAULT: Assigns a default value if none is provided
  • CHECK: Ensures values meet a condition
CREATE TABLE employees (
id INT PRIMARY KEY,
salary DECIMAL DEFAULT 30000,
age INT CHECK (age >= 18)
);

12. What is a Schema?

A schema is a logical container for database objects such as tables, views, and procedures. It organizes and groups related objects.

13. What is a Candidate Key?

A candidate key is a column (or set) that can uniquely identify rows. One candidate key becomes the primary key, while others remain alternatives.

14. What is a Composite Key?

A key formed by combining two or more columns to uniquely identify a row.

15. What is the difference between Super Key, Candidate Key, and Primary Key?

  • Super Key: Any column set that uniquely identifies rows
  • Candidate Key: Minimal super key (no redundant attributes)
  • Primary Key: Chosen candidate key used for row identification

16. What is a Cartesian Join?

Also called a cross join, it returns the Cartesian product of two tables.

SELECT * FROM employees, departments;

17. What is the difference between SQL, MySQL, and SQL Server?

  • SQL: The language for managing relational data
  • MySQL: An open-source RDBMS using SQL
  • SQL Server: Microsoft’s RDBMS product

18. What is the difference between DELETE and TRUNCATE?

Aspect

DELETE

TRUNCATE

WHERE

Can filter rows

Cannot filter

Logging

Row-by-row

Minimal

Speed

Slower

Faster

Rollback

Supported

Limited by DB

Identity reset

No

Often resets


19. What is the difference between SQL and NoSQL?

  • SQL: Relational, structured, ACID-compliant
  • NoSQL: Non-relational, schema-free, designed for scalability

20. What is the logical order of execution in SQL?

SQL doesn’t run in the same order you write it. It’s logically processed like this:

  • FROM/JOIN (build the dataset)
  • WHERE (filter rows)
  • GROUP BY (create groups)
  • HAVING (filter groups)
  • SELECT (pick columns, compute expressions)
  • ORDER BY (sort the final result)
  • LIMIT/TOP (return only some rows)

This explains why aliases usually don’t work in WHERE and why HAVING is for aggregated filters.

In the following section, we take a look at the common intermediate SQL interview questions and answers so that you'll know what to expect from your interviewer.

B. SQL Interview Questions for Intermediate

21. What are SQL Constraints?

Rules applied to maintain data integrity. Common ones include PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK, and NOT NULL.

22. What is Normalization?

Normalization reduces redundancy.

  • 1NF: Atomic values
  • 2NF: No partial dependency
  • 3NF: No transitive dependency
  • BCNF: Every determinant is a candidate key

23. What is the difference between WHERE and HAVING?

Feature

WHERE

HAVING

Applies to

Rows

Groups

Aggregates

Not allowed

Allowed

Execution

Before grouping

After grouping

24. What are Indexes?

Indexes speed up lookups. Types include clustered, non-clustered, composite, unique, and covering indexes.

25. What is GROUP BY?

GROUP BY groups rows with same values, used with aggregates.

26. What’s the difference between COUNT(*), COUNT(1), and COUNT(column)?

  • COUNT(*) counts rows. It doesn’t care about NULLs in columns
  • COUNT(column) counts only rows where that column is not NULL
  • COUNT(1) is typically optimized to behave like row-counting in most engines, but it’s not semantically better than COUNT(*)

If an interviewer asks this, they’re usually checking one thing: whether you understand that COUNT(column) ignores NULLs.

27. What is a View?

A view is a saved query result presented as a virtual table.

Did You Know? 🔍

The job outlook for web developers and designers will grow by 8% between 2023 and 2033.🚀 (Source: U.S. Bureau of Labor Statistics)

28. What are Common Table Expressions (CTEs)?

A CTE is a temporary result set defined with WITH used in queries.

WITH dept_avg AS (
SELECT department_id, AVG(salary) AS avg_sal
FROM employees
GROUP BY department_id
)
SELECT * FROM employees
JOIN dept_avg d ON e.department_id = d.department_id
WHERE e.salary > d.avg_sal;

29. What is a Recursive CTE?

A recursive CTE references itself, often used for hierarchical data.

30. What are Wildcards in SQL?

Wildcards match patterns in strings:

  • % → matches zero or more characters
  • _ → matches exactly one character

31. How do you do pattern matching in SQL, and what are the common pitfalls?

The most common way is LIKE, where % matches any number of characters and _ matches a single character.

-- starts with 'Mon'
SELECT * FROM users
WHERE name LIKE 'Mon%';

-- contains 'sql'
SELECT * FROM articles
WHERE title LIKE '%sql%';

-- exactly 5 characters
SELECT * FROM codes
WHERE code LIKE '_____';

A few things interviewers actually care about:

  • Case sensitivity depends on the database or collation. In PostgreSQL, you’ll often see ILIKE for case-insensitive matches
  • Leading wildcards (%sql) are slow at scale because indexes usually cannot help. Prefix searches (sql%) are much friendlier
  • If you need to match a literal % or _, you’ll typically use an ESCAPE clause (syntax varies slightly across databases)

32. Difference between IN and EXISTS

  • IN: Compares a value against a list or subquery
  • EXISTS: Checks if subquery returns any rows

33. Difference between View and Materialized View

  • View: Virtual, recalculated every time
  • Materialized View: Stores results physically, refreshed periodically

34. What is a Temporary Table?

A table created for intermediate storage, often session-specific.

35. What is a Subquery?

A query nested inside another. Can be simple or correlated.

36. What is a Trigger?

Code that runs automatically on events like insert, update, or delete.

37. Why is NOT IN (subquery) dangerous with NULLs, and what’s safer?

NOT IN is risky with NULLs because SQL uses 3-valued logic of TRUE/FALSE/UNKNOWN

If your subquery returns even one NULL, this can happen:

WHERE customer_id NOT IN (SELECT customer_id FROM orders)

If that inner list contains NULL, the comparison becomes UNKNOWN for many rows, and you get no rows.

Safer patterns to prefer:

  • NOT EXISTS (this is the best default)
  • LEFT JOIN … IS NULL (this one works, but you have to be careful with join conditions)

38. What is a Stored Procedure?

A precompiled set of SQL statements stored in the database.

39. In a LEFT JOIN, what’s the difference between filtering in ON vs WHERE?

Filtering in WHERE can turn a LEFT JOIN into an INNER JOIN by removing NULL-extended rows.

This is one of the most common quiet traps in interviews, and you should be prepared for it. So, when asked, you can say:

  • Filters in the ON clause decide which rows match during the join
  • Filtering in WHERE can turn a LEFT JOIN into an INNER JOIN by removing NULL-extended rows

Example:

-- Keeps customers even if they have no paid orders
SELECT c.id
FROM customers c
LEFT JOIN orders o
  ON o.customer_id = c.id
 AND o.status = 'paid';

But this version turns into an inner join:

SELECT c.id
FROM customers c
LEFT JOIN orders o
  ON o.customer_id = c.id
WHERE o.status = 'paid';  -- removes NULL matches

40. What is a Function?

A reusable program that returns a single value or a table.

41. What is an anti-join, and how do you write it?

An anti-join returns rows in A that have no match in B.

NOT EXISTS (You can say, this is my default recommendation):

SELECT c.id, c.name
FROM customers c
WHERE NOT EXISTS (
  SELECT 1
  FROM orders o
  WHERE o.customer_id = c.id
);

LEFT JOIN … IS NULL (this is also common):

SELECT c.id, c.name
FROM customers c
LEFT JOIN orders o
  ON o.customer_id = c.id
WHERE o.customer_id IS NULL;

Interview tip: If you mention NOT EXISTS is NULL-safe and is preferred for correctness, you usually score points.

42. What is the difference between UNION and UNION ALL?

  • UNION: Combines results, removes duplicates
  • UNION ALL: Combines results, keeps duplicates

43. How do you avoid double counting after joins?

Most wrong answers in SQL interviews come down to one thing: joining a one-to-many table and then aggregating the results. This kind of join multiplies rows, inflating our sums and counts.

Fixes you should mention:

  • Pre-aggregate the “many” side first using CTE/subquery, then join
  • Or aggregate on distinct keys intentionally (but do it on purpose, not as a band-aid)

Example pattern:

WITH orders_by_customer AS (
  SELECT customer_id, SUM(amount) AS total_spend
  FROM orders
  GROUP BY customer_id
)
SELECT c.id, c.name, o.total_spend
FROM customers c
LEFT JOIN orders_by_customer o
  ON o.customer_id = c.id;

44. Difference between RANK(), DENSE_RANK(), and ROW_NUMBER()

  • RANK(): Skips ranks for ties
  • DENSE_RANK(): No gaps in ranking
  • ROW_NUMBER(): Assigns unique numbers

45. How do you get the latest record per user (and handle ties)?

Interviewers ask this to see if you can return one row per user without accidentally dropping data or picking a random row when timestamps match.

The safest approach is to rank each user’s rows by recency and then keep only the top-ranked row.

Here’s the standard window-function solution:

SELECT *
FROM (
  SELECT
    u.*,
    ROW_NUMBER() OVER (
      PARTITION BY user_id
      ORDER BY updated_at DESC, id DESC
    ) AS rn
  FROM user_events u
) t
WHERE rn = 1;

That second sort key (id DESC) is what makes it tie-safe, so the result is deterministic even when two rows share the same timestamp.

Interview tip: Candidates get a common follow-up scenario — If you’re using MAX(updated_at) and joining back to the table, you can accidentally return multiple rows per user when there are ties.

The right answer: Using ROW_NUMBER() with a tie-breaker avoids that problem.

46. What are ACID Properties?

Atomicity, Consistency, Isolation, Durability are key properties to ensure reliable transactions.

47. ROW_NUMBER vs RANK vs DENSE_RANK — when do you use each?

  • Use ROW_NUMBER when you need exactly N rows per group
  • Use RANK when ties should be included (so you may get more than N rows)

Example (when ties should skip ranks):

SELECT *
FROM (
  SELECT
    e.*,
    RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rnk
  FROM employees e
) t
WHERE rnk <= 3;

Use DENSE_RANK when ties shouldn’t create gaps. This aligns with common definitions of window functions.

C. SQL Interview Questions for Experienced

48. Explain different isolation levels in SQL

Isolation levels define the visibility of data changes that one transaction makes to other concurrent transactions. There are four commonly used isolation levels in SQL:

  • READ UNCOMMITTED: At this isolation level, transactions can read changes made by other transactions even if those changes have not been committed. While this provides the highest concurrency level, it also introduces the risk of encountering dirty reads.
  • READ COMMITTED: In this level, transactions can only read committed data, avoiding dirty reads. However, it may still suffer from non-repeatable reads and phantom reads.
  • REPEATABLE READ: Transactions at this level ensure that any data read during the transaction remains unchanged throughout the transaction's lifetime. It prevents non-repeatable reads but may still allow phantom reads.
  • SERIALIZABLE: This represents the utmost isolation level, guaranteeing absolute isolation between transactions. While it eradicates all concurrency problems, locking mechanisms may reduce its efficiency.

49. How do you generate a date spine (calendar) and fill in missing days with zero?

Interviewers ask this because real datasets almost always have gaps, and if you only group by dates that appear in your events table, you’ll silently drop zero-activity days.

To fix this, create a list of dates (calendar table/CTE), LEFT JOIN events onto it, then COALESCE(count, 0).

To showcase your expertise, you can explain this with an example in PostgreSQL version (which is common in interviews)

WITH date_spine AS (
  SELECT generate_series(
    DATE '2026-01-01',
    DATE '2026-01-31',
    INTERVAL '1 day'
  )::date AS dt
),
daily_events AS (
  SELECT
    CAST(event_time AS date) AS dt,
    COUNT(*) AS event_count
  FROM events
  WHERE event_time >= TIMESTAMP '2026-01-01'
    AND event_time <  TIMESTAMP '2026-02-01'
  GROUP BY 1
)
SELECT
  s.dt,
  COALESCE(d.event_count, 0) AS event_count
FROM date_spine s
LEFT JOIN daily_events d
  ON d.dt = s.dt
ORDER BY s.dt;

50. How does a clustered index work, and how is it different from a non-clustered index?

A clustered index defines the actual storage order of rows within a table, allowing for only one clustered index per table and directly influencing the on-disk data organization. Conversely, a non-clustered index does not impact the physical arrangement of data and can coexist with multiple indexes within the same table.

  • Clustered Index: When you create a clustered index on a table, the table's rows are physically rearranged to match the order of the indexed column(s). This makes range queries efficient but may slow down insert/update operations.
  • Non-clustered Index: Non-clustered indexes are separate data structures that store a copy of a portion of the table's data and point to the actual data rows. They improve read performance but come with some overhead during data modification.

51. What are the most common NULL gotchas in SQL?

Interviewers love this question because NULL trips people up, as it doesn’t mean empty or zero. It means 'unknown/missing,' and SQL treats it differently from normal values. If you understand these few rules, you’ll avoid a lot of subtle bugs, and you’ll sound sharp in your interviews.

1) NULL = NULL is not true

In SQL, comparisons involving NULL don’t return TRUE or FALSE. They return UNKNOWN.

That’s why this doesn’t work.

SELECT *
FROM users
WHERE phone = NULL;   -- wrong

The correct way is:

SELECT *
FROM users
WHERE phone IS NULL;  -- right​

Same for “not null”:

WHERE phone IS NOT NULL;

2) NULL values don’t behave the way you expect in filters

A classic one: WHERE col <> 'x' does not include NULLs because NULL is not equal to x — it’s unknown.

So if you want everything except x, including missing values, you need to be explicit about it.

WHERE col <> 'x' OR col IS NULL;

3) COUNT(*) vs COUNT(column) gives different answers

  • COUNT(*) counts rows, even if a column is NULL
  • COUNT(column) counts only rows where that column is not NULL

For example, if a table has 100 rows but 20 rows have email = NULL:

  • COUNT(*) returns 100
  • COUNT(email) returns 80

That difference is intentional, and it’s a common interview check.

4) Aggregates ignore NULLs (but your math still needs guarding)

Most aggregates (SUM, AVG, etc.) skip NULL values. That can be fine, but you still need to handle situations like divide-by-zero or missing denominators.

Common pattern:

SELECT
  SUM(revenue) / NULLIF(COUNT(order_id), 0) AS avg_revenue_per_order
FROM orders;

5) NOT IN can break when NULLs are present

This one catches even experienced people.

If the subquery returns a NULL, NOT IN can behave unexpectedly and return no rows.

Risky:

WHERE customer_id NOT IN (SELECT customer_id FROM orders);

Safer:

WHERE NOT EXISTS (
  SELECT 1
  FROM orders o
  WHERE o.customer_id = customers.customer_id
);

52. What is a join explosion, and how do you prevent it?

A join explosion happens when your join keys aren’t unique and you accidentally turn your result into a many-to-many relationship, multiplying rows.

Here’s how to prevent it:

  • Know the grain of each table before joining
  • Deduplicate one side (ROW_NUMBER) or pre-aggregate
  • Validate row counts before and after joins during analysis

53. What are CTEs (Common table expressions)?

Common Table Expressions (CTEs) serve as momentary result sets that you can mention within SQL statements, typically found within SELECT, INSERT, UPDATE, or DELETE operations. They're established using the `WITH` keyword and are instrumental in streamlining intricate queries by dividing them into more digestible components.

54. Explain the MERGE statement

The SQL MERGE statement executes insertions, updates, or deletions on a target table, guided by the outcomes of a source table or query. It consolidates the functionalities of several individual statements (INSERT, UPDATE, DELETE) into one comprehensive statement, rendering it particularly valuable for achieving data synchronization between tables.

55. How do you use a window function in SQL?

Window functions are employed to perform computations on a group of table rows associated with the current row. They enable the generation of result sets containing aggregated data while retaining the distinct details of each row. Typical window functions encompass ROW_NUMBER(), RANK(), DENSE_RANK(), and SUM() OVER().

56. What is a pivot table, and how do you create one in SQL?

A pivot table is a technique for rotating or transposing rows into columns to better analyze and summarize data. You can create pivot tables in SQL using the `PIVOT` operator to convert row-based data into a column-based format.

57. Explain the concept of table partitioning

Partitioning a table involves breaking down a sizable table into smaller, more easily handled segments known as partitions. This method can enhance SQL query efficiency by permitting the Server to focus solely on pertinent partitions while executing queries. Typically, partitioning uses a column characterized by a high cardinality, such as date or region.

58. What is the use of the explain plan?

The EXPLAIN plan is a valuable feature in numerous relational database management systems. This tool offers a comprehensive view of the database engine's strategy for executing a query, encompassing details such as the selected execution plan, join techniques, index utilization, and projected costs. Database administrators (DBAs) and developers rely on EXPLAIN plans to enhance the performance of their queries.

Gain the confidence to ace your next interview and master real-world skills. Join the SQL Certification Course today! ✍️

59. What are indexed views?

Indexed or materialized views are precomputed result sets stored as physical tables in the database. They improve query performance by allowing the database engine to access pre-aggregated or pre-joined data directly from the indexed view, reducing the need for complex query processing.

60. What is a deadlock?

A deadlock is when two transactions block each other forever.

Simple example:

  • Transaction A locks row 1, then tries to lock row 2
  • Transaction B locks row 2, then tries to lock row 1

Most databases detect this and kill one transaction.

How to reduce deadlocks:

  • Keep transactions short
  • Lock rows in a consistent order
  • Index the columns you filter on (so you lock fewer rows)

61. What’s sargability, and why can functions kill index usage?

A condition is “sargable” if the database can use an index efficiently (usually an index seek). Wrapping an indexed column in a function often prevents index use, forcing scans.

This is often not sargable (a common mistake):

WHERE YEAR(order_date) = 2026

The database has to evaluate YEAR(order_date) for many rows before it can decide whether a row matches. It can’t easily jump to the 2026 section of the index since the filter is applied after the function runs.

Sargable (better):

WHERE order_date >= '2026-01-01'
  AND order_date <  '2027-01-01'

This aligns with how indexes are ordered, so the database can typically jump to the first matching date and stop at the end of the range.

A few other sargability examples interviewers love:

1) Leading wildcard in LIKE

WHERE name LIKE '%son'

This often forces scans.

Prefix searches tend to be index-friendly:

WHERE name LIKE 'son%'

2) Calculations on the column

WHERE price * 1.18 > 1000

Better:

WHERE price > 1000 / 1.18

62. What are the best practices for securing a SQL database?

Securing a SQL database involves implementing access controls, encryption, auditing, and regular security assessments. Best practices include using strong authentication, limiting permissions, and keeping database systems and software up to date.

63. In a composite index, why does the order of columns matter?

A composite index is stored in a sorted order based on the columns you define, from left to right. That left-to-right order decides which queries can use the index efficiently.

For example, with an index on (customer_id, order_date):

  • A query that filters on customer_id can usually use the index well, since customer_id is the leading column
  • A query that filters on both customer_id and order_date can also work well, since it can first narrow down to a single customer, then apply the date range
  • A query that filters only on order_date typically can’t use this index effectively, since the index isn’t primarily organized by date

Example queries:

-- Good fit for index (customer_id, order_date)
SELECT *
FROM orders
WHERE customer_id = 101;
-- Also a good fit (uses customer_id, then date range within that customer)
SELECT *
FROM orders
WHERE customer_id = 101
  AND order_date >= '2026-01-01'
  AND order_date <  '2026-02-01';
-- Often a poor fit for this index (no leading customer_id filter)
SELECT *
FROM orders
WHERE order_date >= '2026-01-01'
  AND order_date <  '2026-02-01';

How to choose the order (what interviewers want to hear):

  • Put the column you most often filter on first, allowing the index to narrow results early
  • If you commonly filter first by an exact match and then by a range, a pattern like (exact_match_column, range_column) is usually a strong default
  • Index design is workload-driven; the right order depends on your most frequent and most expensive queries, not generic rules

64. Explain the concept of database replication

Replication is copying and synchronizing data from one database to another. It ensures data availability, load balancing, and disaster recovery. Common replication types include snapshot, transactional, and merge replication.

65. What do you look for first in an EXPLAIN plan?

While answering this question, keep it simple and practical:

  • Are we doing a scan when we expected a seek?
  • Are joins happening on indexed keys, or are we hashing huge tables?
  • Are filters applied early, or is the database carrying too many rows forward?

Bonus point: mention that EXPLAIN is guidance, not gospel. Real performance still needs measurement.

66. How do you monitor SQL Server performance?

Monitoring SQL Server performance involves tracking key performance metrics, setting up alerts for critical events, and analyzing performance bottlenecks. Tools like SQL Server Profiler and Performance Monitor are commonly used.

67. When would you use a subquery vs a CTE vs a temporary table?

You can say, I think of it in terms of readability, reuse, and control.

  • Subquery: Best for small, one-off logic where you don’t need to reuse the result. It keeps the query compact, especially when the transformation is simple.
  • CTE: Ideal to use when the query has multiple steps, and you want it to read like a sequence. CTEs are great for breaking complex logic into named chunks, and they’re the go-to option for recursion.
  • Temporary table: Best when you need to reuse the same intermediate dataset multiple times, when the intermediate result is large, and you want more control (e.g., adding an index or inspecting it during debugging).

Interview-friendly way to answer this question: If it’s small and used once, I’ll use a subquery. If it’s a multi-step logic that needs clarity, I’ll use a CTE. If I’m reusing the same intermediate result or performance tuning needs more control, I’ll use a temp table.

68. What is a database warehouse?

A database warehouse is a centralized repository that stores data from various sources for analytical and reporting purposes. It is optimized for querying and analysis and often contains historical data.

69. How do you handle incremental loads in SQL (only new or changed rows)?

The standard pattern is a watermark followed by an upsert.

  1. Pick a watermark column like updated_at (or an ever-increasing ID)
  2. Load only rows after the last successful watermark
  3. Upsert into the target: update existing keys and insert new ones
  4. Advance the watermark only after a successful run, so you don’t skip data

A simple example of the extraction filter looks like:

SELECT *
FROM source_table
WHERE updated_at > :last_successful_watermark;

Interview tip — There are two things that interviewers expect you to call out:

  • Late-arriving updates happen. A common fix is to reprocess a small overlap window (e.g., the last 1–2 days) and deduplicate on the target key
  • Upsert needs a stable key. If you don’t have a reliable primary key, you’ll need a dedupe strategy first

70. How do you deduplicate records in SQL and keep the latest row per key?

This is a classic “real data” question. Here’s how you answer it.

The most reliable approach is to rank rows per key by recency, then keep rank 1. It avoids the common mistake of DISTINCT hiding duplicates without actually selecting the correct record.

SELECT *
FROM (
  SELECT
    t.*,
    ROW_NUMBER() OVER (
      PARTITION BY user_id
      ORDER BY updated_at DESC, id DESC
    ) AS rn
  FROM raw_events t
) x
WHERE rn = 1;

Why this works well in interviews:

  • It guarantees one row per key (user_id here)
  • It stays deterministic with a tie-breaker (id DESC), so you don’t get random picks when timestamps match
  • It scales to real-world cleanup tasks: latest status per user, latest address per customer, “latest device per account,” etc.

71. How do you manage database concurrency?

Database concurrency involves a database system's capability to manage multiple concurrent transactions while upholding data integrity. To achieve this, various techniques such as locking mechanisms, optimistic concurrency control, and isolation levels are employed to oversee and regulate database concurrency.

72. How do you sessionize events in SQL using a 30-minute inactivity rule?

Sessionization is the process of grouping a user’s events into sessions, where a new session starts after a gap of more than 30 minutes between events.

In your interview, you can talk about the usual pattern, which is:

  1. Sort events per user
  2. Use LAG() to look at the previous event time
  3. Flag the start of a new session when the gap exceeds 30 minutes
  4. Use a running sum of that flag to assign a session number/ID

Here’s a clean PostgreSQL-style version for you to showcase your knowledge and depth in SQL

WITH ordered AS (
  SELECT
    user_id,
    event_time,
    CASE
      WHEN LAG(event_time) OVER (PARTITION BY user_id ORDER BY event_time) IS NULL THEN 1
      WHEN event_time - LAG(event_time) OVER (PARTITION BY user_id ORDER BY event_time) > INTERVAL '30 minutes' THEN 1
      ELSE 0
    END AS new_session
  FROM events
),
sessions AS (
  SELECT
    *,
    SUM(new_session) OVER (PARTITION BY user_id ORDER BY event_time) AS session_id
  FROM ordered
)
SELECT
  user_id,
  session_id,
  MIN(event_time) AS session_start,
  MAX(event_time) AS session_end
FROM sessions
GROUP BY 1, 2
ORDER BY 1, 2;

Interview tip — There is a common catch worth calling out:

If you filter rows after the join/CTE in a way that removes earlier events, your session boundaries can shift. Session logic needs the full, ordered sequence for each user.

What interviewers are checking:

  • You know how to compare each row with the previous row using LAG()
  • You can turn a new session flag into a session identifier using a running sum
  • You’re thinking per user (PARTITION BY user_id), not across the full dataset

73. How do you implement high availability in SQL databases?

High availability in SQL databases ensures that the database remains accessible and operational despite failures. Techniques like clustering, replication, and failover mechanisms help achieve high availability.

74. How do you find users who were active for 4 consecutive days in SQL?

The key is to first normalize activity to one row per user per day, then detect streaks. A standard trick is to:

  1. Deduplicate to daily activity (user_id, date)
  2. Assign a row number by date per user
  3. Use the fact that, for consecutive dates, date - row_number stays constant
  4. Group by that constant to get streak lengths

PostgreSQL-style example:

WITH days AS (
  SELECT DISTINCT
    user_id,
    CAST(event_time AS date) AS dt
  FROM events
),
ranked AS (
  SELECT
    user_id,
    dt,
    ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY dt) AS rn
  FROM days
),
grouped AS (
  SELECT
    user_id,
    dt,
    (dt - rn * INTERVAL '1 day') AS grp
  FROM ranked
)
SELECT
  user_id
FROM grouped
GROUP BY user_id, grp
HAVING COUNT(*) >= 4
ORDER BY user_id;

Interview tip — Common follow-up question you should be ready for:

Can you return the actual date range of the streak?

The right answer: Yes. Select MIN(dt) and MAX(dt) for each (user_id, grp) group and filter where COUNT(*) >= 4

What interviewers are examining:

  • You can dedupe to days first; multiple events in a day shouldn’t inflate a streak
  • You know at least one reliable streak pattern (row_number group key is a classic)
  • You can explain why the grouping works, not just paste the query

75. What is SQL injection, and how do you prevent it?

SQL injection is a security flaw where attacker-controlled input is treated as part of a SQL query. It usually happens when an application builds SQL by concatenating strings, so the input changes the query's meaning rather than being treated as data.

Prevention — what interviewers expect you to say:

  • Use parameterized queries (prepared statements). This is the single biggest fix, as the database treats user input as a value rather than executable SQL
  • Avoid dynamic SQL unless you truly need it. If you do need it for dynamic sorting/filtering, whitelist allowed column names and sort directions instead of passing raw input through
  • Validate and sanitize inputs at the application layer. Validation doesn’t replace parameterization, but it reduces risk and catches obvious abuse early
  • Apply least privilege. The database user used by the app should have only the permissions it needs. If an injection happens, limited permissions reduce damage

Interview tip — If your interviewer is looking for a short and crisp answer, this is how you quickly summarize your answer and say,

“I prevent SQL injection by defaulting to prepared statements, avoiding string-built queries, whitelisting anything dynamic, and keeping database permissions tight.”

D. Query-based SQL Interview Questions and Answers

76. Find the Second Highest Salary

SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);

Explanation:

This is a classic interview question. The inner query identifies the maximum salary, while the outer query selects the maximum salary that is lower than that, giving the second highest.

77. Find the Nth Highest Salary

SELECT DISTINCT salary
FROM employees e1
WHERE N-1 = (
SELECT COUNT(DISTINCT salary)
FROM employees e2
WHERE e2.salary > e1.salary
);

Explanation:

This query counts how many distinct salaries are higher than the current row’s salary. If that number equals N-1, the row corresponds to the Nth highest salary.

78. Employees Without Managers

SELECT *
FROM employees
WHERE manager_id IS NULL;

Explanation:

Some employees, like CEOs or department heads, may not report to anyone. This query returns all rows where manager_id is NULL.

79. Count Employees by Job Title

SELECT job_title, COUNT(*) AS total
FROM employees
GROUP BY job_title;

Explanation:

GROUP BY clusters employees by job title, and COUNT(*) counts rows in each group. The result shows how many employees hold each role.

80. Find Duplicate Salaries

SELECT salary, COUNT(*)
FROM employees
GROUP BY salary
HAVING COUNT(*) > 1;

Explanation:

This groups records by salary and filters with HAVING to return only salaries that appear more than once. It’s useful for identifying duplicates in a dataset.

81. Customers With No Orders

SELECT c.id, c.name
FROM customers c
WHERE NOT EXISTS (  SELECT 1 FROM orders o WHERE o.customer_id = c.id
);

Explanation:

The NOT EXISTS clause checks for the absence of related records. This query lists customers who have never placed an order.

82. How do you write a SQL query to find employees hired in the last 30 days?

The clean approach is to filter hire_date using a rolling 30-day window anchored to “today.”

Since the exact date function changes by database, here are the common interview-proof versions that you can prepare for:

PostgreSQL

SELECT *
FROM employees
WHERE hire_date >= CURRENT_DATE - INTERVAL '30 days';

MySQL

SELECT *
FROM employees
WHERE hire_date >= CURDATE() - INTERVAL 30 DAY;

SQL Server

SELECT *
FROM employees
WHERE hire_date >= DATEADD(day, -30, CAST(GETDATE() AS date));

What interviewers are quietly examining:

  • You’re using a dynamic rolling window (not hardcoding dates)
  • You know that different SQL dialects express date arithmetic differently
  • You’re filtering on the date column directly, which stays index-friendly in most setups

83. Running Total of Salaries

SELECT employee_id, salary,
       SUM(salary) OVER (ORDER BY employee_id) AS running_total
FROM employees;

Explanation:

The window function SUM() OVER creates a cumulative total of salaries ordered by employee ID. This is often used in reporting and analytics.

84. Top 3 Earners Per Department

SELECT *
FROM (
  SELECT e.*, RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rnk
  FROM employees e
) t
WHERE rnk <= 3;

Explanation:

RANK() OVER (PARTITION BY ...) assigns ranks within each department. Filtering where rnk <= 3 returns the top three earners in every department.

85. How do you find customers who placed orders this year but not last year in SQL?

The most reliable pattern is EXISTS + NOT EXISTS with clean date ranges. It avoids the common NOT IN + NULL trap and keeps the logic readable.

PostgreSQL

SELECT DISTINCT c.id, c.name
FROM customers c
WHERE EXISTS (
  SELECT 1
  FROM orders o
  WHERE o.customer_id = c.id
    AND o.order_date >= DATE_TRUNC('year', CURRENT_DATE)
    AND o.order_date <  DATE_TRUNC('year', CURRENT_DATE) + INTERVAL '1 year'
)
AND NOT EXISTS (
  SELECT 1
  FROM orders o
  WHERE o.customer_id = c.id
    AND o.order_date >= DATE_TRUNC('year', CURRENT_DATE) - INTERVAL '1 year'
    AND o.order_date <  DATE_TRUNC('year', CURRENT_DATE)
);

MySQL

SELECT DISTINCT c.id, c.name
FROM customers c
WHERE EXISTS (
  SELECT 1
  FROM orders o
  WHERE o.customer_id = c.id
    AND o.order_date >= DATE_FORMAT(CURDATE(), '%Y-01-01')
    AND o.order_date <  DATE_FORMAT(DATE_ADD(CURDATE(), INTERVAL 1 YEAR), '%Y-01-01')
)
AND NOT EXISTS (
  SELECT 1
  FROM orders o
  WHERE o.customer_id = c.id
    AND o.order_date >= DATE_FORMAT(DATE_SUB(CURDATE(), INTERVAL 1 YEAR), '%Y-01-01')
    AND o.order_date <  DATE_FORMAT(CURDATE(), '%Y-01-01')
);

SQL Server

SELECT DISTINCT c.id, c.name
FROM customers c
WHERE EXISTS (
  SELECT 1
  FROM orders o
  WHERE o.customer_id = c.id
    AND o.order_date >= DATEFROMPARTS(YEAR(GETDATE()), 1, 1)
    AND o.order_date <  DATEFROMPARTS(YEAR(GETDATE()) + 1, 1, 1)
)
AND NOT EXISTS (
  SELECT 1
  FROM orders o
  WHERE o.customer_id = c.id
    AND o.order_date >= DATEFROMPARTS(YEAR(GETDATE()) - 1, 1, 1)
    AND o.order_date <  DATEFROMPARTS(YEAR(GETDATE()), 1, 1)
);

86. What is the difference between CAST and CONVERT functions in SQL?

In SQL job interviews, you're often asked about data type conversion, and that’s where CAST() and CONVERT() come in. Both are used to change a value from one data type to another, like turning a string into a date or a number. 

The key difference is that CAST() follows the SQL standard, making it more portable across different databases. CONVERT() is specific to SQL Server and provides more formatting options, especially useful when handling date and time formats. 

During SQL interview preparation, it’s smart to know when to use each depending on the system you're working with.

87. How do you design a database schema for a large-scale application?

The first step is understanding the application requirements clearly. Then, normalize the data to remove redundancy, define clear relationships using foreign keys, and create indexes for faster lookups. 

Also, consider how the system will handle growth, through partitioning, sharding, or caching strategies. A good schema is not just about structure; it’s about designing with performance, scalability, and clarity in mind.

88. How do you implement Data Security and Encryption in SQL?

Data protection often comes up during SQL job interviews, and it's an essential part of working with sensitive databases. 

To implement security, use techniques like hashing passwords with HASHBYTES() or encrypting fields using ENCRYPTBYKEY() in SQL Server. Add role-based access controls so only authorized users can access or modify confidential information. 

Don’t forget to encrypt connections using SSL/TLS. Understanding how to secure data, both in transit and at rest, is a key part of SQL interview preparation, especially for roles involving large-scale applications or regulated industries.

Did You Know? 🔍

Professionals with advanced SQL skills, such as database administrators and database architects, receive a median annual pay of $117,450, and the job growth outlook is 8% in the coming years. (Source: U.S. Bureau of Labor Statistics)

SQL Interview Questions at a Glance

Interviewers usually assess how well you understand SQL, from simple commands to how you apply it in real situations. Let’s look at some of the most common SQL interview questions based on different experience levels:

A. SQL Interview Questions for Freshers

  • General SQL Concepts

As a beginner, the interviewer is likely to ask you to elucidate on what the abbreviation SQL stands for and how it can be applied to daily data management. They may ask SQL interview questions about databases, data types that might be stored in a database, and whether you've used any tools or engines in your coursework or job.

  • SQL Commands and Usage

You should be comfortable with basic commands like SELECT, INSERT, UPDATE and DELETE. They may give you a couple of tasks to do, say, filter data with WHERE and order the data with ORDER BY. These types of basic SQL interview questions would be included to ensure you have a good sense of how SQL behaves.

  • Keys and Rules in Tables

You should be prepared to explain what the term primary key, unique key or foreign key is. These terms are important for keeping the data structured the right way. They may also ask how constraints like NOT NULL or UNIQUE assist in keeping data clean.

  • Joins and Handling Missing Data

You’ll likely get questions on combining data from different tables using joins like INNER JOIN or LEFT JOIN. Interviewers also ask how you’d deal with missing values in the data, so knowing how NULL works is useful.

B. SQL Interview Questions for Intermediate

  • Working with Groups and Totals

You will likely be presented with questions about calculating totals or averages using SUM, AVG or COUNT. Employers are assessing if you know how to group data using GROUP BY, which is common in reports.

  • Joins and Inside Queries

You may be asked questions about joining tables together and using queries inside other queries (subqueries); all are common methods of data analysis in projects, so the interviewers are looking for your ability to easily write those queries easily.

  • Combining Results

You might be asked to join results using UNION or find differences using other methods. These types of SQL interview questions and answers show how you handle more than one SQL query result at a time.

  • Improving Speed

Some questions may touch on how to make queries run faster. You don’t need to go too deep, but having a basic idea of things like indexes and keeping queries clean can help you stand out.

C. SQL Interview Questions for Experienced

In a senior-level technical SQL interview, you can expect questions that test your knowledge of SQL syntax, database design, query optimization, and problem-solving abilities using SQL. Other technical SQL interview question topics include:

  • Enhancing Queries

If you've been working with SQL for any length of time, you can count on being asked about improving and speeding up your queries. SQL interview questions posed to inexperienced candidates will be more focused on how they deal with the slowdown of very large databases while keeping their processes sped up and efficient.

  • Managing Data Changes

In addition, employers would like to see how you handle situations requiring more than one user to manage the same data. You might be asked broader committed questions like how you save a change, when you can apply a rollback, or how you address public safety with data revisions.

  • Using Stored Procedures

Interviews might also involve complying with, or making use of, stored procedures and triggers which are good solutions for saving reusable SQL processes and automating some of the processes. Generally, interviewers want to learn if you utilize either in your practice.

  • Solving Business Problems

When it comes to SQL interview questions for professionals/individuals with 5 years of experience, questions are less about definitions and more about scenarios in your past. So be prepared to talk about how you might have resolved a data problem, fixed or enhanced a report you previously created, or how you might have informed the decision-making process at a team level with SQL data.

Conclusion

Interviewers aren’t trying to see if you can recite definitions. Real interviews push past basic syntax to test how you handle messy, unstructured data. In a recent discussion on r/SQL, candidates noted that interviewers now demand solutions for complex window functions and "gaps and islands" logic. Based on what keeps showing up in real interview debriefs, prioritize:

  • Self-joins for hierarchical relationships (for example, employee → manager)
  • Time-bounded aggregations (for example, top customers/orders within a range)
  • CTE + window function combos (for example, changes over time, rankings, running totals)
  • Streak logic (for example, users active X consecutive days)

If you want guided practice from basics to advanced concepts, you can enroll in Simplilearn’s SQL Certification Course to build strong fundamentals and prepare confidently for interviews.

FAQs

1. How can I start learning SQL?

Join Simplilearn’s SQL Certification Course. It covers basics, offers hands-on practice, and is great for beginners.

2. How should I prepare for an SQL interview?

Practice real queries, revise key concepts like SELECT and JOINs, and review common SQL interview questions and answers.

3. What are the most common mistakes candidates make in SQL interviews?

Some frequent mistakes include:

  • Using SELECT* instead of fetching required columns
  • Forgetting how to handle NULL values correctly (IS NULL vs = NULL)
  • Mixing up WHERE and HAVINGS with aggregates
  • Writing queries without considering indexes or performance
  • Overlooking transaction controls like COMMIT and ROLLBACK

Avoiding these shows that you not only know SQL syntax but also follow real-world best practices.

Data Science & Business Analytics Courses Duration and Fees

Data Science & Business Analytics programs typically range from a few weeks to several months, with fees varying based on program and institution.

Program NameDurationFees
Data Strategy for Leaders

Cohort Starts: 26 Feb, 2026

14 weeks$3,200
Oxford Programme in AI and Business Analytics

Cohort Starts: 19 Mar, 2026

12 weeks$3,359
Data Analyst Course11 months$1,449
Data Science Course11 months$1,449