Aggregate functions in SQL perform some calculations on more than one value to return a single value. There are many aggregate functions in SQL, including average, count, sum, min, and max. All aggregate functions ignore NULL values while calculating, except the Count function.

What is an Aggregate Function in SQL?

An aggregate function in SQL returns one value after calculating multiple values of a column. We often use aggregate functions with the GROUP BY and HAVING clauses of the SELECT statement.

There are five types of SQL aggregate functions:

  • Count()
  • Sum()
  • Avg()
  • Min()
  • Max()

Column References

In SQL, aggregate functions are crucial in calculating a set of values and returning a single value. These functions are particularly useful when dealing with large datasets. When using aggregate functions in SQL, it is important to understand column references. A column reference is a name containing the data you want to aggregate. To use an aggregate function with a column reference, specify the column's name in the function's parentheses. 

For example, to find the average salary of employees in a table called "employees", you would use the AVG function with the column reference "salary" like this: 

SELECT AVG (salary)

FROM employees; 

Using column aliases instead of column references is also possible for a more readable code. However, understanding column references is essential when working with SQL aggregate functions.

Obtain Your SQL Certification

SQL Certification CourseExplore Program
Obtain Your SQL Certification

Why Use Aggregate Functions?

It is also important to understand why to use aggregate functions after understanding what an aggregate function is in SQL. Aggregate functions are the most essential ingredient of any database management system. They help us perform many calculations over massive data sets faster and more efficiently. For example, these functions create statements of statistics, drive financial analysis, and maintain inventory levels. Furthermore, aggregate functions can be applied to further our understanding of the data at hand.

Knowing the average price of all products stored in our warehouse or the total sales amount over a period is pretty easy. However, aggregate functions would have to be replaced by checking every data point on our own, which is time-consuming and full of potential failures.

Aggregate functions in SQL are important in general to anyone working in large amounts of data, trying to seek valuable insight.

1. COUNT() Function

The COUNT() function returns the number of rows in a database table.

Syntax:

COUNT(*)  
or  
COUNT( [ALL|DISTINCT] expression )  

Example:

We will use the ‘products’ table from the sample database for our demonstration.

dataset-sql-aggregate-function

The following SQL statement fetches the number of products in the table.

count-ex-1

This will produce the following result.

output-count-ex-1

The below-given command will display those product ids where the unit price is greater than 4.

count-ex-2

This will display the following result.

output-count-ex-2

Let's look at how we can use GROUP BY and HAVING functions with the COUNT function.

Consider the following dataset:

group-dataset

The SQL command given below will list the number of customers in each city.

count-groupby

This will produce the following results:

output-count-groupby

Also Read: The Ultimate Guide on SQL Basics

Accelerate your career as a skilled MEAN Stack Developer by enrolling in a unique Full Stack Web Developer - MERN Stack Master's program. Get complete development and testing knowledge on the latest technologies by opting for the MERN Stack Developer Course. Enroll TODAY!

2. SUM() Function

The SUM() function returns the total sum of a numeric column.

Syntax:

SUM()  
or  
SUM( [ALL|DISTINCT] expression )  

Example:

The following SQL statement finds the sum of the "unit price" fields in the "products" table:

sum-sql-aggregate-function


This will produce the following result.

Let’s look at how we can use GROUP BY and HAVING functions with the SUM function.

Consider the following dataset:

group-dataset

The SQL command below will list the number of customers in each city, having a sum of points greater than 3000.

sum-groupby

This will produce the following result:

output-sum-groupby

3. AVG() Function

The AVG() function calculates the average of a set of values.

Syntax:

AVG()  
or  
AVG( [ALL|DISTINCT] expression ) 

Example:

The following SQL command calculates the average quantity in stock.

avg-aggregate-functions

This will produce the following result.

output-avg-1

4. MIN() Function

The MIN() aggregate function returns the lowest value (minimum) in a set of non-NULL values.

Syntax:

MIN()  
or  
MIN( [ALL|DISTINCT] expression )  

Example:

min-sql-aggregate-function

The above code will give us the minimum quantity in stock in the products table.

output-min

Also Read: How to Aggregate Data Using Group By in SQL?

5. MAX() Function

The MAX() aggregate function returns the highest value (maximum) in a set of non-NULL values.

Syntax:

AVG()  
or
AVG( [ALL|DISTINCT] expression ) 

Example:

The code depicted below will give us the maximum quantity in stock in the products table.

max-aggregate-functions

This will produce the following result.

output-max-1

Discover SQL and Database Management

SQL Certification CourseExplore Program
Discover SQL and Database Management

SQL Server Aggregate Function Syntax

Aggregate functions in SQL Server calculate a set of values and return a single value. These functions are commonly used in SQL queries to summarize data and provide valuable insights. The syntax for using aggregate functions in SQL Server is straightforward.

The syntax for an aggregate function in SQL Server is as follows:

SELECT aggregate_function(column_name)
FROM table_name
[WHERE condition];

These functions are helpful when working with large data sets, as they can help simplify and speed up the analysis process. SUM, COUNT, AVG, and MAX are commonly used aggregate functions. 

So, understanding the syntax of SQL Server aggregate functions is essential for anyone working with databases and looking to analyze data efficiently.

APPROX_COUNT_DISTINCT

The aggregate function in SQL, APPROX_COUNT_DISTINCT  is a resourceful tool that approximates the number of different values present in any column of a table. This function is particularly useful when the count of distinct values is not accurate, and only an approximate estimate is required. It's a great resource when dealing with large data, where calculating the count accurately may be pretty time-consuming and resource-intensive.

The APPROX_COUNT_DISTINCT function, while not providing a fully accurate count of distinct values, is usually precise enough for most practical purposes. This practicality is a key function feature, reinforcing its value in everyday SQL tasks.

Overall, APPROX_COUNT_DISTINCT is a useful tool in an SQL developer's toolkit. It provides a quick and efficient way to estimate the number of distinct values in a column, making you feel resourceful in your data analysis tasks.

Syntax:

APPROX_COUNT_DISTINCT ( expression )

AVG

AVG is an aggregate in SQL that returns an average value over a set of numerical values in a table or even a column. This function is very important in analytical tasks that require getting the mean value for a given data set. The AVG function can always be used along with other aggregate functions of SQL, like COUNT, SUM, MAX, and MIN.

One of the most significant pros of using AVG is the ability to trace outliers in the dataset, which are values much larger or smaller than the average. After these outliers have been flagged, the data analyst has an idea of how the data is distributed, and he can make better decisions based on the obtained insights.

So, AVG is a powerful SQL function that can perform a wide range of data analysis tasks. It is a helpful tool when working with large datasets in a database management system.

Syntax:

AVG ([ALL | DISTINCT] expression )  
[OVER ([partition_by_clause] order_by_clause ) ]

CHECKSUM_AGG

CHECKSUM_AGG: This is an aggregate function in SQL that produces a hash value for a given dataset. The function takes one column or expression and returns a single checksum value to represent the data in that column or expression. It produces an integer, which could be used to compare two sets of data for equality or to detect changes in the data.

CHECKSUM_AGG is often used in data warehousing and other applications where data integrity is essential. It is a powerful tool for detecting data changes and ensuring that the data in a database is accurate and up-to-date. 

Syntax:

CHECKSUM_AGG ( [ ALL | DISTINCT ] expression )

COUNT_BIG

COUNT_BIG, an SQL function, aggregates the number of rows in a table. It differs from the COUNT function in that it returns the big integer data type. This function is particularly useful for counting huge datasets where the number of lines exceeds the maximum value for an integer data type. COUNT_BIG can be used with other SQL functions to implement complex queries and analyses. The beauty of COUNT_BIG lies in its simple syntax, making it easily applicable in SQL statements and ensuring a smooth user experience. 

Syntax:

COUNT_BIG ( { [ [ ALL | DISTINCT ] expression ] | * } )  

GROUPING

The GROUP BY function is one of the most commonly used aggregate functions in SQL. The GROUP BY function allows you to group rows of data based on one or more columns and then perform aggregate calculations for each group. For example, you could use the GROUP BY function to group sales data by month or region and then calculate the total sales for each group.

The GROUP BY function is commonly used with other aggregate functions, such as COUNT, SUM, AVG, and MAX/MIN. Using the GROUP BY function, you can quickly analyze large data sets and summarize the results meaningfully.

Syntax:

GROUPING ( <column_expression> )

GROUPING_ID

The GROUPING_ID function enables SQL aggregates to identify a row's grouping level in a SELECT statement. Depending on the grouping level, the function returns an integer uniquely identifying each row. The values it returns are based on the columns used in the GROUP BY clause of the particular SELECT statement. If a row does not belong to any group, the function returns 0; for rows belonging to any group, it returns a non-zero value.

The GROUPING_ID function is a tool that clarifies complex aggregations over large data sets. It simplifies data grouping in a most useful way for the analysis at hand, providing a clear indication of which grouping level each row falls into. This clarity is a significant benefit for data analysts and business intelligence professionals, giving them a more confident understanding of their data and analysis.

Syntax:

Become a Full Stack Developer in 6 Months!

Full Stack Developer - MERN StackExplore Program
Become a Full Stack Developer in 6 Months!
GROUPING_ID ( <column_expression>[,...n ] )

STDEV

STDEV (Standard Deviation) is a crucial SQL aggregate function used to measure the variation or dispersion in a data set. It calculates the square root of the variance and is a valuable tool for analyzing data trends. It can also help identify outliers in a dataset. The STDEV function is commonly used in statistical analysis, data mining, and data science. By understanding how to use the STDEV function, you can gain valuable insights into your data and make more informed decisions.

Syntax:

STDEV ( [ ALL | DISTINCT ] expression )  
Get noticed by top hiring companies through our JobAssist program. Get complete job assistance post the Full Stack Web Developer - MERN Stack Developer Course and unleash the endless possibilities. Enroll TODAY!

STDEVP

STDEVP, a crucial aggregate function in SQL, plays a significant role in statistical analysis. It calculates the population standard deviation, a key concept that measures the amount of variation or dispersion in a dataset representing an entire population. The function, which accepts numeric values and returns a single value, is instrumental in understanding the spread of data in a population.

This function, STDEVP, is not just a theoretical concept. It's a practical tool that is extensively used in finance, engineering, and scientific research applications. However, it's important to note that STDEVP differs from STDEV, as it computes the population standard deviation, not the sample standard deviation.

Overall, STDEVP is such a useful tool in SQL that would be of great importance to analysts and developers seeking valuable insights into their datasets.

Syntax:

STDEVP ( [ ALL | DISTINCT ] expression )  

STRING_AGG 

STRING_AGG, a powerful aggregate function in SQL, simplifies the task of concatenating strings from multiple rows into a single string. It's a handy tool when you need to group data and present it in a readable format. The beauty of the STRING_AGG function lies in its ease of use in most modern SQL database management systems, including Microsoft SQL Server, PostgreSQL, and MySQL. With this function, you can effortlessly combine values from multiple rows into a single column, separated by a specified delimiter. This feature makes it a breeze to create reports, summaries, and other data visualizations that require data aggregation. 

Overall, STRING_AGG is an invaluable tool for any SQL developer or database administrator looking to streamline their data management and reporting processes.

Syntax:

STRING_AGG (expression, separator ) [ <order_clause> ]<order_clause>::=   WITHIN GROUP (ORDER BY <order_by_expression_list> [ ASC | DESC ] )

VAR

The VAR function is a powerful function that can be used in conjunction with other SQL functions to derive insights from large data sets. You provide it with the column or an expression as an argument containing the values that you are interested in calculating variance with. A result of this VAR function is a decimal value representing the variance of a dataset.

Syntax:

VAR ( [ ALL | DISTINCT ] expression ) 

VARP

VARP, or Variance Population, is an aggregate function in SQL that calculates the variance of provided values. The function becomes useful when you want to discover the spread of a population. To build more sophisticated calculations, you can use the process with other SQL aggregate functions like SUM, AVG, and COUNT.

VARP is a potent tool in data analysis and often comes up in finance and statistics when dealing with vast volumes of data. However, it must be noted that it is pretty different from VAR, the sample variance. Whereas VARP calculates the variance of the whole population, VAR only makes computations concerning the sample from the population. 

Syntax:

VARP ( [ ALL | DISTINCT ] expression )  

RANGE 

The RANGE function is one of the crucial aggregate functions in SQL. It calculates the range of a set of values, which is the difference between the set's most prominent and smallest values. 

For example, if a set of values contains 5, 10, 15, 20, and 25, the range would be 20 (25-5). The RANGE function is commonly used in statistical analysis to measure data spread.

The RANGE function is a powerful tool in SQL that allows you to analyze your data statistically. By calculating the range of your data, you can gain insights into its spread and distribution.

NANMEAN

NANMEAN, an aggregate function in SQL, computes the mean of a set of numbers, omitting the NULL values. Its role in creating a more realistic representation by excluding the NULL values from the calculation reassures the accuracy of your analysis. This function is particularly useful in handling big data files with missing or incomplete data.

NANMEAN, an aggregate SQL function, returns a single value that represents the different values in the dataset. Its usefulness in deriving insights from complex datasets empowers your data analysis. This function is a valuable tool in your data analysis toolkit.

MEDIAN

The median is a statistical measure representing a dataset's median value. In SQL, the median can be calculated using the median() function, an aggregate function that returns the median value of a group of values. 

The syntax for the median() function in SQL is as follows:

SELECT MEDIAN(column_name)
FROM table_name

Here, the column_name represents the column's name that contains the values for which the median needs to be calculated, and table_name represents the name of the table that contains the data. 

The median() function can be combined with other aggregate functions like COUNT, SUM, AVG, etc., to perform more complex calculations on the data. Overall, the median() function in SQL is a powerful tool for data analysis and can provide valuable insights into data distribution in a dataset.

MODE 

One of the most used aggregate functions in SQL is the MODE function. The MODE function returns the value that comes up the most in a set of values. This is very helpful, specifically when dealing with large data sets that need to find the most common values. The syntax for this function is straightforward. It takes only one parameter, the column's name with the values, to analyze.

The syntax for the MODE function is as follows: 

MODE(column_name). 

Overall, the MODE function is a powerful tool for data analysis in SQL and can be used in various contexts to identify patterns and trends.

Start Your Journey to Data Mastery

SQL Certification CourseExplore Program
Start Your Journey to Data Mastery

Conclusion

The aggregate function in SQL is very powerful in the database. It serves the same purpose as their equivalents in MS Excel. In this article, we have seen several examples of aggregate functions in SQL. We hope you will now be clear about what is aggregate function in SQL.

If you wish to learn more about SQL, check out our SQL Certification Course.

Our SQL certification course is led by industry experts who will guide you through the basics and provide practical training to work with SQL databases. By the end of the course, you will have a work-ready understanding of SQL, ready to apply in your own applications.

FAQs

1. Is Max an aggregate function?

Yes, MAX is an aggregate function in SQL. It returns the maximum value in a set of values. You can use MAX to return the highest value in a column.

2. What is the aggregate function in SQL GROUP BY?

An aggregate function in SQL adds together multiple values and returns a single result. The aggregate function will carry out this calculation for each group of data when used with the GROUP BY clause. The GROUP BY statement is commonly used with aggregate functions ( MAX() , MIN() , SUM() , AVG(), COUNT() )

3. Can aggregate functions be used in the WHERE clause?

No, aggregate functions cannot be used in a WHERE clause. They are allowed in the SELECT list and in the HAVING clause.

4. What is aggregation in a query?

Aggregation in a query refers to the process of performing calculations on multiple rows of data to produce a single summary value. Common aggregation functions include SUM(), AVG(), COUNT(), MIN(), and MAX(). These functions are often used with the GROUP BY clause to group rows that have the same values in specified columns and then apply the aggregation to each group.

5. What is the difference between COUNT(*) and COUNT(column_name) in SQL?

The difference between COUNT(*) and COUNT(column_name) in SQL lies in how they count rows:

  • COUNT(*): Counts the total number of rows in a table, including those with NULL values. It doesn't take into account any specific column and gives the count of all rows.
  • COUNT(column_name): Counts the number of non-NULL values in a specific column. If a row has a NULL value in that column, it won't be included in the count.

6. Can we use aggregate functions in the WHERE clause?

No, aggregate functions cannot be used directly in the ‘WHERE’ clause. The ‘WHERE’ clause is used to filter rows before any aggregation takes place. If you want to filter based on the result of an aggregate function, you should use the ‘HAVING’ clause, which is designed for this purpose. The ‘HAVING’ clause filters groups created by the ‘GROUP BY’ clause based on aggregate function results.

Our Software Development Courses Duration And Fees

Software Development Course typically range from a few weeks to several months, with fees varying based on program and institution.

Program NameDurationFees
Caltech Coding Bootcamp

Cohort Starts: 7 Oct, 2024

6 Months$ 8,000
Full Stack Java Developer

Cohort Starts: 18 Sep, 2024

7 months$ 1,449
Full Stack Developer - MERN Stack

Cohort Starts: 18 Sep, 2024

6 Months$ 1,449
Automation Test Engineer

Cohort Starts: 7 Oct, 2024

8 months$ 1,499