An aggregate function in SQL performs a calculation on multiple values and returns a single value. SQL provides many aggregate functions that include avg, count, sum, min, max, etc. An aggregate function ignores NULL values when it performs the calculation, except for the count function. 

What is an Aggregate Function in SQL?

An aggregate function in SQL returns one value after calculating multiple values of a column. We often use aggregate functions with the GROUP BY and HAVING clauses of the SELECT statement.

There are 5 types of SQL aggregate functions:

  • Count()
  • Sum()
  • Avg()
  • Min()
  • Max()

Column References

In SQL, aggregate functions are used to calculate a set of values and return a single value. When using aggregate functions in SQL, it is crucial to understand column references. A column reference is a name containing the data you want to aggregate. To use an aggregate function with a column reference, specify the column's name in the function's parentheses. 

For example, to find the average salary of employees in a table called "employees", you would use the AVG function with the column reference "salary" like this: 

SELECT AVG (salary)

FROM employees; 

Using column aliases instead of column references is also possible for a more readable code. However, understanding column references is essential when working with SQL aggregate functions.

Want a Top Software Development Job? Start Here!

Full Stack Developer - MERN StackExplore Program
Want a Top Software Development Job? Start Here!

Why Use Aggregate Functions?

Aggregate functions are a vital component of database management systems. They allow us to perform calculations on large data sets quickly and efficiently. For example, these functions generate statistical reports, perform financial analysis, and manage inventory levels. 

In addition, we can better understand the data we are working with by using aggregate functions. 

For example, we can easily calculate the average price of all products in our inventory or find the total sales for a particular time. Without aggregate functions, we would need to manually sort through each data point, which would be time-consuming and error-prone. 

Overall, aggregate functions are essential for anyone working with large amounts of data and seeking to gain valuable insights from it.

COUNT() Function

The COUNT() function returns the number of rows in a database table.

Syntax:

COUNT(*)  

or  

COUNT( [ALL|DISTINCT] expression )  

Example:

We will use the ‘products’ table from the sample database for our demonstration.

dataset-sql-aggregate-function

The following SQL statement fetches the number of products in the table.

count-ex-1

This will produce the following result.

output-count-ex-1

The below-given command will display those product ids where the unit price is greater than 4.

count-ex-2

This will display the following result.

output-count-ex-2

Let's look at how we can use GROUP BY and HAVING functions with the COUNT function.

Consider the following dataset:

group-dataset

The SQL command given below will list the number of customers in each city.

count-groupby

This will produce the following results:

output-count-groupby

Also Read: The Ultimate Guide on SQL Basics

Want a Top Software Development Job? Start Here!

Full Stack Developer - MERN StackExplore Program
Want a Top Software Development Job? Start Here!

SUM() Function

The SUM() function returns the total sum of a numeric column.

Syntax:

SUM()  

or  

SUM( [ALL|DISTINCT] expression )  

Example:

The following SQL statement finds the sum of the "unit price" fields in the "products" table:

sum-sql-aggregate-function


This will produce the following result.

Let’s look at how we can use GROUP BY and HAVING functions with the SUM function.

Consider the following dataset:

group-dataset

The SQL command below will list the number of customers in each city, having a sum of points greater than 3000.

sum-groupby

This will produce the following result:

output-sum-groupby

AVG() Function

The AVG() function calculates the average of a set of values.

Syntax:

AVG()  

or  

AVG( [ALL|DISTINCT] expression )  

Example:

The following SQL command calculates the average quantity in stock.

avg-aggregate-functions

This will produce the following result.

output-avg-1

MIN() Function

The MIN() aggregate function returns the lowest value (minimum) in a set of non-NULL values.

Syntax:

MIN()  

or  

MIN( [ALL|DISTINCT] expression )  

Example:

min-sql-aggregate-function

The above code will give us the minimum quantity in stock in the products table.

output-min

Also Read: How to Aggregate Data Using Group By in SQL?

MAX() Function

The MAX() aggregate function returns the highest value (maximum) in a set of non-NULL values.

Syntax:

AVG()  

or  

AVG( [ALL|DISTINCT] expression )  

Example:

The code depicted below will give us the maximum quantity in stock in the products table.

max-aggregate-functions

This will produce the following result.

output-max-1

Preparing Your Blockchain Career for 2024

Free Webinar | 5 Dec, Tuesday | 9 PM ISTRegister Now
Preparing Your Blockchain Career for 2024

SQL Server Aggregate Function Syntax

SQL Server aggregate functions are used to calculate a set of values and return a single value. These functions are commonly used in SQL queries to summarize data and provide valuable insights. The syntax for using aggregate functions in the SQL Server is straightforward.

The syntax for an aggregate function in the SQL Server is as follows:

SELECT aggregate_function(column_name)

FROM table_name

[WHERE condition];

These functions are helpful when working with large data sets, as they can help simplify and speed up the analysis process. SUM, COUNT, AVG, and MAX are commonly used aggregate functions. 

So, understanding the syntax of SQL Server aggregate functions is essential for anyone working with databases and looking to analyze data efficiently.

APPROX_COUNT_DISTINCT

The SQL aggregate function APPROX_COUNT_DISTINCT estimates the number of distinct values present in a particular column of a table. This function is beneficial in cases where the exact count of distinct values is not required and an approximate estimate is sufficient. In addition, it is conducive when working with large datasets where calculating the accurate count can be time-consuming and resource-intensive. 

The APPROX_COUNT_DISTINCT function uses statistical algorithms to estimate the number of distinct values. As a result, the estimate provided by this function may be partially accurate, but it is typically within an acceptable range for most use cases. 

Overall, the APPROX_COUNT_DISTINCT function is a valuable addition to any SQL developer's toolkit, offering a convenient way to estimate the number of distinct values in a column quickly and efficiently.

Syntax:

APPROX_COUNT_DISTINCT ( expression )

AVG

AVG is an SQL aggregate function used to compute the average value of a set of numerical values in a table or column. This function is particularly useful in data analysis tasks where one needs to determine the mean value of a particular dataset. The AVG function can be used alongside other SQL aggregate functions such as COUNT, SUM, MAX and MIN. 

One of the key benefits of using AVG is that it can help identify outliers in a dataset, which are values significantly higher or lower than the average. By identifying these outliers, data analysts can gain valuable insights into data distribution and make more informed decisions based on the insights obtained. 

So, AVG is a powerful SQL function that can perform a wide range of data analysis tasks, making it an essential tool for anyone working with large datasets in a database management system.

Syntax:

AVG ([ALL | DISTINCT] expression )  

[OVER ([partition_by_clause] order_by_clause ) ]

CHECKSUM_AGG

CHECKSUM_AGG is an SQL aggregate function that generates a hash value for a given data set. This function takes a column or expression as input and returns a single checksum value representing the data in that column or expression. The checksum value is an integer that can be used to compare two sets of data for equality or to detect changes in the data. 

CHECKSUM_AGG is often used in data warehousing and other applications where data integrity is essential. It is a powerful tool for detecting data changes and ensuring that the data in a database is accurate and up-to-date. 

Syntax:

CHECKSUM_AGG ( [ ALL | DISTINCT ] expression )

COUNT_BIG

COUNT_BIG is an SQL aggregate function used to count the number of rows in a table. This function is similar to the COUNT function but returns a bigint data type instead of an int one. The COUNT_BIG function is helpful when counting large datasets where the number of rows exceeds the maximum value of an int data type. This function can be used with other SQL functions to perform complex queries and analyses. The syntax for COUNT_BIG is straightforward, making it easy to incorporate into SQL statements. 

Syntax:

COUNT_BIG ( { [ [ ALL | DISTINCT ] expression ] | * } )  

GROUPING

One of the most commonly used aggregate functions in SQL is the GROUP BY function. GROUP BY function allows you to group rows of data based on one or more columns and then perform aggregate calculations on each group. For example, you could use the GROUP BY function to group sales data by month or region and then calculate the total sales for each group.

The GROUP BY function is commonly used with other aggregate functions, such as COUNT, SUM, AVG, and MAX/MIN. Using the GROUP BY function, you can quickly analyze large data sets and summarize the results meaningfully.

Syntax:

GROUPING ( <column_expression> )

GROUPING_ID

GROUPING_ID is an SQL aggregate function that identifies a row's grouping level in a SELECT statement. It returns a unique integer value that represents the row's grouping level. The value returned by GROUPING_ID is based on the grouping columns used in the GROUP BY clause of the SELECT statement. The function returns 0 for rows not part of any group and a non-zero value for rows that are part of a group. 

GROUPING_ID is a powerful function that can be used to perform complex aggregations on large sets of data. It allows users to easily group data in a way that makes sense for their analysis and to identify the grouping level for each row. This function is handy for data analysts and business intelligence professionals who perform complex calculations on large datasets. 

Syntax:

GROUPING_ID ( <column_expression>[,...n ] )

Want a Top Software Development Job? Start Here!

Full Stack Developer - MERN StackExplore Program
Want a Top Software Development Job? Start Here!

STDEV

STDEV (Standard Deviation) is a crucial SQL aggregate function used to measure the variation or dispersion in a set of data. It calculates the square root of the variance and is a valuable tool for analyzing data trends. And it can also help in identifying outliers in a dataset. The STDEV function is commonly used in statistical analysis, data mining, and data science. By understanding how to use the STDEV function, you can gain valuable insights into your data and make more informed decisions.

Syntax:

STDEV ( [ ALL | DISTINCT ] expression )  

STDEVP

STDEVP is an aggregate function in SQL that calculates the population standard deviation of a given set of values. This function is often used in statistical analysis to determine a data set's degree of variation or dispersion. The STDEVP function takes numeric values as input and returns a single value representing the entire population's standard deviation. 

This function is helpful in many applications, such as finance, engineering, and scientific research. It is important to note that the STDEVP function differs from the STDEV function in that STDEV calculates the sample standard deviation, while STDEVP calculates the population standard deviation. 

Overall, STDEVP is a powerful tool in SQL that can help analysts and developers gain valuable insights into their data sets.

Syntax:

STDEVP ( [ ALL | DISTINCT ] expression )  

STRING_AGG 

STRING_AGG is a powerful SQL aggregate function that concatenates strings from multiple rows into a single string. This function is beneficial when you must group data and display it in a readable format. The STRING_AGG function is available in most modern SQL database management systems, including Microsoft SQL Server, PostgreSQL, and MySQL. With this function, you can easily combine values from multiple rows into a single column, separated by a specified delimiter. This ability makes it easy to create reports, summaries, and other data visualizations that require combining data from multiple rows into a single string. 

Overall, STRING_AGG is an invaluable tool for any SQL developer or database administrator looking to streamline their data management and reporting processes.

Syntax:

STRING_AGG (expression, separator ) [ <order_clause> ]

<order_clause>::=   

WITHIN GROUP (ORDER BY <order_by_expression_list> [ ASC | DESC ] )

VAR

The VAR function is a powerful tool for data analysis, and it can be used in combination with other SQL functions to gain insights into large datasets. To use the VAR function in SQL, you provide the column or expression that contains the values you want to calculate the variance for. The result of the VAR function is a decimal value that represents the variance of the dataset. 

Syntax:

VAR ( [ ALL | DISTINCT ] expression ) 

VARP

VARP, or Variance Population, is an SQL aggregate function used to calculate the variance of a given set of values. The function is helpful when you want to analyze the data spread in a population. You can also use the process in combination with other SQL aggregate functions, such as SUM, AVG, and COUNT, to perform more complex calculations. 

VARP is a powerful tool for data analysis and is commonly used in finance, statistics, and other fields where large amounts of data need to be analyzed. However, it is essential to note that VARP differs from VAR, which is the sample variance. VARP calculates the variance of the entire population, while VAR calculates the variance of a population sample. 

Syntax:

VARP ( [ ALL | DISTINCT ] expression )  

RANGE 

The RANGE function is one of the crucial aggregate functions in SQL. And the RANGE function calculates the range of a set of values. The range is the difference between the set's most prominent and smallest values. 

For example, if a set of values contains 5, 10, 15, 20, and 25, the range would be 20 (25-5). The RANGE function is commonly used in statistical analysis to measure data spread.

So, the RANGE function is a powerful tool in SQL that allows you to analyze your data statistically. By calculating the range of your data, you can gain insights into the spread and distribution of your data.

NANMEAN

NANMEAN is an SQL aggregate function used to compute the mean of a set of numbers. The function calculates the average values, excluding any NULL values in the data set. The NANMEAN function is beneficial when dealing with large data sets containing missing or incomplete data. By excluding the NULL values from the calculation, the function provides a more accurate representation of the data. 

The output of the NANMEAN function is a single value that represents the average of the non-null values in the data set. This function is widely used in data analysis and is a valuable tool for generating insights into complex data sets.

MEDIAN

The median is a statistical measure representing a dataset's median value. In SQL, the median can be calculated using the median() function, an aggregate function that returns the median value of a group of values. 

The syntax for the median() function in SQL is as follows:

SELECT MEDIAN(column_name)

FROM table_name

Here, the column_name represents the column's name that contains the values for which the median needs to be calculated, and table_name represents the name of the table that contains the data. 

The median() function can be combined with other aggregate functions like COUNT, SUM, AVG, etc., to perform more complex calculations on the data. Overall, the median() function in SQL is a powerful tool for data analysis and can provide valuable insights into data distribution in a dataset.

Want a Top Software Development Job? Start Here!

Full Stack Developer - MERN StackExplore Program
Want a Top Software Development Job? Start Here!

MODE 

One of the most commonly used aggregate functions is the MODE function. The MODE function returns the most frequently occurring value in a set of values. This function is handy when working with large datasets where it is necessary to identify the most common values. The syntax for the MODE function is quite simple. The function takes a single parameter, the column's name containing the values to be analyzed. 

The syntax for the MODE function is as follows: 

MODE(column_name). 

Overall, the MODE function is a powerful tool for data analysis in SQL and can be used in various contexts to identify patterns and trends.

Conclusion

The aggregate function in SQL is very powerful in the database. It serves the same purpose as their equivalents in MS Excel. In this article, we have seen several examples of aggregate functions.

If you wish to learn more about SQL, then check out our Post Graduate Program in Full Stack Web Development.

This SQL certification course helps you get the basics right with the practical training you need to work with SQL databases and use them in the applications you use or create. On completing the course, you will have a work-ready understanding of how to structure your database correctly, create error-free and efficient SQL statements and clauses, and design and manage your SQL databases for scalable growth.

If you have any questions, please feel free to ask them in our comments section, and our experts will promptly answer them for you. 

FAQs

1. Is Max an aggregate function?

Yes, MAX is an aggregate function in SQL. It returns the maximum value in a set of values. You can use MAX to return the highest value in a column.

2. What is the aggregate function in SQL GROUP BY?

An aggregate function in SQL adds together multiple values and returns a single result. The aggregate function will carry out this calculation for each group of data when used with the GROUP BY clause. The GROUP BY statement is commonly used with aggregate functions ( MAX() , MIN() , SUM() , AVG(), COUNT() )

3. Can aggregate functions be used in the WHERE clause?

No, aggregate functions cannot be used in a WHERE clause. They are allowed in the SELECT list and in the HAVING clause.

Our Software Development Courses Duration And Fees

Software Development Course typically range from a few weeks to several months, with fees varying based on program and institution.

Program NameDurationFees
Caltech Coding Bootcamp

Cohort Starts: 15 Apr, 2024

6 Months$ 8,000
Full Stack Java Developer

Cohort Starts: 2 Apr, 2024

6 Months$ 1,449
Automation Test Engineer

Cohort Starts: 3 Apr, 2024

11 Months$ 1,499
Full Stack Developer - MERN Stack

Cohort Starts: 3 Apr, 2024

6 Months$ 1,449