datalitico.com

Grouping data using the GROUP BY clause

In SQL, the GROUP BY clause is used to group rows based on one or more columns. It allows you to divide the result set into groups and apply aggregate functions to each group independently. The GROUP BY clause is often used in combination with aggregate functions to perform calculations on grouped data.

Basic Syntax

The basic syntax of a SELECT statement with the GROUP BY clause is as follows:

SELECT column1, column2, ..., aggregate_function(column)
FROM table_name
GROUP BY column1, column2, ...;
  • column1, column2, …: Specifies the columns to retrieve from the table.
  • aggregate_function(column): Represents an aggregate function applied to a specific column.
  • table_name: Specifies the table from which to retrieve the data.
  • GROUP BY: Indicates that the result set should be grouped based on the specified columns.

Grouping by Single Column

To group the result set by a single column, simply specify the column name after the GROUP BY keyword.

SELECT department_id, COUNT(*) AS total_employees
FROM employees
GROUP BY department_id;

This query calculates the count of employees in each department and groups them by department_id.

Grouping by Multiple Columns

To group the result set by multiple columns, specify the columns in the desired order after the GROUP BY keyword.

SELECT department_id, gender, COUNT(*) AS total_employees
FROM employees
GROUP BY department_id, gender;

The query above retrieves the department ID and gender from the “employees” table and calculates the count of employees in each department and by gender.

Applying Aggregate Functions

Aggregate functions such as COUNT, SUM, AVG, MAX, and MIN can be used in conjunction with the GROUP BY clause to perform calculations on each group.

SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id;

This shows average salary by department.

By using the GROUP BY clause, you can group rows based on specific columns and apply aggregate functions to each group. This allows you to analyze data at a higher level of granularity and derive meaningful insights. The GROUP BY clause is a powerful tool for summarizing and analyzing data in SQL queries.

Scroll to Top