By: Tim Wiseman | Updated: 2016-08-23 | Comments (1) | Related: 1 | 2 | 3 | 4 | > Functions System
Problem
How do I find some of the basic descriptive statistics of a column such as the maximum, minimum, and average values in SQL Server with T-SQL code? Can you provide some examples?
Solution
SQL Server Max, Min and Avg Functions
SQL Server provides several built in functions to help get some basic data about a column of data.
- Max returns the maximum value of the column. It does this using the collating sequence so it can work on character and datetime columns in addition to numeric ones.
- Min is the inverse. It returns the smallest value of the column and also works with several different data types.
- Avg returns the average or arithmetic mean of the values. It adds all non-null values in the column and then divides them by the number of values added to get the mean. Since it is adding and dividing, AVG requires the column or expression to have a numeric value, and the return type from AVG is determined by the type of the expression.
SQL Server Max, Min and Avg Function Examples
Examples can make these functions much easier to understand. So, let's start with an arbitrary table of some significant constants.
create table constants ( ConstName varchar(50) primary key, PrimaryField varchar(50), ApproxValue decimal(20, 10) ) insert into dbo.constants (ConstName, PrimaryField, ApproxValue) values ('Pi', 'Geometry', '3.1415926535'), ('Euler''s Number e', 'Probability Theory', '2.71828'), ('Square Root of 2', 'Geometry', '1.41421'), ('Golden Ratio', 'Geometry', '1.618'), ('Twin Prime Constant', 'Number Theory', '0.66016'), ('Conway''s Constant', 'Number Theory', '1.30357')
In the most basic usage, we could get the max, min, and average values of these constants. Like this:
select max(ApproxValue) as MaxValue, avg(ApproxValue) as AvgValue, min(ApproxValue) as MinValue from dbo.constants
But this gives us the values for the entire table. If we want to get the values for a particular field of study, we would need to use a group by clause. Like:
select PrimaryField, max(ApproxValue) as MaxValue, avg(ApproxValue) as AvgValue, min(ApproxValue) as MinValue from dbo.constants group by PrimaryField
The Entire Row with SQL Server Min and Max Functions
When we looked at it with a group by clause, we got the descriptive values for the entire table, but it did not give us the row those values came from. So we did not have the name of the constants that had those maximum and minimum values. We can approach this by joining against a subquery. For instance:
select c1.PrimaryField, c1.ConstName, c1.ApproxValue from dbo.constants c1 join ( select PrimaryField, max(ApproxValue) ApproxValue from dbo.constants group by PrimaryField) c2 on c1.PrimaryField = c2.PrimaryField and c1.ApproxValue = c2.ApproxValue
For each field of study this gives us the constant that has the greatest approximate value. In this example, there are no duplicated approximate values, but if the greatest approximate value for a field did appear in more than one row it would be returned twice. We could get the smallest values just by changing the max to min in the code above. This would not work with average of course since the actual average value does not exist in the table.
If we want to return the maximum and minimum constants and their names for a field in the same row, we need to be slightly more sophisticated, but we can use a common table expression or CTE to make it more readable.
; with MaxCTE as ( select c1.PrimaryField, c1.ConstName, c1.ApproxValue from dbo.constants c1 join ( select PrimaryField, max(ApproxValue) ApproxValue from dbo.constants group by PrimaryField) c2 on c1.PrimaryField = c2.PrimaryField and c1.ApproxValue = c2.ApproxValue ), minCTE as ( select c1.PrimaryField, c1.ConstName, c1.ApproxValue from dbo.constants c1 join ( select PrimaryField, min(ApproxValue) ApproxValue from dbo.constants group by PrimaryField) c2 on c1.PrimaryField = c2.PrimaryField and c1.ApproxValue = c2.ApproxValue ) select distinct c.PrimaryField, maxCTE.ConstName as HighValueConstant, maxCTE.ApproxValue as HighValue, minCTE.ConstName as LowValueConstant, minCTE.ApproxValue as LowValue from dbo.constants c join MaxCTE on c.PrimaryField = MaxCTE.PrimaryField join MinCTE on c.PrimaryField = minCte.PrimaryField and MaxCTE.PrimaryField = minCTE.PrimaryField;
This will repeat rows if either a high value or a low value in field is duplicated, but that does not appear here.
Max, Min and Avg SQL Server Functions with Partitioning
Max, min, and avg all support partitioning. If we wanted to get the relevant values for each field with partitioning, using an over clause it would look like:
select distinct PrimaryField, max(ApproxValue) over (partition by PrimaryField) as HighestConstant, min(ApproxValue) over (partition by PrimaryField) as LowestConstant, avg(ApproxValue) over (partition by PrimaryField) as AvgValue from dbo.constants
Next Steps
- Count is another aggregate function that can help describe the data in a table.
- Dalls Snider provides an overview of a related but more advanced technique for min-max column normalization.
- Along with the average, the mode is another significant part of descriptive statistics. Koen Verbeeck provides a method to calculate the mode.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2016-08-23