By: Aaron Bertrand | Updated: 2023-06-01 | Comments (87) | Related: 1 | 2 | 3 | 4 | 5 | > Dates
Problem
A calendar table can be immensely useful, particularly for reporting purposes, and for determining things like business days between two dates. I often see people struggling with manually populating a calendar or date dimension table; usually there are lots of loops and iterative code constructs being used. In this tip I will show you how to build and use a calendar table using a set-based solution that is powerful and easily customizable.
Solution
I build calendar tables all the time, for a variety of business applications, and have come up with a few ways to handle certain details. Sharing them here will hopefully prevent you from re-inventing any wheels when populating your own tables.
One of the biggest objections I hear to calendar tables is that people don't want to create a table. I can't stress enough how cheap a table can be in terms of size and memory usage, especially as underlying storage continues to be larger and faster, compared to using all kinds of functions to determine date-related information in every single query. Twenty or thirty years of dates stored in a table takes a few MBs at most, even less with compression, and if you use them often enough, they'll always be in memory.
I also always explicitly set things like DATEFORMAT
, DATEFIRST
, and LANGUAGE
to avoid ambiguity, default to U.S. English for week starts and for month and day names, and assume that
quarters for the fiscal year align with the calendar year. You may need to change
some of these specifics depending on your display language, your fiscal year, and other
factors.
This is a one-time population, so I'm not worried about speed, even though this specific CTE approach is no slouch. I like to materialize all of the columns to disk, rather than rely on computed columns, since the table becomes read-only after initial population. So I'm going to do a lot of those calculations during the initial series of CTEs. To start, I'll show the output of each CTE one at a time.
You can change some of these details to experiment on your own. In this example, I'm going to populate the date dimension table with data spanning 30 years, starting from 2010-01-01.
First, we have a recursive CTE that returns a sequence representing the number of days between our start date (2010-01-01) and 30 years later less a day (2039-12-31):
-- prevent set or regional settings from interfering with -- interpretation of dates / literals SET DATEFIRST 7, -- 1 = Monday, 7 = Sunday DATEFORMAT mdy, LANGUAGE US_ENGLISH; -- assume the above is here in all subsequent code blocks. DECLARE @StartDate date = '20100101'; DECLARE @CutoffDate date = DATEADD(DAY, -1, DATEADD(YEAR, 30, @StartDate)); ;WITH seq(n) AS ( SELECT 0 UNION ALL SELECT n + 1 FROM seq WHERE n < DATEDIFF(DAY, @StartDate, @CutoffDate) ) SELECT n FROM seq ORDER BY n OPTION (MAXRECURSION 0);
This returns the following list of numbers:
In SQL Server 2022 or Azure SQL Database, that initial CTE could be vastly simplified by using the new GENERATE_SERIES function, which would also eliminate any need for MAXRECURSION in subsequent queries:
DECLARE @StartDate date = '20100101', @years int = 30; ;WITH seq(n) AS ( SELECT n = value FROM GENERATE_SERIES(0, DATEDIFF(DAY, @StartDate, DATEADD(YEAR, @years, @StartDate))-1) ) SELECT n FROM seq ORDER BY n;
Next, we can add a second CTE that translates those numbers into all the dates in our range:
DECLARE @StartDate date = '20100101'; DECLARE @CutoffDate date = DATEADD(DAY, -1, DATEADD(YEAR, 30, @StartDate)); ;WITH seq(n) AS ( SELECT 0 UNION ALL SELECT n + 1 FROM seq WHERE n < DATEDIFF(DAY, @StartDate, @CutoffDate) ), d(d) AS ( SELECT DATEADD(DAY, n, @StartDate) FROM seq ) SELECT d FROM d ORDER BY d OPTION (MAXRECURSION 0);
Which returns the following range of dates:
Now, we can start extending those dates with information commonly vital to calendar tables / date dimensions. Many are bits of information you can extract from the date, but it's more convenient to have them readily available in a view or table than it is to have every query calculate them inline. I'm working a little backward here, but I'm going to create an intermediate CTE to extract exactly once some computations I'll later have to make multiple times. This query:
DECLARE @StartDate date = '20100101'; DECLARE @CutoffDate date = DATEADD(DAY, -1, DATEADD(YEAR, 30, @StartDate)); ;WITH seq(n) AS ( SELECT 0 UNION ALL SELECT n + 1 FROM seq WHERE n < DATEDIFF(DAY, @StartDate, @CutoffDate) ), d(d) AS ( SELECT DATEADD(DAY, n, @StartDate) FROM seq ), src AS ( SELECT TheDate = CONVERT(date, d), TheDay = DATEPART(DAY, d), TheDayName = DATENAME(WEEKDAY, d), TheWeek = DATEPART(WEEK, d), TheISOWeek = DATEPART(ISO_WEEK, d), TheDayOfWeek = DATEPART(WEEKDAY, d), TheMonth = DATEPART(MONTH, d), TheMonthName = DATENAME(MONTH, d), TheQuarter = DATEPART(Quarter, d), TheYear = DATEPART(YEAR, d), TheFirstOfMonth = DATEFROMPARTS(YEAR(d), MONTH(d), 1), TheLastOfYear = DATEFROMPARTS(YEAR(d), 12, 31), TheDayOfYear = DATEPART(DAYOFYEAR, d) FROM d ) SELECT * FROM src ORDER BY TheDate OPTION (MAXRECURSION 0);
Yields this data:
If you wanted your fiscal year aligned differently, you could change the year and quarter calculations,
or add additional columns. Let's say your fiscal year starts October 1st, then depending on whether that's
9 months late or 3 months early, you could just substitute d
for a DATEADD
expression:
;WITH q AS (SELECT d FROM ( VALUES('20200101'), ('20200401'), ('20200701'), ('20201001') ) AS d(d)) SELECT d, StandardQuarter = DATEPART(QUARTER, d), LateFiscalQuarter = DATEPART(QUARTER, DATEADD(MONTH, -9, d)), LateFiscalQuarterYear = YEAR(DATEADD(MONTH, -9, d)), EarlyFiscalQuarter = DATEPART(QUARTER, DATEADD(MONTH, 3, d)), EarlyFiscalQuarterYear = YEAR(DATEADD(MONTH, 3, d)) FROM q;
Whatever my source data is, I can build on those parts and get much more detail about each date:
DECLARE @StartDate date = '20100101'; DECLARE @CutoffDate date = DATEADD(DAY, -1, DATEADD(YEAR, 30, @StartDate)); ;WITH seq(n) AS ( SELECT 0 UNION ALL SELECT n + 1 FROM seq WHERE n < DATEDIFF(DAY, @StartDate, @CutoffDate) ), d(d) AS ( SELECT DATEADD(DAY, n, @StartDate) FROM seq ), src AS ( SELECT TheDate = CONVERT(date, d), TheDay = DATEPART(DAY, d), TheDayName = DATENAME(WEEKDAY, d), TheWeek = DATEPART(WEEK, d), TheISOWeek = DATEPART(ISO_WEEK, d), TheDayOfWeek = DATEPART(WEEKDAY, d), TheMonth = DATEPART(MONTH, d), TheMonthName = DATENAME(MONTH, d), TheQuarter = DATEPART(Quarter, d), TheYear = DATEPART(YEAR, d), TheFirstOfMonth = DATEFROMPARTS(YEAR(d), MONTH(d), 1), TheLastOfYear = DATEFROMPARTS(YEAR(d), 12, 31), TheDayOfYear = DATEPART(DAYOFYEAR, d) FROM d ), dim AS ( SELECT TheDate, TheDay, TheDaySuffix = CONVERT(char(2), CASE WHEN TheDay / 10 = 1 THEN 'th' ELSE CASE RIGHT(TheDay, 1) WHEN '1' THEN 'st' WHEN '2' THEN 'nd' WHEN '3' THEN 'rd' ELSE 'th' END END), TheDayName, TheDayOfWeek, TheDayOfWeekInMonth = CONVERT(tinyint, ROW_NUMBER() OVER (PARTITION BY TheFirstOfMonth, TheDayOfWeek ORDER BY TheDate)), TheDayOfYear, IsWeekend = CASE WHEN TheDayOfWeek IN (CASE @@DATEFIRST WHEN 1 THEN 6 WHEN 7 THEN 1 END,7) THEN 1 ELSE 0 END, TheWeek, TheISOweek, TheFirstOfWeek = DATEADD(DAY, 1 - TheDayOfWeek, TheDate), TheLastOfWeek = DATEADD(DAY, 6, DATEADD(DAY, 1 - TheDayOfWeek, TheDate)), TheWeekOfMonth = CONVERT(tinyint, DENSE_RANK() OVER (PARTITION BY TheYear, TheMonth ORDER BY TheWeek)), TheMonth, TheMonthName, TheFirstOfMonth, TheLastOfMonth = MAX(TheDate) OVER (PARTITION BY TheYear, TheMonth), TheFirstOfNextMonth = DATEADD(MONTH, 1, TheFirstOfMonth), TheLastOfNextMonth = DATEADD(DAY, -1, DATEADD(MONTH, 2, TheFirstOfMonth)), TheQuarter, TheFirstOfQuarter = MIN(TheDate) OVER (PARTITION BY TheYear, TheQuarter), TheLastOfQuarter = MAX(TheDate) OVER (PARTITION BY TheYear, TheQuarter), TheYear, TheISOYear = TheYear - CASE WHEN TheMonth = 1 AND TheISOWeek > 51 THEN 1 WHEN TheMonth = 12 AND TheISOWeek = 1 THEN -1 ELSE 0 END, TheFirstOfYear = DATEFROMPARTS(TheYear, 1, 1), TheLastOfYear, IsLeapYear = CONVERT(bit, CASE WHEN (TheYear % 400 = 0) OR (TheYear % 4 = 0 AND TheYear % 100 <> 0) THEN 1 ELSE 0 END), Has53Weeks = CASE WHEN DATEPART(WEEK, TheLastOfYear) = 53 THEN 1 ELSE 0 END, Has53ISOWeeks = CASE WHEN DATEPART(ISO_WEEK, TheLastOfYear) = 53 THEN 1 ELSE 0 END, MMYYYY = CONVERT(char(2), CONVERT(char(8), TheDate, 101)) + CONVERT(char(4), TheYear), Style101 = CONVERT(char(10), TheDate, 101), Style103 = CONVERT(char(10), TheDate, 103), Style112 = CONVERT(char(8), TheDate, 112), Style120 = CONVERT(char(10), TheDate, 120) FROM src ) SELECT * FROM dim ORDER BY TheDate OPTION (MAXRECURSION 0);
This adds supplemental information about any given date, such as the first of period / last of period the date falls within, whether it is a leap year, a few popular string formats, and some specific ISO 8601 specifics (I'll talk more about those in another tip). You may only want some of these columns, and you may want others, too. When you're happy with the output, you can change this line:
SELECT * FROM dim
To this:
SELECT * INTO dbo.DateDimension FROM dim
Then you can add a clustered primary key (and any other indexes you want to have handy):
CREATE UNIQUE CLUSTERED INDEX PK_DateDimension ON dbo.DateDimension(TheDate);
To give an idea of how much space this table really takes, even with all those columns that you probably don't need,
the max is about 2MB with a regular clustered index defined on the TheDate
column, all the way down to 500KB
for a clustered columnstore index compressed with COLUMNSTORE_ARCHIVE
(not necessarily something you should do,
depending on the workload that will work against this table, but since it is effectively read only, the DML overhead isn't
really a consideration):
Next, we need to talk about holidays, one of the primary seasons you need to use a calendar table instead
of relying on built-in date/time functions. In the original version of this tip, I added an IsHoliday
column, but as a comment rightly pointed out, this set is probably best held in a separate table:
CREATE TABLE dbo.HolidayDimension ( TheDate date NOT NULL, HolidayText nvarchar(255) NOT NULL, CONSTRAINT FK_DateDimension FOREIGN KEY(TheDate) REFERENCES dbo.DateDimension(TheDate) ); CREATE CLUSTERED INDEX CIX_HolidayDimension ON dbo.HolidayDimension(TheDate); GO
This allows you to have more than one holiday for any given date, and in fact allows for multiple entire calendars each with their own set of holidays (imagine an additional column specifying the CalendarID).
Populating the holiday dimension table can be complex. Since I am in the United States, I'm going to deal with statutory holidays here; of course, if you live in another country, you'll need to use different logic. You'll also need to add your own company's holidays manually, but hopefully if you have things that are deterministic, like bank holidays, Boxing Day, or the third Monday of July is your annual off-site arm-wrestling tournament, you should be able to do most of that without much work by following the same sort of pattern I use below. You may also have to add some logic if your company observes weekend holidays on the previous or following weekday, which gets even more complex if those happen to collide with other company- or industry-specific non-business days. We can add most of the traditional holidays with a single pass and rather simple criteria:
;WITH x AS ( SELECT TheDate, TheFirstOfYear, TheDayOfWeekInMonth, TheMonth, TheDayName, TheDay, TheLastDayOfWeekInMonth = ROW_NUMBER() OVER ( PARTITION BY TheFirstOfMonth, TheDayOfWeek ORDER BY TheDate DESC ) FROM dbo.DateDimension ), s AS ( SELECT TheDate, HolidayText = CASE WHEN (TheDate = TheFirstOfYear) THEN 'New Year''s Day' WHEN (TheDayOfWeekInMonth = 3 AND TheMonth = 1 AND TheDayName = 'Monday') THEN 'Martin Luther King Day' -- (3rd Monday in January) WHEN (TheDayOfWeekInMonth = 3 AND TheMonth = 2 AND TheDayName = 'Monday') THEN 'President''s Day' -- (3rd Monday in February) WHEN (TheLastDayOfWeekInMonth = 1 AND TheMonth = 5 AND TheDayName = 'Monday') THEN 'Memorial Day' -- (last Monday in May) WHEN (TheMonth = 7 AND TheDay = 4) THEN 'Independence Day' -- (July 4th) WHEN (TheDayOfWeekInMonth = 1 AND TheMonth = 9 AND TheDayName = 'Monday') THEN 'Labour Day' -- (first Monday in September) WHEN (TheDayOfWeekInMonth = 2 AND TheMonth = 10 AND TheDayName = 'Monday') THEN 'Columbus Day' -- Columbus Day (second Monday in October) WHEN (TheMonth = 11 AND TheDay = 11) THEN 'Veterans'' Day' -- (November 11th) WHEN (TheDayOfWeekInMonth = 4 AND TheMonth = 11 AND TheDayName = 'Thursday') THEN 'Thanksgiving Day' -- (Thanksgiving Day ()fourth Thursday in November) WHEN (TheMonth = 12 AND TheDay = 25) THEN 'Christmas Day' END FROM x WHERE (TheDate = TheFirstOfYear) OR (TheDayOfWeekInMonth = 3 AND TheMonth = 1 AND TheDayName = 'Monday') OR (TheDayOfWeekInMonth = 3 AND TheMonth = 2 AND TheDayName = 'Monday') OR (TheLastDayOfWeekInMonth = 1 AND TheMonth = 5 AND TheDayName = 'Monday') OR (TheMonth = 7 AND TheDay = 4) OR (TheDayOfWeekInMonth = 1 AND TheMonth = 9 AND TheDayName = 'Monday') OR (TheDayOfWeekInMonth = 2 AND TheMonth = 10 AND TheDayName = 'Monday') OR (TheMonth = 11 AND TheDay = 11) OR (TheDayOfWeekInMonth = 4 AND TheMonth = 11 AND TheDayName = 'Thursday') OR (TheMonth = 12 AND TheDay = 25) ) INSERT dbo.HolidayDimension(TheDate, HolidayText) SELECT TheDate, HolidayText FROM s UNION ALL SELECT DATEADD(DAY, 1, TheDate), 'Black Friday' FROM s WHERE HolidayText = 'Thanksgiving Day' ORDER BY TheDate;
Black Friday is a little trickier, because it's the Friday after the fourth Thursday
in November. Usually that makes it the fourth Friday, but several times a century it
is actually the fifth Friday, so the UNION ALL
above just grabs the day
after each Thanksgiving Day.
And then there's Easter. This has always been a complicated problem; the rules for calculating the exact date are so convoluted, I suspect most people can only mark those dates where they have physical calendars they can look at to confirm. If your company doesn't recognize Easter, you can skip ahead; if it does, you can use the following function, which will return the Easter holiday dates for any given year:
CREATE FUNCTION dbo.GetEasterHolidays(@TheYear INT) RETURNS TABLE WITH SCHEMABINDING AS RETURN ( WITH x AS ( SELECT TheDate = DATEFROMPARTS(@TheYear, [Month], [Day]) FROM (SELECT [Month], [Day] = DaysToSunday + 28 - (31 * ([Month] / 4)) FROM (SELECT [Month] = 3 + (DaysToSunday + 40) / 44, DaysToSunday FROM (SELECT DaysToSunday = paschal - ((@TheYear + (@TheYear / 4) + paschal - 13) % 7) FROM (SELECT paschal = epact - (epact / 28) FROM (SELECT epact = (24 + 19 * (@TheYear % 19)) % 30) AS epact) AS paschal) AS dts) AS m) AS d ) SELECT TheDate, HolidayText = 'Easter Sunday' FROM x UNION ALL SELECT DATEADD(DAY, -2, TheDate), 'Good Friday' FROM x UNION ALL SELECT DATEADD(DAY, 1, TheDate), 'Easter Monday' FROM x ); GO
(You can adjust the function easily, depending on whether they recognize just Easter Sunday or also Good Friday and/or Easter Monday. There is also another tip here that will show you how to determine the date for Mardi Gras, given the date for Easter.)
Now, to use that function to add the Easter holidays to the HolidayDimension
table:
INSERT dbo.HolidayDimension(TheDate, HolidayText) SELECT d.TheDate, h.HolidayText FROM dbo.DateDimension AS d CROSS APPLY dbo.GetEasterHolidays(d.TheYear) AS h WHERE d.TheDate = h.TheDate;
Finally, you can create a view that bridges these two tables (or multiple views):
CREATE VIEW dbo.TheCalendar AS SELECT d.TheDate, d.TheDay, d.TheDaySuffix, d.TheDayName, d.TheDayOfWeek, d.TheDayOfWeekInMonth, d.TheDayOfYear, d.IsWeekend, d.TheWeek, d.TheISOweek, d.TheFirstOfWeek, d.TheLastOfWeek, d.TheWeekOfMonth, d.TheMonth, d.TheMonthName, d.TheFirstOfMonth, d.TheLastOfMonth, d.TheFirstOfNextMonth, d.TheLastOfNextMonth, d.TheQuarter, d.TheFirstOfQuarter, d.TheLastOfQuarter, d.TheYear, d.TheISOYear, d.TheFirstOfYear, d.TheLastOfYear, d.IsLeapYear, d.Has53Weeks, d.Has53ISOWeeks, d.MMYYYY, d.Style101, d.Style103, d.Style112, d.Style120, IsHoliday = CASE WHEN h.TheDate IS NOT NULL THEN 1 ELSE 0 END, h.HolidayText FROM dbo.DateDimension AS d LEFT OUTER JOIN dbo.HolidayDimension AS h ON d.TheDate = h.TheDate;
And now you have a functional calendar view you can use for all of your reporting or business needs.
Summary
Creating a dimension or calendar table for business dates and fiscal periods might seem intimidating at first, but once you have a solid methodology in line, it can be very worthwhile. There are many ways to do this; some will subscribe to the idea that many of these date-related facts can be derived at query time, or at least be non-persisted computed columns. You will have to decide if the values are calculated often enough to justify the additional space on disk and in the buffer pool.
To further help performance, you could put the calendar table into its own filegroup (or its own database), and mark it as read-only after initial population. This won't force the table to stay in memory all the time (remember DBCC PINTABLE?), but that will happen naturally if the table is queried enough anyway. What it could potentially help with is reducing other types of contention.
Next Steps
- Build a persisted calendar table to help with reporting queries, business
logic, and gathering additional facts about given dates.
- Using a calendar table in SQL Server - Part 1
- Calendar Table in SQL Server to Identify Date Gaps and Islands
- Create a Calendar Table in SQL Server to Optimize Scheduling Problems
- SQL Calendar Table - Use a Table and View to Simplify Generation
- Simplify Date Period Calculations in SQL Server
- SQL Date Calculation Simplifications in SQL Server
- See these related tips and other resources:
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2023-06-01