By: Tim Smith | Updated: 2016-12-01 | Comments (2) | Related: > Dates
Problem
We've noticed that holidays and weekends impact our collection of metrics across our systems. Should we track these, or are there easy ways we filter these out?
Solution
Holidays, weekends and seasonal cycles might have an impact on metric evaluation and filtering them or separating them may assist in establishing stronger baselines for analysis. This applies for architecture that has down time or is used in cycles, such as the common Monday through Friday cycle. In this tip, we'll look at some approaches to handling these values that we can combine in our metric analysis, or that we can use for other application purposes where the time cycle will impact us.
One approach to tracking weekends, holidays and possible business downtime days (if applicable) is to use a date table where we can exclude dates through a join to the table. In the below code example, I create this join table with only two columns that will be used, one for the join (JoinDateValue) and one of the filter (WeekendHoliday). In some cases, we might find value by adding more columns, like knowing whether it's a weekend, holiday, or other business cyclical date.
/* ---- Table for JOIN CREATE TABLE tblDateJoin( DateID SMALLINT IDENTITY(1,1), JoinDateValue DATE, WeekendHoliday BIT ) */ DECLARE @stYear DATE = '2016-01-01' DECLARE @enYear DATE = '2016-12-31' WHILE @stYear <= @enYear BEGIN ---- Sometimes a business may have a different cycle, so this can be changed to fit the cycle IF (DATEPART(DW,@stYear) IN (1,7)) BEGIN INSERT INTO tblDateJoin (JoinDateValue,WeekendHoliday) VALUES (@stYear,1) END ELSE BEGIN INSERT INTO tblDateJoin (JoinDateValue,WeekendHoliday) VALUES (@stYear,0) END SET @stYear = DATEADD(DD,1,@stYear) END SELECT * FROM tblDateJoin ---- For holidays, we would execute the below - example with New Years' Day: UPDATE tblDateJoin SET WeekendHoliday = 1 WHERE tblDateJoin = '2016-01-01'
A contrived join example with this table:
SELECT t.* FROM tblMetricData t ---- In this example, tblMetricData's column MetricDate is a DATE field INNER JOIN tblDateJoin tt ON t.MetricDate = tt.JoinDateValue WHERE tt.WeekendHoliday = 0
We should note here that not all business (or even states and provinces) honor the same holidays, business cycles, and sometimes even weekends are different - in fact, in the case of holidays and business cycles, I've seen these significantly vary. The key is that when we perform the JOIN to this table, we avoid using statements NOT IN ('2016-01-01'), which might create significant costs if we have multiple years. Another benefit to using a join table like this is that we can replicate or bulk copy it on other servers for the same level of analysis for our needs - essentially, one table is the source of truth and all other servers use the source of truth table. Because businesses can differ from each other, it might be just as convenient to exclude the dates within the query in situations where the query must be manually constructed each time and only a few dates are excluded, like a once-a-year report running in December where only December 25th is supposed to be excluded.
If you consider excluding dates, be careful about excluding these dates in every report for certain environments, as these dates can still provide useful information if we determine that usage has increased during these periods, or if we're using these periods to identify and maintain administrative task schedules.
If we're in an environment where we know that we don't need these metrics, or space is a heavy restriction so we're limited, we can create the filter on the data retrieval level. For an example, if we're using PowerShell to retrieve metrics, we can create the filter on the PowerShell level. In the below example, I exclude Sundays and Mondays within an if statement, where we could apply further logic depending on needs:
<# PS Version 4.0 #> ### We can filter on the script/application level if ((Get-Date).DayOfWeek | Where-Object {$_ -notin ("Sunday","Monday")}) { Write-Host "Do something" } ### Let's test this with other dates: $testOne = ((Get-Date).AddDays(-4).DayOfWeek) $testTwo = ((Get-Date).AddDays(1).DayOfWeek) if ($testOne | Where-Object {$_ -notin ("Sunday","Monday")}) { Write-Host "($testOne) Do something" } if ($testTwo | Where-Object {$_ -notin ("Sunday","Monday")}) { Write-Host "($testTwo) Do something" }
From here, we could skip retaining the information, or log it elsewhere, or even log a different set of information, relative to the cycle we're trying to filter. If we're completely excluding data and we're using a tool like Task Scheduler, SQL Server Agent, or the .NET Quartz library, we could just have the schedule exclude Sundays and Mondays on run times during the week.
I've seen rare cases where the business is off during periods and this is the intended behavior - like a site being turned off for a daily period on the cloud. In these rare cases, the monitoring matches the on and off cycle of the business, so no metric collection occurs when the time is off.
Next Steps
- I recommend collecting a baseline of everything for at least three to four months before determining what to exclude. This does vary by environment, but it's a good starting practice (see last example as an exception).
- Consider the costs and benefits of filtering while storing everything versus storing only what's needed.
- Check out these additional resources:
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2016-12-01