SQL Servers Lag and Lead Functions to Help Identify Date Differences

By:   |   Updated: 2015-01-19   |   Comments (4)   |   Related: > Functions System


Problem

Analytics in the Health Care industry requires locating previous visits or claims that indicate if a patient/member is being readmitted to a hospital or rehab facility. Basic T-SQL is very difficult to use in order compute these rates when not using temporary tables. The T-SQL statement does become hard to follow if you are not an expert in set based theory nor is it very readable.  Are there any simple answers to find previous records for readmissions in our example?

Solution

SQL Server added the LAG and LEAD functions in T-SQL to help find a previous record related to the key fields. The addition of these functions help with readability and simplicity. No longer do you need to use temporary tables although using staging tables for further calculation could be required.

The following is a view of the Claims structure. Our date columns used will be StartDate and EndDate.

Claim Header Diagram

Scripts

Here are the CREATE TABLE and INSERT statements for this example

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO

CREATE TABLE [dbo].[ClaimHeader](
 [ClaimNumber] [varchar](20) NOT NULL,
 [PatID] [varchar](50) NOT NULL,
 [ClaimStartDate] [date] NOT NULL,
 [ClaimEndDate] [date] NOT NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO

INSERT INTO dbo.ClaimHeader VALUES ('123','AB0123','2014-01-21','2014-01-28')'
INSERT INTO dbo.ClaimHeader VALUES ('124','AB0123','2014-03-01','2014-03-09')'
INSERT INTO dbo.ClaimHeader VALUES ('125','AB0123','2014-03-28','2014-03-31')'
INSERT INTO dbo.ClaimHeader VALUES ('126','AB0123','2014-03-31','2014-04-10')'

A Look at the Data

We are going to use the ClaimEndDate and ClaimStartDate to find our readmissions. Here is a view of some of the data and what we are trying to accomplish.

Claim Header Data

Using the SQL Server LAG Function to get Previous End Date

If you look closely, you will see that the first claims started on 1/21/14 and ended 1/28/14. The next claim starts on 3/1/14. This indicates that the second admission did not happen within 30 days of the previous claim. Thus, it is not a readmission within 30 days.

But, the third claim started on 3/28/14 and the previous (2nd claim) ended 3/9/14 which is within 30 days of each other.

Let’s use the LAG functions to get the days between claims. The first parameter in the function is the date we are using in the current row - ClaimEndDate. The second parameter (i.e. 1) is the previous occurrence in the set returned with Null being used for the last parameter for the return value if nothing is found by the LAG function.

-- DateDiff( day, Previous End Date, Current Start Date) AS DiffDays
/*
LAG (scalar_expression [,offset] [,default])
    OVER ( [ partition_by_clause ] order_by_clause )
*/

SELECT ClaimNumber, PatID AS PatientID,
    ClaimStartDate, ClaimEndDate,
 LAG( ClaimEndDate, 1, Null) OVER (PARTITION BY  PatID
     ORDER BY ClaimStartDate, ClaimEndDate) AS PreviousClaimEndDate
  FROM dbo.ClaimHeader

OVER divides the data into partitions (PARTITIONED BY - PatID) containing a rowset from the FROM (ClaimHeader) clause using ORDER BY (StartDate and EndDate) to sort the set. This is how we get claims for a Patient ordered by ClaimEndDate ascending.

T-SQL with LAG Function

This use of the LAG function gives us the Previous ClaimEndDate and with this we can calculate the day difference between Start and End.

-- DateDiff( day, Previous End Date, Current Start Date) AS DiffDays

SELECT *, DATEDIFF ( day, PreviousClaimEndDate, ClaimStartDate) AS DiffDays
  FROM (
    SELECT ClaimNumber, PatID AS PatientID,
        ClaimStartDate, ClaimEndDate,
     LAG( ClaimEndDate, 1, Null) OVER (PARTITION BY  PatID
         ORDER BY ClaimStartDate, ClaimEndDate) AS PreviousClaimEndDate
      FROM dbo.ClaimHeader
 ) LagSelect


Using DateDiff

Conclusion

From this output, we can see the first Claim does not have a previous claim, so it is not a re-admit within 30 days. The second claim can be excluded from readmit count because the DiffDays is 32 days.

Now, the third claim is within 19 days of the previous, so it will be included in the 30 readmit count, but 4th claim has a DiffDays of 0. This was interesting because it would seem the patient went home and was re-admitted the same day of discharge. The Metrics team took note of this and examined the data more closely and was able to find other columns in the claim that indicated a transfer to a different facility, thus excluding it from the count.

So, the LAG (and possibly the LEAD) functions can really help in looking at before and after records to compute date differences from a single T-SQL statement that is more readable and logical than previous methods for gathering this information.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Thomas LeBlanc Thomas LeBlanc is a Sr DBA and MCITP 05/08 DBA & 08 BI has spoken at the PASS Summit 2011/12, SQL Rally & many SQL Saturday’s.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2015-01-19

Comments For This Article




Thursday, August 15, 2024 - 10:13:56 AM - LEE Back To Top (92455)
I have a question about this as the information is great. Is there any way to restart the lag after a date change.

Wednesday, November 20, 2019 - 8:49:45 AM - Sushil Kushwaha Back To Top (83144)

i have data in table as below

ID                Start_Date                      End_Date                        Duration_Seconds

1                 2019-01-01 09:00:00      2019-01-01 09:01:00       60

2                 2019-01-02 09:00:00      2019-01-02 09:01:00       60

3                 2019-01-03 09:00:00      null                                 60

4                 null                                null                                 60

i want to update the end date based on duration, also i want to update the start date as previous end date where the start date is null.

The table contains thousands of records.

Can you help me with the query without using loop.

Thanks


Saturday, September 17, 2016 - 7:32:06 AM - swarup Back To Top (43348)

 

 .hi i am swaroop...................

 i am having a doubt  on date difference

 in two datatables only one datatable contains date column  but with a same person with multiple transactions in different dates....

my question is here how to calculate difference in dates regarding to present date............

 please help me


Saturday, July 18, 2015 - 10:10:40 PM - DANIEL Back To Top (38248)

Hi

 

thank you for your posting.

Is it possible to run these 3 SQL in one query?

 

 

Regards

 

 















get free sql tips
agree to terms