By: Aaron Bertrand | Updated: 2019-10-28 | Comments (2) | Related: > TSQL
Problem
A user recently asked about an interesting scenario involving tables with foreign key relationships. They wanted to generate the DELETE statements that would allow them to manually delete from child tables first (perhaps in stages), based on criteria they define, and tell them – before performing the deletes – how many rows the operation would affect in each table. They wanted output like this:
DELETE dbo.ChildTable1 WHERE ParentID < <some constant>; -- This would delete 47 row(s). DELETE dbo.ChildTable2 WHERE ParentID < <some constant>; -- This would delete 14 row(s).
...
DELETE dbo.ParentTable WHERE ID < <some constant>; -- This would delete 11 row(s).
Solution
The catalog views, like sys.columns and sys.foreign_key_columns, have all the information we need to approach this problem.
Let’s create a simple schema; one parent table and two child tables:
CREATE TABLE dbo.Personnel
(
ID int PRIMARY KEY
); CREATE TABLE dbo.Addresses
(
ID int PRIMARY KEY,
PersonnelID int FOREIGN KEY REFERENCES dbo.Personnel(ID)
); CREATE TABLE dbo.Referrals
(
ID int PRIMARY KEY,
ReferrerID int FOREIGN KEY REFERENCES dbo.Personnel(ID),
ReferreeID int FOREIGN KEY REFERENCES dbo.Personnel(ID)
);
Notice that the second child table has two references back to the parent table.
Let’s insert some rows:
INSERT dbo.Personnel(ID) VALUES(1),(2),(3),(4),(5),(6); INSERT dbo.Addresses(ID, PersonnelID) VALUES(1,1),(2,2),(3,3),(6,6); INSERT dbo.Referrals VALUES(1,1,2),(2,4,3),(3,5,4),(4,5,2),(5,6,6);
Now, let’s say the user wants to delete all the rows from these tables, where the Personnel ID < 4. We can generate the desired output using a query against the catalog views, with the table and column name, which we can pass in as parameters to a stored procedure.
CREATE PROCEDURE dbo.GenerateCascadingDeletes
@ParentTable nvarchar(512),
@ParentColumn sysname,
@DeleteCriteria nvarchar(255)
AS
BEGIN SET NOCOUNT ON; DECLARE @sql nvarchar(max) = N'',
@src nvarchar(max) = N'SELECT ''DELETE $t$ WHERE $c$ $clause$;'' UNION ALL SELECT
''-- This would delete '' + (SELECT RTRIM(COUNT(*)) FROM $t$ WHERE $c$ $clause$)
+ '' rows.'';'; -- generate deletes for child tables: SELECT @sql = @sql
+ REPLACE(REPLACE(REPLACE(@src,N'$t$',t),N'$c$',c),N'$clause$',@DeleteCriteria)
FROM
(
SELECT t = QUOTENAME(OBJECT_SCHEMA_NAME(pt.parent_object_id))
+ '.' + QUOTENAME(OBJECT_NAME(pt.parent_object_id)),
c = QUOTENAME(pc.name)
FROM sys.foreign_key_columns AS pt
INNER JOIN sys.columns AS pc
ON pt.parent_object_id = pc.[object_id]
AND pt.parent_column_id = pc.column_id
INNER JOIN sys.columns AS rc
ON pt.referenced_column_id = rc.column_id
AND pt.referenced_object_id = rc.[object_id]
WHERE pt.referenced_object_id = OBJECT_ID(@ParentTable)
AND rc.name = @ParentColumn
) AS x; -- final delete of parent table: SELECT @sql += REPLACE(REPLACE(REPLACE(@src,N'$t$',@ParentTable),
N'$c$',@ParentColumn),N'$clause$',@DeleteCriteria); EXEC sys.sp_executesql @sql; END
GO
Here is a sample run of the stored procedure.
exec dbo.GenerateCascadingDeletes 'Personnel', 'ID', '< 4'
Output from the above query:
-------------------------------------------------- DELETE [dbo].[Addresses] WHERE [PersonnelID] < 4; -- This would delete 3 rows. ------------------------------------------------- DELETE [dbo].[Referrals] WHERE [ReferrerID] < 4; -- This would delete 1 rows. ------------------------------------------------- DELETE [dbo].[Referrals] WHERE [ReferreeID] < 4; -- This would delete 3 rows. --------------------------------------- DELETE dbo.Personnel WHERE ID < 4; -- This would delete 3 rows.
Notice how the counts are potentially misleading for the Referrals table; the first delete operation would delete one row based on the ReferrerID column, but the second operation would only be able to delete two rows based on the ReferreeID column (since one was already deleted).
That output could then be used to decide in which order to delete from the tables, based on importance, size, or other criteria. Some deletes could even be moved to a chunking process to minimize the impact to the transaction log. Once all child tables have been deleted, you can move on to the last statement, which deletes from the parent.
Note that this is a simple scenario and not a complete solution. Caveats include:
- Multiple key columns are not supported (though it is possible to adjust this script to handle that).
- Grandchildren are not supported (this would require recursion in some way and additional steps).
- Circular references are not supported (this would have to be handled manually).
- This is not protected from SQL Injection, but it could be enhanced to be more secure.
Summary
The catalog views provide all the information you need to automate activities like deleting from child tables before deleting from the parent. This solution has its limitations in more complex scenarios, but it can be a good starting point for the more typical use case of a simple parent-child relationship.
Next Steps
Read on for related tips and other resources:
- List Dependencies for SQL Server Foreign Keys
- Drop and Re-Create All Foreign Key Constraints in SQL Server
- Disable, enable, drop and recreate SQL Server Foreign Keys
- Script to identify all non-indexed foreign keys in a SQL Server database
- Use DELETE CASCADE Option to Handle Child Key Removal in Foreign Key Relationships
- Using DELETE CASCADE Option for Foreign Keys
- Protecting Yourself from SQL Injection in SQL Server - Part 1
- Protecting Yourself from SQL Injection in SQL Server - Part 2
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2019-10-28