Identifying PII Data to Lock Down in SQL Server (Part 1 of 2)

By:   |   Updated: 2011-06-23   |   Comments (1)   |   Related: 1 | 2 | > Encryption


Problem

We're pretty confident that we have locked down and encrypted our financial data, but a lot of our customer's PII (Personally Identifiable Information) data is still held in unencrypted form. This data is able to be selected directly by read only business users on many of our downstream reporting, datawarehouse and standby servers. The rise of identity theft makes protecting this data imperative. DBAs are the custodians of this information and must protect it like we protect our own personal information. Recent publicity over the theft of Sony PSN data underscores both the economic and ethical importance of protecting personal data.

Solution

In Part 1 of this series I will provide a quick way to identify potential PII information in your databases. In Part 2, I will present a method to lock down access to your PII data while still retaining user access to other non PII fields.

Personally Identifiable Information is defined as information that can be used to distinguish or trace a person's identity. Examples include social security number, medical records, or information that, when combined or used with other identifying information, is linked or linkable to a specific individual.

If we can identify all the columns in our databases that hold PII data then we can plan a suitable lockdown strategy.

Get the PII columns

The simple script below will iterate through each database and use wildcards to search for column names that could be PII data. We are specifically concerned with first name, last name, addresses, gender, SSN, email and other common PII sounding fields.

The script below can be edited to suit your environment and the nature of the PII data that you retain.

-- script examines all columns in all DBs for PII sounding names
EXEC sp_msforeachDB 'USE [?]
SELECT db_name() as [DB], 
object_name(o.object_id) AS [Object], 
c.name AS [Column]
FROM sys.columns c join sys.objects o ON c.object_id = o.object_id 
WHERE o.type IN (''U'',''V'') /* tables and views */
AND (c.name like ''%mail%''
OR c.name like ''%first%name%''
OR c.name like ''%last%name%''
OR c.name like ''%birth%''
OR c.name like ''%sex%''
OR c.name like ''%address%''
OR c.name like ''%phone%''
OR c.name like ''%social%''
OR c.name like ''%ssn%''
OR c.name like ''%gender%'')
AND db_name() NOT IN (''msdb'',''tempdb'',''master'')'

Here is sample output from the AdventureWorksDW2008 database.

a simple script to identify pii on sql server

As can be seen from the results we have identified potential columns which contain PII data and should be locked down from users running SELECT queries.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Alan Cranfield Alan Cranfield is a versatile SQL Server DBA with over 10 years experience managing critical systems in large distributed environments.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2011-06-23

Comments For This Article




Tuesday, April 24, 2018 - 12:35:16 AM - sagar goswami Back To Top (75765)

 

Hi, I have a database name ExpertZone in which I  want to search PII.I am not able to understand how to edit this script.Please help me.

 















get free sql tips
agree to terms