By: Alex Tocitu | Updated: 2011-02-09 | Comments (10) | Related: 1 | 2 | > Functions User Defined UDF
Problem
You need to provide data validation at the server level for complex strings like phone numbers, email addresses, etc. You may also need to do data cleanup / standardization before moving it from source to target. Although SQL Server provides a fair number of string functions, the code developed with these built-in functions can become complex and hard to maintain or reuse.
Solution
.NET Base Class Library provides the System.Text.RegularExpressions namespace which contains classes with access to the .NET Framework regular expressions engine. Using these classes you can use .NET languages in addition to the T-SQL programming language to create database objects like stored procedures and functions to operate on strings to retrieve and update data in SQL Server databases.
For example, a regular expression can be applied to a text field, to search for a first pattern, then modify the text field to conform to a second pattern. Regular expressions use a language of their own, specifically designed for string processing, with some of the most popular listed below:
- Internet Email: \w+([-+.']\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*
- Internet URL: http(s)?://([\w-]+\.)+[\w-]+(/[\w- ./?%&=]*)?
- US Phone Number: ((\(\d{3}\) ?)|(\d{3}-))?\d{3}-\d{4}
- US Social Security Number: \d{3}-\d{2}-\d{4}
- US ZIP Code: \d{5}(-\d{4})?
- URI: \b(\S+)://([^:]+)(?::(\S+))?\b
Describing how these regular expressions are constructed and work is beyond the scope of this article.
In the tip presented here we will show the code necessary to develop a SQL CLR C# function that validates an input string, and in the case of finding a match, outputs a transformed string. (To learn more about creating a CLR function, refer to this tip.) To be more concrete, the function takes an input string believed to contain a character representation of a phone number; if the assumption is true, the function will output a string in a compact form, containing only the digits.
Although pretty straightforward, the code sample below has comments to improve clarity.
This function takes three parameters:
- InputString
- MatchPattern
- ReplacementPattern
using Microsoft.SqlServer.Server; using System.Data.SqlTypes; // namespace to work with regular expressions using System.Text.RegularExpressions; public class cls_RegularExpressions { [SqlFunction] public static SqlString ReplaceMatch( SqlString InputString, SqlString MatchPattern, SqlString ReplacementPattern) { try { // input parameters must not be NULL if (!InputString.IsNull && !MatchPattern.IsNull && !ReplacementPattern.IsNull) { // check for first pattern match if (Regex.IsMatch(InputString.Value, MatchPattern.Value)) // match found, replace using second pattern and return result return Regex.Replace(InputString.Value, MatchPattern.Value, ReplacementPattern.Value); else // match not found, return NULL return SqlString.Null; } else // if any input paramater is NULL, return NULL return SqlString.Null; } catch { // on any error, return NULL return SqlString.Null; } } };
A sample SQL Server run is shown below, where we pass in the phone number, the match pattern and the replacement pattern. The first two examples pass the conversion, but the third example fails because of two "))" after the area code.
select dbo.ReplaceMatch ('(129).673-4192', '^\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})$', '$1$2$3') go; returned value: 1296734192 -------------------------- select dbo.ReplaceMatch ('(129.673-4192', '^\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})$', '$1$2$3') go; returned value: 1296734192 -------------------------- select dbo.ReplaceMatch ('(129)).673-4192', '^\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})$', '$1$2$3') go; returned value: NULL --------------------
The advantages of the ReplaceMatch function are straightforward as it can be used inline of SQL statements and with a whole spectrum of regular expressions within your SQL Server (including Express/Compact) environment. You could also create a second function that will only check for a match and return a boolean value, without doing any pattern replacement.
Regular expressions provide a very powerful text-processing technology that every Developer/DBA should be at least familiar with.
Our tip here just attempts to get you started writing your own SQL CLR objects embedding this technology.
Next Steps
- Compile, deploy, and use the ReplaceMatch function; enhance it to suit your needs.
- Explore other uses of the ReplaceMatch function with different regular expressions.
- Include using regular expressions and CLR in your SQL Server development toolkit.
- Check other SQL CLR tips on this site.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2011-02-09