Create a SQL Server InitCap Function

By:   |   Updated: 2016-09-26   |   Comments (2)   |   Related: > Functions User Defined UDF


Problem

You want to create a SQL Server T-SQL function similar to Oracle's INITCAP PL/SQL function. The Oracle INITCAP function returns a string with the first letter of each word in upper-case and all the other letters in lower-case. The words are assumed to be delimited by white space or characters that are not alphanumeric. You want to achieve this goal without writing complex T-SQL code and use SQL Server's integration with .NET in order to use the .NET rich string function support. How can this be accomplished using T-SQL?

Solution

The solution involves three major steps:

  • The first step is creating a static C# function residing in a .NET class as shown in Step 1 below. We will create a short C# function in a .cs file that uses the ToTitleCase C# string method that has a similar functionality to INITCAP.
  • The second step is to create a DLL (Dynamic Link Library) file from the .cs file as shown in Step 2.  We will use the csc.exe C# compiler that exists in the .NET framework on our server. We will use the /target:library switch in order to create a DLL file.
  • The third step is a series of T-SQL statements all done in the SQL Server Management Studio environment as shown in Step 3.  We enable the CLR integration on our server, then we create an assembly object that connects to the DLL file that was built in Step 2 and finally we create a T-SQL function that is an external wrapper for the C# method inside the assembly object. Note that the nvarchar datatype is the SQL Server matching data type to the C# string type.

When all of three steps are done, the TSQL INITCAP function can be used.

Step 1

Here is the C# code for the function.  Save this to file C:\StringUtils.cs.

using System;
using System.Globalization;
public class StringUtils
    {
      public static string Initcap(string strText)
      { 
       return new CultureInfo("en").TextInfo.ToTitleCase(strText.ToLower()); 
      }
    }

Step 2

Here is how to create a DLL file using the csc.exe C# compiler utility. The location of the csc.exe file will be in the Microsoft.NET framework folder such as: C:\Windows\Microsoft.NET\Framework64\v4.0.30319.

Execute the following lines in a command shell (cmd.exe) window.  You will need to find the correct path on your server for the csc.exe file.  Also, see the notes at the bottom of this tip about adding this to the environment variables.

cd C:\
C:\Windows\Microsoft.NET\Framework64\v4.0.30319\csc.exe /target:library StringUtils.cs

Step 3

Here is the T-SQL code.  Execute the following in a SQL Server Management Studio query window.

use master
GO

-- enable CLR 
exec sp_configure 'show advanced options',1
GO
reconfigure with override
GO
exec sp_configure 'clr enabled' , 1
GO
reconfigure with override
GO

-- create assembly and function in SQL Server
CREATE ASSEMBLY StringUtils from 'c:\StringUtils.dll' WITH PERMISSION_SET = SAFE;
GO

CREATE FUNCTION Initcap( @s nvarchar(2000)) 
RETURNS nvarchar(2000)
AS EXTERNAL NAME StringUtils.StringUtils.Initcap;   
GO 

Example Using New InitCap Function

select dbo.Initcap ('eLI LEiba knows SQL')

The result is:

Eli Leiba Knows Sql

Further Notes

  1. The function was tested with SQL Server 2012 and SQL Server 2014 Developer editions.
  2. Add the path of the CSC.EXE compiler to the PATH environment variable. This will help making a .dll file from anywhere on the server.
Next Steps

Check out these related tips:



sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Eli Leiba Eli Leiba is a senior application DBA, a teacher and a senior database consultant with 19 years of RDBMS experience.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2016-09-26

Comments For This Article




Monday, September 26, 2016 - 12:53:54 PM - Jeff Moden Back To Top (43420)

First, thank you for writing this article.  Anyone who takes the time to do such a thing is Aces in my book.

Let's take a look at the official documentation for the TextInfo.ToTitleCase method, which can be found at the following URL...

https://msdn.microsoft.com/en-us/library/system.globalization.textinfo.totitlecase(v=vs.110).aspx?cs-save-lang=1&cs-lang=csharp#code-snippet-1

... where it states...

___________________________________________________________________________

"As illustrated above, the ToTitleCase method provides an arbitrary casing behavior which is not necessarily linguistically correct. A linguistically correct solution would require additional rules, and the current algorithm is somewhat simpler and faster. We reserve the right to make this API slower in the future.

The current implementation of the ToTitleCase method yields an output string that is the same length as the input string. However, this behavior is not guaranteed and could change in a future implementation."
___________________________________________________________________________

Since it's actually not a correct implementation of true, well formed "Title Case", Microsoft reserves the right to change it at any time as clearly stated above.  Such a change would make it slower and Microsoft also states that above.  The really bad part is that such changes would be made silently during any automatic update and could be missed during manual updates unless you're one of those good folks that reads all release information prior to doing an update.

Because of that declared instability, there's no way that I'd use such a function even in C# never mind in T-SQL.

Further, there's been no performance test in this thread and, especially as of late, that's a serious concern for me.  Many people don't realize that the relatively new FORMAT() function (2012) in T-SQL is a built in CLR call to a simlarly name .NET fuction nor do the realize that it's also 44 times slower than any equivalent CONVERT, supposedly code portability be damned.

One of the things on my "To Do" list is to create a high performance scalar-return iTVF (Inline Table Valued Function) to replace the fully documented T-SQL scalar function that appears below.  In the meantime, note that the scalar function below is anything but complicated.  Also, no sure if this forum preserves leading spaces so it might mess up the nicely indented code that it is in real life.  We're about to find out.

Also note that it was written specifically for "English" (or what us Americans call "English") but the code is documented well enough so that I believe that it could easily be modified to work in most languages, symbolic languages like some of the oriental languages and Hindii being the most obvious exceptions.

Here's the T-SQL Scalar Function... full credit to George Mastros for the intial concept...

 

CREATE FUNCTION dbo.InitialCap(@String VARCHAR(8000))
/***************************************************************************************************
 Purpose:
 Capitalize any lower case alpha character which follows any non alpha character or single quote.

 Revision History:
 Rev 00 - 24 Feb 2010 - George Mastros - Initial concept
 http://blogs.lessthandot.com/index.php/DataMgmt/DBProgramming/sql-server-proper-case-function

 Rev 01 - 25 Sep 2010 - Jeff Moden 
        - Redaction for personal use and added documentation.
        - Slight speed enhancement by adding additional COLLATE clauses that shouldn't have mattered
        - and the reduction of multiple SET statements to just 2 SELECT statements with only one
        - actually appearing in the loop, which means only 1 PATINDEX is now in the loop (DRY Code).
        - Add no-cap single-quote by single-quote to the filter.
***************************************************************************************************/
RETURNS VARCHAR(8000)
     AS
  BEGIN 
----------------------------------------------------------------------------------------------------
DECLARE @Position INT
;
--===== Update the first character no matter what and then find the next postion that we 
     -- need to update.  The collation here is essential to making this so simple.
     -- A-z is equivalent to the slower A-Z
 SELECT @String   = STUFF(LOWER(@String),1,1,UPPER(LEFT(@String,1))) COLLATE Latin1_General_Bin,
        @Position = PATINDEX('%[^A-Za-z''][a-z]%',@String COLLATE Latin1_General_Bin)
;
--===== Do the same thing over and over until we run out of places to capitalize.
     -- Note the reason for the speed here is that ONLY places that need capitalization
     -- are even considered for @Position using the speed of PATINDEX. 
  WHILE @Position > 0
 SELECT @String   = STUFF(@String,@Position,2,UPPER(SUBSTRING(@String,@Position,2))) COLLATE Latin1_General_Bin,
        @Position = PATINDEX('%[^A-Za-z''][a-z]%',@String COLLATE Latin1_General_Bin)
;
----------------------------------------------------------------------------------------------------
 RETURN @String;
    END
;
 

 

 

 

 

 


Monday, September 26, 2016 - 6:57:04 AM - Rakesh Joshi Back To Top (43416)

 

 I also test it in 2008 R2 its working awesome with 3.5 instead of 4.0 or 4.5 .. Great work.. 















get free sql tips
agree to terms