By: Tibor Nagy | Updated: 2010-12-09 | Comments (6) | Related: > Testing
Problem
If you develop new functionalities for your SQL database then probably you already encountered the typical problem of testing large scale databases. I have to run a series of performance and functional tests on a database with a few million rows, but I do not have the necessary test data. I heard about some excellent commercial tools but they are expensive and my company cannot afford them. How can I generate the test data on my own and populate it to the tables on the test database?
Solution
I will show you some tricks to generate and multiply rows for the test database. Each data type needs a different approach as you can see below.
Numeric data
First of all you will definitely need to generate some numeric data. The hands on idea would be to use the RAND() function. Unfortunately RAND is invoked once per query, not once per row therefore it returns the same value during the course of your query. So I suggest using other methods unless you would like to use a different seed value for each and every call which can be very painful for generating mass volume of data.
If you are looking for a real random value, you will get much better results with generating a NEWID() and calculating the checksum. The following expression sets INTVALUE1 field to a random integer value:
UPDATE TESTTABLE SET INTVALUE1=CHECKSUM(NEWID())
Further modifying this expression you can generate decimal values and using the ABS() function you can eliminate the negative numbers:
UPDATE TESTTABLE SET DECVALUE1=ABS(CHECKSUM(NEWID()))/100.0
Fixed range data sets
We can use the above method with some changes to generate values within a fixed data range. For example when you need to generate a Boolean value then you can choose from the data sets (0;1) or (Y;N) etc. You can use the parity of the random integer to convert it into Boolean data. Beware; you cannot use case function to sort NEWID results since NEWID is invoked for every CASE statement. The following expression generates random yes or no values for the column BOOLVALUE1:
--Generate random 0 or 1 value for every row UPDATE TESTTABLE SET BOOLVALUE1 = ABS(CHECKSUM(NEWID()))%2
--Translate the values to Yes or No UPDATE TESTTABLE SET BOOLVALUE1 = 'N' WHERE BOOLVALUE1='0' UPDATE TESTTABLE SET BOOLVALUE1 = 'Y' WHERE BOOLVALUE1='1'
Text data
Text data requires special attention if you would like to have a database with some reasonable content. You have to build an initial dictionary and then use it to breed more rows. For example it is common to use names in various database fields. The following example shows you how to create a table containing 100 different names in a few seconds.
--Create table for first names CREATE TABLE [NAMES1] (FIRST_NAME [varchar](20))
--Create table for family names CREATE TABLE [NAMES2] (FAMILY_NAME [varchar](20))
--Fill first names INSERT INTO NAMES1 VALUES ('John') INSERT INTO NAMES1 VALUES ('Jack') INSERT INTO NAMES1 VALUES ('Jill') INSERT INTO NAMES1 VALUES ('Bill') INSERT INTO NAMES1 VALUES ('Mary') INSERT INTO NAMES1 VALUES ('Kate') INSERT INTO NAMES1 VALUES ('Kevin') INSERT INTO NAMES1 VALUES ('Matt') INSERT INTO NAMES1 VALUES ('Rachel') INSERT INTO NAMES1 VALUES ('Tom')
--Fill family names INSERT INTO NAMES2 VALUES ('Smith') INSERT INTO NAMES2 VALUES ('Morgan') INSERT INTO NAMES2 VALUES ('Simpson') INSERT INTO NAMES2 VALUES ('Walker') INSERT INTO NAMES2 VALUES ('Bauer') INSERT INTO NAMES2 VALUES ('Taylor') INSERT INTO NAMES2 VALUES ('Morris') INSERT INTO NAMES2 VALUES ('Elliott') INSERT INTO NAMES2 VALUES ('Clark') INSERT INTO NAMES2 VALUES ('Rock')
--Generate 10x10=100 different names SELECT * INTO TESTTABLE FROM NAMES1 CROSS JOIN NAMES2
You can take the above examples to quickly generate more and more different database records from a few row sample tables. Cross joins and self joins can easily create large datasets so be cautious when using very big tables. For example cross joining two tables with one thousand records generates results of one million records.
Next Steps
- Build supporting tables to generate the text values
- Create stored procedures to fill test database
- Check out these related resources:
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2010-12-09