By: Michelle Gutzait | Updated: 2009-12-08 | Comments (2) | Related: 1 | 2 | 3 | > Linked Servers
Problem
You have a foreign database, different than SQL Server. This database can be Oracle, DB2, Progress DB, Sybase or any other database platform that can be connected to by using ODBC or OLE DB providers. You need to import data from many tables that reside in the remote database into your SQL Server database. The list of tables is dynamic (tables may be added or removed from the list or from the remote/destination databases) and the schema could also change. It is possible that the tables on the SQL Server side also include additional column, such as the datetime when the row was imported or a timestamp.
In this article, I show how this can be done by using Linked Servers. Note that it is also possible to be implemented via an SSIS package, but it's much more complicated. An example of such an implementation can be found here.
Solution
There are three possible requirements for the implementation of this import process:
- Drop and recreate the tables in the destination database (SQL Server) based on the schema in the source database.
- Truncate destination tables (SQL Server) and INSERT the data from the source database based on the schema in the destination database.
- INSERT data into destination database (SQL Server) using a WHERE clause in the source database. This will append data to the existing tables, without cleaning up the tables before (inserting deltas).
I will provide scripts to implement each of the above options. The script will be running from SQL Server (which in our case is the destination database).
Solution components:
The solution requires the following:
- A table in the local SQL Server database that will contain the list of the tables that need to be imported. There is also a flag column called IsEnabled, so that tables can be enabled/disabled from the list without being removed:
CREATE TABLE tblListOfTables (TableName VARCHAR(200) NOT NULL PRIMARY KEY, IsEnabled bit DEFAULT 1 )
- A working Linked Server to the source (external) database. Learn how to create the Linked Server here. In this article I use the names: MyLinkedServer.MyDB.MySchema for the Linked Server name, remote database name and remote schema name. You will need to change this to match your linked server objects. Also, I am using a fixed schema, so you can adjust this script to be more dynamic if needed.
- In my solution, the local imported tables in SQL Server have the same name and contain column names that are identical to the names in the source table. If needed, you can extend the tblListOfTables table to include different table names or different column names, but this does complicate the solution.
Before you get started make sure you replace this code MyLinkedServer.MyDB.MySchema in the scripts below with your linked server name, source database name and source schema name.
Solution #1
Drop and recreate the tables in the destination database (SQL Server) based on the schema in the source database.
This will create an exact duplicate of the table and the data in the destination database. If the table exists it will first be dropped.
DECLARE @Loop INT, @cmd VARCHAR(MAX), @rc INT, @TB VARCHAR(130) -- I am creating a new temporary table with a sequential ID (and no gaps), -- so I can loop by it: SELECT ROWID = IDENTITY(INT,1,1), TableName INTO #Tables FROM tblListOfTables WHERE IsEnabled = 1 ORDER BY TableName SET @rc = @@ROWCOUNT SET @Loop = 1 -- Looping on table names, dropping and recreating each: WHILE @Loop <= @rc BEGIN SELECT @TB = TableName FROM #Tables WHERE ROWID = @Loop SET @cmd = '' SELECT @cmd = 'IF EXISTS (SELECT 1 FROM sys.objects WHERE Type = ''U''' + 'AND name = ''' + @TB + ''') DROP TABLE [' + @TB + '];' + 'SELECT * INTO [' + @TB + '] ' + 'FROM MyLinkedServer.MyDB.MySchema.[' + @TB + ']' EXEC (@cmd) SET @Loop = @Loop + 1 END DROP TABLE #Tables GO
Solution #2
Truncate destination tables (SQL Server) and INSERT the data from the source database based on the schema in the destination database.
Assumptions: table and column names are identical in source and destination tables. Destination tables already exist and may have additional columns that will be excluded from the insert (optional). The insert will be created on the fly based on the schema of the destination tables.
In the following example, the destination table includes the additional column shown below that does not exist in the source and should not be imported:
RowImported datetime NOT NULL default getdate()
DECLARE @Loop INT, @cmd VARCHAR(MAX), @rc INT, @TB VARCHAR(130) -- I am creating a new temporary table with a sequential ID (and no gaps), -- so I can loop by it: SELECT ROWID = IDENTITY(INT,1,1), TableName INTO #Tables FROM tblListOfTables WHERE IsEnabled = 1 ORDER BY TableName SET @rc = @@ROWCOUNT SET @Loop = 1 -- Looping on table names, truncating and inserting data to each: WHILE @Loop <= @rc BEGIN SELECT @TB = TableName FROM #Tables WHERE ROWID = @Loop SET @cmd = '' SELECT @cmd = @cmd + ',[' + name + ']' FROM sys.columns WHERE OBJECT_NAME(OBJECT_ID) = @TB AND name NOT IN ('RowImported') -- Excluding the additional column -- (remove this line if not required -- or add more columns if necessary) SELECT @cmd = 'TRUNCATE TABLE [' + @TB+ '];' + ' INSERT INTO [' + @TB + '] (' + SUBSTRING(@cmd,2,LEN(@cmd)) + ')' + ' SELECT ' + SUBSTRING(@cmd,2,LEN(@cmd)) + ' FROM MyLinkedServer.MyDB.MySchema.[' + @TB + ']' EXEC (@cmd) SET @Loop = @Loop + 1 END DROP TABLE #Tables GO
Solution #3
INSERT data into destination database (SQL Server) using a WHERE clause in the source database. This will append data to the existing tables, without cleaning up the tables before (inserting deltas).
Assumption: table and column names are identical in source and destination tables. Destination tables already exist and may have additional columns that will be excluded from the insert (optional).
In the following example, destination table includes the additional column:
RowImported datetime NOT NULL default getdate()
In this example, the delta will be fetched based on the last SeqNumber column in the source table. (Note that if you use date datatypes in the WHERE clause, you will have to adjust your query to the date format in the source database, according to the format of the specific platform):
In this example you will need to replace SeqNumber and @SeqNo with the appropriate columns from your tables. Since this is hard coded for all tables you could also make this more dynamic by adding this logic to the control table.
DECLARE @Loop INT, @cmd VARCHAR(MAX), @rc INT, @TB VARCHAR(130) -- I am creating a new temporary table with a sequential ID (and no gaps), -- so I can loop by it: SELECT ROWID = IDENTITY(INT,1,1), TableName INTO #Tables FROM tblListOfTables WHERE IsEnabled = 1 ORDER BY TableName SET @rc = @@ROWCOUNT SET @Loop = 1 WHILE @Loop <= @rc BEGIN SELECT @TB = TableName FROM #Tables WHERE ROWID = @Loop SET @cmd = '' SELECT @cmd = @cmd + ',[' + name + ']' FROM sys.columns WHERE OBJECT_NAME(OBJECT_ID) = @TB AND name NOT IN ('RowImported') -- Excluding the additional column -- (remove this line if not required -- or add more columns if necessary) -- First step would be to read the last fetched Sequential Number from which -- remote data will be fetched: SELECT @cmd = 'DECLARE @SeqNo int; SELECT @SeqNo = max(SeqNumber) from [' + @TB+ '];' + 'INSERT INTO [' + @TB + '] (' + SUBSTRING(@cmd,2,LEN(@cmd)) + ')' + ' SELECT ' + SUBSTRING(@cmd,2,LEN(@cmd)) + ' FROM MyLinkedServer.MyDB.MySchema.[' + @TB + '] WHERE SeqNumber > @SeqNo' EXEC (@cmd) SET @Loop = @Loop + 1 END DROP TABLE #Tables GO
Summary
In this article I provided a simple solution to import data dynamically from a remote external database via a Linked Server.
You can expand your solution taking into consideration the following:
- Encapsulate everything in a transaction, if it is required.
- This is not recommended because it involves MSDTC transactions. The best approach would be to import everything into a staging database in SQL Server (with no transaction) and then use a local transaction to update the SQL Server Production tables.
- If the number of tables to be imported is big, consider running a few processes in parallel. You can encapsulate the above logic in an SSIS package or a program to split the load between a few processes that will run in parallel.
Next Steps
- Learn how to Create and Destroy Linked Server on demand.
- Modify the above T-SQL by replacing the object names with the names in your environment.
- Schedule the above T-SQL via a SQL Server Agent job according to your needs.
- Implement a Log table in which you can capture failure/success of each import as well as the time each import took. Possibly, implement Try/catch blocks to assist with the error capturing.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2009-12-08