Access Amazon Redshift data from SQL Server via a Linked Server

By: Douglas Correa | Updated: 2023-05-19 | Comments (7) | Related: > Linked Servers

Problem

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. There are often times when you need to access Redshift data from SQL Server. You could export and import data, but another option is to create a linked server from SQL Server to Redshift. In this tip we walk through how this can be done for AWS Redshift from SQL Server.

Solution

In this tip, we will show how SQL Server can access Redshift data via a linked server.

Install Amazon Redshift ODBC Driver

You can get a copy of the Amazon Redshift ODBC Driver 32-bit or 64-bit to install.

The steps to complete the installation is very simple, following these steps for the installation.

Configure Amazon Redshift ODBC Driver

Once the driver has been installed you will see the driver in the ODBC Data Source Administrator.

After you download and install the ODBC driver, you need to add a data source name (DSN) entry to the client machine. SQL client tools uses this data source to connect to the Amazon Redshift database.

Create a new data source like the example below.

Select the System DSN tab if you want to configure the driver for all users on the computer or the User DSN tab if you want to configure the driver for your user account only.

Specify the endpoint for your Amazon Redshift cluster. You can find this information in the Amazon Redshift console on the cluster’s details page. Type the port number that the database uses. By default, Amazon Redshift uses port 5439, but you should use the port that the cluster was configured to use when it was launched.

To test, click the Test button. If the client computer can connect to the Amazon Redshift database, you will see the following message: Connection successful.

How to create a SQL Server Linked Server to Amazon Redshift

In SQL Server Management Studio, open Object Explorer, expand Server Objects, right-click Linked Servers, and then click New Linked Server.
On the General Page, type the name of the instance of SQL Server that you area linking to.
Specify an OLE DB server type other than SQL Server.
Select an OLE DB data source from the list box (in this case ODBC Drivers).
Type the product name and the data source will be the ODBC Data Source we created before.

The first RPC setting is mainly for a legacy feature called Remote Server. The RPC OUT setting is very pertinent to linked servers on SQL Server. Think about an RPC (Remote Procedure Call) as being a stored procedure being run remotely from server 1 to linked server 2.

Also, you can create a linked server with these T-SQL commands:

USE [master]
GO
EXEC master.dbo.sp_addlinkedserver @server = N'REDSHIFT'
, @srvproduct=N'Redshift'
, @provider=N'MSDASQL'
, @datasrc=N'demo'

EXEC master.dbo.sp_addlinkedsrvlogin @rmtsrvname=N'REDSHIFT'
,@useself=N'False'
,@locallogin=NULL
,@rmtuser=NULL
,@rmtpassword=NULL
GO

Access Amazon Redshift data from SQL Server

After you connect to the linked server, you will be able to see the remote data in the SSMS Object Explorer.

Query Amazon Redshift data from SQL Server

I created an example below to show how to join and merge data with the Redshift environment, load data on SQL Server or insert data into Redshift.

Amazon Redshift is based on PostgreSQL 8.0.2. Amazon Redshift and PostgreSQL have a number of very important differences that you must be aware of as you design and develop your data warehouse application. You can see more examples here how to create tables on Redshift.

USE demo
GO

CREATE TABLE tb01 (
id INT IDENTITY
, first_name VARCHAR(50)
, last_name VARCHAR(50)
, age SMALLINT
)
GO

TRUNCATE TABLE dbo.tb01
GO

INSERT INTO dbo.tb01 (first_name, last_name, age) VALUES ('douglas', 'correa', 36)
GO

EXEC('create table demodb.public.tbdemo(
id integer not null
,name varchar(100)
,birth timestamp ) ') AT DEMO 

EXEC('insert into demodb.public.tbdemo values(1, ''douglas correa'', getdate() )') AT DEMO
EXEC('insert into demodb.public.tbdemo values(2, ''renato silvestre'', getdate() )') AT DEMO
GO

EXEC('delete from demodb.public.tbdemo') AT DEMO
GO

SELECT T.id, T.first_name, T.last_name, T.age
FROM OPENQUERY(DEMO,'select id, name, birth from demodb.public.tbdemo') AS OQ
INNER JOIN dbo.tb01 AS T ON OQ.id = T.id
GO

MERGE dbo.tb01 AS target
USING (SELECT * 
FROM OPENQUERY(DEMO,'select id, name, birth from demodb.public.tbdemo') AS OQ
) AS source (id, name, birth) 
ON target.id = source.id
WHEN MATCHED THEN
UPDATE SET target.age = DATEDIFF(yyyy, GETDATE(), source.birth)
WHEN NOT MATCHED THEN
INSERT (first_name, age) 
VALUES (SUBSTRING(source.name, 0, CHARINDEX(' ',source.name)), DATEDIFF(yyyy, GETDATE(), source.birth))
OUTPUT Inserted.id, Inserted.first_name, Inserted.age;
GO

Running the script above, we can see the output in the screenshots below:

The MERGE results:

SELECT with INNER JOIN results:

The results of these operations depend on type conversion rules and data type compatibility. In general, data types that fall into the same type category (such as different numeric data types) are compatible and can be implicitly converted.

If you compare numeric values with character strings, the numeric values are converted to character strings. When converting DATE or TIMESTAMP to TIMESTAMPTZ, DATE or TIMESTAMP are assumed to use the current session time zone.

Amazon Redshift ODBC Driver Performance

The Connector/ODBC driver has been optimized to provide very fast performance. If you experience problems with the performance of Connector/ODBC, or notice a large amount of disk activity for simple queries, there are a number of aspects to check:

Ensure that ODBC Tracing is not enabled. With tracing is enabled, a lot of information is recorded in the tracing file by the ODBC Manager.
You can check, and disable, tracing within Windows using the Tracing panel of the ODBC Data Source Administrator.

Next Steps

More about option parameters here.
Connector/ODBC errors FAQ.

About the author

Douglas Correa is a database professional, focused on tuning, high-availability and infrastructure.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Article Last Updated: 2023-05-19

Tuesday, October 10, 2023 - 3:52:37 AM - jd	Back To Top (91642)
same issue w/ michael and Yan, is there any solution on this? Cannot initialize the data source object of OLE DB provider "MSDASQL" for linked server "REDSHIFTTEST". OLE DB provider "MSDASQL" for linked server "REDSHIFTTEST" returned message "[Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified". (.Net SqlClient Data Provider)

Friday, May 19, 2023 - 11:26:59 AM - Greg Robidoux	Back To Top (91212)
Thank you. The links have been updated.

Friday, May 19, 2023 - 10:33:56 AM - murali	Back To Top (91211)
The Redshift drivers links are broken

Wednesday, February 12, 2020 - 12:28:59 PM - Nazareth Berlanga	Back To Top (84382)
Works like a charm! Thanks!