Access Amazon Redshift data from SQL Server via a Linked Server

By:   |   Updated: 2023-05-19   |   Comments (7)   |   Related: > Linked Servers


Problem

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. There are often times when you need to access Redshift data from SQL Server. You could export and import data, but another option is to create a linked server from SQL Server to Redshift. In this tip we walk through how this can be done for AWS Redshift from SQL Server.

Solution

In this tip, we will show how SQL Server can access Redshift data via a linked server.

Install Amazon Redshift ODBC Driver

You can get a copy of the Amazon Redshift ODBC Driver 32-bit or 64-bit to install.

The steps to complete the installation is very simple, following these steps for the installation.

Amazon Redshift install
Amazon Redshift install

Configure Amazon Redshift ODBC Driver

Once the driver has been installed you will see the driver in the ODBC Data Source Administrator.

ODBC Data Source Administrator

After you download and install the ODBC driver, you need to add a data source name (DSN) entry to the client machine. SQL client tools uses this data source to connect to the Amazon Redshift database.

Create a new data source like the example below.

ODBC Data Source Administrator

Select the System DSN tab if you want to configure the driver for all users on the computer or the User DSN tab if you want to configure the driver for your user account only.

Amazon Redshift ODBC Driver DSN setup

Specify the endpoint for your Amazon Redshift cluster. You can find this information in the Amazon Redshift console on the cluster’s details page. Type the port number that the database uses. By default, Amazon Redshift uses port 5439, but you should use the port that the cluster was configured to use when it was launched.

To test, click the Test button. If the client computer can connect to the Amazon Redshift database, you will see the following message: Connection successful.

How to create a SQL Server Linked Server to Amazon Redshift

  1. In SQL Server Management Studio, open Object Explorer, expand Server Objects, right-click Linked Servers, and then click New Linked Server.
  2. On the General Page, type the name of the instance of SQL Server that you area linking to.
  3. Specify an OLE DB server type other than SQL Server.
  4. Select an OLE DB data source from the list box (in this case ODBC Drivers).
  5. Type the product name and the data source will be the ODBC Data Source we created before.
sql server new linked server

The first RPC setting is mainly for a legacy feature called Remote Server. The RPC OUT setting is very pertinent to linked servers on SQL Server. Think about an RPC (Remote Procedure Call) as being a stored procedure being run remotely from server 1 to linked server 2.

Also, you can create a linked server with these T-SQL commands:

USE [master]
GO
EXEC master.dbo.sp_addlinkedserver @server = N'REDSHIFT'
, @srvproduct=N'Redshift'
, @provider=N'MSDASQL'
, @datasrc=N'demo'

EXEC master.dbo.sp_addlinkedsrvlogin @rmtsrvname=N'REDSHIFT'
,@useself=N'False'
,@locallogin=NULL
,@rmtuser=NULL
,@rmtpassword=NULL
GO

Access Amazon Redshift data from SQL Server

After you connect to the linked server, you will be able to see the remote data in the SSMS Object Explorer.

sql server ssms object explorer

Query Amazon Redshift data from SQL Server

ssms sql query and results

I created an example below to show how to join and merge data with the Redshift environment, load data on SQL Server or insert data into Redshift.

Amazon Redshift is based on PostgreSQL 8.0.2. Amazon Redshift and PostgreSQL have a number of very important differences that you must be aware of as you design and develop your data warehouse application. You can see more examples here how to create tables on Redshift.

USE demo
GO

CREATE TABLE tb01 (
id INT IDENTITY
, first_name VARCHAR(50)
, last_name VARCHAR(50)
, age SMALLINT
)
GO

TRUNCATE TABLE dbo.tb01
GO

INSERT INTO dbo.tb01 (first_name, last_name, age) VALUES ('douglas', 'correa', 36)
GO

EXEC('create table demodb.public.tbdemo(
id integer not null
,name varchar(100)
,birth timestamp ) ') AT DEMO 

EXEC('insert into demodb.public.tbdemo values(1, ''douglas correa'', getdate() )') AT DEMO
EXEC('insert into demodb.public.tbdemo values(2, ''renato silvestre'', getdate() )') AT DEMO
GO

EXEC('delete from demodb.public.tbdemo') AT DEMO
GO

SELECT T.id, T.first_name, T.last_name, T.age
FROM OPENQUERY(DEMO,'select id, name, birth from demodb.public.tbdemo') AS OQ
INNER JOIN dbo.tb01 AS T ON OQ.id = T.id
GO

MERGE dbo.tb01 AS target
USING (SELECT * 
FROM OPENQUERY(DEMO,'select id, name, birth from demodb.public.tbdemo') AS OQ
) AS source (id, name, birth) 
ON target.id = source.id
WHEN MATCHED THEN
UPDATE SET target.age = DATEDIFF(yyyy, GETDATE(), source.birth)
WHEN NOT MATCHED THEN
INSERT (first_name, age) 
VALUES (SUBSTRING(source.name, 0, CHARINDEX(' ',source.name)), DATEDIFF(yyyy, GETDATE(), source.birth))
OUTPUT Inserted.id, Inserted.first_name, Inserted.age;
GO

Running the script above, we can see the output in the screenshots below:

ssms sql query and results

The MERGE results:

ssms sql query and results

SELECT with INNER JOIN results:

ssms sql query and results

The results of these operations depend on type conversion rules and data type compatibility. In general, data types that fall into the same type category (such as different numeric data types) are compatible and can be implicitly converted.

If you compare numeric values with character strings, the numeric values are converted to character strings. When converting DATE or TIMESTAMP to TIMESTAMPTZ, DATE or TIMESTAMP are assumed to use the current session time zone.

Amazon Redshift ODBC Driver Performance

The Connector/ODBC driver has been optimized to provide very fast performance. If you experience problems with the performance of Connector/ODBC, or notice a large amount of disk activity for simple queries, there are a number of aspects to check:

  • Ensure that ODBC Tracing is not enabled. With tracing is enabled, a lot of information is recorded in the tracing file by the ODBC Manager.
  • You can check, and disable, tracing within Windows using the Tracing panel of the ODBC Data Source Administrator.
Next Steps
  • More about option parameters here.
  • Connector/ODBC errors FAQ.


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Douglas Correa Douglas Correa is a database professional, focused on tuning, high-availability and infrastructure.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2023-05-19

Comments For This Article




Tuesday, October 10, 2023 - 3:52:37 AM - jd Back To Top (91642)
same issue w/ michael and Yan, is there any solution on this?
Cannot initialize the data source object of OLE DB provider "MSDASQL" for linked server "REDSHIFTTEST".
OLE DB provider "MSDASQL" for linked server "REDSHIFTTEST" returned message "[Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified". (.Net SqlClient Data Provider)

Friday, May 19, 2023 - 11:26:59 AM - Greg Robidoux Back To Top (91212)
Thank you.

The links have been updated.

Friday, May 19, 2023 - 10:33:56 AM - murali Back To Top (91211)
The Redshift drivers links are broken

Wednesday, February 12, 2020 - 12:28:59 PM - Nazareth Berlanga Back To Top (84382)

Works like a charm! Thanks!


Wednesday, April 17, 2019 - 7:57:26 PM - Michael Back To Top (79589)

I'm having the identical issue as Yan.  The ODBC System DSN test works, but when I attempt to test the connection of the linked server itself in SSMS I get this message:

Cannot initialize the data source object of OLE DB provider "MSDASQL" for linked server "REDSHIFT".

OLE DB provider "MSDASQL" for linked server "REDSHIFT" returned message "[Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified". (Microsoft SQL Server, Error: 7303)

Yan, did you ever get past this issue?


Friday, February 15, 2019 - 11:37:33 AM - Douglas Correa Back To Top (79044)

Hi yan,

The only thing I'm thinking now is if you used User DSN and not system DSN or your data source is different perhaps

Regards,

Doug 


Friday, February 15, 2019 - 11:03:39 AM - yan fei Back To Top (79042)

 hi doug,

I am trying to follow this post to set up my linked server from sql server 2016 to redshift - the ODBC DSN works, but when testing the linked server, always got error message below:

Cannot initilize the data source object of OLEDB provider "MSDASQL" for linked server "REDSHIF"

OLE DB provider "MSDASQL" for linked server "REDSHIFT" returned message "[Microsoft][ODBC Driver Manager] Data Source name not found and no default driver specified" (microsoft SQL Server, Error: 7303) 

I double check my config following your post, don't seem to see anything wrong, and again DSN works when i tested, and I used the correct DSN name - any idea? 















get free sql tips
agree to terms