Getting SQL Server Metadata With Python

By: Tim Wiseman | Comments (3) | Related: > Functions System

Problem

Metadata is invaluable when working with SQL Server, so how can I bring SQL Server metadata into a Python based Program?

Solution

Metadata is data about data. In the context of Microsoft SQL Server it includes topics such as the names of the tables in a database, the columns in a table, and the number of rows in a table. Much of this metadata can be acquired in a user friendly fashion through the GUI provided by SQL Server Management Studio (SSMS) when you are working in an interactive fashion with the server. Assuming you have the proper permissions, the Object Explorer in SSMS is quite helpful for getting the list of databases, tables, columns, and other objects inside the database.

However, the user friendly GUI of SSMS will not help when you are dealing with the databases programmatically. Microsoft SQL Server makes it relatively simple to query much of its metadata in SQL Server 2005 and later through the System Catalog Views and the Information Schema Views These can be accessed from within virtually any environment that allows queries to be sent directly to the server. But, especially for simple and commonly accessed metadata, there are often ways to access it that are built into the programming language and may be easier to manipulate from within that language or the libraries used with the language.

For Python in particular, there is more than one library that can be used to access Microsoft SQL Server. For instance, the QtSQL library integrates well with the rest of the PyQt tool set and there are several implementations of ODBC for Python. The rest of this tip will focus on pyodbc, but most of the libraries will offer somewhat similar options and many libraries used with languages other than Python will also have similar options since many of them use ODBC. For that matter, and for that reason, many libraries for other languages that use ODBC will offer similar options.

The main object for gathering metadata, and doing most other tasks, through pyodbc is the cursor. This exposes methods such as tables, columns, statistics, rowIdColumns, primaryKeys, foreignKeys, procedures, getTypeInfo. The information which is available from these methods can always be acquired through T-SQL commands. As you can see from a Profiler Trace, running cursor.tables() in Python executes sys.sp_tables on the SQL Server.

exec [sys].sp_tables NULL,N'%',NULL,NULL,@fUsePattern=1

Although all the metadata that can be gained from these methods could also be obtained by sending the T-SQL to the server, there are some advantages to using the methods. For one thing, it can help decouple that part of the code from the underlying database by relying on the library and ODBC to handle the implementation details. Although I tend to favor SQL Server for most projects, there are times when it makes sense to use other types of databases or other ODBC accessible data sources as well. For instance, we could connect to an Excel file through Python and get a list of its sheet names this way:

import pyodbc

connStr = ('Driver={Microsoft Excel Driver (*.xls)};FIRSTROWHASNAMES=1;' +
            'READONLY=FALSE;CREATE_DB=Book1.xls; DBQ= Book1.xls')

#Need autocommit for excel
conn = pyodbc.connect(connStr, autocommit = True) 
curs = conn.cursor()

for x in curs.tables():
    print x

But if I try to send a command for sp_tables directly to Excel the way I could if I was connected to SQL Server like :

for x in curs.execute('exec sp_tables'):
    print x

Then I get an error message reading "pyodbc.ProgrammingError: ('42S02', "[42S02] [Microsoft][ODBC Excel Driver] The Microsoft Jet database engine could not find the object 'sp_tables'. Make sure the object exists and that you spell its name and the path name correctly. (-1305) (SQLExecDirectW)")". Of course, using ODBC is rarely an ideal way to access an Excel file. In Python, I frequently use XLRD and XLWT. It does still demonstrate the versatility of using the ODBC calls rather than acquiring the metadata through direct queries. While it would be uncommon to write a program that would use a SQL Server or an Excel file interchangeably, it is not uncommon to start with a single user version of a program with a smaller database such as Access and then migrate to the greater power of SQL Server as the project gets more complicated. Making use of the more general ODBC functions rather than implementation specific executions helps make those migrations simpler.

Looking at the tables method specifically, it will probably return more data than we will want. Just like sp_tables, it returns information on every table like an object in the database including system tables, temporary tables, aliases, synonyms, and views. If we want to narrow that down to standard tables, we can set the table type to 'TABLE', like: curs.tables(tableType = 'TABLE'). You can similarly narrow the list by the table name or schema and it supports the standard wildcards of '_' or '%'.

Much like the tables method, the columns method applies to the entire database the cursor is connected to. Behind the scenes, it executes sp_columns_100. It can be narrowed down to the table in question by providing the table parameter: curs.columns(table='IntTbl'). If however you want the columns of the current result set that the cursor is holding, those can be accessed through the description property. The description property returns a tuple of tuples which contain the (name, type_code, display_size, internal_size, precision, scale, null_ok) for each column used in that result set.

Naturally, these methods can be used together to get more refined data. A short script to find all the tables without primary keys could look like:

import pyodbc #to connect to SQL Server

sqlConnStr = ('DRIVER={SQL Server Native Client 11.0};Server=YourServer;Database=Test;'+
            'Trusted_Connection=YES') 
            
conn = pyodbc.connect(sqlConnStr)
curs = conn.cursor()

def findTablesWithNoPk(curs):
    """Takes an active cursor as an input and returns a list
    of the names of all tables with no Primary key"""
    noPkTbls = []    
    #using fetchall to free up the cursor for queries in the loop
    #if the cursor will not be used in the loop, it is often better
    #to iterate over the rows instead.
    for row in curs.tables(tableType = 'TABLE').fetchall():
        #fetchone will return None, which is treated as false, if there are no rows.
        #Cannot use rowcount since that often returns -1 for performance reasons until
        #a row has been fetched.
        if not curs.primaryKeys(row.table_name, row.table_cat, row.table_schem).fetchone():
            noPkTbls.append(row.table_schem + '.' + row.table_name)
    return noPkTbls
    
print findTablesWithNoPk(curs)

This gives results very similar to the pure T-SQL query that Aaron Bertrand provided in his recent article Finding SQL Server tables with (or without) a certain property. His query is somewhat more efficient and requires fewer calls to the server. This procedure is somewhat more pythonic and does not rely on any specific facts about SQL Server, using the ODBC methods instead. Also, since this function is calling sp_tables behind the scenes instead of sys.tables it may return results for sys.trace_xe_action_map and sys.trace_xe_event_map on certain versions of SQL Server.

The metadata which is made available through ODBC allows you to swiftly bring in data about the database in a way that does not rely on specific database features and can be quickly accessed form within the programming language.

Next Steps

The following tips and other resources talk more about using Python with SQL Server:

Python for the SQL Server DBA
Plotting SQL Server Data for Data Visualization
The pyodbc documentation on the cursor is invaluable when working with pyodbc.

The following articles talk more about getting metadata from SQL Server through other means:
- Aaron Bertand's Over 40 queries to find SQL Server tables with or without a certain property
- And also Aaron's Finding SQL Server views with (or without) a certain property
- Arshad Ali's Understanding Catalog Views in SQL Server 2005 and 2008

About the author

Tim Wiseman is a SQL Server DBA and Python Developer in Las Vegas.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Thursday, February 15, 2024 - 7:38:23 AM - Joel	Back To Top (91949)
@Cheri: The invalid syntax is because of one of the differences between Python 2 and 3. For Python 3 you want print (findTablesWithNoPk(curs)) @Tim: Time to update this for Python 3.

Monday, August 23, 2021 - 12:10:13 PM - Tim Wiseman	Back To Top (89158)
What version of Python are you using? When I wrote this 2.x was by far the predominate version of python, but now 3.x is much more common. The syntax for several commands, prominently including print, changed between 2.x and 3.x.

Friday, August 20, 2021 - 4:16:04 PM - Cherie	Back To Top (89156)
Hey I run in to a invalid syntax error with the print statement to print findTablesWithNoPk(curs). Do you know why?

Getting SQL Server Metadata With Python

Problem

Solution

Next Steps

About the author

Comments For This Article