By: Levi Masonde | Updated: 2023-03-23 | Comments | Related: > Python
Problem
There are so many APIs available to get various types of information. One goal a lot of developers have is how to load data into SQL Server by using an API. In this article, we look at an API from RapidAPI that allows us to retrieve flight information which we will use to load into a SQL Server table.
Solution
You can use one of RapidAPI's API to get global airport information and use Python to store the data in your SQL Server database for later use to analyze or just read the data from a table. The API we will look at has a lot of different information about flights and you can use this data to see which airports to avoid, flight delays, or just general flight information.
In this tutorial, we will look at how to retrieve data for flight delays from different airports globally. The delays are calculated relative to the time specified for the request and the median of the delays, this will help see which airports are more reliable than other airports.
Prerequisites
- SQL Server instance, you can create one following this guide
- Python, Version 3.11.0
- Visual Studio Code, Version 1.72.1
- Windows 10 PC or Windows Server 2019/2022
- RapidAPI account, you can use a free subscription
Creating SQL Server Table
Before getting the data, create a SQL Server table to store the data.
We will use SQLCMD, but you can use SSMS or any other method to create a database and table.
Open a Windows command prompt or use the Python Console Terminal and enter the following using your server and credentials.
sqlcmd -S PC_NAME\SQL_INSTANCE -E
The following code will create a database named AirportDelaysDB.
CREATE DATABASE AirportDelaysDB; GO
This code changes the context to the newly created database and creates a table with just the ID column. The other columns will be created in the Python script below.
USE AirportDelaysDB; GO CREATE TABLE AirportDelaysTable(ID int NOT NULL PRIMARY KEY); GO
Here is the console sequence.
Great, now you have a table to write your flight delay data.
Creating SQL engine for Python
Now that you have created a database and a table, you need to create a connection string between your SQL instance and Python code.
Use the pip command pip install sqlalchemy
to install sqlalchemy
to your environment. SQLAlchemy creates
an engine for Python to execute SQL statements. Also, urllib comes pre-installed
with Python, so there's no need to install it.
Create a Python file named FlightDbConn.py and add the following code updating the information for your environment:
import sqlalchemy as sa from sqlalchemy import create_engine import urllib
conn = urllib.parse.quote_plus( 'Data Source Name=KNITNETDS;' 'Driver={SQL Server};' 'Server=PC_NAME\SQL_INSTANCE;' 'Database=AirportDelaysDB;' 'Trusted_connection=yes;' ) try: FlightEngine = create_engine('mssql+pyodbc:///?odbc_connect={}'.format(conn)) print("Passed") except: print("failed!")
Getting the Flight Data
We will use the AeroDataBox API to get Flight delays data per airport. Make sure to login and get your API keys. At the moment AeroDataBox API is provided exclusively via the RapidAPI – a platform where you can find and connect to thousands of various APIs with a single API key. You can test API for free with the Basic plan.
This tutorial uses ICAO codes instead of airport names. The ICAO airport code or location indicator is a four-letter code designating airports around the world. These codes, are defined by the International Civil Aviation Organization.
The airport delays API endpoints allow answering questions, like "How many flights are delayed or canceled right now or in past?" or "What are the delay statistics and delay index at an airport now or back then?" These endpoints are designed to get a statistical insight into the overall "smoothness" of flight operations. Information returned from the endpoint includes departures and arrivals, a median delay of recent flight movements, number of canceled flights, flight delay index, and validity periods of the batch.
To pull data, use the pip command pip install
pandas
. Pandas will be used to manage the data structure.
Create a file named FlightAPI.py and add the following. You will need to update with your keys.
import requests import pandas as pd from FlightDbConn import FlightEngine delay_url = "https://aerodatabox.p.rapidapi.com/airports/delays/2023-03-06T00:18" querystring = {"withAircraftImage":"false","withLocation":"false"} headers = { "X-RapidAPI-Key": "046c3b4981msha4789c0d3484c16p1bc8d3jsna2f831b3710c", "X-RapidAPI-Host": "aerodatabox.p.rapidapi.com" } delay_response = requests.request("GET", delay_url, headers=headers, params=querystring) delay_response_Json = delay_response.json() Icao_array = [] initialTime_array = [] finalTime_array = [] departureDelayTotal_array = [] arrivalDelayTotal_array = [] cancelled_array = [] for i in delay_response_Json: Icao = i['airportIcao'] Icao_array.append(Icao) initialTime = i['fromUtc'] initialTime_array.append(initialTime) finalTime = i['toUtc'] finalTime_array.append(finalTime) departureDelay = i['departuresDelayInformation']['numTotal'] departureDelayTotal_array.append(departureDelay) arrivalDelays = i['arrivalsDelayInformation']['numTotal'] arrivalDelayTotal_array.append(arrivalDelays) cancelled = i['arrivalsDelayInformation']['numCancelled'] cancelled_array.append(cancelled) delay_df = pd.DataFrame({"Icao":Icao_array,"FromTime":initialTime_array,"ToTime":finalTime_array,"DepartureTimeDelay":departureDelayTotal_array,"ArrivalTimeDelay":arrivalDelayTotal_array,"Cancelled":cancelled_array}) print(delay_df)
The script above fetches data from the API, defines variables and stores the data in a pandas dataframe. The output shows the arrival and departure airport delays and number of canceled flights per airport. Keep in mind, this output is in the time frame shown.
Now we connect the pandas dataframe to the SQL database table created above. Modify the FlightAPI.py file as follows:
import requests import pandas as pd from FlightDbConn import FlightEngine delay_url = "https://aerodatabox.p.rapidapi.com/airports/delays/2023-03-06T00:18" querystring = {"withAircraftImage":"false","withLocation":"false"} headers = { "X-RapidAPI-Key": "046c3b4981msha4789c0d3484c16p1bc8d3jsna2f831b3710c", "X-RapidAPI-Host": "aerodatabox.p.rapidapi.com" } delay_response = requests.request("GET", delay_url, headers=headers, params=querystring) delay_response_Json = delay_response.json() Icao_array = [] initialTime_array = [] finalTime_array = [] departureDelayTotal_array = [] arrivalDelayTotal_array = [] cancelled_array = [] for i in delay_response_Json: Icao = i['airportIcao'] Icao_array.append(Icao) initialTime = i['fromUtc'] initialTime_array.append(initialTime) finalTime = i['toUtc'] finalTime_array.append(finalTime) departureDelay = i['departuresDelayInformation']['numTotal'] departureDelayTotal_array.append(departureDelay) arrivalDelays = i['arrivalsDelayInformation']['numTotal'] arrivalDelayTotal_array.append(arrivalDelays) cancelled = i['arrivalsDelayInformation']['numCancelled'] cancelled_array.append(cancelled) delay_df = pd.DataFrame({"Icao":Icao_array,"FromTime":initialTime_array,"ToTime":finalTime_array,"DepartureTimeDelay":departureDelayTotal_array,"ArrivalTimeDelay":arrivalDelayTotal_array,"Cancelled":cancelled_array}) print(delay_df) try: delay_df.to_sql('AirportDelaysTable',con=FlightEngine,if_exists='replace',index=True) print("data added to the database") except Exception as e: print("Failed!") print(e)
The script above will write the API data to the SQL Server table which you can now query.
Conclusion
Great! Now you can utilize this API's flight data to capture and analyze the data. Check out the API to see all of the flight and airport related data you can retrieve.
Next Steps
Check out these related articles:
- Read API Data with Power BI using Power Query
- Extracting API Data Using PowerShell and Loading into SQL Server
- Twitter API to Execute Queries and Analyze Data in SQL Server
- Load API Data to SQL Server Using Python and Generate Report with Power BI
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2023-03-23