By: John Miner | Updated: 2023-05-18 | Comments (2) | Related: > Python
Problem
Today, the three prominent cloud vendors support REST API frameworks to create, update, and delete various services. What is a REST API? It is an acronym that stands for Representational State Transfer Application Programming Interface. It is a software architecture style in which information is transferred from the client to the server via HTTP. The typical methods of GET, PUT, and DELETE are coded to take parameters such as header and/or body during the web call. The response indicates whether the action was successful or an error has occurred. The JavaScript Object Notation (JSON) is an open standard file format commonly used by web applications to transmit data. The most basic idea of the format is that all data must be stored as key value pairs. Of course, simple data types can be combined to create objects, and arrays of objects can also exist. Last but not least, the Python programming language is very popular. It has libraries that support both JSON and HTTP.
How can we use Python to store data in a JSON format? How can we deploy infrastructure in the Azure cloud using the REST API?
Solution
The Azure REST API is a published interface from Microsoft that constantly changes. Therefore, it is important to specify the API version when making a call. The Python language can be used to store data locally in JSON file format and deploy Azure infrastructure via web calls.
Business Problem
Our manager asked us to investigate how to store data in a JSON format using Python. This is a precursor to saving our company's transaction information in a data lake. Additionally, the manager wants to know if deploying and managing services using the Azure REST API is possible. Here is a list of tasks we need to investigate and solve:
Task Id | Description |
---|---|
1 | Explore the loads method. |
2 | Review data types supported by JSON. |
3 | Explore the dumps method. |
4 | Obtain bearer tokens. |
5 | Create a resource group, storage account, and storage container. |
6 | Upload a file to blob storage. |
7 | Remove resource group. |
At the end of the research, we will understand how to manage JSON files and make REST API calls.
JSON Data - Simple Data Types
This section will explore the different data types supported by the JSON file format. The JSON library supports a loads method (decoder) and a dump method (encoder). To start our exploration, we need both the Python interpreter and an interactive development environment.
The Anaconda distribution is one of my favorites. Download and install the software. The image below shows the start menu containing Jupyter Notebooks and the Spyder IDE. Both use the Python interpreter to execute your code.
A string value is one of the core data types. The variable s will contain the string representation of the JSON, and the variable o will contain the object it is decoded to. Execute the program below to create your first key value pair. The language we are talking about today is Python.
# # Test 1 - define a string # # include library import json # spacing spacer = " " * 4 # define string s = r'{ "language": "python"}' print("\nstring example:") print(spacer + s) # cast to object o = json.loads(s) # data type print(spacer + str(type(o))) # retrieve value print(spacer + o['language'])
The image below shows the complete program loaded in the Spyder editor. To the left, the code sits. To the right, we have the variables declared during execution at the top and the trace of the execution and output at the bottom. Since this information takes up the whole screen, the remaining screenshots presented in this tip will focus on the variable and output panes. Note: Internally, Python considers the decoded JSON string a dictionary object.
A numeric value is one of the core data types. Again, the variable s will contain the string representation of the JSON, and the variable o will contain the object it is decoded to. Execute the program below to create your second key value pair. The version of the Python language is 3.9.13. I used padding to store the software version as a decimal number of 3.009013. Please look at the documentation for more information, but JSON supports both whole and decimal numbers.
# # Test 2 - define a number # # include library import json # spacing spacer = " " * 4 # define string s = r'{ "version": 3.009013}' print("\nnumeric example:") print(spacer + s) # cast to object o = json.loads(s) # data type print(spacer + str(type(o))) # retrieve value print(spacer + str(o['version']))
The image below shows the output for executing the code.
The last core data type is the boolean value. The code below saves a key/value pair that indicates that the Python language is popular.
# # Test 3 - define a boolean # # include library import json # spacing spacer = " " * 4 # define string s = r'{ "popular": true }' print("\nboolean example:") print(spacer + s) # cast to object o = json.loads(s) # data type print(spacer + str(type(o))) # retrieve value print(spacer + str(o['popular']))
The image below shows that the JSON library is still decoding the data as a dictionary.
This section explored the string, numeric, and boolean data types core to the JSON format. In the next section, we will discuss combining these data types to make arrays and/or objects.
JSON Data - Complex Data Types
The next evolutionary step in creating a JSON document is combining the core data types into more complex ones. The code below creates an array of numbers representing the start of the Fibonacci sequence.
# # Test 4 - define an array # # include library import json # spacing spacer = " " * 4 # define string s = r'{ "fibinocci": [1, 1, 2, 3, 5, 8] }' print("\narray example:") print(spacer + s) # cast to object o = json.loads(s) # data type print(spacer + str(type(o))) # retrieve value print(spacer + str(o['fibinocci'][5]))
Arrays are zero-based in the Python language. Since the JSON string gets encoded as a dictionary, we need to select the property to retrieve the array. Next, we choose the nth index to retrieve the value we want. We want the 6th element in this case, so we use five as the index number. The image below shows the results of executing the program.
In real life, data is much more complicated. For instance, how can we represent a programming language's name and the year it was created? That idea is an object in JSON. We need to use an array of objects. Since the string representation is quite large, I will use the triple quote syntax to define the JSON string over multiple lines. If you are curious, I found the data from programming languages on this web page. There are quite a few languages I have not tried yet!
# # Test 5 - array of objects # # include library import json # define a json string s = """ [ { "name": "FORTRAN", "year": 1957 }, { "name": "COBOL", "year": 1959 }, { "name": "BASIC", "year": 1964 }, { "name": "PASCAL", "year": 1970 }, { "name": "C", "year": 1972 }, { "name": "SQL", "year": 1978 }, { "name": "C++", "year": 1980 }, { "name": "ADA", "year": 1983 }, { "name": "PERL", "year": 1987 }, { "name": "PYTHON", "year": 1990 }, { "name": "R", "year": 1993 }, { "name": "JAVA", "year": 1995 }, { "name": "C#", "year": 2001 }, { "name": "SCALA", "year": 2003 }, { "name": "POWERSHELL", "year": 2006 } ] """ # spacing spacer = " " * 4 # define string print("\nobjects example:") # cast to object o = json.loads(s) # how many elements print(spacer + "number of objects = " + str(len(o))) # data type print(spacer + str(type(o))) # retrieve value print(spacer + str(o[9]['name']))
The output of executing the above Python code is shown below. First, there are 15 languages that I have used frequently over my career. Second, the string presentation of this data has 536 characters. Finally, the decoder noticed the array and stored the data as a list of dictionaries.
So far, we have been decoding JSON strings into Python objects. The last experiment is to encode a Python object as a JSON string. To do this, I found an article on famous computer scientists, located the top three people, and collected their overall ranking, first name, last name, date born, date died, and contribution.
# # Test 6 - encode python dictionary # # include library import json # empty list l = [] # create dictionary p = {} p['rank'] = 1 p['fname'] = 'Alan' p['lname'] = 'Turing' p['born'] = 1912 p['died'] = 1954 p['contribution'] = 'turing machine' # add to list l.append(p) # create dictionary p = {} p['rank'] = 2 p['fname'] = 'Timothy' p['lname'] = 'Berners-Lee' p['born'] = 1955 p['died'] = None p['contribution'] = 'world wide web' # add to list l.append(p) # create dictionary p = {} p['rank'] = 3 p['fname'] = 'Charles' p['lname'] = 'Babage' p['born'] = 1791 p['died'] = 1871 p['contribution'] = 'mechanical computer' # add to list l.append(p) # define string s = json.dumps(l) # spacing spacer = " " * 4 print("\nencode example:") print(spacer + s)
We will use the dumps method to encode the object as a string. Inside Python, the variable is a list of dictionaries. The encoded variable seen below is a string with a length of 346 characters.
In short, we now know how to decode both simple and complex JSON strings as Python objects using the loads method. The opposite operation, the dumps method, can encode complex Python objects into JSON strings.
Access Token
It is best to use a service principal to make calls to the Azure REST API. However, the service principal must be granted permissions at the subscription, active directory, and/or service level to complete a task successfully. We also do not want to log into Azure for each API call. Therefore, we want to get an access token that can be used for a given service.
This section will discuss obtaining an access token using the OAuth 2.0 method. The image below is from the Microsoft document and shows that the first three steps can be executed manually once.
The image below shows my service principal called dlpsvcprn01. There is a secret associated with the service principal, but I will not share it here!
We can grant the service principal access to the APIs. The image below shows that Azure Storage and Microsoft Graph have been delegated.
At the subscription level, we must give out role-based access controls (RBAC). I over-provisioned access to make a point. I could get away with just [contributor] and [storage blob data contributor] access.
Now that the service principal has rights, we can focus on how to obtain access tokens.
The whole Python program is broken into steps or an algorithm in list form. I am going to focus on access tokens in this section. When working with service principals, you need to know the tenant id, client id, and client secret. I usually place this information in a key vault in Azure. Since I am developing with a standalone virtual machine in Azure, I will use clear text in the program. This pattern is not best practice, but I do not want to bloat the code by adding a security layer.
# # Step 0 – set variables + add libraries # # variables tenant_id = "<your tenant id>" client_id = "<your client id>" client_secret = "<your client secret>" # add libraries import requests import json from time import gmtime, strftime import time
The code below logs onto Microsoft Online using the service principal and returns a token for use with the management APIs:
# # Step 1 - get access token 4 mgmt api # # make details for [post] request headers = {'Content-Type': 'application/x-www-form-urlencoded'} url = r'https://login.microsoftonline.com/' + tenant_id + r'/oauth2/token' data = { 'grant_type':'client_credentials', 'client_id': client_id, 'client_secret': client_secret, 'resource': 'https://management.azure.com/' } # call rest api r = requests.post(url, headers=headers, data=data) # extract bearer token token1 = r.json()['access_token'] # show response print("\nstep 1 - get management token") print(" - status = {0}\n".format(r.status_code)) print(json.dumps(r.json(), indent=4, separators=(',', ': ')))
Please note the use of JSON strings, decoding JSON, and encoding JSON. This call requires a post method with a header and body. Sometimes the body needs to be encoded. It all depends on the API call.
I will not show the output variables in future screenshots. However, I wanted to point out that the variable explorer can be used to drill into objects. The variable named "token1" is for use with the management APIs. Please note that the resource is pointing to https://management.azure.com. This is important since the storage service is a different resource. Therefore, we need a different token when working with storage. Also, please note that the token expires in 3599 seconds or just shy of 1 hour.
The fifth step in my algorithm gets a token saved as a variable named "token2". This token is for use with the storage APIs. The image below shows the token that has expired.
The main point of this section is to reinforce the idea that a service principal will be used to interact with Azure Services. Since we do not want to log into each service before each call, we will use access (bearer) tokens.
Groups and Storage
System designers use resource groups to put resources that are used together in one container. The deletion of the resource group means the deletion of all objects in the group. Today, we will create a resource group and a storage account using the Azure REST API.
In the portal, you might have noticed an entry called a resource id. This is a fully qualified path to the resource. Since a tenant might have multiple subscriptions, we must select a subscription. The code below returns an array of subscriptions. It is a good thing that I only have one. Thus, I am selecting the first element of the array. Please see the documentation for details on listing subscriptions.
# # Step 2 - get subscription id # # make details for [get] request headers = {'Authorization': 'Bearer ' + token1, 'Content-Type': 'application/json'} params = {'api-version': '2016-06-01'} url = r'https://management.azure.com/subscriptions' # call rest api r = requests.get(url, headers=headers, params=params) # extract subscription subscription = r.json()['value'] [0]['subscriptionId'] # show response print("\nstep 2 - get subscription id") print(" - status = {0}\n".format(r.status_code)) print(json.dumps(r.json(), indent=4, separators=(',', ': ')))
You can see my subscription is a "pay-as-you-go" type. However, it is backed by Microsoft sponsorship since I am a data platform MVP. Thus, I get some credits each year to investigate Azure services, write articles and teach end users.
We will be using this subscription id in future API calls. The code below uses a put call to create a new resource group. Notice that the token is part of the header. It is imperative to include the API-version since the data requirements for the REST API call might change over time. Finally, the body of the call is encoded as a JSON string. Please see the documentation for details on creating a resource group.
# # Step 3 - create resource group # # make details for [put] request group = 'tipsarg4dev01' headers = {'Authorization': 'Bearer ' + token1, 'Content-Type': 'application/json'} params = {'api-version': '2021-04-01'} url = r"https://management.azure.com/subscriptions/" + subscription + "/resourcegroups/" + group data = {'location': 'eastus' } # call rest api r = requests.put(url, headers=headers, params=params, data=json.dumps(data)) # show response print("\nstep 3 - create resource group") print(" - status = {0}\n".format(r.status_code)) print(json.dumps(r.json(), indent=4, separators=(',', ': ')))
The output below shows the successful deployment of the resource group in the "eastus" location.
The code below creates a storage account. Please note that the amount of information in the body is quite large. Please see the documentation for details on creating a storage account.
# # Step 4 - create storage account # # make details for [put] request storage = 'tipsasa4dev01' headers = {'Authorization': 'Bearer ' + token1, 'Content-Type': 'application/json'} params = {'api-version': '2022-09-01'} url = r"https://management.azure.com/subscriptions/" + subscription + "/resourcegroups/" + group + '/providers/Microsoft.Storage/storageAccounts/' + storage data = { "sku": { "name": "Premium_LRS" }, "kind": "BlockBlobStorage", "location": "eastus", "properties": { "isHnsEnabled": True, "isNfsV3Enabled": False, "supportsHttpsTrafficOnly": False, "networkAcls": { "bypass": "AzureServices", "defaultAction": "Allow", "ipRules": [], "virtualNetworkRules": [] } } } # call rest api r = requests.put(url, headers=headers, params=params, data=json.dumps(data)) # show response print("\nstep 4 - create storage account") print(" - status = {0}\n".format(r.status_code))
The output of the code is shown below. It is quite small. The documentation from Mozilla shows the various HTTPS status codes. A code of 202 means the command has been accepted, but not acted upon. Azure is queue oriented. Eventually, the action on storage will be pulled from the queue and acted upon.
Some actions are done with the management API, while others are done with the storage API. The creation of the storage container and uploading a file will be done using the latter.
Containers and Files
Not only do storage actions on containers or files use a different resource endpoint, but two fields are required in the header for each call. The "x-ms-version" describes the version of the API service to use. The "x-ms-date" is the date of the request. There is a window of time in which the call can be made. Old dates will generate a "403" status code. Optionally, you can pass an "x-ms-meta-name" field to describe what is being done. Please note that the second access token is being used in this section.
# # Step 6 - create storage container # # date/time has to be close msdate = time.strftime("%a, %d %b %Y %H:%M:%S GMT", time.gmtime()) # make details for put request storage = 'tipsasa4dev01' container = 'tipsasc4dev01' headers = {'Content-Type': 'application/json', 'Authorization': 'Bearer ' + token2, 'x-ms-version': '2021-12-02', 'x-ms-date': msdate, 'x-ms-meta-name': 'MS SQL TIPS - Create Container', } url = r"https://" + storage + ".blob.core.windows.net/" + container + r"?restype=container" # call rest api r = requests.put(url, headers=headers) # show response print("\nstep 6 - create storage container") print(" - status = {0}\n".format(r.status_code))
The image below shows the requested storage container that has been created. Please see the documentation for details on creating a storage container.
I decided to upload a plain text file to blob storage. In short, you can read the text file off your local storage and save it as a blob file. Since JSON files are text, this works fine with our business case. If you want to use parquet, you will need to change the content type and read the body in as binary.
# # Step 7 - create blob (text) file # # make details for put request filename = "mssqltips.txt" filecontent = "Write Azure Rest API calls with Python + Json." headers = {'Content-Type': 'text/plain; charset=UTF-8', 'Content_length': str(len(filecontent)), 'Authorization': 'Bearer ' + token2, 'x-ms-version': '2021-12-02', 'x-ms-date': msdate, 'x-ms-meta-name': 'MS SQL TIPS - Create Blob File', 'x-ms-blob-type': 'BlockBlob' } url = r"https://" + storage + ".blob.core.windows.net/" + container + r"/" + filename # call rest api r = requests.put(url, headers=headers, data=filecontent) # show response print("\nstep 7 - create blob file") print(" - status = {0}\n".format(r.status_code))
Executing the above code produces the following output. Please see the documentation for details on creating a blob file.
To date, we have not seen the fruits of our labor. The image below was taken from the Azure Portal. The text file resides in the resource group, storage account, and storage container we created via REST API calls.
The last step is to delete this prototype infrastructure. This can be done using the management API token. The code below uses the DELETE action of the requests object to remove the container.
# # Step 8 - delete resource group # # make details for [delete] request group = 'tipsarg4dev01' headers = {'Authorization': 'Bearer ' + token1, 'Content-Type': 'application/json'} params = {'api-version': '2021-04-01'} url = r"https://management.azure.com/subscriptions/" + subscription + "/resourcegroups/" + group # call rest api r = requests.delete(url, headers=headers, params=params) # show response print("step 8 - delete resource group") print(" - status{0}\n".format(r.status_code))
The image below shows the output from executing the program. Please see the documentation for details on deleting a resource group.
Again, everything in Azure takes time to execute. I caught the resource in a deleting state on the Azure Portal.
User-defined Python programs that use the "requests" and "JSON" libraries can effectively deploy infrastructure using the Azure REST APIs.
Summary
The JavaScript Object Notation specification was developed and used in early 2001 by Douglas Crockford and Chip Morningstar. The first application of specification was a web application that used JSON objects to keep track of the session state at the server. In December 2005, Yahoo started offering web services using JSON as a way to communicate. Currently, JSON is used all the time by cloud vendors who offer web services.
Today, we talked about how to manually create JSON documents. The "loads" method of the "json" library for Python decodes a string into a native object. The "dumps" method of the same library will take a Python object and encode it as a string. Can bad JSON be harmful to programs that developers write? The answer is yes. Like SQL injection, a hacker can change the code and cause unwanted damage. For instance, I wanted to write a web service that takes a SQL statement and executes the action. It was designed to insert data into a table. However, the service account that runs against the database was given ownership writes. This scenario is an accident waiting to happen. Ensure your web servers are safe against JSON injection by sanitizing the data and using the principle with the least security needed for an action.
Python programs can be written to deploy infrastructure using the Azure REST API. JSON strings pass data back and forth from the web services at the lowest level. Would I use this technique every day? The answer is no. REST API programming is the lowest level of communication that can be achieved. There exist Azure SDK libraries for Python that create a level of abstraction. In short, the programmer does not need to worry about the nitty-gritty details of web calls. I often deploy infrastructure using the Azure Command Line Interface for interactive deployments and ARM templates for pipeline deployments. Does knowing REST API programming come in handy? The answer is yes. Certain programming frameworks, such as Azure Data Factory (ADF), might not have built-in functionality for SharePoint in Office 365. However, ADF does have support for web calls. Therefore, you can upload and download files from SharePoint using the REST API.
Next Steps
- The Python programs for JSON and REST are being shared via this link. Just download the code and start your learning from this Python tutorial.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2023-05-18