By: Andrea Gnemmi | Updated: 2024-07-11 | Comments | Related: > PostgreSQL
Problem
We have all encountered the need to store non-structured or semi-structured data in an RDBMS; XML or JSON data in particular. This can be complicated, especially in the past with limited technical options, and even more complicated if we want to query this data efficiently.
Solution
PostgreSQL offers two different data types, JSON and JSONB, as well as many operators in order to query, index, and work efficiently with JSON. It is probably the best JSON management of all RDBMSs.
This tip will review the most important syntaxes, features, and possibilities of the PostgreSQL JSON and JSONB data types.
JSON and JSONB Data Types
PostgreSQL has two different data types to manage JSON data:
- JSON - The oldest and less feature-rich, and
- JSONB - Offers the possibility to apply indexes to the data type and other specific operator
In terms of differences, the JSON data type stores an exact copy of the data that needs to be parsed when accessed while JSONB stores a decomposed binary version of the data which is slower to insert, but faster to process because it is not reprocessed. There is also a third data type in PostgreSQL related to JSON, JSONPATH, but this tip will not analyze it.
So, let's begin by showing an example of adding a JSON column in a table, as well as the operators we can use, and then query these data types. As always, we will use the Chinook database, a GitHub free downloadable database available in multiple RDBMS formats. It is a simulation of a digital media store with some sample data. All you need to do is download the required version, which includes all the scripts for data structure and inserts for data.
-- MSSQLTips.com create table if not exists orders (order_id bigint generated always as identity (start with 1 increment by 1) not null primary key, order_details jsonb not null);
We can take a look at this newly created table in pgAdmin and then insert some data in it:
-- MSSQLTips.com insert into orders (order_details) values ('{"customer": "Astrid Gruber", "track":"For Those About To Rock (We Salute You)"}'), ('{"customer": "Astrid Gruber", "track":"Love In An Elevator"}'), ('{"customer": "Alexandre Rocha", "track":"Whole lotta love"}');
All is fine so far; we have inserted some correctly parsed JSON data. Let's try now to insert some not correctly formed JSON:
-- MSSQLTips.com insert into orders (order_details) values ('{"customer": "Alexandre Rocha", "track":"For Those About To Rock (We Salute You)"');
We receive an error as the string is not correctly ended for a JSON! That is amazing! Without relying on any complicated parsing function, we have a "free" check of our data!
JSON Operators
But it's not over! There are many JSON operators available to query this data!
Let's introduce the first:
->
This operator extracts the JSON object field with the given key. Here's an example:
-- MSSQLTips.com select order_details -> 'track' as track from orders;
As you can see, it is very easy to extract an object from the JSON. Notice that,
in this case, the resulting data is still in JSONB data type. Anyway, we can do
more. We can use the object to filter our query, in this case, extracting it as
a text data type with the operator
->>
:
-- MSSQLTips.com select order_details -> 'track' as track_json, order_details ->> 'track' as track_text from orders where order_details ->> 'customer' like 'Astrid%';
As you can see, I used both operators to compare the results with different data types.
These first two operators are valid in both JSON and JSONB data types. But, there
are some specific ones valid only with JSONB data type. An example of this
is the ?
, which is interesting
in that it returns a Boolean if the specific text is a key inside our JSON. Let's
see it in action.
-- MSSQLTips.com insert into orders (order_details) values ('{"track":"For Those About To Rock (We Salute You)"}');
We first inserted a new row in the table without the customer key in the JSON. Now, we can query the table again and check only the rows where this key is present:
-- MSSQLTips.com select order_details ->> 'customer' as customer from orders where order_details ? 'customer';
As expected, the row that we just inserted is not returned by the query. This is just an example of the operators that I mostly used. I will share a complete list of the links with the official PostgreSQL documentation.
JSON Processing and Testing Functions
Obviously, there are some functions used to test and process JSON and JSONB data
types. Let's make a couple of examples of two interesting processing functions
like jsonb_each
(and the
equivalent for JSON data type named
json_each
):
-- MSSQLTips.com select jsonb_each(order_details) from orders;
As we can see with this function, we have the JSON broken down with all keys
and values, which is a nice feature for exploring what's in it. As the result
set is a record type, we can represent it as a recordset result, maintaining the
same data type JSONB for the values or converting it to a TEXT data type using
jsonb_each_text
:
-- MSSQLTips.com select key,value from orders, jsonb_each_text(order_details);
We need to use this particular syntax to select directly from the jsonb_each_text output, and then the result set will be formatted nicely.
But there are more functions like
jsonb_object_keys
, which
will return all the keys of a JSON:
-- MSSQLTips.com select order_id, jsonb_object_keys(order_details) from orders;
As expected, the query returns only seven rows since the last row inserted has only the "track" key in the JSON.
Starting with version 16 of PostgreSQL, we also have a testing function,
IS JSON
, similar to the
function found in many other RDBMS (including SQL Server and Oracle). This function
helps recognize if a string is JSON:
-- MSSQLTips.com select order_details is json as ISJSON from orders;
As expected, the result is always true as the order_details column is a JSONB. But, below is the result when we run the function against the string we tried to insert in one of the first examples above, receiving an error:
-- MSSQLTips.com select ('{"customer": "Alexandre Rocha", "track":"For Those About To Rock (We Salute You)"') is json as ISJSON;
As expected, it returns as false since it is not JSON.
JSONB Indexing
Last but not least, a JSONB data type can be indexed, greatly helping performance when using all the JSON operators. The index type that is used is GIN. (I will write more in-depth about indexes in PostgreSQL in another tip series.) Let's check out an example by first inserting some additional rows:
-- MSSQLTips.com insert into orders (order_details) select cast('{"customer":'||'"'||"FirstName"||' '||"LastName"||'", "track":'||'"'||"Name"||'"}' as jsonb) from "Finance"."Customer" cross join "Track" where "GenreId" in (1,3,4,13) and "Name" not like '%"%';
As you can see, I filtered out the track names containing the JSON escape character
("
) to simplify the
insert, and I explicitly cast the text resulting from the query into a JSONB data
type. With a separate query, I added some more rows using the same method to have
a table of 168K rows, which is not big, but is at least significant.
We can now query our table using the JSONB operator
@>
in the filter. This
operator, specific to a JSONB data type, returns true if the correct JSON is contained
in the left one, thus perfect for a filter:
-- MSSQLTips.com select order_details -> 'track' as track from orders where order_details @> '{"customer": "Astrid Gruber"}';
Let's run the Explain Analysis of this query to obtain the plan used:
As expected, since there are no indexes, we have a sequential scan of the entire table.
Now, we can create an index on the JSONB column:
-- MSSQLTips.com create index idx_test_jsonb on orders using gin (order_details);
We can now check again the Explain Analysis to see the plan used by the optimizer to return the query:
-- MSSQLTips.com select order_details -> 'track' as track from orders where order_details @> '{"customer": "Astrid Gruber"}';
The query now uses the index created for the filter, making the query run a little bit faster.
Next Steps
In this tip, we have reviewed some of the most interesting features of JSON data type management in PostgreSQL.
Here are links to the official PostgreSQL documentation:
Here are some other tips regarding JSON:
- ISJSON enhancements in SQL Server for valid JSON format for a Value, Array or Object
- JSON Support in SQL Server 2016
- Determine if a Specific Path in a JSON Document Exists in SQL Server
- Advanced JSON Techniques in SQL Server - Part 1
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2024-07-11