Compare SQL Server, MySQL and PostgreSQL Features

By: Maria Zakourdaev | Updated: 2018-10-29 | Comments (11) | Related: > PostgreSQL

Problem

SQL Server DBAs are absolutely sure that SQL Server is the best relational database. Ever. I have never had any doubt myself.

What should you do if your company wants to cut costs and put some data assets into an open-source RDBMS? From a Data Engineer/ Data Architect point of view, which one should you suggest to use? Opensource=free. Which is unfortunately, a very attractive option for the decision makers who are responsible for the company budget.

Moreover, I have always wondered how MySQL and PostgreSQL are different? Which features does each offer and what features don't they offer?

Solution

In order to understand, how to choose between these database systems, I have spent some time and read a few manuals. You can see below the mapping that I have made for myself in order to understand the differences. I have added SQL Server to the comparison as well.

I would be happy to see comments posted for this tip and add more things to the mappings below. Furthermore, do not hesitate to post a comment if the information below is not correct or not relevant anymore in the new product versions.

In general, both MySQL and PostgreSQL databases fully support ACID and can handle large amounts of data and high levels of query concurrency.

PostgreSQL is more feature-rich and extensible and maybe a better choice for extreme cases.
MySQL is much more popular, suits web applications and e-commerce projects, there are much more blogposts/support/documentation then for PostgreSQL.
For environments with high number of connections - PostgreSQL might need a lot of memory, because each connection has its own memory. However, there are solutions to overcome this issue, like PgBouncers, external connection pools.
Manual partition management in PostgreSQL requires too much overhead and updates that move rows from one partition to another will fail.
There is 1 CPU per query limitation in MySQL and only the nested-loop join algorithm which make MySQL a less optimal choice for data warehouse systems.
If check constraint functionality is important - there are none in MySQL.

Comparison of MySQL vs. PostgreSQL vs. SQL Server

In regards to the MySQL features, I will be mainly be referring to the InnoDB storage engine.

General information for MySQL, PostgreSQL and SQL Server

	MySQL	PostgreSQL	SQL Server
Maturity	Initial release was in 1995	Initial release was in 1989	MSMS SQL Server for OS/2 was released in 1989 (together with Sybase) SQL Server 6.0 was released in 1995 marking the end of collaboration with Sybase.
Language	Written in C, has a few C++ modules	Written in C	Mostly C++ with a few exceptions
Cost	Open source / Owned by Oracle and has several paid editions	Completely free / Open source	SQL Server Express is a free edition, but it is limited to using 1 processor, 1 GB memory and 10 GB database files.

Data changes for MySQL, PostgreSQL and SQL Server

	MySQL	PostgreSQL	SQL Server
Row Updates	Updates happen in place, changed data is copied to the rollback segment. This makes vacuuming and index compaction very efficient. MySQL is slower for reads, but writes are atomic and if columns in a secondary index change, this does not require changes to all indexes.	Updates are being implemented as inserts + mark as delete for vacuum. All indexes have a link to the physical id of the row. This has an update amplifying effect because when the column gets updated, new row with new physical id gets created and all indexes require updates, even those which are not referring to the changed column to get a pointer to the new row physical id.	Row-Store database engine: In-Memory database engine: updates implemented as insert + mark for delete. Garbage collector is not non-blocking and parallel Columnstore database engine: in-place updates
Vacuum / Defragmentation	Vacuuming and index compaction are very efficient.	Vacuum performs full tables scans to find the deleted rows and quite heavy process/might impact users’ workload.	In-memory garbage collector might add max ~15% overhead, usually much less.

Querying the data for MySQL, PostgreSQL and SQL Server

	MySQL	PostgreSQL	SQL Server
The buffer pool / cache that serves queries	MySQL cache that serves user queries is called a buffer pool. This cache can be set to the size as large as needs, leaving only enough memory for other processes on the server. You can split the buffer pool into multiple parts to minimize contention for memory structures and you can pin tables to the buffer pool. Table scan or mysqldump evicts older data.	PostgreSQL maintains shared memory for data pages and, due to the fact that it is a process-based system, each connection has a native OS process of its own and has its own memory. Process is releasing the memory after the execution has finished. Therefore, has problems scaling past hundreds of active connections.	SQL Server memory is called buffer pool and its size can be set as large as needed, no option to set multiple buffer pools.
Constraints support	Supports primary keys, foreign keys, not-null constraints, unique constraints, default constraints, does not support CHECK constraints	Supports primary keys, foreign keys, not-null constraints, check constraints, unique constraints, default constraints, exclusion constraints	Supports primary keys, foreign keys, not-null constraints, check constraints, unique constraints, default constraints
Temporary tables	Supports CTE, No support for global temp tables (available outside the session scope) and no table variables. Interesting fact: You cannot refer to a TEMPORARY table more than once in the same query. For example, the following does not work: SELECT * FROM temp_table JOIN temp_table AS t2;	Supports CTE, Global and local temporary tables and table variables (using table name as a type name). Interesting fact: if you create two tables with the same name, one is temporary and another one is regular table CREATE TEMP TABLE X (…) and CREATE TABLE X (…), "select * from x" will always bring data from temporary table.	Supports CTE, Global and local temporary tables and table variables.
Window / Analytical functions	Supports: CUME_DIST, FIRST_VALUE, LAG, LAST_VALUE, LEAD, PERCENT_RANK, ROW_NUMBER, RANK, DENSE_RANK, NTILE, NTH_VALUE No PERCENTILE_CONT, PERCENTILE_DISC functions.	Supports functions: CUME_DIST, FIRST_VALUE, LAG, LAST_VALUE, LEAD, PERCENTILE_CONT, PERCENTILE_DISC, PERCENT_RANK, ROW_NUMBER, RANK, DENSE_RANK, NTILE, NTH_VALUE	Supports functions: CUME_DIST, FIRST_VALUE, LAG, LAST_VALUE, LEAD, PERCENTILE_CONT, PERCENTILE_DISC, PERCENT_RANK, ROW_NUMBER, RANK, DENSE_RANK, NTILE. Yet no NTH_VALUE function
Parallel query execution	MySQL will usually use 1 CPU per query.	Query plans can leverage multiple CPUs	Query plans can leverage multiple CPUs
Indexes	Supports index-organized tables - clustered indexes. Does not support persisted indexes / materialized views	Supports index-organized table, but updates are manual until ProstgreSQL 11 when it is automatic. Supports persisted indexes/materialized views.	Supports index-organized tables - clustered indexes that automatically maintains rows order.
Multiple indexes usage in single query	Multiple indexes might be used for the single query.	Multiple indexes might be used for the single query. If we have separate indexes on x and y, one possible implementation of a query like WHERE x = 5 AND y = 6 is to use each index with the appropriate query clause and then AND together the index results to identify the result rows.	Multiple indexes might be used for a single query (index intersection feature).
Multicolumn indexes	Multi-column indexes can have up to 16 columns	Multi-column indexes can have up to 32 columns	Multi-column indexes can have up to 16 columns
Partial indexes (an index built over a subset of a table using filter)	Does not support partial indexes	Supports partial indexes	Supports partial indexes
Join algorithms	MySQL executes joins between tables using only a nested-loop algorithm or variations of it.	Supports nested-loop joins, Hash joins and merge joins algorithms.	Supports nested-loop joins, hash joins and merge joins algorithms.
Query execution plan reuse	Maintains caches for prepared statements and stored programs on a per-session basis. Statements cached for one session are not accessible to other sessions.	Caches query plans only as long as the prepared statement is open. The query plan is disposed when the prepared statement is closed.	Has shared execution plan cache to enable queries to reuse execution plans
Statistics	Maintains persistent and non-persistent statistics (cleared on server restart)	Maintains statistics used by the planner, they are being updated by ANALYZE or VACUUM or CREATE INDEX	Maintains persistent statistics
Memory-optimized tables	MySQL has got an ability to store tables in memory. The tables that are created in memory do not support transactions, their data is vulnerable to crashes. Those tables should be used as a temporary area or as a read-only caches.	Does not offer any in-memory engine.	In-memory OLTP is integrated into SQL Server’s database engine
Columnstore or row- store	MariaDB have recently launched the column store engine for MySQL which was designed as a massively parallel database in an environment with multiple servers. It can be used instead of InnoDB storage engine.	Row-store. Does not offer any columnar storage engine.	SQL Server offers column store indexes to query large tables

JSON and Data Type Support for MySQL, PostgreSQL and SQL Server

	MySQL	Postgresql	SQL Server
JSON data type	MySQL has JSON data type support and also supports in place partial updates over the JSON instead of replacing the whole document however there are many limitations. It does not support indexing for JSON but there are workarounds.	PostgreSQL supports JSON data type and supports partial updates	SQL Server supports JSON data type and supports partial updates
Additional Advanced data types	Supports Geospatial data type. No user-defined types.	Supports Geospatial and lots of advanced data types, such as multi-dimensional arrays, user-defined types, etc.	Supports Geospatial data type, Hierarchical data

Sharding / Partitioning / Replication for MySQL, PostgreSQL and SQL Server

	MySQL	PostgreSQL	SQL Server
Partitioning support	Supports HASH partitioning (use HASH function on any column to split table into N partitions), RANGE or LIST partitioning that can be based on several columns and KEY partitioning which is similar to HASH but based on some auto generated number.	Supports RANGE and LIST partitioning but partitions and indexes on them must be manually created and old-style partitioning via table inheritance (when querying the parent table, all children tables are being queries as well, children tables have constraints on partitioning column. Interesting fact: Children tables can have more columns that parent table and indexes must be applied separately on children tables.)	Supports RANGE partitioning.
Sharding support	No good sharding implementation (MySQL Cluster is rarely deployed due to many limitations)	There are dozens of forks of Postgres which implement sharding but none of them yet haven’t been added to the community release.	No standard sharding implementation.
Replication	Master-slave replication based on statements or based on changed rows Group replication with master server automatic election	Master - slave replication based on changed rows and log shipping.	Database level: Availability Groups master-multiple slaves Log shipping On data level: Master-slave / Bi-directional master-slave/ and master-master (merge) replication

Next Steps

What are pros and cons of PostgreSQL and MySQL? With respect to reliability, speed, scalability, and features: https://www.quora.com/What-are-pros-and-cons-of-PostgreSQL-and-MySQL-With-respect-to-reliability-speed-scalability-and-features
MySQL partitioning: https://www.w3resource.com/mysql/mysql-partition.php
Creating linked servers from SQL Server to MySQL: https://www.mssqltips.com/sqlservertip/4577/create-a-linked-server-to-mysql-from-sql-server
MySQL to SQL Server coding differences: https://www.mssqltips.com/sqlservertutorial/2204/mysql-to-sql-server-coding-differences
Creating linked server from SQL Server to PostgreSQL: https://www.mssqltips.com/sqlservertip/3662/sql-server-and-postgresql-linked-server-configuration--part-2
Creating two way data access between SQL Server and PostgreSQL: https://www.mssqltips.com/sqlservertip/3661/creating-twoway-data-access-between-sql-server-and-postgresql--part-1

About the author

Maria Zakourdaev has been working with SQL Server for more than 20 years. She is also managing other database technologies such as MySQL, PostgreSQL, Redis, RedShift, CouchBase and ElasticSearch.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Article Last Updated: 2018-10-29

Monday, November 5, 2018 - 12:18:18 PM - Justin Patterson	Back To Top (78159)
Sharding is mispelled in the Postgres section