Compare SQL Server, MySQL and PostgreSQL Features

By:   |   Updated: 2018-10-29   |   Comments (11)   |   Related: > PostgreSQL


Problem

SQL Server DBAs are absolutely sure that SQL Server is the best relational database. Ever. I have never had any doubt myself.

What should you do if your company wants to cut costs and put some data assets into an open-source RDBMS? From a Data Engineer/ Data Architect point of view, which one should you suggest to use? Opensource=free. Which is unfortunately, a very attractive option for the decision makers who are responsible for the company budget.

Moreover, I have always wondered how MySQL and PostgreSQL are different? Which features does each offer and what features don't they offer?

Solution

In order to understand, how to choose between these database systems, I have spent some time and read a few manuals. You can see below the mapping that I have made for myself in order to understand the differences. I have added SQL Server to the comparison as well.

I would be happy to see comments posted for this tip and add more things to the mappings below. Furthermore, do not hesitate to post a comment if the information below is not correct or not relevant anymore in the new product versions.

In general, both MySQL and PostgreSQL databases fully support ACID and can handle large amounts of data and high levels of query concurrency.

  • PostgreSQL is more feature-rich and extensible and maybe a better choice for extreme cases.
  • MySQL is much more popular, suits web applications and e-commerce projects, there are much more blogposts/support/documentation then for PostgreSQL.
  • For environments with high number of connections - PostgreSQL might need a lot of memory, because each connection has its own memory. However, there are solutions to overcome this issue, like PgBouncers, external connection pools.
  • Manual partition management in PostgreSQL requires too much overhead and updates that move rows from one partition to another will fail.
  • There is 1 CPU per query limitation in MySQL and only the nested-loop join algorithm which make MySQL a less optimal choice for data warehouse systems.
  • If check constraint functionality is important - there are none in MySQL.

Comparison of MySQL vs. PostgreSQL vs. SQL Server

In regards to the MySQL features, I will be mainly be referring to the InnoDB storage engine.

General information for MySQL, PostgreSQL and SQL Server

MySQL PostgreSQL SQL Server
Maturity Initial release was in 1995 Initial release was in 1989 MSMS SQL Server for OS/2 was released in 1989 (together with Sybase)

SQL Server 6.0 was released in 1995 marking the end of collaboration with Sybase.
Language Written in C, has a few C++ modules Written in C Mostly C++ with a few exceptions
Cost Open source / Owned by Oracle and has several paid editions Completely free / Open source SQL Server Express is a free edition, but it is limited to using 1 processor, 1 GB memory and 10 GB database files. 

Data changes for MySQL, PostgreSQL and SQL Server

MySQL PostgreSQL SQL Server
Row Updates Updates happen in place, changed data is copied to the rollback segment. This makes vacuuming and index compaction very efficient. MySQL is slower for reads, but writes are atomic and if columns in a secondary index change, this does not require changes to all indexes.   Updates are being implemented as inserts + mark as delete for vacuum. All indexes have a link to the physical id of the row. This has an update amplifying effect because when the column gets updated, new row with new physical id gets created and all indexes require updates, even those which are not referring to the changed column to get a pointer to the new row physical id. Row-Store database engine:

In-Memory database engine: updates implemented as insert + mark for delete. Garbage collector is not non-blocking and parallel

Columnstore database engine: in-place updates 
Vacuum / Defragmentation Vacuuming and index compaction are very efficient. Vacuum performs full tables scans to find the deleted rows and quite heavy process/might impact users’ workload. In-memory garbage collector might add max ~15% overhead, usually much less.

Querying the data for MySQL, PostgreSQL and SQL Server

MySQL PostgreSQL SQL Server
The buffer pool / cache that serves queries MySQL cache that serves user queries is called a buffer pool. This cache can be set to the size as large as needs, leaving only enough memory for other processes on the server. You can split the buffer pool into multiple parts to minimize contention for memory structures and you can pin tables to the buffer pool. Table scan or mysqldump evicts older data. PostgreSQL maintains shared memory for data pages and, due to the fact that it is a process-based system, each connection has a native OS process of its own and has its own memory. Process is releasing the memory after the execution has finished. Therefore, has problems scaling past hundreds of active connections. SQL Server memory is called buffer pool and its size can be set as large as needed, no option to set multiple buffer pools.
Constraints support Supports primary keys, foreign keys, not-null constraints, unique constraints, default constraints, does not support CHECK constraints Supports primary keys, foreign keys, not-null constraints, check constraints, unique constraints, default constraints, exclusion constraints Supports primary keys, foreign keys, not-null constraints, check constraints, unique constraints, default constraints
Temporary tables Supports CTE, No support for global temp tables (available outside the session scope) and no table variables.

Interesting fact: You cannot refer to a TEMPORARY table more than once in the same query. For example, the following does not work: SELECT * FROM temp_table JOIN temp_table AS t2;
Supports CTE, Global and local temporary tables and table variables (using table name as a type name).

Interesting fact: if you create two tables with the same name, one is temporary and another one is regular table CREATE TEMP TABLE X (…) and CREATE TABLE X (…), "select * from x" will always bring data from temporary table.
Supports CTE, Global and local temporary tables and table variables.
Window / Analytical functions Supports:

CUME_DIST, FIRST_VALUE, LAG, LAST_VALUE, LEAD, PERCENT_RANK, ROW_NUMBER, RANK, DENSE_RANK, NTILE, NTH_VALUE No PERCENTILE_CONT, PERCENTILE_DISC functions.
Supports functions:

CUME_DIST, FIRST_VALUE, LAG, LAST_VALUE, LEAD, PERCENTILE_CONT, PERCENTILE_DISC, PERCENT_RANK, ROW_NUMBER, RANK, DENSE_RANK, NTILE, NTH_VALUE
Supports functions:

CUME_DIST, FIRST_VALUE, LAG, LAST_VALUE, LEAD, PERCENTILE_CONT, PERCENTILE_DISC, PERCENT_RANK, ROW_NUMBER, RANK, DENSE_RANK, NTILE. Yet no NTH_VALUE function
Parallel query execution MySQL will usually use 1 CPU per query. Query plans can leverage multiple CPUs Query plans can leverage multiple CPUs
Indexes Supports index-organized tables - clustered indexes.

Does not support persisted indexes / materialized views  
Supports index-organized table, but updates are manual until ProstgreSQL 11 when it is automatic.

Supports persisted indexes/materialized views.
Supports index-organized tables - clustered indexes that automatically maintains rows order.
Multiple indexes usage in single query Multiple indexes might be used for the single query. Multiple indexes might be used for the single query. If we have separate indexes on x and y, one possible implementation of a query like WHERE x = 5 AND y = 6 is to use each index with the appropriate query clause and then AND together the index results to identify the result rows. Multiple indexes might be used for a single query (index intersection feature).
Multicolumn indexes Multi-column indexes can have up to 16 columns Multi-column indexes can have up to 32 columns Multi-column indexes can have up to 16 columns
Partial indexes (an index built over a subset of a table using filter) Does not support partial indexes Supports partial indexes Supports partial indexes
Join algorithms MySQL executes joins between tables using only a nested-loop algorithm or variations of it. Supports nested-loop joins, Hash joins and merge joins algorithms. Supports nested-loop joins, hash joins and merge joins algorithms.
Query execution plan reuse Maintains caches for prepared statements and stored programs on a per-session basis. Statements cached for one session are not accessible to other sessions. Caches query plans only as long as the prepared statement is open. The query plan is disposed when the prepared statement is closed. Has shared execution plan cache to enable queries to reuse execution plans
Statistics Maintains persistent and non-persistent statistics (cleared on server restart) Maintains statistics used by the planner, they are being updated by ANALYZE or VACUUM or CREATE INDEX Maintains persistent statistics
Memory-optimized tables MySQL has got an ability to store tables in memory. The tables that are created in memory do not support transactions, their data is vulnerable to crashes. Those tables should be used as a temporary area or as a read-only caches. Does not offer any in-memory engine. In-memory OLTP is integrated into SQL Server’s database engine
Columnstore or row- store MariaDB have recently launched the column store engine for MySQL which was designed as a massively parallel database in an environment with multiple servers. It can be used instead of InnoDB storage engine. Row-store. Does not offer any columnar storage engine. SQL Server offers column store indexes to query large tables

JSON and Data Type Support for MySQL, PostgreSQL and SQL Server

MySQL Postgresql SQL Server
JSON data type MySQL has JSON data type support and also supports in place partial updates over the JSON instead of replacing the whole document however there are many limitations. It does not support indexing for JSON but there are workarounds. PostgreSQL supports JSON data type and supports partial updates SQL Server supports JSON data type and supports partial updates
Additional Advanced data types Supports Geospatial data type. No user-defined types. Supports Geospatial and lots of advanced data types, such as multi-dimensional arrays, user-defined types, etc. Supports Geospatial data type, Hierarchical data

Sharding / Partitioning / Replication for MySQL, PostgreSQL and SQL Server

MySQL PostgreSQL SQL Server
Partitioning support Supports HASH partitioning (use HASH function on any column to split table into N partitions), RANGE or LIST partitioning that can be based on several columns and KEY partitioning which is similar to HASH but based on some auto generated number. Supports RANGE and LIST partitioning but partitions and indexes on them must be manually created and old-style partitioning via table inheritance (when querying the parent table, all children tables are being queries as well, children tables have constraints on partitioning column. Interesting fact: Children tables can have more columns that parent table and indexes must be applied separately on children tables.) Supports RANGE partitioning.
Sharding support No good sharding implementation (MySQL Cluster is rarely deployed due to many limitations) There are dozens of forks of Postgres which implement sharding but none of them yet haven’t been added to the community release. No standard sharding implementation.
Replication Master-slave replication based on statements or based on changed rows

Group replication with master server automatic election
Master - slave replication based on changed rows and log shipping. Database level: Availability Groups master-multiple slaves

Log shipping

On data level: Master-slave / Bi-directional master-slave/ and master-master (merge) replication
Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Maria Zakourdaev Maria Zakourdaev has been working with SQL Server for more than 20 years. She is also managing other database technologies such as MySQL, PostgreSQL, Redis, RedShift, CouchBase and ElasticSearch.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2018-10-29

Comments For This Article




Monday, July 22, 2019 - 9:56:10 AM - Cergey Chaulin Back To Top (81833)

A few notes:

As for row updates in sql server, it is not entirely true. Often row updates are in-place - https://sqlinthewild.co.za/index.php/2011/06/21/are-all-updates-split-into-delete-insert/.

Also in sql server, there are user defined types, aggregates (with clr support). Procedures and functions can also be done using CLR.

As for MySQL, MariaDB (does it count?) has hash joins.


Monday, November 5, 2018 - 1:32:27 PM - Phillip Novak Back To Top (78161)

While the comment on MSSQL, " no option to set multiple buffer pools " is technically correct in that there is no "buffer pool" command. MSSQL does offer multiple buffer pools when implementing the Resource Governor feature. 

 


Monday, November 5, 2018 - 12:42:57 PM - Greg Robidoux Back To Top (78160)

Thanks Justin, we have updated the tip.

-Greg


Monday, November 5, 2018 - 12:18:18 PM - Justin Patterson Back To Top (78159)

Sharding is mispelled in the Postgres section 


Monday, November 5, 2018 - 9:19:55 AM - adm Back To Top (78157)

This statement is no longer true: "Manual partition management in PostgreSQL requires too much overhead and updates that move rows from one partition to another will fail"

There are significant partitioning enhancements in Postgresql 11:

"Improvements to partitioning functionality, including:
- Add support for partitioning by a hash key
- Add support for PRIMARY KEY, FOREIGN KEY, indexes, and triggers on partitioned tables
- Allow creation of a “default” partition for storing data that does not match any of the remaining partitions
- UPDATE statements that change a partition key column now cause affected rows to be moved to the appropriate partitions
- Improve SELECT performance through enhanced partition elimination strategies during query planning and execution"

https://www.postgresql.org/docs/11/static/release-11.html

 


Monday, November 5, 2018 - 3:39:36 AM - IJeb Reitsma Back To Top (78155)

This statement is wrong:

SQL Server: "No multiple index support in a single query." 

I have seen many query plans where the database engine uses two indexes on a single table, combining the results after the search.


Wednesday, October 31, 2018 - 12:18:03 AM - gwise Back To Top (78119)

 UPDATE statements that change a partition key column now cause affected rows to be moved to the appropriate partitions 

https://www.postgresql.org/docs/11/static/release-11.html

 


Tuesday, October 30, 2018 - 4:15:21 AM - Sergey Ch Back To Top (78111)

 Thank you Maria, you did a great job approving MSSQL is the best RDBMS one more time :)

 

Although there are several features I liked in MySQL and Postgre, it still looks like a clear win by MSSQL in total.

 


Monday, October 29, 2018 - 10:39:19 AM - Maria Back To Top (78109)

 Thank you @William Meitzen for your contribution! I will fix that section 

 


Monday, October 29, 2018 - 10:06:57 AM - Adel Yousuf Back To Top (78108)

Hi Maria

Thanks for this great topic, but I have a question "Where is Oracle in this comparison"?

 

Regards


Monday, October 29, 2018 - 8:22:13 AM - William Meitzen Back To Top (78106)

PostgreSQL supports index-organized tables (clustered tables).  SQL Server keeps the clustered order automatically, PostgreSQL must be told to update the clustered order ("cluster table_name").

 

PostgreSQL has temporary tables, but not table variables.















get free sql tips
agree to terms