PostgreSQL Compatibility in YugabyteDB 2.0
The team at Yugabyte and members of the community were excited to announce the general availability of YugabyteDB 2.0 this week. One of the flagship features of the release was the production readiness of the PostgreSQL compatible, Yugabyte SQL API (YSQL). In other blogs we covered Jepsen testing results, new performance benchmarks and ecosystem integrations including the GraphQL projects Hasura and Prisma. In this post we’d like to take the opportunity to dive a little deeper into the PostgreSQL compatibility features to demonstrate what’s supported and what’s around the corner.
In a nutshell, YugabyteDB 2.0 supports:
- Distributed transactions
- Serializable and snapshot isolation
- All simple and complex data types
- Foreign keys
- Stored procedures
- Most built-in functions including window functions and various expressions
- Simple and complex queries including JOINs, UPSERTs, etc
- Native JSONB support
- Auto-increment ids
- Indexes and Views
- Authorization & access control
- Plus various extensions, with more on the way!
Almost every large organization these days has (or is developing a strategy) on how to use less (not more) of monolithic SQL databases like Oracle and SQL Server for new applications. A lot of the motivation to look at distributed SQL alternatives revolves around cost, the migration of workloads to the cloud, the adoption of cloud-native technologies and a move away from proprietary licenses to 100% open source databases.
At Yugabyte we decided early on that instead of creating a brand new flavor of SQL, that we’d aim for PostgreSQL syntax and wire-protocol compatibility instead. Why? As firm believers in open source and open APIs, building a proprietary SQL API was not even an option for us. We selected PostgreSQL because we were truly impressed by the openness, maturity and feature completeness of the query layer. These attributes have created a highly passionate and thriving community of users. Industry analysts are also observing these same trends that have led to a big resurgence of PostgreSQL, especially over the last 5 years. Matt Asay’s post “Why Oracle’s missteps have led to PostgreSQL’s ‘moment’ in the database market” analyzes the trends in a fair amount of detail.
Ok, let’s dive into the compatibility details!
In this category we have 30+ different types of “basic” SQL queries which include both simple and complex SELECTs, GROUP BYs, CUBEs, ROLLUPs, UNIONs and more. Everything we’ve tested so far shows YugabyteDB supporting these types of commands without issue. This is unlike CockroachDB which offers limited PostgreSQL compatibility compared to YugabyteDB. For example you can issue queries that contain GROUPING SETS and CUBES in YugabyteDB, which you can’t in CockroachDB:
SELECT contact_title, count(contact_title) AS number_of_people, city, country FROM suppliers GROUP BY GROUPING SETS (city, country, contact_title);
SELECT product_id, supplier_id, product_name, SUM(units_in_stock) FROM products GROUP BY product_id, CUBE(product_id, supplier_id);
In this category YugabyteDB supports all the various types of JOINs your use case might require, including:
- INNER JOIN
- OUTER JOIN
- FULL OUTER JOIN
- CROSS JOIN
- NATURAL JOIN
For example a SELECT with an INNER JOIN on three tables:
SELECT customers.company_name, employees.first_name, orders.order_id FROM orders INNER JOIN customers ON customers.customer_id = orders.customer_id INNER JOIN employees ON employees.employee_id = orders.employee_id;
As you might expect, YugabyteDB supports the creating and deleting of tables, plus the altering of columns and sequences. Because YugabyteDB is a distributed SQL database, you can also safely assume that the specification of primary and foreign keys, check, unique and NOT NULL constraints all work out-of-the-box. For example:
CREATE TABLE customer_customer_demo ( customer_id bpchar NOT NULL, customer_type_id bpchar NOT NULL, PRIMARY KEY (customer_id, customer_type_id), FOREIGN KEY (customer_type_id) REFERENCES customer_demographics, FOREIGN KEY (customer_id) REFERENCES customers );
One thing to note however, is that currently YugabyteDB only permits primary keys to be specified in the CREATE TABLE statement. This limitation will be removed when this GitHub issue is resolved. (Foreign keys however, can be added in a CREATE TABLE or ALTER TABLE.)
YugabyteDB supports all simple and advanced datatypes available in PostgreSQL including:
- Boolean types
- Character types
- Numeric types
- Temporal types (date, time, etc)
- UUID types
- User-Defined types like CREATE TYPE and CREATE DOMAIN
- Special types (macaddr, lseg, point, etc)
Stored procedures allow developers to “bundle up” operations that would otherwise take several queries and round trips into a single function. Stored procedures also help minimize duplicate code, as developers can reuse existing stored procedures to perform the same actions. What’s the difference between a FUNCTION and a PROCEDURE? A function cannot execute transactions. In other words, inside a function you cannot open a new transaction, even commit or rollback the current transaction. In PostgreSQL 11, stored procedures with transactional support were introduced to remedy this problem. Unlike CockroachDB, YugabyteDB supports the range of capabilities related to stored procedures including variables, constraints, loops, transactions and functions. For example:
CREATE OR REPLACE PROCEDURE transfer(INT, INT, DEC) LANGUAGE plpgsql AS $ BEGIN -- subtracting the amount from the sender's account UPDATE accounts SET balance = balance - $3 WHERE id = $1; -- adding the amount to the receiver's account UPDATE accounts SET balance = balance + $3 WHERE id = $2; COMMIT; END; $;
Indexes are an effective way to find specific rows much faster than it could do without indexes. YugabyteDB supports both basic and advanced capabilities in regards to indexes, including CREATE, DROP, multicolumn and something CockroachDB doesn’t support, partial indexes. For example:
CREATE TABLE test(id int primary key, b int, c int, d bool); CREATE INDEX test_idx ON test(b,c) WHERE d = true; EXPLAIN SELECT * FROM test WHERE b = 0 AND d = true; Index Scan using test_idx on test (cost=0.00..5.04 rows=10 width=13) Index Cond: (b = 0) (2 rows)
Although a view doesn’t store any data (unlike a materialized view), they do provide yet another way to offer up data to a query. A view is based on one or more tables, which are known as the “base” tables. When a view is created, we are essentially creating a query, often complex, so that the data it comprises can be accessed more readily. For example:
CREATE VIEW customer_view AS SELECT cu.customer_id AS id, cu.fname || ' ' || cu.lname AS name, a.address, a.postal_code AS "zip code", a.phone, city.city, country.country, CASE WHEN cu.activebool THEN 'active' ELSE '' END AS notes, cu.store_id AS sid FROM customer cu INNER JOIN address a USING (address_id) INNER JOIN city USING (city_id) INNER JOIN country USING (country_id);
A trigger is a function invoked automatically whenever an event like an INSERT, UPDATE, or DELETE happens. For example:
CREATE TRIGGER PayRate_Change ON Employees AFTER UPDATE AS BEGIN INSERT INTO ChangeLog (EmpID, CHangedBy, DateChanged, OldRate, NewRate) SELECT i.empid, suser_sname(), getdate(), d.PsyRate, i.PsyRate FROM inserted i JOIN deleted d ON i.empid = d.empid AND d.PsyRate <> i.PsyRate; -- does not take `NULL`s into account END;
Note that currently CockroachDB does not support triggers according to this open GitHub issue.
Window functions include things like DENSE_RANK, LAST_VALUE, and NTILE. Window functions are similar to aggregate functions, like AVG( ), in that they operate on a set of rows. However, unlike an aggregate function it does not reduce the number of rows returned by the query. The term “Window” is used to describe the set of rows on which the function operates. In a nutshell, a window function returns values from the rows inside the specified window. For example:
SELECT product_name, group_name, price, LAST_VALUE (price) OVER ( PARTITION BY group_name ORDER BY price RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS highest_price_per_group FROM products INNER JOIN product_groups USING (group_id);
There are two easy ways to start exploring YSQL using readily available PostgreSQL tutorials.
- PGExercises tutorial: 81 exercises including simple and complex queries, JOINs, subqueries, aggregations, working with timestamps, string operations and recursive queries.
- PostgreSQL Tutorial: 20 in-depth guides with hundreds of exercises that get you familiar with basic and advanced features of PostgreSQL that are applicable to YugabyteDB.