Yugabyte Database Engineering Update — July 20, 2018
Welcome to the inaugural edition of the YugabyteDB Community and Engineering update series! Let’s dive in and take a look at what has happened over the last few weeks.
There has been a lot of activity in terms of meetups and events. In June, YugaByte was at DockerCon. We also did a hands-on lab about building geo-distributed cloud apps at a Datariders meetup and a talk at Samsung about building modern apps at cloud scale. We will be at Google Cloud NEXT 2018 from July 24–26, 2018. Stop by and say hello!
You can also see our other upcoming events.
YugaByte is looking for a passionate Developer Advocate! Are you excited about becoming the voice of our users? Do you love experimenting with new technologies and presenting it at conferences, meetups and workshops? We would love to talk to you. Check out our list of open positions.
We recently released the YugabyteDB 1.0.4. This release packs a number of features on top of the 1.0 version by adding:
- Secondary indexes and unique constraints to ensure that a column does not have duplicate values.
JSONB datatype now supports fine grained select/update of attributes and built-in operators.
- Built-in functions to compute averages and to convert blobs to types.
- Ability to read from the local datacenter.
- Support for the ZSCORE command.
- Support bounded staleness for follower reads.
- YugabyteDB now works with Presto.
- C++, C# and Go client drivers are now supported.
For the upcoming 1.1 release , we are working towards the general availability of a number of critical features:
Other major items on the roadmap include:
- Security features like authentication of users in YCQL and YEDIS APIs
- Support for managed Kubernetes environments such as GKE and PKS
To view a list of all items being worked, browse to our GitHub projects page.
- There is now a tutorial on backing up data.
- Quick start guide to trying core transactional features in YugabyteDB such as distributed ACID transactions, secondary indexes and JSON documents.
- Here is a quick checklist to deploying YugabyteDB, including settings for various public clouds.
- Interested in how databases work under the hood? Read about the basics of DB storage engines and some of the more advanced considerations.
- Read all about multi-model and multi-API databases and how they simplify app development in the multi-cloud era.
- Everything you wanted to know about ACID transactions but did not know where to start, A Primer on ACID Transactions has you covered.
- Learn all about DynamoDB in 11 concrete points. You can also read about a comparison between MongoDB, DynamoDB and Apache Cassandra.
Videos and Technical Presentations
- Building modern apps at cloud scale that dives into the need for a microservices-based architecture in the cloud using Kubernetes.
- How YugabyteDB works on Docker and Kubernetes.
There have been a few important enhancement requests from the community.
Unique Secondary Index Enhancement
A recent request was to implement a unique secondary index in order to ensure that there were no duplicate values in a column. For example, consider an
employee table which has
employee_id as the primary key column, and an
There are a number of such scenarios in OLTP applications where uniqueness of the values in a column need to be ensured, and we believe this is a great addition to a transactional NoSQL database. Hence, we decided to prioritize this feature. Under the hood, the unique constraint performs a distributed transaction using conditional insert statements.
Fine-Grained Errors in Batch Inserts
In the current Apache Cassandra/CQL wire protocol, when any error occurs in a batch of insert operations, only a single error code can be returned. This is not ideal in cases when only a few insert operations fail, because in such cases the app cannot find out the failed inserts in the batch.
The feature request was to implement a way to return fine-grained errors in a batch. This can now be achieved by adding
RETURNS STATUS AS ROW clause to an insert statement. As an example:
t (k, c) VALUES (2, 2)
RETURNS STATUS AS ROW
If the above insert failed because of a unique index violation, it would return an error as follows:
[applied] | [message] | k | c
false | Duplicate value disallowed by unique index | 1 | 2
| k.t_unique_c | |
LIMIT and OFFSET Support
LIMIT and OFFSET enable paginating through results. If a limit count is given, no more than that many rows will be returned. OFFSET says to skip that many rows before beginning to return rows. If both OFFSET and LIMIT appear, then OFFSET rows are skipped before starting to count the LIMIT rows that are returned.
We felt that the LIMIT and OFFSET support feature request would enable a lot of use-cases that need to paginate through the results. Imagine an e-commerce site wants to display the orders placed by a user in a paginated fashion, displaying 10 orders per page. The first page of orders can be retrieved as follows:
SELECT * FROM orders WHERE user_id = 1000 LIMIT 10;
And the second page of orders can subsequently be retrieved using the following query:
SELECT * FROM orders WHERE user_id = 1000 LIMIT 10 OFFSET 10;
Highlights from GitHub, StackOverflow and Forums
Following are a few of the recent questions, comments and discussions that are worth pointing out.
Best Code Reading
Best Use Case Discussion
Here is a great discussion about a use-case to track, store and serve user actions on a platform. It starts out with a discussion on micro-second precision, dives into support for various isolation levels in distributed transactions and finally discusses hash-partitioning in YCQL. Great discussion yjiangnan!