Distributed SQL Tips and Tricks – July 17, 2020

Dorian Hoxha

Welcome to this week’s tips and tricks blog where we recap some distributed SQL questions from around the Internet. We’ll also review upcoming events, new documentation, and blogs that have been published since the last post. Got questions? Make sure to ask them on our YugabyteDB Slack channel, Forum, GitHub, or Stackoverflow. Ok, let’s dive right in:

How to identify a colocated database from the rest

One way to identify a colocated table in YugabyteDB is to look at the yb-master UI page (https://<yb-master-ip>:7000/tables). All colocated databases will have a system table entry with the name (<id>.colocated.parent.tablename).

You can also use the yb-admin utility like below:

./bin/yb-admin --master_addresses <master_addresses> list_tables_with_db_types | grep '\.colocated\.parent\.tablename

How to increment a counter in a YCQL MAP collection

At the moment doing an UPDATE a SET m['k'] = m['k'] + 1 WHERE... is not supported in YCQL (or Apache Cassandra).

For Collection Data Types, MAP/LIST/SET, + and - operators are used to append or remove the <key, value> entries from the collection.

As an alternative way we can use common columns (using of the map-key as a clustering column for better performance):

ycqlsh:k> create table t2 (id int, m_key text, m_val int, primary key((id), m_key));
ycqlsh:k> insert into t2 (id, m_key, m_val) values (100, 'k', 1);
ycqlsh:k> select * from t2;
 id  | m_key | m_val
 100 | 	k | 	1
(1 rows)
ycqlsh:k> update t2 set m_val = m_val + 1 where id = 100 and m_key = 'k';
ycqlsh:k> select * from t2;
 id  | m_key | m_val
 100 | 	k | 	2
(1 rows)

Limits in YCQL collections

Collections are designed for storing small sets of values that are not expected to grow to an arbitrary size (such as phone numbers or addresses for a user, rather than posts or messages). While collections of larger sizes are allowed, they may have a significant impact on performance for queries involving them. In particular, some list operations (insert at an index and remove elements) require a read-before-write.

If the application warrants a big number of values, it’s better to model the collection as another separate table like in the question above.

Running a plain ysql_dump while the table is receiving inserts may timeout and result in Query error: Restart read required errors.

./tserver/postgres/bin/ysql_dump -h -d postgres --data-only --table my_table -f out.txt
ysql_dump: Dumping the contents of table "sqlsecondaryindex" failed: PQgetResult() failed.
ysql_dump: Error message from server: ERROR:  Query error: Restart read required at: { read: { physical: 1592265362684030 } local_limit: { physical: 1592265375906038 } global_limit: <min> in_txn_limit: <max> serial_no: 0 }

To read against a consistent snapshot and avoid running into the “read restart” error, pass the --serializable-deferrable option to ysql_dump. For example:

~/tserver/postgres/bin/ysql_dump -h -d postgres \
   --data-only --table sqlsecondaryindex \
   --serializable-deferrable -f data1.csv

New Documentation, Blogs, Tutorials, and Videos

New Blogs

New Videos

New and Updated Docs

To support the release of YugabyteDB 2.2, we published these new docs:

Upcoming Events

Get Started

Ready to start exploring YugabyteDB features? Getting up and running locally on your laptop is fast. Visit our quickstart page to get started.

High performance distributed SQL by Yugabyte

Dorian Hoxha

Related Posts

Explore Distributed SQL and YugabyteDB in Depth

Discover the future of data management.
Learn at Yugabyte University
Get Started Business
Browse Yugabyte Docs
Explore docs Business
PostgreSQL For Cloud Native World
Read for Free Business