Lift-and-Shift of High Write-Throughput Apps with YugabyteDB
This blog explains how index design in YugabyteDB enables globally ordered reads without requiring equality predicates while maintaining high write throughput.
These indexes avoid hot tablets caused by monotonic inserts (timestamps, sequences) and still preserve ordered index scans.
YugabyteDB 2025.2.1 includes a planner optimisation that can produce a globally ordered result without query changes, even when additional ranges are introduced into the index structure.
Which Indexes Can You Create?
YugabyteDB supports:
- HASH indexes (defined by double parentheses)
- RANGE indexes (ASC/DESC) are defined by single parentheses with a leading ASC/DESC
- A combination of the two, with a leading HASH
| Access method | Supports ordering | Supports range | Equality |
|---|---|---|---|
| (1) ASC/DESC | Yes | Yes | Yes |
| (2) HASH | No | No | Yes |
| (3) HASH leading ASC/DESC | Yes | Yes | Yes |
| (4) BUCKET leading ASC/DESC | Yes | Yes | Yes |
(1) CREATE INDEX yb_range on te(timestamp asc); (2) CREATE INDEX yb_hash on te((uuid) hash); (3) CREATE INDEX yb_hashrange on te((uuid) hash, timestamp asc); (4) CREATE INDEX yb_range on te(bucket ASC, timestamp asc);
What query patterns are supported with each index type?
- Leading range sharding (ASC/DESC) preserves order for range scans, and ORDER BY (LIMIT) queries, but monotonic inserts (e.g., NOW() timestamps or sequences) always append to the end of the range. This concentrates writes into the most recent tablet, creating hot shards and uneven tablet sizes.
- Hash sharding distributes writes evenly, but it breaks ordering on the hash key. Range queries on the hash key then require full scans and sorts.
- Hash sharding with range ordering distributes writes evenly and supports equality on the hash key, and enables ORDER BY and range scans on the range key.
- Bucket leading ASC/DESC columns distribute writes evenly and allow for ORDER BY and range scans without any equality key.
High-Throughput Monotonic Inserts on a Leading ASC/DESC Indexed Key
High-throughput writes on an ASC/DESC indexed key which receives monotonic inserts directly cause hot tablets, which can overload a single node. This leads to:
- Limiting scalability
- Limiting throughput
- Increasing latency
- Unbalanced resource usage and compaction pressure.
Index Ordering and Scalability
Q: How do you both preserve index ordering and scalability from monotonic inserts on a leading ASC/DESC indexed key?
A: Introduce a low-cardinality prefix (bucket)
This bucket is commonly created with a virtual index or GENERATED columns using the yb_hash_code(<key_columns>) % <number>: expression, as below:
CREATE INDEX yb_virtualbucket on te
((yb_hash_code("timestamp") % 3) ASC, timestamp asc)
SPLIT AT VALUES ((1), (2));Bucket Design Recommendations:
- The simplest way to perform Modulo-based hashing sizing is based on your write throughput; in this example, it is 3 because there are 3 nodes, and we want it to write to all 3. If the example had 9 nodes and received high write-throughput, then the recommended modulo would be at least 9.
- If it’s a unique index or PK, the arguments to yb_hash_code must be a subset of the unique index key or PK columns.
- If it’s a non-unique index, the columns in the yb_hash_code don’t matter as long as they are deterministic.
- The SPLIT clause is optional but recommended, as it provides even distribution of each bucket across nodes
- The bucket column or index expression column should generally be the first column of the key.
What it does:
- Write scalability to at least the number of buckets
- Balanced resource usage
Simple User Examples
This example uses the 2025.2.1 version and a 3-node YugabyteDB cluster.
- Create a table with a monotonic column
drop table te; CREATE TABLE te ( id SERIAL PRIMARY KEY, "timestamp" TIMESTAMPTZ NOT NULL DEFAULT NOW() );
- Create an index that always writes to 3 nodes
CREATE INDEX yb_nothotspot ON te ((yb_hash_code("timestamp") % 3) ASC, "timestamp") include (id) SPLIT AT VALUES ((1), (2));Note: the “hotspot” index would have looked like this
...hotspot ON te ("timestamp" asc) include (id); - Insert monotonic data
INSERT INTO te ("timestamp") SELECT '2026-01-01 00:00:00+00'::timestamptz + (gs * (('2024-01-01 00:00:00+00'::timestamptz - '2026-01-01 00:00:00+00'::timestamptz) / 10000000)) FROM generate_series(0, 9999999) AS gs; - Enable features that preserve global ordering across the buckets by activating planner optimisations.
analyze; set yb_max_saop_merge_streams=64; set yb_enable_derived_saops=true; set yb_enable_cbo=on; Alter database yugabyte set yb_max_saop_merge_streams=64; Alter database yugabyte set yb_enable_derived_saops=true; Alter database yugabyte set yb_enable_cbo=on;
- Observe planner changes that provide global ordering with no SQL or application changes.
explain (Analyze) SELECT timestamp FROM te WHERE "timestamp" >= '2020-01-01' AND "timestamp" < '2030-01-01' ORDER BY timestamp;
Notice that no quicksort is present and that the
yb_enable_derived_saopsfeature passed the “bucket” column into the index condition. Pggate has then merged the streams and returned the data without a sort node.Index Only Scan using yb_nothotspot on te (actual rows=10000000 loops=1) Index Cond: (("timestamp" >= '2020-01-01 00:00:00+00'::timestamp with time zone) AND ("timestamp" < '2030-01-01 00:00:00+00'::timestamp with time zone) AND (((yb_hash_code("timestamp") % 3)) = ANY ('{0,1,2}'::integer[]))) Merge Sort Key: "timestamp" Merge Stream Key: (yb_hash_code("timestamp") % 3) Merge Streams: 3You see how powerful this feature is during the introduction of the LIMIT clause. The SQL is asking for 1000 globally ordered rows, and PostgreSQL can again return it without the sort node, while scanning 1000 rows per bucket. This makes it very efficient.
explain (analyse, costs off, timing on, dist) SELECT timestamp FROM te WHERE "timestamp" >= '2020-01-01' AND "timestamp" < '2030-01-01' ORDER BY timestamp LIMIT 1000; Limit (actual time=1.682..2.258 rows=1000 loops=1) -> Index Only Scan using yb_nothotspot on te (actual time=1.680..2.122 rows=1000 loops=1) Index Cond: (("timestamp" >= '2020-01-01 00:00:00+00'::timestamp with time zone) AND ("timestamp" < '2030-01-01 00:00:00+00'::timestamp with time zone) AND (((yb_hash_code("timestamp") % 3)) = ANY ('{0,1,2}'::integer[]))) Merge Sort Key: "timestamp" Merge Stream Key: (yb_hash_code("timestamp") % 3) Merge Streams: 3 Storage Index Rows Scanned: 3000 Execution Time: 2.402 msThis also works perfectly for more complex OLTP top-n queries, such as keyset pagination. Create a more complicated index and an extra predicate.
ALTER TABLE te ADD COLUMN key_id integer NOT NULL DEFAULT 123; analyze;
CREATE INDEX scalable_key_timestamp ON public.te ( (yb_hash_code("timestamp") % 3) asc, key_id, "timestamp" ASC, id ) SPLIT AT VALUES ((1), (2));The global ordering is still preserved on this keyset pagination without any changes to the SQL.
explain (analyze, costs off, timing on) SELECT * FROM public.te WHERE 1=1 AND key_id = 123 AND "timestamp" >= '2025-05-05 08:00:00' AND ("timestamp", id) > ('2025-05-05 08:00:00', 1) ORDER BY "timestamp" ASC, id ASC LIMIT 1000;Limit (actual time=2.059..2.633 rows=1000 loops=1) -> Index Only Scan using scalable_key_timestamp on te (actual time=2.058..2.513 rows=1000 loops=1) Index Cond: ((key_id = 123) AND ("timestamp" >= '2025-05-05 08:00:00+00'::timestamp with time zone) AND (ROW("timestamp", id) > ROW('2025-05-05 08:00:00+00'::timestamp with time zone, 1)) AND (((yb_hash_code("timestamp") % 3)) = ANY ('{0,1,2}'::integer[]))) Merge Sort Key: "timestamp", id Merge Stream Key: (yb_hash_code("timestamp") % 3) Merge Streams: 3 Heap Fetches: 0 Planning Time: 0.139 ms Execution Time: 2.777 ms Peak Memory Usage: 24 kB
Practical Guidance (Proactive Checklist)
- When my ASC/DESC indexed column has a high write throughput of monotonic values
- When my high write-throughput ASC/DESC indexed column requires range and/or ordered SQL access without hash equality
- When my indexed column is ASC/DESC sharded, and the first high-cardinality column in the key (which could be the second or third from the first key position) is monotonically increasing.
(3) Example – this is still a hotspot, ~90% of rows may write to one node
CREATE INDEX yb_hotspot
ON te ("type" ASC, "timestamp" ASC) where TYPE IN ('SUCCESS','FAILED');Conclusion
In distributed SQL systems, timestamp-ordered inserts, sequence-based IDs, and “latest N items” queries such as activity feeds, time-series data, and audit logs are a common pattern.
This blog demonstrates why these patterns can create write hotspots in distributed architectures, and how ordering, key design, and access paths directly impact scalability. With YugabyteDB, understanding and addressing these trade-offs is now significantly simpler, enabling true lift-and-shift of high-write-throughput applications.
With the new YugabyteDB planner optimizations, you can apply a simple index change to eliminate hotspots in the physical data model that provide “the latest N items.”
The query planner automatically performs the necessary rewrites to preserve global ordering for “latest N,” allowing you to scale writes horizontally while maintaining correct result ordering and low-latency reads with no application changes.
Download the latest version of YugabyteDB and try it for yourself.