Data modeling in ScyllaDB: Choosing your primary key

Shahar Gvirtz
6 min readNov 25, 2020

ScyllaDB is an Apache Cassandra compatible database system, developed with performance in mind (and therefore written in CPP). In this posts-series, I’ll discuss some of the important things you should know when designing a data model for Scylla based applications (most of these things applies for Cassandra as well). In this post, I’ll start by discussing what is a primary key in Scylla, and how to choose one.

Image by Scylla Inc.

Choosing Primary key, Partitioning and Clustering key

A table in Scylla contains multiple partitions. When defining a table in Scylla, you choose which columns will be part of your primary key. The primary key uniquely identifies a row. There can’t be two rows with the same primary key.

The primary key in Scylla consists of two parts:

· The partitioning key, which decides to which partition the specific row will belong (and, because partitions are the physical unit where the data stored, it’s also affecting on which node the data will be stored)

· The clustering key, which decides how the rows will be sorted with the specific partition

Choosing the primary key and its component right is one of the most important data modeling choices you must make.

So, how do you choose your primary key and its component? By the queries, you’re going to make.
To decide what will be the table’s primary key, and what will be its component, you must know how the table is going to be queried.

Understanding how partitioning & clustering key works

Scylla gives you extremely fast access to a row by its primary key and fast access to all rows inside a specific partition. On the other hand, Scylla isn’t designed for “full table scans” queries, or for queries involving multiple partitions.
For example, let’s think about a table with columns [A, B, C, D, E], and a primary key which is ((B, A, C), E).

In that example:

· B, A, C, E are the composite columns of the primary key

· B, A, C are the columns of the partitioning key (enclosed in the inner parentheses)

· E is the clustering key column

This modeling will let us:

· Get a specific row by the values of B, A, C, E extremely fast (SELECT * FROM tbl WHERE B=’val1’ AND A=’val2’ AND C=’val3’ AND E = ‘val4’).

· Get all the values in the partitions fast. The following query will return us the first bunch of rows fast, and will let us iterate through all the partition if we want to:
SELECT * FROM tbl WHERE B=’val1’ AND A=’val2’ AND C=’val3’

· Because E is the clustering key, we can sort by it as well:
SELECT * FROM tbl WHERE B=’val1’ AND A=’val2’ AND C=’val3’ ORDER BY E

But, unlike RDBMS “regular” B+ tree indices, we won’t benefit from a query like SELECT * FROM tbl WHERE B=’val1’ AND A=’val2’, and this query will actually cause a full table scan.

Another difference in Scylla comparing to an RDBMS is for queries like SELECT * FROM tbl WHERE B=’val1’ AND A=’val2’ AND C=’val3’ AND D=’val4’. Here, although we’re accessing a single partition, we do some additional filtering. This filtering is not allowed without ALLOW FILTERING at the end of the query, and it’s not recommended to rely on it as a design choice (unless you know there won’t be many rows returning).

If we want the flexibility of sometimes querying one of the columns, we can use them as clustering columns. For example, how should we define our primary key if we want to support the following queries:

In that case, we can create our table by choosing ((B, A), C, D) (B and A are the partitioning key, C and D are the clustering columns). Making both C and D as clustering columns gives us the freedom to use the following access patterns:

It’s important to notice that we can’t query on D without C, unless specifying ALLOW FILTERING. If we’ll do, we’ll get the following error message:

InvalidRequest: Error from server: code=2200 [Invalid query] message=”PRIMARY KEY column “d” cannot be restricted as preceding column “c” is not restricted”

Choosing the right partitioning & clustering key

After we understood how choosing the primary key affects the access pattern we can use, let’s get into some tips about choosing the partitioning and clustering key:

· Know your queries. As we saw, the definition of the primary key has a significant effect on how you can query your data, and what types of queries will be fast. Make sure you think of all (or most) of your queries before designing your tables. If you want the great performance Scylla can give you, you must design your tables for the queries.

· Don’t base your data-model on “ALLOW FILTERING”, even in the partition level. Sometimes, it might be tempting to just add “ALLOW FILTERING” clause at the end of your query, and just let ScyllaDB handles it for you. That’s not a good idea, for many reasons:
1. ALLOW FILTERING queries runs a sequential scan over the data, which is not even the fastest way to perform a full table scan.

2. Even if you think you are filtering over a small number of rows, just in a single partition, you might be surprised in the future when one of the partition grows and the performance decreases.

3. It might have cluster-wide effects.

· Don’t be afraid of duplicating data. If you have a different type of access pattern, that just can’t be together in the same table, because they need different partitioning keys, don’t be afraid of duplicating the data and have a different table that will be used to access the data from a different angle, and have a different primary key.
If this new table uses the exact same data and schema, you don’t even have to duplicate the data by yourself, ScyllaDB can do it for you by using Materialized Views or Global Secondary Indexes

· Try to not duplicate data if not needed. Sometimes it looks easier just to create multiple secondary indexes or materialized views.
But duplicating the data from various angles has its cost: it requires more space (both disk space and memory usage, where the data cache is stored), makes maintenance tasks slower, and might not be needed.
Sometimes you can save those additional costs just by choosing carefully what will be in your partitioning key and what will be part of your clustering key.

· Avoid large partitions. For various reasons, Scylla prefers smaller partitions. “Small” means preferable smaller than 100MB (that’s the default definition of a large partition). So, it’s better if you plan your schema (choose the partitioning key) in a way each partition will be smaller than 100MB. Sometimes it’s easier (for example, if you have time-series data, just split it for small enough chunks to not exceed this recommendation), and sometimes it’s not that natural.
In our data-model, some of the tables include a “BucketId” column, which is calculated by modulo on a consistent hashing function over immutable data. This allows us to divide a large partition (that doesn’t divide naturally into smaller chunks) into smaller partitions, and still know exactly where each row is going to be (so when we need to access a specific row, we know programmatically to access it by its primary key).
In addition to splitting the data into smaller partitions, it allows us to increase parallelism. By designing our services to initiate concurrent requests to different partitions (and sometimes even use this bucketing as a method to split work between different nodes of our application), we were able to decrease our response time for queries that involve a full-scan of a single entity, by splitting the rows of this entity to multiple partitions, instead of just one.

Closing

Unlike RDBMS, primary keys in Scylla aren’t just about having logically correct data but has a significant effect on the supported data-access patterns. Also, choosing a primary key in Scylla isn’t about normalizing data. It’s about how the data will be spread across the cluster, and how it will be sorted. Scylla encourages you to de-normalize your data and design your tables based on the workload you run.

--

--