PostgreSQL How-to: SportsDB Running on a Distributed SQL Database
SportsDB is a sample dataset compiled from multiple sources, encompassing a variety of sports including football, baseball, ice hockey and more. It also cross-references many different types of content media. It is capable of supporting queries for the most intense of sports data applications, yet is simple enough for use by those with minimal database experience. The database itself is comprised of over 100 tables and just as many sequences, unique constraints, foreign keys and indexes. The dataset also includes almost 80k rows of data. It has been ported to MySQL, SQL Server and PostgreSQL. You can check out a detailed ER diagram here.
In this post we are going to walk you through how to download and install the PostgreSQL compatible version of SportsDB onto the YugabyteDB distributed SQL database with a replication factor of 3.
For complete instructions on how to get up and running on a variety of platforms including prerequisites, check out our Quickstart Guide. In the following section we’ll cover the basic steps for getting up and running in just a few minutes with a local 3 node cluster on your Mac.
$ wget https://downloads.yugabyte.com/yugabyte-18.104.22.168-darwin.tar.gz
$ tar xvfz yugabyte-22.214.171.124-darwin.tar.gz && cd yugabyte-126.96.36.199/
Note: The above instructions are for version 1.3.0. To find the latest version of YugabyteDB, visit the quickstart page.
Add a few loopback IP addresses for the various YugabyteDB nodes to use.
sudo ifconfig lo0 alias 127.0.0.2
sudo ifconfig lo0 alias 127.0.0.3
sudo ifconfig lo0 alias 127.0.0.4
sudo ifconfig lo0 alias 127.0.0.5
sudo ifconfig lo0 alias 127.0.0.6
sudo ifconfig lo0 alias 127.0.0.7
With the command below you’ll create a 3 node cluster with a replication factor of 3.
$ ./bin/yb-ctl --rf 3 create --tserver_flags "memory_limit_hard_bytes=6442450944”
Note: Due to the hardware limitations of my laptop and size of the scripts, I temporarily bumped up the available memory accessible to YugaByte by adding the above
tserver_flags argument. Depending on your setup, you might need to bump it up as well if you see errors like the one below while executing the database scripts.
Service unavailable (yb/tserver/tablet_service.cc:239): Soft memory limit exceeded (at 96.13% of capacity)]
Now let’s take a look at the status of the cluster and all the nodes that comprise it.
$ ./bin/yb-ctl status
As you can see from the output, we have three nodes running locally with a replication factor of 3. This means that every piece of data is being replicated on all three nodes.
Next run the YSQL shell command to enter the PostgreSQL shell.
$ ./bin/ysqlsh --echo-queries
Type "help" for help.
We are now ready to build the SportsDB database.
You can download the SportsDB database that is compatible with YugabyteDB from our GitHub repo. The five files you’ll need are:
- sportsdb_tables.sql which creates tables and sequences
- sportsdb_inserts.sql which loads the sample data into the sportsdb database
- sportsdb_constraints.sql which creates unique constraints
- sportsdb_fks.sql which creates foreign key constraints
- sportsdb_indexes.sql which creates indexes
We’ve purposely broken up what would otherwise be a very large script. By breaking it up into building blocks, it’ll be easier to see what YugabyteDB is doing and spot any problems (if you encounter them) a lot easier.
CREATE DATABASE sportsdb;
Let’s confirm we have the sportsdb database by listing out the databases on our cluster.
Switch to the sportsdb database.
postgres=# \c sportsdb
You are now connected to database "sportsdb" as user "postgres".
sportsdb=# \i /Users/yugabyte/sportsdb_tables.sql
We can verify that all 203 tables and sequences have been created by executing:
Next, let’s load our database with sample data, ~80k rows.
sportsdb=# \i /Users/yugabyte/sportsdb_inserts.sql
Let’s do a simple SELECT to pull data from the basketball_defensive_stats table to verify we now have some data to play with.
sportsdb=# SELECT * FROM basketball_defensive_stats WHERE steals_total = '5';
Next, let’s create our unique constraints and foreign keys by executing:
sportsdb=# \i /Users/yugabyte/sportsdb_constraints.sql
sportsdb=# \i /Users/yugabyte/sportsdb_fks.sql
Finally, let’s create our indexes by executing:
sportsdb=# \i /Users/yugabyte/sportsdb_indexes.sql
Note: If you have worked with the SportDB sample database in the past, you know that the index creation section specifies the use of a BTREE index. YugabyteDB makes use of LSM trees, so we’ve modified the script as such. You can read more about LSM vs BTREE in our post, “A Busy Developer’s Guide to Database Storage Engines – The Basics.” Even if we had not specified LSM, you would have seen an informational message that advised you that YugabyteDB had made the switch behind the scenes.
That’s it! You are ready to start exploring SportsDB running on YugabyteDB using your favorite PostgreSQL admin or development tool. You can learn more about the SportsDB project, libraries, samples, web services and more by visiting the project page here.