Generate SQL Script in PostgreSQL

PostgreSQL Tips and Tricks
Franck Pachot

DBAs often need to generate SQL statements, specifically DDL (Data Definition Language), using SQL queries and then execute them. PostgreSQL has many features that can be helpful here.

All SQL databases provide their metadata as relational views on the catalog (i.e. database dictionary). That is not a coincidence. Codd’s rule number 4 mandates that a dynamic online catalog must be based on the relational model. Therefore, it’s logical to use SQL queries to generate DDL dynamically from the database metadata and execute the output.

YugabyteDB also uses one process per connection, mirroring PostgreSQL’s protocol and query layer. This isn’t typically a scalability problem as connections can be spread across nodes. With YugabyteDB, all nodes are equal, accepting connections with read and write transactions. Ongoing work to add a database resident connection pool—based on Odyssey—aims to address applications without client-side connection pools and microservices with too many connection pools.

Register for the Distributed SQL Summit

Join us at our free online Distributed SQL Summit on September 12 to hear about our new Connection Manager during the keynote.

Let’s walk through an example using YugabyteDB, which is Postgres-compatible and provides the same SQL language and catalog views as PostgreSQL.

yugabyte=#  select format(
             $SQL$ alter table %1$8I.%2$-30I add column %3$I text not null default %4$L $SQL$ ,
             schemaname, tablename, 'user_mod', 'init'
             ) as ddl from pg_tables where schemaname='public' and tablename like 'customer%';

                                                 ddl
------------------------------------------------------------------------------------------------------
  alter table   public.customer_customer_demo         add column user_mod text not null default 'init'
  alter table   public.customer_demographics          add column user_mod text not null default 'init'
  alter table   public.customers                      add column user_mod text not null default 'init'
(3 rows)

yugabyte=#  \gexec
ALTER TABLE
ALTER TABLE
ALTER TABLE

I have generated an ALTER TABLE statement to ADD COLUMN for a few tables following a pattern (%customer) in their name and executed the statement.

Let’s look, in detail, at each feature that I used.

The DDL statement will be a character string (text) in the result. Simple character strings can be enclosed in quotes (‘) but they may contain quotes themselves which have to be doubled (‘ ‘). This can quickly be a nightmare to read and maintain. PostgreSQL has a simple solution: Dollar-Quoted String Constants where a tag enclosed between two dollar signs ($) replaces a quote. The advantages are twofold. You can choose:

  1. a tag that will not collide with any text you have in the string
  2. different tags to enclose quotes in quotes

Note that the tag is not mandatory, and when there’s no need for multiple ones, $$ will appear a lot. Have you seen that same thing when defining a stored procedure or some procedural code in a DO block? If so, that’s because the PLpg/SQL code is declared in a character string as text and is interpreted as such at execution time.

Here is another example. Let’s use $PL$ for the PL/pgSQL code text, and $MESSAGE$ for the text to print with NOTICE:

yugabyte=# do $PL$ begin raise notice $MESSAGE$ select I'm a quote $MESSAGE$ ; end; $PL$;

NOTICE:   select I'm a quote
DO

This is equivalent to, but much easier to read, than:

yugabyte=# do ' begin raise notice '' select I''''m a quote '' ; end; ';

NOTICE:   select I'm a quote
DO

Because my DDL contains single quotes, or I want to document the intention of the text string, I built my ALTER TABLE … ADD COLUMN text with $SQL$.

The DDL I build dynamically needs to contain schemaname and tablename into it. Of course, I could concatenate strings like this:

select 'alter table '||schemaname||'.'||tablename||' add column' from pg_tables where schemaname='public' and tablename like 'customer%';

But identifiers should be double quoted when they contain special characters, reserved, or case sensitive words. The query could be:

select 'alter table "'||schemaname||'"."'||tablename||'" add column' from pg_tables where schemaname='public' and tablename like 'customer%';

I did that a lot with the Oracle database but PostgreSQL is better with the format() function. Omitting double quotes is a bug that you will probably encounter when you have table names that contain more than the basic characters. Similar to a printf() function, you can leave placeholders in the text to be replaced with the function arguments:

yugabyte=# select format('Hello %s','World');

   format
-------------
 Hello World
(1 row)

The same query, without the double quoted identifiers, is much simpler:

select format('alter table %s.%s add column', schemaname, tablename) from pg_tables where schemaname='public' and tablename like 'customer%';

And format has more than %s. I change it to %I to add quoted identifiers:

select format('alter table %I.%I add column', schemaname, tablename) from pg_tables where schemaname='public' and tablename like 'customer%';

Note that it does more than adding double quotes for all identifiers. It does this only when needed by the SQL identifier, because of case-sensitive name, reserved words, or special characters:

yugabyte=# select format('Hello %I','World');
    format
---------------
 Hello "World"
(1 row)

yugabyte=# select format('Hello %I','world');
   format
-------------
 Hello world

I like to align the keywords and use left padding by adding a width, and right padding with an additional ‘-‘ flag link in: %30I.%-30I

When you have single quotes, you can add them by using the dollar-quoted strings, but also with the %L in format:

yugabyte=# select format('Hello %L','world');

    format
---------------
 Hello 'world'
(1 row)

Sometimes you need the same argument at multiple places. Rather than repeating it in the list of arguments, an additional position can be interposed as %n$. You can try this one:

select format ('I %1$s, you %1$s, we all %1$s for ice %2$s. ', 'scream','cream');

This is how I generated my DDL statements with placeholders for schemaname, tablename but also some constants that I found more convenient to put as format’s arguments.

The next thing is to run it. Ending my query by ; displays the result. If I end it with \gexec it will execute it.

Summary: Generate SQL Script in PostgreSQL

To sum up, use:

  • $TAG$ to replace outer quotes
  • format() to build strings dynamically
  • %s for text, %L for test literals in single-quotes, %I for identifiers that may need double quotes
  • catalog views to get metadata
  • \gexec to execute the result

To clarify, the purpose of this is not to execute arbitrary DDL statements in production. Rather, it’s meant to create scripts that are easily maintainable, free from hardcoded text, and that won’t overlook double quotes for identifiers. Additionally, using a small additional where clause when querying from catalog views can aid in creating idempotent scripts that avoid executing existing statements again.

As a further example, I used this same technique in YugabyteDB to generate the 8 sequences. Watch my YugabyteDB Friday Tech Talk (YFTT), Best Practices Using Sequences, for more information.

Additional PostgreSQL Tips and Tricks

Generate Random Text Strings in PostgreSQL

Tips and Tricks Library

Franck Pachot

Related Posts

Explore Distributed SQL and YugabyteDB in Depth

Discover the future of data management.
Learn at Yugabyte University
Get Started
Browse Yugabyte Docs
Explore docs
PostgreSQL For Cloud Native World
Read for Free