Bruce Momjian

Postgres Blog


This blog is about my work on the Postgres open source database, and is published on Planet PostgreSQL. PgLife allows monitoring of all Postgres community activity.

Online status:
Unread Postgres emails:
Email graphs: incoming, outgoing, unread, commits (details)



Presentation Postgres 11 Features Presentation

Friday, September 14, 2018

Now that I have given a presentation about Postgres 11 features in New York City, I have made my slides available online.

 


Administration Multi-Host Pg_dump

Wednesday, September 12, 2018

You have probably looked at logical dumps as supported by pg_dump and restores by pg_restore or, more simply, psql. What you might not have realized are the many options for dumping and restoring when multiple computers are involved.

The most simple case is dumping and restoring on the same server:

$ pg_dump -h localhost -Fc test > /home/postgres/dump.sql
$ pg_restore -h localhost test < /home/postgres/dump.sql

or with a plain text dump:

$ pg_dump -h localhost -f /home/postgres/dump.sql test
$ psql -h localhost -f /home/postgres/dump.sql test

Where this gets interesting is with multiple hosts. You can:

$ # dump a remote database to your local machine
$ pg_dump -h remotedb.mydomain.com -f /home/postgres/dump.sql test
 
$ # dump a local database and write to a remote machine
$ pg_dump -h remotedb.mydomain.com test | ssh postgres@remotedb.mydomain.com 'cat > dump.sql'
 
$ # dump a remote database and write to the same remote machine
$ pg_dump -h remotedb.mydomain.com test | ssh postgres@remotedb.mydomain.com 'cat > dump.sql'
 
$ # or a different remote machine
$ pg_dump -h remotedb1.mydomain.com test | ssh postgres@remotedb2.mydomain.com 'cat > dump.sql'

You also have similar restore options. I will use psql below but pg_restore works the same:

$ # dump a remote database and restore to your local machine
$ pg_dump -h remotedb.mydomain.com test1 | psql test2
 
$ # dump a local database and restore to a remote machine
$ pg_dump -h remotedb.mydomain.com test | ssh postgres@remotedb.mydomain.com 'psql test'
 
$ # dump a remote database and restore to the same remote machine
$ pg_dump -h remotedb.mydomain.com test1 | ssh postgres@remotedb.mydomain.com 'psql test2'
 
$ # or a different remote machine
$ pg_dump -h remotedb1.mydomain.com test | ssh postgres@remotedb2.mydomain.com 'psql test'

As you can see, there is a lot of flexibility possible.

Post a Comment

Administration Monitoring Complexity

Monday, September 10, 2018

I have always had trouble understanding the many monitoring options available in Postgres. I was finally able to collect all popular monitoring tools into a single chart (slide 96). It shows the various levels of monitoring: OS, process, query, parser, planner, executor. It also separates instant-in-time reporting and across-time alerting/aggregation options.

Post a Comment

Security Signing Rows

Friday, September 7, 2018

With the rsa keys created in my previous blog entry, we can now properly sign rows to provide integrity and non-repudiation, which we did not have before. To show this, let's create a modified version of the previous schema by renaming the last column to signature:

CREATE TABLE secure_demo2 (
        id SERIAL, car_type TEXT, license TEXT, activity TEXT, 
        event_timestamp TIMESTAMP WITH TIME ZONE, username NAME, signature BYTEA);

Now, let's do the insert as before:

INSERT INTO secure_demo2
VALUES (DEFAULT, 'Mazda Miata', 'AWR-331', 'flat tire',
CURRENT_TIMESTAMP, 'user1', NULL) RETURNING *;
 id |  car_type   | license | activity  |        event_timestamp        | username | signature
----+-------------+---------+-----------+-------------------------------+----------+-----------
  1 | Mazda Miata | AWR-331 | flat tire | 2017-07-08 10:20:30.842572-04 | user1    | (null)

Now, we sign it using our private rsa key, rather than creating a random key for this:

SELECT ROW(id, car_type, license, activity, event_timestamp, username)
FROM secure_demo2
WHERE id = 1
 
-- set psql variables to match output columns
\gset
 
\set signature `echo :'row' | openssl pkeyutl -sign -inkey ~user1/.pgkey/rsa.key | xxd -p | tr -d '\n'`
 
UPDATE secure_demo2 SET signature = decode(:'signature', 'hex') WHERE id = 1;
 
SELECT * FROM secure_demo2;
 id |  car_type   | license | activity  |        event_timestamp        | username | signature
----+-------------+---------+-----------+-------------------------------+----------+-------------
  1 | Mazda Miata | AWR-331 | flat tire | 2017-07-08 10:20:30.842572-04 | user1    | \x857310...

To later verify that the data row has not been modified, we can do:

SELECT ROW(id, car_type, license, activity, event_timestamp, username), signature
FROM secure_demo2
WHERE id = 1;
                                     row                                     | signature
-----------------------------------------------------------------------------+-------------
 (1,"Mazda Miata",AWR-331,"flat tire","2017-07-08 10:20:30.842572-04",user1) | \x857310...
\gset
 
\echo `echo :'signature' | xxd -p -revert > sig.tmp`
\echo `echo :'row' | openssl pkeyutl -verify -pubin -inkey /u/postgres/keys/user1.pub -sigfile sig.tmp`
Signature Verified Successfully
 
\! rm sig.tmp

Because signature verification is done using the public certificate, anyone can verify that the data has not been modified. It also allows non-authors to verify that the data was created by the owner of the private certificate.

This and the previous two blog entries are related. The first one explained how to create and store a simple message authentication code (mac). The second one explained how to encrypt data on the client side using symmetric and public key cryptography. This blog entry shows how to do message authentication via public key signing, so anyone with access to the public key can verify authorship.

Post a Comment

Security Client Row Access Control

Wednesday, September 5, 2018

Usually the database administrator controls who can access database data. However, it is possible for clients to completely control who can access data they add to the database, with the help of openssl.

First, let's create rsa keys for three users from the command line. We first create an rsa public/private key pair for each user in their home subdirectory and then make a copy of their rsa public key in the shared directory /u/postgres/keys:

# # must be run as the root user
# cd /u/postgres/keys
# for USER in user1 user2 user3
> do    mkdir ~"$USER"/.pgkey
>       chown -R "$USER" ~"$USER"/.pgkey
>       chmod 0700 ~"$USER"/.pgkey
>       openssl genpkey -algorithm RSA -out ~"$USER"/.pgkey/rsa.key
>       chmod 0600 ~"$USER"/.pgkey/*
>       openssl pkey -in ~"$USER"/.pgkey/rsa.key -pubout -out "$USER".pub
> done

A more sophisticated setup would include creating a certificate authority and signing certificates for each user using the created keys. This allows the certificate authority to prove that the public keys belong to the specified users.

With this in place, it is now possible to encrypt data on the client using a public key that can only be decrypted by someone with access to the matching private key. Here is an example for user user1:

# echo test4 | openssl pkeyutl -encrypt -pubin -inkey /u/postgres/keys/user1.pub
> openssl pkeyutl -decrypt -inkey ~"$USER"/.pgkey/rsa.key
test4

This encrypts text with user1's public key, then decrypts it with their private key.

Now, let's create a table to hold the encrypted data and add some encrypted data:

CREATE TABLE survey1 (id SERIAL, username NAME, result BYTEA);
 
\set enc `echo secret_message | openssl pkeyutl -encrypt -pubin -inkey /u/postgres/keys/user1.pub | xxd -p | tr -d '\n'`
INSERT INTO survey1 VALUES (DEFAULT, 'user1', decode(:'enc', 'hex'));
 
-- save data for the other two users
\set enc `echo secret_message | openssl pkeyutl -encrypt -pubin -inkey /u/postgres/keys/user2.pub | xxd -p | tr -d '\n'`
INSERT INTO survey1 VALUES (lastval(), 'user2', decode(:'enc', 'hex'));
 
\set enc `echo secret_message | openssl pkeyutl -encrypt -pubin -inkey /u/postgres/keys/user3.pub | xxd -p | tr -d '\n'`
INSERT INTO survey1 VALUES (lastval(), 'user3', decode(:'enc', 'hex'));

We could have placed all the user-encrypted data in the same row using a bytea array:

CREATE TABLE survey2 (id SERIAL, username NAME[], result BYTEA[]);
 
\set enc1 `echo secret_message | openssl pkeyutl -encrypt -pubin -inkey /u/postgres/keys/user1.pub | xxd -p | tr -d '\n'`
\set enc2 `echo secret_message | openssl pkeyutl -encrypt -pubin -inkey /u/postgres/keys/user2.pub | xxd -p | tr -d '\n'`
\set enc3 `echo secret_message | openssl pkeyutl -encrypt -pubin -inkey /u/postgres/keys/user3.pub | xxd -p | tr -d '\n'`
 
INSERT INTO survey2 VALUES (
        DEFAULT,
        '{user1, user2, user3}',
        ARRAY[decode(:'enc1', 'hex'), decode(:'enc2', 'hex'), decode(:'enc3', 'hex')]::bytea[]);

We could have stored the encrypted value only once using a random password and encrypted the password using each user's public key and stored those:

CREATE TABLE survey3 (id SERIAL, result BYTEA, username NAME[], keys BYTEA[]);
 
\set key `openssl rand -hex 32`
\set enc `echo secret_message | openssl enc -aes-256-cbc -pass pass\::key | xxd -p | tr -d '\n'`
 
\set enc1 `echo :'key' | openssl pkeyutl -encrypt -pubin -inkey /u/postgres/keys/user1.pub | xxd -p | tr -d '\n'`
\set enc2 `echo :'key' | openssl pkeyutl -encrypt -pubin -inkey /u/postgres/keys/user2.pub | xxd -p | tr -d '\n'`
\set enc3 `echo :'key' | openssl pkeyutl -encrypt -pubin -inkey /u/postgres/keys/user3.pub | xxd -p | tr -d '\n'`
 
INSERT INTO survey3 VALUES (
        DEFAULT,
        :'enc',
        '{user1, user2, user3}',
        ARRAY[decode(:'enc1', 'hex'), decode(:'enc2', 'hex'), decode(:'enc3', 'hex')]::bytea[]);

To decrypt data using the first schema (one user per row, no arrays), user1 would do:

SELECT * FROM survey1 WHERE username = 'user1';
 id | username |    result
----+----------+-------------
  1 | user1    | \x80c9d0...
 
-- set psql variables to match output columns
\gset
 
-- 'cut' removes \x
\set decrypt `echo :'result' | cut -c3- | xxd -p -revert | openssl pkeyutl -decrypt -inkey ~user1/.pgkey/rsa.key`
 
SELECT :'decrypt';
    ?column?
----------------
 secret_message

The process would be similar for survey2 and survey3. Of course, this does not prevent database administrators from removing data from the database, and because public keys are used to encrypt, they could add data too. A message authentication code (mac) would prevent this.

Post a Comment

Security Cryptographically Authenticated Rows

Friday, August 31, 2018

When storing data in the database, there is an assumption that you have to trust the database administrator to not modify data in the database. While this is generally true, it is possible to detect changes (but not removal) of database rows.

To illustrate this, let's first create a table:

CREATE TABLE secure_demo (
        id SERIAL, car_type TEXT, license TEXT, activity TEXT, 
        event_timestamp TIMESTAMP WITH TIME ZONE, username NAME, hmac BYTEA);

The last column (hmac) is used for change detection. Let's insert a row into the table:

INSERT INTO secure_demo
VALUES (DEFAULT, 'Mazda Miata', 'AWR-331', 'flat tire',
CURRENT_TIMESTAMP, 'user1', NULL) RETURNING *;
 id |  car_type   | license | activity  |       event_timestamp        | username | hmac
----+-------------+---------+-----------+------------------------------+----------+------
  1 | Mazda Miata | AWR-331 | flat tire | 2017-07-06 20:15:59.16807-04 | user1    |

Notice that this query also returns a text representation of the inserted row, including the computed columns id and event_timestamp.

To detect row changes, it is necessary to generate a message authentication code (mac) which is generated with a secret known only to the client. It is necessary to generate the mac on the client so the secret is never transferred to the server. These psql queries update the inserted row to store the mac:

SELECT ROW(id, car_type, license, activity, event_timestamp, username)
FROM secure_demo
WHERE id = 1;
 
-- set psql variables to match output columns
\gset
 
\set hmac `echo :'row' | openssl dgst -sha256 -binary -hmac 'MY-SECRET' | xxd -p | tr -d '\n'`
 
UPDATE secure_demo SET hmac = decode(:'hmac', 'hex') WHERE id = 1;
 
SELECT * FROM secure_demo;
 id |  car_type   | license | activity  |       event_timestamp        | username |   hmac
----+-------------+---------+-----------+------------------------------+----------+-------------
  1 | Mazda Miata | AWR-331 | flat tire | 2017-07-06 20:15:59.16807-04 | user1    | \x9549f1...

To later verify that the data row has not been modified, do:

SELECT ROW(id, car_type, license, activity, event_timestamp, username), hmac
FROM secure_demo
WHERE id = 1;
                                    row                                     |   hmac
----------------------------------------------------------------------------+-------------
 (1,"Mazda Miata",AWR-331,"flat tire","2017-07-06 20:15:59.16807-04",user1) | \x9549f1...
 
\gset
 
\echo  ' E''\\\\x'`echo :'row' | openssl dgst -sha256 -binary -hmac 'MY-SECRET' | xxd -p | tr -d '\n'`''''
 E'\\x9549f10d54c6a368499bf98eaca716128c732132680424f74cb00e1d5a175b63'
 
\echo :'hmac'
 E'\\x9549f10d54c6a368499bf98eaca716128c732132680424f74cb00e1d5a175b63'

The database administrator could replace or remove the hmac value, but this would be detected. This is because computing a proper hmac requires the MY-SECRET key, which is never sent to the server.

The above solution only allows someone with access to the secret key to determine if the row has been modified, meaning only they can check the message's integrity. A more sophisticated solution would be to use a private key to sign a hash of the row value — this would allow anyone with access to the public key to check that the row has not been modified, but still only allow those with access to the private key to generate a new hmac. This would also allow for non-repudiation.

There is the risk that the row values returned by the insert do not match those that were supplied, so some client-side checks would need to be added. There is also no detection for removed rows; this is similar to the problem of trying to detect a blocked tls connection attempt.

Such a setup is clearly overkill for many databases, but there are some use-cases where data integrity guaranteed by the client, independent of the database administrator, is useful. Non-repudiation using public key infrastructure is also sometimes useful.

Post a Comment

Security Foreign Data Wrappers and Passwords

Wednesday, August 29, 2018

Foreign data wrappers (fdw) allow data to be read and written to foreign data sources, like NoSQL stores or other Postgres servers. Unfortunately the authentication supported by fdws is typically limited to passwords defined using create user mapping. For example, postgres_fdw only supports password-based authentication, e.g., scram. Though only the database administrator can see the password, this can still be a security issue.

Ideally, at least some of the Postgres fdws should support more sophisticated authentication methods, particularly SSL certificates. Another option would be to allow user authentication to be sent through fdws, so the user has the same permissions on the fdw source and target. There is no technical reason fdw authentication is limited to passwords. This problem has been discussed, and it looks like someone has a plan for solving it, so hopefully it will be improved soon.

Post a Comment

Security Certificate Revocation Lists

Monday, August 27, 2018

If you are setting up Postgres server or client TLS/SSL certificates, be sure to also configure support for a certificate revocation list (crl). This list, distributed by the certificate authority, lists certificates that should no longer be trusted.

While the crl will initially likely be empty, a time will come when a private key used by a certificate or device is exposed in an unauthorized manner, or an employee who had access to private keys leaves your organization. When that happens, you will need the ability to invalidate certificates — having that ability pre-configured will help, especially during a crisis.

View or Post Comments

Performance Oracle Real Application Clusters (RAC)

Monday, July 9, 2018

I get asked about Oracle RAC often. My usual answer is that Oracle RAC gives you 50% of high reliability (storage is shared, mirroring helps) and 50% of scaling (CPU and memory is scaled, storage is not). The requirement to partition applications to specific nodes to avoid cache consistency overhead is another downside. (My scaling presentation shows Oracle RAC.)

I said the community is unlikely to go the Oracle RAC direction because it doesn't fully solve a single problem, and is overly complex. The community prefers to fully-solve problems and simple solutions.

For me, streaming replication fully solves the high availability problem and sharding fully solves the scaling problem. Of course, if you need both, you have to deploy both, which gives you 100% of two solutions, rather than Oracle RAC which gives you 50% of each.

However, I do think database upgrades are easier with Oracle RAC, and I think it is much easier to add/remove nodes than with sharding. For me, this chart summarizes it:

                         HA   Scaling  Upgrade Add/Remove
        Oracle RAC       50%     50%    easy    easy
        Streaming Rep.  100%     25%*   hard    easy
        Sharding          0%    100%    hard    hard
        
        * Allows read scaling

(I posted this to pgsql-general in 2016.)

View or Post Comments

Performance Query Planner Interview

Saturday, June 23, 2018

Software Engineering Radio has just posted a one-hour audio recording of an interview I did about the Postgres query optimizer. It is generic enough to be useful to anyone wanting to understand how relational databases optimize queries.

On an unrelated note, I am leaving soon for a 25-day European Postgres speaking tour. I am speaking at one-day conferences in Zürich, London, and Amsterdam, a user group in Frankfurt, and will be presenting at ten EnterpriseDB events across Europe.

View or Post Comments

Presentation Will Postgres Live Forever?

Thursday, June 7, 2018

I had the opportunity to present an unusual topic at this year's Postgres Vision conference: Will Postgres Live Forever? It is not something I talk about often but it brings out some interesting contrasts in how open source is different from proprietary software, and why innovation is fundamental to software usage longevity. For the answer to the question, you will have to read the slides.

View or Post Comments

News Draft of Postgres 11 Release Notes

Thursday, May 17, 2018

I have completed the draft version of the Postgres 11 release notes. Consisting of 167 items, this release makes big advances in partitioning, parallelism, and server-side transaction control via procedures. One of the more unexpected yet useful features is "Allow 'quit' and 'exit' to exit psql when used in an empty buffer".

The release notes will be continually updated until the final release, which is expected in September or October of this year.

View or Post Comments

Documentation Intermediate Certificates

Monday, January 22, 2018

I previously mentioned the importance of high quality documentation, so we are always looking for improvements. This email thread from 2013 attempted to codify the rules for how to properly use intermediate ssl/tls certificates with Postgres. At this time, our documentation was updated to recommend storing intermediate certificates with root certificates because it was unclear under what circumstances intermediate certificates are transferred to the remote server to be chained to a trusted root certificate.

During research for my four security talks, I studied certificate handling. I found certificate chain resolution rules in the verify manual page. In testing various certificate locations, I also found that Postgres follows the same rules.

Based on this testing, I realized the conclusions reached in 2013 were inaccurate, or at least incomplete. While the documented procedure worked, the more practical and recommended approach is to store intermediate certificates (created with v3_ca extensions) with leaf certificates to be sent to the remote end. (I think the requirement of using the v3_ca extension when creating intermediate certificates is what caused much of the testing confusion in the past.)

This new procedure allows short-lived leaf and intermediate certificates to be replaced at expire time while long-lived root certificate stores remains unchanged. For example, for clients to verify the server's certificate, the server would contain the intermediate and server's leaf certificates, and clients only need root certificates, which rarely change.

The documentation of all supported Postgres versions has been updated to recommend this new procedure. I have also added sample scripts showing how to create root-leaf and root-intermediate-leaf certificate chains.

These changes will be distributed in the next minor Postgres releases, scheduled for next month. Until this new documentation is released, you can read the updates in the Postgres 11 docs in the server and libpq ssl sections. I am hopeful this clarified documentation will encourage people to use ssl and ssl certificate verification.

View or Post Comments

Presentation Four New Security Talks

Monday, January 15, 2018

In the past few months I have completed four new security talks, totaling 294 slides. The first and third talks explain the fundamentals of cryptography and cryptographic hardware, respectively. The second and fourth talks cover application of these fundametals. The second talk covers tls, including the use of ssl certificates by Postgres. The fourth covers the use of cryptographic hardware by applications, including Postgres.

View or Post Comments

Conference Video of Russian Interview

Wednesday, January 10, 2018

I just did a two-hour interview in English with the Russian Postgres user group. A video recording of the interview is online and covers questions asked by the Russian attendees.

View or Post Comments

Community Web Forums?

Friday, January 5, 2018

This email thread explores the idea of the community supporting web forums instead of or in addition to the email lists, where the majority of community development and discussion happen. Reason stated for not pursuing web forums included:

  • It has been tried before and failed
  • Larger web forum communities already exist, e.g., Stack Overflow
  • Established Postgres community members prefer email

Ultimately I think the Postgres community needs to do a better job of publicizing the existence of external communities that help Postgres users, e.g., Slack. As an example, the Postgres irc channel is well publicized and currently has 1,100 connected users. Also, EnterpriseDB created the Postgres Rocks web forum six months ago.

View or Post Comments

Documentation Wal and Xlog

Wednesday, January 3, 2018

Postgres isn't the best at naming things. Of course, there is the old computer saying, "There are only two hard things in Computer Science: cache invalidation and naming things." With Postgres being 31 years old and developed by several different project teams, naming can be even more inconsistent.

One naming inconsistency, which we have lived with for years, is the name of the write-ahead log. Postgres references this using the acronym wal in server variables like wal_level, but the PGDATA directory containing the write-ahead log files was called pg_xlog. In "pg_xlog," the "x" stands for "trans" which is short for "transaction", and of course "log" means "log", so "xlog" was short for "transaction log." This was also confusing because there is a clog directory which records "transaction status" information (commits, aborts). So, "xlog" or "transaction log" was already a bad name, and having it also referenced as wal just made it worse.

Postgres 10 has made the difficult change of removing references to "xlog" and "clog," and instead name them "wal" and "pg_xact" consistently. This email thread covers many of the gory details of what we changed and why. It isn't ideal to be changing the name of internal database objects, and it will cause some pain to those moving to Postgres 10, but future users of Postgres will have a more consistent experience of Postgres and how it works.

View or Post Comments