Bruce Momjian

Postgres Blog

This blog is about my work on the Postgres open source database, and is published on Planet PostgreSQL. PgLife allows monitoring of all Postgres community activity.

Online status:
Unread Postgres emails:
Email graphs: incoming, outgoing, unread, commits (details), events

Documentation Postgres 9.6 Features

Wednesday, September 21, 2016

I have written a presentation covering the important features in Postgres 9.6 and some of the features we hope for in Postgres 10.


Documentation Be Prepared for Prepare

Wednesday, June 15, 2016

Prepared statements are queries where the constants used in the query are separated from the query text. While this improves security by avoiding SQL injection attacks, it also allows repeatedly-executed queries to avoid parsing and planning overhead by executing saved generic plans that represent queries with typical constants. While generic plans don't have the advantage of being planned based on the statistics of specific constants, the avoidance of parsing and planning overhead is often more beneficial.

Before Postgres 9.2, generic plans were used for all prepared queries. In Postgres 9.2, logic was added to use a generic plan only if it has a cheaper cost after five or more executions. Unfortunately, this behavior was undocumented, causing confusion for users surprised to find the explain plans of prepared queries changing after five executions, sometimes for the worse.

After much discussion, this has been remedied by improving the Notes section of the 9.6 prepare manual page. It now explains how generic plans are created, and when they are used. I have also created an sql script that, when run through psql (output), illustrates the documented behavior.

Post a Comment

Conference Lots-O-Travel

Tuesday, May 31, 2016

Since January, I have had the pleasure of speaking about Postgres in 15 cities: Singapore, Seoul, Tokyo, San Francisco, Los Angeles, Phoenix, St. Louis, Bloomington (Illinois), Chicago, Charlotte, New York City, Brussels, Helsinki, Moscow, and Krasnoyarsk (Siberia).

I am particularly excited about the new cities I visited in Asia, and growth there will continue in the coming months. You can see from my travel map that the two areas still lacking Postgres activity are the Middle East and Africa. Fortunately, Umair Shahid has already started on the Middle East.

In more Postgres-saturated continents, like North America and Europe, there are now several conferences per year during different months, in different cities, and with different focuses. While proprietary database companies usually have just one huge conference a year per continent, our distributed conference teams allow for smaller, more frequent, more geographically distributed conferences, which better meet the needs of our users. Smaller conferences allow for more interaction with speakers and leaders. More frequent conferences allow people to attend a conference quickly, rather than waiting eleven months for the next yearly conference. Geographically distributed conferences allow for reduced travel costs, which is particularly important for first-time attendees. I realize seeing thousands of Postgres people together is motivating, but once that wears off, the benefits of more, smaller conferences are hard to beat.

Post a Comment

Presentation Non-Relational Postgres

Monday, April 25, 2016

Having worked on Postgres for twenty years, I have seen a lot of features added. However, I usually evaluate new features in isolation and rarely see the big picture of how they fit together. Last month I created a new talk, Non-Relational Postgres, which presents a comprehensive overview of non-relational Postgres storage.

The consistent use of @> (containment operator) listed on slide 70 is interesting. It also has good coverage of how character strings can be split apart and indexed.

Post a Comment

Tip Layers of Security

Saturday, April 9, 2016

Security can be complex, and for databases, it can be very complex because there are so many different users and access patterns. Postgres supports nine layers of security from the client application to a specific data row. I thought it would be interesting to list them, in order:

  1. listen_addresses: This controls what interfaces the server listens to. localhost listens just on the local network interface (e.g., while * listens on all interfaces. unix_socket_directories, unix_socket_group, and unix_socket_permissions control Unix-domain sockets in a similar manner.
  2. pg_hba.conf: This file allows specification of eleven authentication methods, filtered by host name, IP address, network range, database name, user name, and ssl usage.
  3. nologin: Setting a user to nologin prevents future logins.
  4. allow_connections: Setting the database option allow_connections to false prevents future connections to that database.
  5. grant ... connect: This controls which roles can connect to the specified database.
  6. connection limit: This per-database and per-user setting limits the number of connections; a zero value prevents new connections.
  7. grant ... schema: Usage permission is required to access objects inside a schema. (By default, the public schema has usage and create permissions granted to the public role.)
  8. grant ... table: Per-command access, e.g. select, insert, is configurable per table.
  9. row security: New in Postgres 9.5, this allows row-level access control using a create policy expression.

So, the next time you can't access something, consider the security layers involved.

Post a Comment

Performance Cpus Are Slowing Us Down

Friday, April 1, 2016

In my Database Hardware Selection Guidelines talk, I emphasized that database servers are usually limited by I/O and memory constraints, not cpu. However, on slide 24 I mentioned one use-case that is cpu-bound — read-only workloads where the working set fits into ram. Over time, this distinction has gotten more pronounced with the addition of faster I/O (SSDs) and larger-memory systems. What hasn't improved much is cpu speed, though cpu core count has certainly increased.

A subset of cpu-bound workloads are data warehouse queries, where the working set fits into ram and a large percentage of time is spent in the executor. (The executor runs a state machine, a "plan", created by the optimizer.) This workload distinction is well outlined on slide 5 of a Vitesse DB talk. For queries that spend most of their time in the executor and process data already in ram, executor overhead is significant.

There were two presentations at the recent PGDay Asia conference that highlight projects designed to reduce executor overhead, leading to massive speedups for specific workloads. The more limited approach was by Kumar Rajeev Rastogi of Huawei. Their approach is to create shared object libraries at table creation time that know about the column structure of each table — this allow specific columns to be accessed rapidly by reducing row access overhead.

A more radical approach is taken by Vitesse DB (closed source). At runtime, Vitesse DB compiles optimizer-generated plans into object files that are run during execution. This avoids executor state-machine overhead. Plans that cannot be compiled are run using the normal state-machine executor. (Both projects use LLVM.)

This clearly shows that there are workloads where the overhead of the executor is significant, and that groups trying to solve this problem. I think the Postgres community distribution is going to need to address this someday as well. (While parallelism and sharding allow multiple cpus to be used for a single query, they don't improve single-cpu throughput.)

Post a Comment

News Oracle Attacks Postgres in Russia

Tuesday, March 22, 2016

During my twenty years with Postgres, I knew the day would come when proprietary databases could no longer ignore Postgres and would start attacking us.

Well, that day has come, at least in Russia. During the past few weeks, Oracle sent a letter (Russian, English translation) to Russian partners and customers comparing Oracle favorably to Postgres as a way of cirumventing a new law favoring Russian-produced software. This is the first direct attack I have seen on Postgres, and is probably representative of the kinds of attacks we will see from other vendors and in other countries in the years to come.

The press has picked up on the news (Russian, English) and given balanced coverage. Comments on the English article were, in general, positive — I particularly liked this one. There are two Hacker News threads about it (1, 2), a community thread about it, and another community thread about Oracle RAC.

Taken together, this is an interesting time for Postgres.

Post a Comment

Thoughts The Plan for FDW-Based Sharding

Thursday, February 25, 2016

Earlier this week I posted an email to the hackers list outlining a plan for built-in sharding. Later emails in the thread clarify many of my initial remarks. (I also posted a presentation about this a year ago.)

The emails detail the steps being taken to enhance existing Postgres features (foreign data wrappers (FDW), parallelism, partitioning) to minimize the code changes necessary to add sharding. While there is still uncertainty about the final design, what workloads it will support, and even the probability of success, steady progress is being made. As feature enhancements are completed, the path to built-in Postgres sharding will become clearer.

View or Post Comments