Bruce Momjian

Postgres Blog


This blog is about my work on the Postgres open source database, and is published on Planet PostgreSQL. PgLife allows monitoring of all Postgres community activity.

Online status:
Unread Postgres emails:
Email graphs: incoming, outgoing, unread, commits (details), events



Presentation Non-Relational Postgres

Monday, April 25, 2016

Having worked on Postgres for twenty years, I have seen a lot of features added. However, I usually evaluate new features in isolation and rarely see the big picture of how they fit together. Last month I created a new talk, Non-Relational Postgres, which presents a comprehensive overview of non-relational Postgres storage.

The consistent use of @> (containment operator) listed on slide 70 is interesting. It also has good coverage of how character strings can be split apart and indexed.

 


Tip Layers of Security

Saturday, April 9, 2016

Security can be complex, and for databases, it can be very complex because there are so many different users and access patterns. Postgres supports nine layers of security from the client application to a specific data row. I thought it would be interesting to list them, in order:

  1. listen_addresses: This controls what interfaces the server listens to. localhost listens just on the local network interface (e.g. 127.0.0.1), while * listens on all interfaces. unix_socket_directories, unix_socket_group, and unix_socket_permissions control Unix-domain sockets in a similar manner.
  2. pg_hba.conf: This file allows specification of eleven authentication methods, filtered by host name, IP address, network range, database name, user name, and ssl usage.
  3. nologin: Setting a user to nologin prevents future logins.
  4. allow_connections: Setting the database option allow_connections to false prevents future connections to that database.
  5. grant ... connect: This controls which roles can connect to the specified database.
  6. connection limit: This per-database and per-user setting limits the number of connections; a zero value prevents new connections.
  7. grant ... schema: Usage permission is required to access objects inside a schema. (By default, the public schema has usage and create permissions granted to the public role.)
  8. grant ... table: Per-command access, e.g. select, insert, is configurable per table.
  9. row security: New in Postgres 9.5, this allows row-level access control using a create policy expression.

So, the next time you can't access something, consider the security layers involved.

Post a Comment

Performance Cpus Are Slowing Us Down

Friday, April 1, 2016

In my Database Hardware Selection Guidelines talk, I emphasized that database servers are usually limited by I/O and memory constraints, not cpu. However, on slide 24 I mentioned one use-case that is cpu-bound — read-only workloads where the working set fits into ram. Over time, this distinction has gotten more pronounced with the addition of faster I/O (SSDs) and larger-memory systems. What hasn't improved much is cpu speed, though cpu core count has certainly increased.

A subset of cpu-bound workloads are data warehouse queries, where the working set fits into ram and a large percentage of time is spent in the executor. (The executor runs a state machine, a "plan", created by the optimizer.) This workload distinction is well outlined on slide 5 of a Vitesse DB talk. For queries that spend most of their time in the executor and process data already in ram, executor overhead is significant.

There were two presentations at the recent PGDay Asia conference that highlight projects designed to reduce executor overhead, leading to massive speedups for specific workloads. The more limited approach was by Kumar Rajeev Rastogi of Huawei. Their approach is to create shared object libraries at table creation time that know about the column structure of each table — this allow specific columns to be accessed rapidly by reducing row access overhead.

A more radical approach is taken by Vitesse DB (closed source). At runtime, Vitesse DB compiles optimizer-generated plans into object files that are run during execution. This avoids executor state-machine overhead. Plans that cannot be compiled are run using the normal state-machine executor. (Both projects use LLVM.)

This clearly shows that there are workloads where the overhead of the executor is significant, and that groups trying to solve this problem. I think the Postgres community distribution is going to need to address this someday as well. (While parallelism and sharding allow multiple cpus to be used for a single query, they don't improve single-cpu throughput.)

Post a Comment

News Oracle Attacks Postgres in Russia

Tuesday, March 22, 2016

During my twenty years with Postgres, I knew the day would come when proprietary databases could no longer ignore Postgres and would start attacking us.

Well, that day has come, at least in Russia. During the past few weeks, Oracle sent a letter (Russian, English translation) to Russian partners and customers comparing Oracle favorably to Postgres as a way of cirumventing a new law favoring Russian-produced software. This is the first direct attack I have seen on Postgres, and is probably representative of the kinds of attacks we will see from other vendors and in other countries in the years to come.

The press has picked up on the news (Russian, English) and given balanced coverage. Comments on the English article were, in general, positive — I particularly liked this one. There are two Hacker News threads about it (1, 2), a community thread about it, and another community thread about Oracle RAC.

Taken together, this is an interesting time for Postgres.

Post a Comment

Thoughts The Plan for FDW-Based Sharding

Thursday, February 25, 2016

Earlier this week I posted an email to the hackers list outlining a plan for built-in sharding. Later emails in the thread clarify many of my initial remarks. (I also posted a presentation about this a year ago.)

The emails detail the steps being taken to enhance existing Postgres features (foreign data wrappers (FDW), parallelism, partitioning) to minimize the code changes necessary to add sharding. While there is still uncertainty about the final design, what workloads it will support, and even the probability of success, steady progress is being made. As feature enhancements are completed, the path to built-in Postgres sharding will become clearer.

Post a Comment