Internals Specialized Cpu Instructions

Friday, December 8, 2023

C compilers do a great job of converting C to assembly language (optionally) and then to primitive cpu instructions. However, some cpu instructions are too specialized or too complex to map to the C language. For example, Postgres has used test and set assembly language cpu instructions for decades to provide good performance for high concurrency workloads. We have such cpu instructions for:

Since 2014, Postgres has supported atomics, which provides test and set, compare and exchange, and atomic addition. We also use compiler built-in intrinsic functions that directly map to cpu instructions for crc computations. If you are using tls/ssl with OpenSSL, the OpenSSL library will use specialized cpu instructions (if supported) to perform aes encryption.

A new feature in Postgres 16 is the use of single instruction, multiple data (simd) intrinsic functions on x86_64 and Arm cpus. Simd functions allow identical operations on 16-byte blocks — this is Postgres's first use of non-special-purpose cpu instructions. The blocks can be treated as 16 bytes or four 4-byte integers. The operations include 16-byte:

Most of these operate at the byte level, but some also process four-byte integers. These four-byte instructions are used for:

I am hopeful that more specialized cpu instruction uses will be added in future releases.

View or Post Comments

Thoughts In Praise of PostgreSQL

Wednesday, December 6, 2023

At the risk of causing undue pride in the community, I want to share a blog post in praise of Postgres. This early paragraph captures his sentiments:

After 25 years of persistence, and a better logo design, Postgres stands today as one of the most significant pillars of profound achievement in free software, alongside the likes of Linux and Firefox. PostgreSQL has taken a complex problem and solved it to such an effective degree that all of its competitors are essentially obsolete, perhaps with the exception of SQLite.

And he ends with, "When I think of projects which have made the most significant impacts on the free software ecosystem, and on the world at large, PostgreSQL has a place on that list."

View or Post Comments

Thoughts Multiply Time

Friday, November 24, 2023

Reading the title, you might think it is about multiplying interval values. Well, no, it is about something much larger.

If you are an application developer, database administrator, or Postgres internals developer, what value do you give to society? Well, you are paid or people appreciate your work, so you must provide some value. Is it food, clothing, shelter, entertainment — no, not directly. So, what is it? Time multiplication.

In this video, Rory Vaden explains that there are three things you can do in a day, urgent tasks, important tasks, and significant tasks. (The video is based on his book, Procrastinate on Purpose: 5 Permissions to Multiply Your Time.) He explains how to prevent urgent tasks from crowding out important tasks, but his big focus is on significant tasks, that is, tasks that, if done now, give you more time in the future. He calls this multiplying your time and justifies its significance because it produces benefits far into the future by freeing up time to do future tasks.

So, what benefit do application developers, database administrators, and Postgres internals developers provide? They multiply time. Application developers automate tasks for users, so users have more time to do other things. Database administrators make the work of application developers more efficient. Postgres internals developers make Postgres users more efficient by adding features to make configuration easier, applications easier to write, and faster. When Postgres becomes faster, e.g., 1.2% in Postgres 16, how does that make an impact? Well, every operation, on average is faster, so how many operations happen in a day? How many Postgres databases are running? Once you do the math, even a small improvement makes a big impact.

What about a performance engineer who spends a day making one query slightly faster — was it worth it? Yes, but only if the query is run many times. The benefit is not from a single query execution, because the query likely didn't take a day to run. The benefit is the performance improvement multiplied by the number of times the query will be run in the future. You can find the most time consuming queries using pgBadger reports or by running a query using pg_stat_statements. It is funny that Postgres has tools to analyze such time investments.

With these ideas in mind, you might look at your job differently — I know I have.

View or Post Comments

Time Indexing timestamps

Wednesday, November 22, 2023

Timestamps are very precise and flexible in Postgres, but sometimes users want to do index lookups of timestamp values with less precision, i.e., by date. To illustrate this, let's create a sample table with 100k timestamp values:

CREATE TEMPORARY TABLE date_test (event_time TIMESTAMP WITH TIME ZONE);
 
INSERT INTO date_test
SELECT
(
    SELECT '2023-03-01 00:00:00'::timestamptz +
           (floor(random() *
            (extract(EPOCH FROM '2023-04-01'::timestamptz) -
             extract(EPOCH FROM '2023-03-01'::timestamptz)) +
            b * 0)::integer || 'seconds')::interval
)
FROM generate_series(1, 100000) AS a(b);
 
ANALYZE date_test;

An initial per-date indexing approach might be to create an index on the timestamp column, cast to date:

CREATE INDEX i_date_test ON date_test ((event_time::date));
ERROR:  functions in index expression must be marked IMMUTABLE

This fails because the index might be accessed by sessions with different timezone values. An index created using the timezone value of the current session would yield incorrect results if used by a session with a different timezone value, so such indexes cannot be created. One solution is create the index with a specific timezone:

SHOW timezone;
     TimeZone     
------------------
 America/New_York
 
CREATE INDEX i_date_test_nyc ON date_test (((event_time AT TIME ZONE 'America/New_York')::date));

This works, but the index can only be used when a matching time zone is specified in the query:

SELECT COUNT(*)
FROM date_test
WHERE (event_time AT TIME ZONE 'America/New_York')::date = '03-07-2023';
 count 
-------
  3244
 
EXPLAIN (COSTS OFF)
SELECT COUNT(*)
FROM date_test
WHERE (event_time AT TIME ZONE 'America/New_York')::date = '03-07-2023';
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Aggregate
   ->  Bitmap Heap Scan on date_test
         Recheck Cond: (((event_time AT TIME ZONE 'America/New_York'::text))::date = '2023-03-07'::date)
         ->  Bitmap Index Scan on i_date_test_nyc
               Index Cond: (((event_time AT TIME ZONE 'America/New_York'::text))::date = '2023-03-07'::date)
 
EXPLAIN (COSTS OFF)
SELECT COUNT(*)
FROM date_test
WHERE event_time::date = '03-07-2023';
                        QUERY PLAN
-----------------------------------------------------------
 Aggregate
   ->  Seq Scan on date_test
         Filter: ((event_time)::date = '2023-03-07'::date)

While casting to date only has day granularity, date_trunc() with a time zone specification in Postgres 16 and later can be used for more fine-grained indexing, though I am only showing date granularity below:

CREATE INDEX i_date_test_trunc ON date_test ((date_trunc('day', event_time, 'America/New_York')));
 
SELECT COUNT(*)
FROM date_test
WHERE date_trunc('day', event_time, 'America/New_York') = '03-07-2023';
 count
-------
  3244
 
EXPLAIN (COSTS OFF)
SELECT COUNT(*)
FROM date_test
WHERE date_trunc('day', event_time, 'America/New_York') = '03-07-2023';
                                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate
   ->  Bitmap Heap Scan on date_test
         Recheck Cond: (date_trunc('day'::text, event_time, 'America/New_York'::text) = '2023-03-07 00:00:00-05'::timestamp with time zone)
         ->  Bitmap Index Scan on i_date_test_trunc
               Index Cond: (date_trunc('day'::text, event_time, 'America/New_York'::text) = '2023-03-07 00:00:00-05'::timestamp with time zone)

A more flexible approach is not to index the timestamp with casts, and instead specify the date range in the query:

CREATE INDEX i_date_test ON date_test (event_time);
 
SELECT COUNT(*)
FROM date_test
WHERE event_time >= '03-07-2023' AND event_time < '03-08-2023';
 count 
-------
  3179
 
EXPLAIN (COSTS OFF)
SELECT COUNT(*)
FROM date_test
WHERE event_time >= '03-07-2023' AND event_time < '03-08-2023';
                                           QUERY PLAN
---------------------------------------------------------------------------------------------------
 Aggregate
   ->  Bitmap Heap Scan on date_test
         Recheck Cond: ((event_time >= '2023-03-07 00:00:00-05'::timestamp with time zone) AND
                        (event_time < '2023-03-08 00:00:00-05'::timestamp with time zone))
         ->  Bitmap Index Scan on i_date_test
               Index Cond: ((event_time >= '2023-03-07 00:00:00-05'::timestamp with time zone) AND
                            (event_time < '2023-03-08 00:00:00-05'::timestamp with time zone))
 
SELECT COUNT(*)
FROM date_test
WHERE event_time >= '03-07-2023' AND event_time < '03-07-2023'::date + '1 day'::interval;
 count 
-------
  3179
 
EXPLAIN (COSTS OFF)
SELECT COUNT(*)
FROM date_test
WHERE event_time >= '03-07-2023' AND event_time < '03-07-2023'::date + '1 day'::interval;
                                          QUERY PLAN
---------------------------------------------------------------------------------------------------
 Aggregate
   ->  Bitmap Heap Scan on date_test
         Recheck Cond: ((event_time >= '2023-03-07 00:00:00-05'::timestamp with time zone) AND
                        (event_time < '2023-03-08 00:00:00'::timestamp without time zone))
         ->  Bitmap Index Scan on i_date_test
               Index Cond: ((event_time >= '2023-03-07 00:00:00-05'::timestamp with time zone) AND
                            (event_time < '2023-03-08 00:00:00'::timestamp without time zone))
 
EXPLAIN (COSTS OFF)
SELECT COUNT(*)
FROM date_test
WHERE event_time >= '03-07-2023' AND event_time < '03-07-2023 08:32:00'::timestamptz + '1 minute'::interval;
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Aggregate
   ->  Bitmap Heap Scan on date_test
         Recheck Cond: ((event_time >= '2023-03-07 00:00:00-05'::timestamp with time zone) AND 
                        (event_time < ('2023-03-07 08:32:00-05'::timestamp with time zone + '00:01:00'::interval)))
         ->  Bitmap Index Scan on i_date_test
               Index Cond: ((event_time >= '2023-03-07 00:00:00-05'::timestamp with time zone) AND 
                            (event_time < ('2023-03-07 08:32:00-05'::timestamp with time zone + '00:01:00'::interval)))

The last two queries show you don't need to hard-code the ending date — it can be computed, and the last query shows you can use arbitrary granularity with the same index.

We have shown indexing timestamp values by creating an index cast to date with a time zone specification, date_trunc() with a time zone specification, and as timestamp with a range used in the query. Looking at these options:

  1. Date index cast: only day granularity, fixed to a time zone
  2. date_trunc(): fixed to a granularity and time zone
  3. Timestamp index: unlimited granularity, flexible time zone

why would you ever use anything but a timestamp index? The other indexing options are useful if the storage of a fixed time zone and granularity in the index is useful, or if you wish to create a unique index, which requires the storage of a specific granularity and time zone in the index.

In summary, most people assume the first two options are the logical approach, but the third option is the most flexible and powerful. The timestamp index option does require specifying a range in the query, while the previous ones do not, though those require a time zone specification and have other downsides.

View or Post Comments

Sql Is Sql Good?

Monday, November 20, 2023

The Postgres mailing lists are full of practical discussions, but two years ago there was a 77-email thread titled "The tragedy of sql" that questioned the utility of the sql query language; t is worth a review. While opening with "A fun philosophical discussion," it states:

The world's economic output would be substantially higher (5%?) if our industry had settled on almost anything other than sql for relational databases.

It disparages object-relational mappers and suggests Datalog as an alternative query language.

The replies are difficult to summarize. Some explained that no computer language is perfect, with cobol, fortran, C, and assembler getting honorable mentions. Some mentioned that sql is unlike other languages because it is set-oriented and declarative. Some commented on whether sql's English-language-like syntax was a benefit. Quel gets mentioned as an alternative query language (and the language used during Postgres's Berkeley days.) One long reply questioned whether it was wise to closely couple the sql language with data storage. The ubiquity of sql was also mentioned.

It is hard to summarize all the opinions, but I will close with this email reply:

I have the opposite perspective. As a dev/manager, sql is much more powerful at getting data storage from abstract concept, into a usable structure, than any programming language I've had the (mis)fortune of using. I've long since lost count of the massive volume of other people's code (especially orms) I've removed and replaced by updating sql statements to do all the logic, and return me exactly what I want. And typically this also comes with a (sometimes massive) performance gain.

I've managed many a programmer that would complain that sql is too hard and they don't want to learn it, but had no problem spending days learning the orm of the month that "saves them time" and writing complex inscrutable monstrosities with them.

Could sql be better? Absolutely. But in terms of bang-for-my-buck, I feel learning sql has saved me more clock-time, and improved my productivity/value probably more than any other individual language in my career.

and this one:

There is nothing natural about either relations or arrays and hashes/dictionaries. Relations are [a] pretty literal implementation of basic set theory. Having a decent understanding of basic set theory is a condition for understanding sql. Now, we can discuss whether a language implementing a mathematical theory is "good" or "bad", whatever the meaning of "good" or "bad" in the given context. Historically, sql is a good fit for the banking business and accounting and that is why it is still around.

Another misconception about sql is treating it as a general purpose programming language. sql is data description language, nothing more, nothing less. It doesn't need loops, arrays, hashes or subroutines, its principal purpose is to describe a subset of data. Without sql you would have to read all the data and filter the unnecessary stuff yourself. Furthermore, part of sql are so-called "acid requirements". Transaction can only see the data that was committed before the transaction has begun. Implementing acid takes a lot of code, that's what mvcc is all about. However, that too has its origins in accounting. You cannot pay the money you don't have. And the last thing about sql is transaction management. Without relational databases and sql, you would need a proprietary transaction manager, just like MongoDB. And MongoDB has a complex proprietary transaction manager and is losing market share. So, to recapitulate:

That is why sql is still around. And that is why we all live in a yellow subroutine (this reference is not for the millenials or younger).

View or Post Comments

Sql Transaction Block Isolation Levels

Friday, November 17, 2023

When a single query is run outside of a transaction block, it is clear how transaction visibility should behave — only transactions committed before the query started should be visible. Changes made by other sessions during the query, even if committed during the query, should not be visible. Slide 11 of my MVCC Unmasked presentation illustrates this.

For queries run in transaction blocks, the ideal behavior is less clear, so Postgres supports three options for controlling transaction block behavior. The default, read committed, causes each new query in a transaction block to get a new visibility snapshot, meaning that each new query sees changes that were committed by other sessions before the query started.

This might not be ideal, so Postgres supports repeatable read. In this mode, instead of getting a new snapshot at the start of each query in a transaction block, a snapshot is taken when the first query in the transaction block is issued, and this snapshot is used for all queries in the transaction block. This means that changes committed by other sessions during the transaction block's execution are not seen by any statements in the block. This is ideal for read-only transactions, such as those used for reporting. It guarantees that all statements see a consistent view of the database.

The third supported isolation level is the serializable isolation level. It has the same fixed snapshot during the transaction block, but it also checks for row changes that would be impossible if the transactions were executed serially, meaning non-concurrently. If such a changes are detected, one of the transactions is aborted. Applications that do proper row locking with FOR UPDATE and FOR SHARE probably don't need to use this mode, but it is helpful for applications that prefer to retry transactions rather than guarantee proper locking behavior.

As you can see, there is a lot more functionality available beyond the default read committed isolation level, either for read-only transactions or applications that need improper lock avoidance. You might not need these other two isolations levels in your first Postgres application, but they can come in handy.

View or Post Comments

Sql Lateral Usage

Wednesday, November 15, 2023

Lateral is a powerful sql feature that allows virtual tables to be created in from clauses that reference real or virtual tables that appeared previously in the same from clause. Here is a simple example:

CREATE TABLE test (x) AS SELECT generate_series(1,3);
 
SELECT * FROM test AS test1 (x) JOIN LATERAL (SELECT test1.x + 1) AS test2 (x) ON (true);
 x | x
---+---
 1 | 2
 2 | 3
 3 | 4

You will notice that the subquery references test1.x which is created in the first part of the from clause. If lateral is not supplied, an error is generated:

SELECT * FROM test AS test1 (x) JOIN (SELECT test1.x + 1) AS test2 (x) ON (true);
ERROR:  invalid reference to FROM-clause entry for table "test1"
LINE 1: SELECT * FROM test AS test1 (x) JOIN (SELECT test1.x + 1) AS...
                                                     ^
DETAIL:  There is an entry for table "test1", but it cannot be referenced from this part of the query.
HINT:  To reference that table, you must mark this subquery with LATERAL.

This query highlights a helpful hint that was added in Postgres 16. Of course, this query could have been written by adding the computed value to the target list, rather than the from clause:

SELECT *, test1.x + 1 FROM test AS test1 (x);
 x | ?column?
---+----------
 1 |        2
 2 |        3
 3 |        4

Here are two lateral queries that cannot be rewritten as additional target list columns:

SELECT * FROM test AS test1 (x) JOIN LATERAL (SELECT power(test1.x, test.x) FROM test) AS test2 (x) ON (true);
 x | x
---+----
 1 |  1
 1 |  1
 1 |  1
 2 |  2
 2 |  4
 2 |  8
 3 |  3
 3 |  9
 3 | 27
 
SELECT * FROM test AS test1 (x) JOIN LATERAL (SELECT power(test1.x, test.x) FROM test) AS test2 (x) ON (true) WHERE test1.x != test2.x;
 x | x
---+----
 2 |  4
 2 |  8
 3 |  9
 3 | 27

The first query requires lateral because the subquery generates multiple test2 rows for each test1 row. The second query also does a comparison in the where clause which must happen before the target list is generated.

Using columns defined in previous parts of the from clause is common, and since there is no ambiguity in function calls, they do not require lateral, as seen below:

SELECT * FROM test AS test1 (x) JOIN LATERAL generate_series(1, test1.x) AS test2 (x) ON (true);
 x | x
---+---
 1 | 1
 2 | 1
 2 | 2
 3 | 1
 3 | 2
 3 | 3
 
SELECT * FROM test AS test1 (x) JOIN generate_series(1, test1.x) AS test2 (x) ON (true);
 x | x
---+---
 1 | 1
 2 | 1
 2 | 2
 3 | 1
 3 | 2
 3 | 3
 
SELECT * FROM test AS test1 (x) JOIN generate_series(1, test1.x) AS test2 (x) ON (true) WHERE test1.x != test2.x;
 x | x
---+---
 2 | 1
 3 | 1
 3 | 2

The final query shows use of a where clause restriction that references the table and the function call. As you can see, lateral adds new flexibility to sql by doing subqueries and function calls that references previous from columns.

This is my 700th Postgres blog entry. I created a blog category index years ago to help people find relevant entries.

View or Post Comments

Sql Combining Queries into Ctes

Monday, November 13, 2023

My common table expression (cte) talk shows how ctes can be used to combine individual queries into a single cte query. This email thread explores the downsides of such combinations, and when combining is unreasonable. Its conclusions are:

The most glowing report of cte combining is from this email, saying, "I have yet to regret replacing a transaction with a cte over the past decade."

View or Post Comments

Sql All About All

Friday, November 10, 2023

Sql is a powerful declarative language — you tell it what you want, and the optimizer determines the fastest way to produce the desired result. Of course, the language is not perfect. I already blogged about the non-sequential order in which select clauses are executed.

The unusual behavior I would like to explain here concerns union, intersect, and except. These three clauses allow queries to be connected with the result being either combined, intersected, or subtracted, respectively. In most aspects of the sql language, you add keywords to cause additional processing, e.g., add unique to a column definition to make it unique, add order by or limit to a select to order or limit the result.

Unfortunately, that is not how union, intersect, and except behave — they actually perform a distinct operations if no modifier keyword is supplied. In fact union, intersect, and except support the rarely-used keyword distinct to be explicit about enabling distinct processing, i.e., union distinct. To prevent this distinct behavior, all must be used. In fact, I recommend union, intersect, and except always use distinct or all so it is clear that a decision has been made about duplicate behavior. (I have not done so below to highlight how easily this behavior can be overlooked.)

I have created so many erroneous queries in the past that every time I use union, intersect, and except I always have to think, "Do I want all here?" and usually I do. I know the default distinct behavior is based on relational algebra, but practically I find it to be error-prone and wish it were different. However, the sql standard requires the default distinct behavior, so query writers just need to be careful when writing queries involving union, intersect, and except and consider the use of all in each case.

In researching my Beyond Joins and Indexes presentation, I discovered some unusual except behavior. When using union and intersect with all, I never consider the stage at which distinct processing happens. I knew that it always happened before returning results, e.g.:

VALUES (1), (1), (1), (2), (2) UNION VALUES (1), (1);
 column1
---------
       1
       2

The fact that there are three 1's and two 2's in the first values clause is insignificant because duplicates are removed before output. This is similarly true for intersect:

VALUES (1), (1), (1), (2), (2) INTERSECT VALUES (1), (1);
 column1
---------
       1

It doesn't matter how many duplicates exist on either side of union or intersect. However, except is different — if duplicate removal only happens before the results are returned, then the presence of duplicates on either side of the except clause could affect the output, e.g.:

VALUES (1), (1), (1), (2), (2) EXCEPT VALUES (1), (1);
 column1
---------
       2

In this query, if duplicate removal happened only before the result was returned, the third 1 in the first values clause would not be removed, and hence should appear in the output. What I learned is that duplicate removal effectively happens before the queries are combined by except (no 1 is returned). I say "effectively" because Postgres, and probably other relational systems, do not perform actual duplicate removal but rather create a HashSetOp set operation which records only the first occurrence of each value, effectively removing duplicates as they are encountered and causing the distinct behavior shown above.

With except all you can see the output with no duplicate removal:

VALUES (1), (1), (1), (2), (2) EXCEPT ALL VALUES (1), (1);
 column1
---------
       1
       2
       2

I used to think of union, intersect, and except without all as simply running the joined queries, and then performing the union, intersect, or except operations. A better mental model is to consider the first query as run and stored, with duplicate removal, and later queries combined with the stored result, also with duplicate removal.

View or Post Comments

Security Scram Is Here to Stay

Wednesday, November 8, 2023

Scram was first introduced in Postgres 10 and has been enhanced in most major release since. In summary:

Postgres 10 Added scram-sha-256 password encryption support for client/server communication and password storage

Postgres 11 Added scram channel binding support

Postgres 13 Allowed libpq to require channel binding

Postgres 14 Made scram-sha-256 the default method for password storage

Because scram has been improved over such a long time period, it can be difficult to understand how all the improvements work together. Postgres 10 introduced a new password hashing method, sha256 and a new way of processing and transmitting passwords, scram (Salted Challenge Response Authentication Mechanism). Together they were called "scram-sha-256".

Postgres 11 added support for channel binding. This feature causes the server to send a hash of the user's password that includes the server's certificate signature. (Ssl/tls is required for channel binding.) Though the certificate signature is public, the certificate can only be used in an ssl handshake by a someone who knows its private key, so it is impossible for a malicious server in the middle to impersonate the certificate's owner and successfully send the hash to the client. Unfortunately Postgres 11 didn't give libpq a way to require channel binding, so Postgres 13 added this capability.

Postgres 14 took on the difficult task of changing the default hashing method for server-side password storage. I say "difficult" not because the code change was hard to make (it was not), but because the new default, if not modified by the user, would cause all new passwords to be invalid for clients that only understand md5.

Another complexity could have been that all users can't update their passwords simultaneously so specifying an authentication method in pg_hba.conf during this transitional period could have been hard. However, the md5 authentication method was modified to support md5 and scram-sha-256 password storage and authentication, depending on how the password is stored, so this made the transition period easier. Administrators can periodically query the system tables to see if any md5-stored passwords remain and eventually switch the authentication method to scram-sha-256 in pg_hba.conf. The stored password prefix indicates its type:

CREATE USER sha256_user PASSWORD '123';
 
SET password_encryption = 'md5';
 
CREATE USER md5_user PASSWORD '123';
 
SELECT rolname, rolpassword FROM pg_authid WHERE rolname LIKE '%_user';
   rolname   |                                                              rolpassword
-------------+---------------------------------------------------------------------------------------------------------------------------------------
 sha256_user | SCRAM-SHA-256$4096:1Ad/QlH6owvCj5vXAUh2CQ==$OQU2vrsq1xSiVoBqCCQGKJUH9TM9hQlD0JilwDG+nhM=:tw//Wq8d8PIenpNsMKYACSK7KTvwHuJUClC6uMP5Elg=
 md5_user    | md57bf609b499ae3c13394233461f186b90

As you can see, this feature took a while to full implement, partly because the various scram options were not clear at the time we started, and because it took time for most deployed client libraries to understand this new protocol.

View or Post Comments

Presentation Explaining the Postgres Query Optimizer Improvements

Monday, November 6, 2023

I wrote Explaining the Postgres Query Optimizer in 2011, and though I have presented it 22 times, it has changed little over the years. (I created a followup presentation this year.) Periodically, I check my slides to make sure they match the output of the current Postgres version.

I did this recently using the file associated with the presentation and found that the explain plans for the final limit queries were different. I updated my slides, and then went to find the optimizer improvement responsible for this change. After applying a patch to enable compiling of older branches, I found the commit was from January 20, 2017.

What was odd is that the patch added logical replication features, and did not modify the optimizer. Thanks to Andrew Gierth, I discovered that the commit changed my explain plans because it increased the number of pg_class rows by 3%, and that changed the costs enough to change the explain plans.

Previously in the slides, limit switched only from a hash join to a nested loop join. With this change, outlined on slide 59, limit and vacuum cause plan changes in this order:

  1. Hash join
  2. Nested loop join with two index scans
  3. Merge join with two index scans
  4. Hash join
  5. Merge join with index scan and sort

I have seen subtle changes in the number of table rows cause changes, but I have never seen limit plans change to such an extent.

View or Post Comments

Presentation Presentations from Sql Queries

Friday, November 3, 2023

Having 61 presentations on my website, I am always looking for efficient ways to create them. For presentations that focus on sql features, I have found the most efficient method is to first write the sql queries in the order I want them to appear in my presentation. I then run the queries through psql, and write its output to a text file. Next, I import the file into my document editor, LyX. I then add slide headings, markup, colors, and supporting text. Sometimes there are too many sql queries to efficiently manage in a single file so I use multiple files and name them with numeric prefixes, e.g., 01, 02, and concatenate them together before feeding them into psql.

Over time, I sometimes modify the queries in the LyX document so I have written a script to extract sql queries from the LyX document — that way, any modifications I make are automatically reflected in the sql file associated with each presentation. I have used this method for 11 presentations. The build process is controlled by Makefiles so all 61 presentations (2,696 slides), and their associated sql files, can be built with a single command, which, thanks to parallelism, takes only 26 seconds.

View or Post Comments

Performance Sharding Status

Wednesday, November 1, 2023

While Postgres excels at supporting many workloads, including modern ones, there are still some workloads it cannot handle. Two areas that Postgres is currently unable to handle involve large data volumes:

Postgres is a great general-purpose database, and it might be fine if Postgres were deemed unacceptable for such workloads, especially if supporting such workloads required a major redesign of Postgres. However, could Postgres be modified incrementally to support such workloads? This is what the Postgres development community has been working on since 2015.

By improving partitioning, parallelism, and foreign data wrappers, sharding can be added to Postgres to handle the above workloads. These slides show how partitioning has been improved over the years, and these slides show how parallelism has been improved. While it would be good if foreign data wrappers could take part in all the partitioning and parallelism optimizations, one missing piece is the ability to execute partial (non-partitionwise) aggregates on partitions in parallel when the partitions are foreign tables. I am hopeful this patch by Fujii-san will be in Postgres 17 to add that capability.

I am unclear how much the existing sharding features are being used. As server hardware keeps getting more powerful, the limits of what is possible on a single server continues to expand. However, there continues to be workloads that cannot be efficiently processed by a single server, so it will be interesting to see if Postgres can expand its abilities to handle such workloads.

View or Post Comments

Administration last_seq_scan and last_idx_scan

Monday, October 30, 2023

Postgres 16 includes new columns in pg_stat_*_tables to record the last sequential and index scans on tables, and a new column in pg_stat_*_indexes to record the last index scan for specific indexes. These fields are only updated on transaction commit to minimize their overhead.

\d pg_stat_user_tables
                      View "pg_catalog.pg_stat_user_tables"
       Column        |           Type           | Collation | Nullable | Default
---------------------+--------------------------+-----------+----------+---------
 relid               | oid                      |           |          |
 schemaname          | name                     |           |          |
 relname             | name                     |           |          |
 seq_scan            | bigint                   |           |          |
 last_seq_scan       | timestamp with time zone |           |          |
 seq_tup_read        | bigint                   |           |          |
 idx_scan            | bigint                   |           |          |
 last_idx_scan       | timestamp with time zone |           |          |
…

The initial purpose of these new fields was to assist in identifying unnecessary indexes via old last_idx_scan values and possibly the need for new indexes via new last_seq_scan values. Of course, sequential and index comparisons over time can also yield this information, but that requires preplanning. When presenting in Chicago, someone mentioned that old last_seq_scan and last_idx_scan values might indicate tables that are unnecessary. Of course, I would never automate removal of such tables, but this is a way to indicate that research is needed to determine if the tables are necessary.

View or Post Comments

Monitoring Query_id and pg_stat_statements

Friday, October 27, 2023

In 2014, someone in Korea requested that we add a query_id to pg_stat_activity to match the queryid in pg_stat_statements. This would allow administrators to get duration estimates of currently running queries. Postgres 14 added this capability, so here is an example of it being used. First, we must install pg_stat_statements:

$ make -C contrib/pg_stat_statements install
 
$ psql test <<END
CREATE EXTENSION pg_stat_statements;
ALTER SYSTEM SET shared_preload_libraries = pg_stat_statements;
\q
END
 
$ pg_ctl restart

With that installed, we will run a query in the background, find the running query_id from pg_stat_activity, and then lookup the average execution time in pg_stat_statements:

$ # Run to create pg_stat_statements entry
$ psql -c 'SELECT pg_sleep(2.001);' test
 
$ psql -c 'SELECT pg_sleep(2.001);' test &
$ # wait for the background query to start
$ sleep 1
 
$ psql test <<END
SELECT query, query_id FROM pg_stat_activity WHERE query LIKE '%pg_sleep(2.001%' AND pid != pg_backend_pid()
\gset
          query          |      query_id
-------------------------+--------------------
 SELECT pg_sleep(2.001); | -11391618518959119
 
SELECT mean_exec_time FROM pg_stat_statements WHERE queryid = :query_id;
   mean_exec_time
--------------------
 2004.1557771666664
 
-- joining the tables
SELECT mean_exec_time
FROM pg_stat_activity, pg_stat_statements
WHERE   pg_stat_activity.query LIKE '%pg_sleep(2.001%' AND
        pid != pg_backend_pid() AND
        pg_stat_activity.query_id = pg_stat_statements.queryid;
   mean_exec_time
--------------------
 2004.1557771666664
END

You can see that pg_stat_statements accurately estimated the duration time of this running query. Of course, this duration time was predictable, but most sql queries are not, so this method can be helpful.

View or Post Comments

Humor Postgres Music

Wednesday, October 25, 2023

It's great to see creative work based on Postgres, and I know of two songs based on Postgres:

They were done a few years ago but are still enjoyable. I am not sure I am going to hear Feliz Navidad the same ever again.

View or Post Comments

Hardware Power-Loss Protection

Monday, October 23, 2023

I have written extensively about the performance benefits of allowing caches to be in write-back mode, rather than write-through mode, and for databases this is only safe by preventing cache data loss during power failure.

Over a decade ago, the only way to have a power-failure-safe cache was by having a battery, often called a battery-backed unit (bbu), on a raid controller. With the introduction of ssds, power-failure-safe caches were accomplished using supercapacitors. Because power-failure-safe caching is something that few applications other than databases care about, product literature often didn't even mention the feature, which required purchasers to study product images, looking for batteries or supercapacitors to verify the feature's existence. Intel produced a helpful document about the feature in 2014.

Things have advanced in recent years, and the industry has standardized on several terms to indicate power-failure-safe caching without mentioning the actual hardware involved, e.g., batteries, supercapacitors. The most common term is "power-loss protection" (plp) and is used by:

Some Crucial products implement "power loss immunity," which is more limited than power-loss protection and not suitable for database workloads.

When I called an AVA Direct salesman to explain my new server needs, he had never heard of power-loss protection, probably because they don't configure many database servers with local storage. Once I explained its purpose, we decided on a Micron 7450 Pro nvme ssd, which clearly mentioned the power-loss protection feature. While manufacturers have gotten better at specifying power-failure-safe cache features, it still takes a little work to find supported models, and "power-loss protection" are the words you should be looking for to find supported products.

View or Post Comments

Security Tde Status Report

Thursday, October 19, 2023

Transparent Data Encryption (tde) at the cluster level has been supported by the Oracle database for many years and has become a requirement of many security specifications and best practices. The Oracle documentation lists two practical advantages of the feature:

Interestingly, it does not mention increased protection from data being accessed by database users, database superusers, or operating system users, and that is because it has limited value in protecting such access. It does also automatically encrypt file system backups, and provides encrypted storage in environments where operating system-level encryption is impossible.

Encryption is great — and you probably use it everyday in your browser. It can also be used to log in to remote servers and send encrypted files. Postgres has supported encrypted communication between client and server for decades. In all of these cases, the encryption protects against man-in-the-middle attacks and the two end points are assumed to be secure. However, how can encryption be used successfully between a database process and its storage? The normal encryption use-case breaks down here — are we suspecting a man-in-the-middle attack between the database process and the storage? Do we consider the file system insecure? Probably not. What Postgres wants is to write data to the file system and storage that only it can read back unencrypted.

As already stated, this is unlike typical encryption usage where the encryption and decryption happen in assumed-to-be secure environments and only the transport is assumed to be insecure. For Postgres, its behavior is affected by the file system so if the file system can be compromised, so can Postgres. This is the quandary that has kept tde from being implemented in Postgres for so many years because it undermines the security of tde encryption. In summary, tde protects against malicious users with read-only access to the file system, and its storage media and backups, but does not protect against write access to the file system or malicious users with read-only access to database process memory.

"What about compliance?", you might ask. Regulatory compliance is an external value of tde, not a security value inherent in the feature itself. I realize this distinction might be meaningless for those who operate in environments that require such regulatory compliance, but that doesn't change the fact that implementing a feature because of its inherent value or because of external requirements is different. Every Postgres feature requires code and maintenance, and the Postgres community is rigorous in implementing features whose inherent value warrants the additional code and maintenance.

However, that doesn't mean there aren't Postgres tde options. The most creative solution is a virtual file system product by Thales. Postgres reads and writes to their virtual file system and the Thales software writes that data encrypted to a regular file system, and decrypts it on read. A more Postgres-specific implementation was done years ago by Cybertec. A patched version of Postgres 12 is available, and their Github repository supports Postgres 15. Several proprietary versions of Postgres have added tde including edb, Crunchy Data, and Fujitsu. That's a lot of companies duplicating effort that could be more efficiently done in the community, but the community has resisted adding the feature so far and these companies are meeting an obvious need. DBaaS cloud providers can provide more simple storage encryption since the file system is usually inaccessible to users.

An open source implementation of tde has just been announced by Percona. It is marked as experimental and is implemented as a table access method. It does not encrypt indexes, the write-ahead log (wal), or temporary tables, so it is not a full-cluster encryption solution.

The community continues to study options for full-cluster encryption. I wrote a set of patches in 2021 that implemented heap and index file encryption, but write-ahead log encryption is still needed. Additional code to encrypt temporary files is one of the most complex and invasive burdens of adding tde. Even if all of this code was written, it is unclear it would be accepted since every feature added to Postgres must balance code overhead against the inherent value of the feature. (Perhaps someday the external value of features will be more influential.) This email discussion is the most recent overview of the value of tde, and its limitations.

View or Post Comments

NoSQL Peanut Butter and Chocolate

Tuesday, October 17, 2023

Postgres's non-relational data type support has greatly increased its adoption in recent years. The use cases I have heard is that Postgres supports relational data, and it supports non-relational data like json. This ability is often called multi-model.

However, I am not sure if that is the right focus — it highlights Postgres's ability to store two types of data workloads in the same system, but it doesn't highlight the power of supporting data that can be seamlessly organized using both systems — let me explain.

People normally say, "Oh, I can store my financial data using Postgres's relational data types, and I can store my browser data in jsonb." However, what about storing most of your financial data using relational, and storing the financial data that doesn't fit cleanly in relational in jsonb, full text search, or PostGIS? And some of your browser data should probably be pulled out of jsonb and stored using relational data types for better consistency and performance.

In the 1970's, there was a marketing campaign for Reese's Peanut Butter Cups that had peanut butter and chocolate eaters colliding, creating a taste that was better than the two individual flavors. I think Postgres multi-model capability fits that better-together idea as well.

View or Post Comments

NoSQL Data Gravity

Monday, October 9, 2023

I found the term "data gravity" confusing when I first heard it because it combines a physics/astronomy term with an information technology term. However, once it was explained to me, it made perfect sense and gave a name to a behavior I have seen frequently.

The idea is that once data is stored, other data, for efficiency purposes, ideally should be stored near it. As the amount of data grows, it becomes almost necessary for other data to be stored near it. The "gravity" concept is that large amounts of data attract other data to be near it.

We see "data gravity" all the time. If we have a directory with files, and we are adding a related file, for efficient access, we put the new file in the same directory. My saved browser bookmarks are stored together. Authentication credentials are usually managed centrally. If you store data in a database, it is usually most efficient to store new data there too.

And this is where specialized databases can cause problems. Is Postgres perfect for every data need? Certainly not. Is the most efficient place to store most of an enterprise's data? Probably. If Postgres or another central data store doesn't fit all your needs, then certainly choose a specialized data store, but realize that you are working against data gravity, and there will be costs associated with distance from the majority of your data. Those costs can be managed, but be sure you figure those costs into your decision in choosing a specialized data store.

Someone once complained that MongoDB was creating undesirable data silos in their organization. I thought they meant that their MongoDB data didn't have an externally-visible structure and was therefore hard to query by applications that did not create the data. While that is true, he also meant that MongoDB was defying data gravity by causing barriers that inhibited the efficient sharing of data that is possible in a centralized data store designed for multiple applications and data needs.

Specialized data stores will always be around, and we need them for specialized purposes, but more general data stores give vast efficiency benefits that should not be overlooked.

View or Post Comments

NoSQL The Relational Future Is Bright

Friday, October 6, 2023

Ten years ago, I published an article titled "The Future of Relational Databases." At the time, I stated that while "NoSQL is all the rage," relational systems, particularly Postgres, still have huge advantages and are easily adjusting to today's workloads.

Having been involved with Postgres for decades, I am always filtering my opinions to try to avoid possible bias. While I think I based my opinion on historical and technical facts, I always wondered whether I missed something because I am so involved with Postgres and had limited knowledge of the competing NoSQL solutions. I had similar doubts in the late 1990's when I felt MySQL was an inferior solution, for technical and development process reasons, which turned out to be accurate.

During the intervening ten years, I saw NoSQL ascend and decline, clearly following the hype cycle of over-enthusiasm for new technology, while Postgres continued its gradual ascent. If I had any doubts about my previous NoSQL opinion, those doubt were eliminated by the opening keynote at this week's PGConf NYC. The keynote, titled What Goes Around Comes Around… And Around… by Andy Pavlo, went back in history to show the growth of relational systems since the 1980s and why specialty database challengers fell short. (The paper upon which the slide are based has more details.)

Today the advantage of relational systems, particularly Postgres with its open source development process and extendibility, has gotten even stronger. We are entering a phase where Postgres is the only reasonable relational database option. We have been working on improving Postgres for 27 years, and at Berkeley for ten years before that, and it seems like we might have many decades to go.

View or Post Comments

Null Null Presentation Updates

Thursday, October 5, 2023

Having delivered my Nulls Make Things Easier? talk at PGConf NYC this week, I found two ways to improve the talk. First, in researching my Beyond Joins and Indexes talk, I was reminded (slide 65) of the difference in null handling between not in and not exists. I realized this information should be my nulls talk, so I added slide 25 to cover this.

Second, someone asked during the Q&A if I could summarize how to avoid problems with nulls so I added slide 45 at the end. Humorously my major recommendation was to liberally use not null column constraints where appropriate to prevent their existence. Fewer nulls, fewer problems.

View or Post Comments

Documentation Railroad Diagrams

Monday, October 2, 2023

Railroad diagrams are a way of graphically representing computer language grammar. While Postgres documents command-line tools and SQL commands using the Unix manual-page-style textual representation, there is value in offering the Postgres syntax in a more graphical way.

Fortunately, two years ago Domingo Alvarez Duarte converted the Postgres grammar to Extended Backus-Naur form (ebnf) which can be fed into online tools to generate railroad diagrams. The Postgres grammar diagram output is available, though it is long. The select diagram might be easier to understand. You can click on the tan boxes to see details.

View or Post Comments

Business Oracle Supports Postgres

Friday, September 29, 2023

At CloudWorld 2022, an Oracle executive announced support for a managed Postgres cloud service and an optimized version called Postgres Aries. Eleven months later, Oracle has announced limited availability of their Postgres cloud service, with full availability in December. (FYI, oci in this article stands for Oracle Cloud Infrastructure.) They even said some nice things about Postgres:

PostgreSQL has long been a beacon in the open source database world. With over 35 years of rigorous development, it boasts an impressive track record for reliability, robust features, and performance. Highlighted by DB-engines, its rise in market share is a testament to its adaptable data model and diverse extensions catering to a variety of use cases.

In general, companies only add products that have features that their existing products lack, and that seems to be the case here.

Oracle is "thrilled" to be offering Postgres, and I think the Postgres community should be thrilled too. With two of the three big proprietary relational vendors, Microsoft and ibm, already promoting Postgres, Oracle was the last holdout. With this announcement, they join the other big cloud vendors like aws, Google, and Alibaba in offering Postgres.

Update: New details 2023-12-10

Update: Now available 2024-03-09

View or Post Comments

Documentation Transaction Processing Chapter

Wednesday, September 27, 2023

Postgres 16 has a new chapter about transaction processing internals, which includes details about transaction identifiers, locking, subtransactions, and two-phase transactions. While this is new in Postgres 16, it also applies to all previous supported releases.

View or Post Comments

Configuration Setting Per-User Server Variables

Monday, September 25, 2023

Postgres server variables offer powerful control over how the database behaves. Privileged users often want to set superuser-restricted server variables for non-superusers. The simplest approach is for the superuser to issue alter role ... set on the non-superuser role. If the superuser wants the non-superuser role to have more control over a server variable, a security definer function can be used, e.g.:

-- Use a transaction block so the function is only visible with the proper permissions.
BEGIN WORK;
 
-- This function allows log_statement to be set to only 'ddl' or 'mod'.
CREATE OR REPLACE FUNCTION set_log_statement (value TEXT)
RETURNS VOID AS
$$
BEGIN
        IF value = 'ddl' OR value = 'mod'
        THEN    EXECUTE format('SET log_statement TO %1$s', value);
        ELSE    RAISE EXCEPTION 'log_statement cannot be set to "%"', value;
        END IF;
END
$$
LANGUAGE plpgsql
SECURITY DEFINER;
 
-- Grant non-superusers permission to execute this function.
REVOKE EXECUTE ON FUNCTION set_log_statement FROM PUBLIC;
GRANT EXECUTE ON FUNCTION set_log_statement TO non_superuser;
 
COMMIT WORK;

Postgres 15 added the ability for grant to give non-superusers permission to change superuser-restricted server variables, e.g.:

GRANT SET ON PARAMETER log_statement TO non_superuser;

As you can see, superusers can adjust superuser-only server variables for non-superusers in several ways:

View or Post Comments

Presentation Postgres 16 Features Presentation

Wednesday, September 20, 2023

Now that I have given a presentation about Postgres 16 features in Jakarta and Chicago, I have made my slides available online.

View or Post Comments

Conference Selecting Conferences

Monday, September 18, 2023

Since the covid crisis ended, the number of Postgres conferences has returned to normal. As a frequent speaker I have to use criteria to decide which conferences to attend. Having to make attendance decisions regularly, I came up with the following criteria to help guide my decisions:

I hope publishing these criteria helps other speakers who have to make similar decisions.

View or Post Comments

Community Leadership Abuse

Friday, September 15, 2023

U.S. President Harry Truman had a defining quote displayed on his desk, "It is amazing what you can accomplish if you do not care who gets the credit." Leaders who embrace this concept attract engaged team members who invest in their goals.

Unfortunately, not everyone follows this concept. Some people are driven by desires for power and recognition, and are willing to exploit organizational weaknesses to accomplish their goals. As I mentioned earlier, the Postgres community structure is very open, and this gives people who want power and recognition the opportunity to quickly get involved, with few guard rails to limit their efforts. They abuse this freedom and try to get out-sized influence. The community usually tolerates their behavior for a long time but eventually sees the harm being done and reacts, often completely eliminating the offending person's influence. These kinds of reactions happen rarely, so they often surprise those who are punished because they can't remember seeing or hearing about any previous punishments, and thought the community was open and that there were no limits.

In fact, the initial punishment is often invisible. It might be as simple as people ignoring them, not accepting them as speakers, or not attending their events. For people who desire power and recognition, this is a serious blow and they start to look for other ways of exploiting the system, leading to further problems which can cause more public punishments. The community also adds structures to avoid similar problems in the future.

In the past I have warned several people who eventually got into such problems. Unfortunately, they thought my warnings were simply negative and that I didn't understand the huge value they were bringing to the community with their "leadership". Once they end up in a big hole, they contact me to ask how to get out, and I am often unable to help since so much damage has been done.

View or Post Comments

Community Who's in Charge?

Monday, September 11, 2023

When you get involved with a new organization, you assume it is like previous organizations you have been involved with that have a hierarchical structure. It has a person or group of people in charge, who are either given their positions by the organization owners, shareholders, or elected by the organization's members. Those in charge control limited resources, like money, budgets, or property. Those not in charge work within the constraints set by those in charge, and their decision-making is limited to implementing the goals assigned to them.

Therefore, when people want to get involved with the Postgres project, they assume we have a structure similar to other organizations. They look for those in charge: "Who controls limited resources?", "How to get in a position of leadership?", and they get confused. Sometimes they focus on the core team — they must be in charge, but they aren't elected, and the list of things they manage doesn't appear to be foundational to the project.

There is a list of committers who control changes to the source code. Are they in charge, at least of the source code? While they initiate the changes, they are subject to acceptance or rejection from anyone on the hackers email list, and they must address any feedback. Sometimes new developers who represent companies ask how they can become committers because they don't understand that committing is a mechanical process and that the hackers list is the proper place to start development discussions.

And are other groups:

Which groups report to which other groups, and who is in charge? The answer is that the groups are distributed and operate mostly independently because they lack resource limits which require hierarchical structures. However, they must take suggestions and criticism from almost anyone in the community, and must respond to it. If a group does a consistently poor job, new members will join the team to help. Almost anyone can join any of the groups, though the last three listed groups limit membership and are more hierarchical because they manage limited resources or handle sensitive information.

Who is ultimately in charge? There is a great temptation to say it is the core team, and technically this is correct because their final listed role is, "Making difficult decisions when consensus is lacking." However, this authority is used so rarely to be almost an afterthought. Instead of thinking of the core team as the center of a vast web of control, it might be better to think of it as the core of the Earth — yes, it needs to be there for other layers be on top of, but most of the Earth's activity happens near the Earth's crust, and the Earth's core can usually be safely ignored.

So, who is in charge? You are, and everyone else in the community. You have the ability to help in almost any area, with few hindrances except for the suggestions and corrections of other community members. How do you get anything done? You give your opinion, you listen for the opinions of others, hopefully adjust your goals with these new opinions, and once no one else complains, you are good to go! Do that for 27 years, like Postgres has, and it is amazing how good the result can be.

Update: This new web page shows many of the details. 2024-07-03

Update: My employer, edb</SC, published a video interview on this topic. 2024-07-08

View or Post Comments

Community Email Etiquette

Friday, September 8, 2023

While a recent Postgres survey indicates that email lists are an impediment to new people getting involved with the project, email continues to be the primary means of communication in the community. Some of the reasons are historical, some are based on email's distributed nature, and some on email's flexibility of expression.

Unfortunately, fewer people are familiar with how to use email efficiently, especially on email lists with thousands of people. While the Postgres community tried to educate users on the intricacies of email, there are a few issues that continue to challenge users:

One additional complexity is that email composition agents are different, meaning that it is hard to trim and inline-quote text in some email agents, and display of inline text is sometimes handled differently. While long-time community members usually use email agents where these formatting options are easy, new community members are less likely to have such email agents and less likely to understand the complexities of emailing thousand of people. Effectively, this makes a double-barrier to people entering the community via the email lists. Fortunately, people are patient and email posters eventually understand the intricacies of email, but it is something that isn't easy at the beginning.

View or Post Comments

Business Supermarket Chicken

Wednesday, September 6, 202

You would think that open source and the inexpensive rotisserie chickens available in many supermarkets would have little in common. Open source exists in the virtual enterprise-dominated world, while supermarket chickens are, well, very domestic and practical. However, economic forces tend to operate similarly in all spheres of life.

First, let's look at the precooked supermarket chicken. A Costco executive explained why those chickens are so cheap:

Some of the things you go to Costco for, you don't need once a week. In order to attract customers more often, they have chicken or hot dogs or the low prices on gasoline. They don't put the chickens at the front of the store. You have to walk past the blenders and the big-screen TVs. The guy who stops in to buy a rotisserie chicken for a quick, hot meal might walk past the laptops and say, "Maybe I'll go to the Costco to buy a laptop for my son for Christmas."

So, cheap chickens get people into the store, and they buy more later. With open source, its low cost gets people to try it out and adopt it for small projects. As the software proves its worthiness, its use expands and ultimately organizations standardize on it. While open source wasn't designed to use this method to expand its footprint, its zero-cost adoption has rapidly increased its growth. Once people use Postgres, and understand the value of its unique features, they would probably choose Postgres even if it were not cheaper.

The normal perception flow during Postgres adoption is:

It's a funny thing — people need low cost to try out Postgres, but once they experience it, its cost benefit is immaterial.

View or Post Comments

Business PostgreSQL Benefits and Challenges: A Snapshot

Tuesday, September 5, 2023

Ivan Panchenko wrote a great article about why and how businesses should start using Postgres. It covers Postgres's structural benefits like vendor independence, code quality, and technology like NoSQL. It also covers the challenges that could be faced, like cost, time, and in-house expertise. When I saw "Not controlled by a single vendor" in both the benefits and challenges sections, I knew he was on the right track!

View or Post Comments

Business Community Edition

Friday, September 1, 2023

Related to my previous blog entry about bait and switch, I have heard people struggle to distinguish between binaries based on source code released by the Postgres community and binaries based on modifications of the community source code. The term "community edition" has often been used to represent binaries in the first category.

I think "community edition" has several problems. First, it suggests that there is a non-community edition, i.e., an enterprise edition. While some companies do produce modified enterprise-focused versions of Postgres, the community only produces one version of the source code which is designed for hobbyists, developers, small businesses, and large enterprises.

A second problem is that companies that control the development of open source software often create a "community edition" of their software which is free, and an "enterprise edition" which requires payment and might have more features, fewer license limitations, or fewer usage restrictions. Again, the Postgres community only produces one version.

I wonder if the terms "community Postgres" or "community release" would be superior to "community edition".

View or Post Comments

Business Open Source Bait and Switch: Licensing and Beyond

Wednesday, August 30, 2023

Having been involved in open source for thirty years, I have seen many open source distribution methods, packaging systems, licensing options, and business models. In the early days there were only two classes of licenses — public domain-style licenses like BSD and MIT, and the GNU licenses. Though the GNU licenses had confusion around applications linking to static GNU libraries, the general goals of the two licenses were well understood. The Postgres license is similar to public domain-style licenses, and is accepted as an OSI-certified license.

As open source took over the enterprise IT infrastructure, companies formed around open source, and diverse licenses started to proliferate. Rather than align with the previous two popular open styles, they created new licenses that were closer to proprietary licenses to gain economic advantage. While the new open source licenses make the source visible, how binaries created from the source could be used and how modifications could be produced were often limited, and licenses changed to be more restrictive. Users often overlooked the negative effects of company control of the software development process until it was too late. Companies even used GPL licenses for revenue-generating purposes in ways there were not obvious to people who initially adopted the GPL software.

In summary, users often equated open source as "not proprietary," and hence always good. It has taken years for organizations to realize that all open source licenses are not the same, and that companies can distort the open source process to their own advantage, to the detriment of the user community. This article (comments) by Peter Zaitsev goes into the gory details of how open source is used to gain economic advantage. Its theme is "bait and switch" — that users choose open source, no matter what open source license is used or who controls development, and often don't realize the downsides until they are hit with the negative consequences.

Having worked with Postgres for 27 years, I have seen this bait-and-switch many times, as users overlooked Postgres for more polished open source software, backed by companies designed to exploit the attraction of open source. As open source-based companies grew, and then, after exploitation was widely known, declined, Postgres continued its steady progress. Twenty-seven years later, that steady progress has paid off with an exceptionally-capable feature set and sterling reputation.

Update: Good video about this. 2024-04-30

View or Post Comments

Presentation The Postgres Trajectory

Friday, June 30, 2023

I am not known for light, fluffy talks, but rather "down in the weeds" talks where I can go through 80 slides in 50 minutes. Maybe not something to be proud of, but it is what excites me.

So, when the PostgreSQL Singapore Community Meetup asked me to give a 20-minute talk about Postgres trends, I was challenged, and was reminded of something Robert Haas said recently about short talks often being harder than longer ones. It required me to come up with the foundational reasons Postgres is so popular, and I choose two, open source and extendibility — both have clear historic and current data on why they are important to Postgres's success.

All I then needed to do was to create a slide deck, and unsurprisingly, I already had a lot of material on these topics, though not distilled to their essence as required. I took the highlights of my material, mostly diagrams, and created a short 14-slide presentation called The Postgres Trajectory. I presented the talk virtually for Singapore and hope to use it for other events where a short, big-picture talk is required.

View or Post Comments

Release notes Postgres 16 Release Notes

Friday, May 19, 2023

I have just completed the first draft of the Postgres 16 release notes. It includes developer community feedback but still needs more xml markup and links.

The release note feature count is 199, which is 7% higher than the average count of the previous five releases. Postgres 16 Beta 1 should be released soon. The final Postgres 16 release is planned for September/October of this year.

View or Post Comments

Presentation Dissecting Partitioning

Thursday, January 12, 2023

Having delivered my new talk Dissecting Partitioning at PGDay/MED, I have now added this 109-slide presentation to my website. The talk first covers the purpose and limitations of Postgres's declarative partitioning implementation. The bulk of the talk uses SQL queries to show the many optimizations possible with partitioning. It concludes by showing some complicated data architectures made possible by partitioning.

View or Post Comments

Presentation Beyond Joins and Indexes

Thursday, January 12, 2023

My presentation Explaining the Postgres Query Optimizer covers the details of query optimization, optimizer statistics, joins, and indexes. I have delivered it 20 times since 2011.

While that talk covers common optimizer issues, in recent years I began to feel it would be helpful to produce a comprehensive talk about the other plans the optimizer can choose. Researching all the details proved to be a challenge, and the result is Beyond Joins and Indexes, which covers the 42 other plans that were not covered in my previous optimizer talk. I delivered this talk at the DC PostgreSQL Users Group on Monday so I have now added the slides to my website.

Update: I incorrectly stated in my two previous live presentations (Los Angeles, Russia) that memoize did not cache negative matches. I have corrected the slides on this point. 2023-04-07

View or Post Comments