Bruce Momjian

Postgres Blog


This blog is about my work on the Postgres open source database, and is published on Planet PostgreSQL. PgLife allows monitoring of all Postgres community activity.

Online status:
Unread Postgres emails:
Email graphs: incoming, outgoing, unread, commits (details), blog posts, events



Performance Index Order Does Matter

Friday, February 24, 2017

Postgres has supported multi-column indexes since 1997, e.g. create index i_test on test (a, b, c). It can easily use an index if the supplied columns are all at the front of the index, e.g. a and b in the previous index, but it can also use the index if some of the indexed column values are not supplied, e.g. columns a and c in the previous index. It does this by looking up a in the index, then looking through the index for matches of c, ignoring values of b, e.g.

CREATE TABLE test (a INT, b INT, c INT);
INSERT INTO test
        SELECT x, x, x FROM generate_series(1, 100000) AS f(x);
CREATE INDEX i_test ON test(a, b, c);
ANALYZE test;
 
EXPLAIN (COSTS false)
SELECT * FROM test WHERE a = 1 AND c = 1;
              QUERY PLAN
--------------------------------------
 Index Only Scan using i_test on test
   Index Cond: ((a = 1) AND (c = 1))

Of course, if possible, you should always put the most commonly-supplied columns first in the index because skipping columns during index scans (called "index skip scans") is expensive.

However, if you are supplying all the column values referenced in the index, I assumed it didn't matter what order the columns were specified in the index, but Robert Haas recently mentioned this is not always true. For example, if a has many duplicate values, and b has mostly unique values, having a at the start of the index is suboptimal — better to use b first, which will more effectively narrow the search space when looking for matches on a.

A more specific example of this is range queries. In that case, if the range test of one column is less restrictive than the equality test of another, it would be better for the equality test column to be first in an index. For example, in the queries below, the first query uses an index because it is very restrictive on the first column, the second query uses an index because it is moderately restrictive on the first column, while the third does not use an index because it is effectively unrestrictive on the first column:

EXPLAIN SELECT * FROM test WHERE a = 1 AND b >= 1 AND b <= 100000;
                               QUERY PLAN
-------------------------------------------------------------------------
 Index Only Scan using i_test on test  (cost=0.42..8.44 rows=1 width=12)
   Index Cond: ((a = 1) AND (b >= 1) AND (b <= 100000))
 
EXPLAIN SELECT * FROM test WHERE a >= 1 AND a <= 50000 AND b = 1;
                                 QUERY PLAN
----------------------------------------------------------------------------
 Index Only Scan using i_test on test  (cost=0.42..1404.10 rows=1 width=12)
   Index Cond: ((a >= 1) AND (a <= 50000) AND (b = 1))
 
EXPLAIN SELECT * FROM test WHERE a >= 1 AND a <= 100000 AND b = 1;
                       QUERY PLAN
--------------------------------------------------------
 Seq Scan on test  (cost=0.00..2291.00 rows=1 width=12)
   Filter: ((a >= 1) AND (a <= 100000) AND (b = 1))

Notice the increasing costs, even though all queries match one indexed row.

Obviously, in cases where you are not specifying all indexed columns in every query, you should put the most frequently referenced columns first in the index to avoid the overhead of index skip scans. However, for cases where most indexed columns are going to be supplied in queries, placing the most restrictive columns first in indexes is a win.

 


Performance Expression Index Statistics and Joins

Wednesday, February 22, 2017

In my previous blog post, I showed how statistics generated on expression indexes can be used to produce more accurate row counts, and potentially better plans. While I did show more accurate row counts via explain, I did not show changed query plans. I plan to do so in this blog post. First, the setup:

CREATE TABLE test1 AS
        SELECT * FROM generate_series(1, 100000) AS f(x);
CREATE TABLE test2 AS
        SELECT * FROM generate_series(1, 2) AS f(x);
 
ANALYZE test1;
ANALYZE test2;

then a join query using modulus with no expression index:

EXPLAIN SELECT test1.x
FROM test1, test2
WHERE test1.x % 2 = 1 AND test1.x = test2.x;
                           QUERY PLAN
-----------------------------------------------------------------
 Nested Loop  (cost=0.00..1959.02 rows=1 width=4)
   Join Filter: (test1.x = test2.x)
   ->  Seq Scan on test1  (cost=0.00..1943.00 rows=500 width=4)
         Filter: ((x % 2) = 1)
   ->  Materialize  (cost=0.00..1.03 rows=2 width=4)
         ->  Seq Scan on test2  (cost=0.00..1.02 rows=2 width=4)

A nested loop join is used, which is suboptimal because the row count for test1 is one hundred times too small. With proper statistics on the modulus operation on test1.x, a more efficient hash join is used:

CREATE INDEX i_test1 ON test1((x % 2));
ANALYZE test1;
ANALYZE test2;
 
EXPLAIN SELECT test1.x
FROM test1, test2
WHERE test1.x % 2 = 1 AND test1.x = test2.x;
                            QUERY PLAN
------------------------------------------------------------------
 Hash Join  (cost=1.04..2132.29 rows=1 width=4)
   Hash Cond: (test1.x = test2.x)
   ->  Seq Scan on test1  (cost=0.00..1943.00 rows=50197 width=4)
         Filter: ((x % 2) = 1)
   ->  Hash  (cost=1.02..1.02 rows=2 width=4)
         ->  Seq Scan on test2  (cost=0.00..1.02 rows=2 width=4)

Notice the test1 row count is now much more accurate, and that analyzing the base table also analyzes the expression index. The total cost is now slightly higher (2132.29 vs. 1959.02), but that is not because the hash join is more expensive. Rather, it is because the nested loop misestimated how many rows it would need to process because it didn't know the selectivity of the modulus operation.

One thing I learned in researching this blog post is how much the optimizer "loves" hash joins. If test2 has three or more rows, or if test1 has ten times more rows and parallelism is enabled, a hash join is used even without expression index statistics. Hash joins are very robust despite misestimation so they are favored by the optimizer. The takeaway is that the creation of expression indexes for statistical purposes is recommended only if testing shows they actually improve query plans, i.e. improving explain row counts alone has little benefit.

Post a Comment

Performance Statistics on Expression Indexes

Monday, February 20, 2017

Most people know that Postgres allows the creation of indexes on expressions. This is helpful if you need index lookups of expressions used in where clauses.

However, there is another benefit to expression indexes, and that is optimizer statistics. Not only do expression indexes allow rapid lookups of matching expressions, but they also provide optimizer statistics, which improve row estimates and hence query plans. Here is an example:

CREATE TABLE test (x INTEGER);
INSERT INTO test SELECT x FROM generate_series(1, 100) AS t(x);
 
ANALYZE test;
 
SELECT COUNT(*) FROM test WHERE x % 2 = 1;
 count
-------
    50
 
EXPLAIN SELECT * FROM test WHERE x % 2 = 1;
                     QUERY PLAN
----------------------------------------------------
 Seq Scan on test  (cost=0.00..2.50 rows=1 width=4)
   Filter: ((x % 2) = 1)

The optimizer doesn't know the selectivity of the modulus operator, so it initially assumes only one row is returned. Once an expression index is created and analyze statistics generated, the optimizer knows exactly how many rows will be returned:

CREATE INDEX i_test ON test ((x % 2));
 
ANALYZE test;
 
EXPLAIN SELECT * FROM test WHERE x % 2 = 1;
                     QUERY PLAN
-----------------------------------------------------
 Seq Scan on test  (cost=0.00..2.50 rows=50 width=4)
   Filter: ((x % 2) = 1)

Interestingly, the optimizer used expression index statistics, even though the expression index itself was not used. In the example above, the modulus operator is not selective enough to make the index useful, but expression statistics would be useful for more complex queries, e.g. with joins. This method can also be used to create statistics on functions.

It is also possible to create an expression index that generates cross-columns statistics, the benefits of which I mentioned in an earlier blog post. For example, this expression index would supply accurate statistics for state/city combinations, but queries would need to use the exact concatenation construction:

CREATE INDEX i_customer_state_city ON customer ((state || '|' || city));

It would be nice if there a way to create expression statistics without the overhead of creating and maintaining indexes.

Post a Comment

Performance Still No Query Hints?

Friday, February 17, 2017

So, it is 2017 and Postgres still doesn't support query hints like other relational databases? Yep, it's true, and probably will be forever since "'Oracle-style' optimizer hints" is listed in the "Features We Do Not Want" section of the Postgres todo list. A wiki page outlines the reasons for this.

While this seems perfectly logical to people who have used Postgres for years, it strikes new users as rigid and extreme. There are several reasons for this divergence.

First, what new users don't realize is that there are ways to control the optimizer, just not with inline query hints. The most useful ways are by changing planner constants and other planner options, which allow you to tailor the optimizer to your specific hardware and query types. This is a more holistic approach to optimizer tuning because, if set properly, they improve all queries, compared to "use this index"-style hints which must be added to every query.

There are also several crude enable settings that can be changed at the session level to prevent the optimizer from considering certain executor capabilities, like join and scan types. Unfortunately, these effect the entire query, i.e. you can't turn off sequential scan for just one table. It also doesn't allow the specification of which index to use. These are really designed for debugging.

In the same vein, an area that often causes suboptimal plans are queries that reference several columns from the same table where column values are correlated, e.g. a US state column with the value 'NY' is more likely to have a city value of 'New York City' than a state value of 'AZ' would. While inline query hints might be able to fix this case for constants, it can't help when variables are passed into the query, like with parameterized queries. A patch being considered for Postgres 10, multivariate statistics, would benefit all queries by collecting statistics on column combinations in addition to the single column statistics already collected. This, again, is an example of a holistic approach to improving the optimizer.

There are two additional ways to control single queries that are often used in production. Common table expressions allow a query to be broken up into stages which are executed sequentially because each with clause acts as an optimization fence. Secondly, the use of offset 0 prevents subqueries from being moved into the outer query.

I expect future improvements to address other optimizer shortcomings. One area where inline query hints would really help is fixing queries in an emergency, when you don't have time to research the cause. Postgres currently doesn't have an easy solution for that.

Post a Comment

Performance Going Deep on Stats

Wednesday, February 15, 2017

Postgres includes lots of monitoring tools that allow you to see what is happening, but there are a few settings that really go into detail, e.g. log_statement_stats:

SET client_min_messages = log;
SET log_statement_stats = true;
 
SELECT 100;
LOG:  QUERY STATISTICS
DETAIL:  ! system usage stats:
1 !       0.000000 s user, 0.000000 s system, 0.000201 s elapsed
2 !       [0.000000 s user, 0.000000 s system total]
3 !       0/0 [0/0] filesystem blocks in/out
4 !       0/14 [0/363] page faults/reclaims, 0 [0] swaps
5 !       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
6 !       0/0 [3/0] voluntary/involuntary context switches
 ?column?
----------
      100

What do all those numbers mean? Well, the Linux getrusage() manual page has a clue — these are kernel resource usage statistics. The first two lines show the user and kernel ("system") cpu time used, as well as elapsed time. (The numbers in brackets on line #2 and following are totals for this process.) Line #3 shows I/O from/to the storage device (not from the kernel cache). Line #4 covers memory pages faulted into the process address space and reclaimed. Line #5 shows signal and ipc message activity. Line #6 shows process context switches.

Let's look at some more interesting queries by populating a new table:

CREATE TABLE test(x INTEGER);
INSERT INTO test SELECT x FROM generate_series(1, 1000) t(x);

By clearing the kernel buffers with echo 3 > /proc/sys/vm/drop_caches and restarting the server we can see reads from the storage device (via log_statement_stats) and reads from the kernel cache (via explain (buffers true):

EXPLAIN (ANALYZE true, BUFFERS true, TIMING false, COSTS false)
SELECT COUNT() FROM test;
LOG:  QUERY STATISTICS
DETAIL:  ! system usage stats:
!       0.000000 s user, 0.000000 s system, 0.019259 s elapsed
!       [0.008000 s user, 0.000000 s system total]
!       1712/0 [5840/280] filesystem blocks in/out
!       7/94 [21/586] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!       29/1 [73/3] voluntary/involuntary context switches
                    QUERY PLAN
---------------------------------------------------
 Aggregate (actual rows=1 loops=1)
   Buffers: shared read=5
   ->  Seq Scan on test (actual rows=1000 loops=1)
         Buffers: shared read=5
 Planning time: 3.673 ms
 Execution time: 2.221 ms
 
EXPLAIN (ANALYZE true, BUFFERS true, TIMING false, COSTS false)
SELECT COUNT() FROM test;
LOG:  QUERY STATISTICS
DETAIL:  ! system usage stats:
!       0.000000 s user, 0.000000 s system, 0.000408 s elapsed
!       [0.008000 s user, 0.000000 s system total]
!       0/0 [5840/288] filesystem blocks in/out
!       0/3 [21/591] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!       0/0 [74/3] voluntary/involuntary context switches
                    QUERY PLAN
---------------------------------------------------
 Aggregate (actual rows=1 loops=1)
   Buffers: shared hit=5
   ->  Seq Scan on test (actual rows=1000 loops=1)
         Buffers: shared hit=5
 Planning time: 0.027 ms
 Execution time: 0.250 ms

Notice the first count reads 1712 512-byte blocks from storage and 5 8kB blocks from the kernel cache into the shared buffer cache. (The unrecorded reads are probably system table information.) The second query shows no I/O from the storage device, and a hit on 5 8kB shared buffer pages.

As you might know, Postgres executes queries in stages. Fortunately, you can get rusage statistics for each stage with log_parser_stats, log_planner_stats, and log_executor_stats. (explain (analyze true) also shows planning and execution durations, which you can see above.)

As you can see, you can access a wealth of information about Postgres and how it is using the cpus, storage, and caches. These have a large impact on how quickly queries execute.

Post a Comment

Performance Performance Goalposts

Monday, February 13, 2017

I often get the question, "How do I get good performance out of Postgres?," and sometimes I am kind of miffed because I get the feeling that they expect a short answer. Those who are familiar with Postgres know there is no short answer, and it really isn't Postgres's fault because there isn't a simple answer to that question for any relational database.

I guess it comes down to the complexity of relational systems. There is the sql parser, the optimizer, and background workers. No matter how much auto-tuning we do, there are still going to be things that administrators must adjust for optimal performance. And then there is the hardware, which is stressed by all relational systems. Proper tuning of Postgres must happen at all three levels:

  • Hardware
  • Database parameter configuration
  • Sql (both data modeling and queries)

Unless all three are tuned, you can't say the system is optimized. I have already covered hardware tuning in my Database Hardware Selection Guidelines presentation. Database parameter configuration is a tough one because every site has different requirements, though there are a few common tuning cases. Tuning at the sql level involves proper schema and index setup, plus monitoring with something like log_min_duration_statement, pg_stat_statements, or pgBadger. Once you identify a problem sql query, you often have to do detective work with explain.

None of this is easy, but hey, if it were easy everyone could do it. The good news is that, when done well, database performance tuning can yield big dividends.

Post a Comment

NoSQL Postgres Gives Developers More Options with NoSQL

Saturday, February 11, 2017

I just co-authored a blog for ibm's Linux on Power developer site. While there is similar content in my YeSQL slides, the ibm blog more concisely describes the four options available to application developers, from pure relational to pure NoSQL, and the options in between possible with Postgres.

Post a Comment

Object bundles Creating Extensions in Schemas

Thursday, February 2, 2017

Postgres extensions are a powerful way to add functionality to Postgres and dozens of extensions are available. However, it can be difficult to manage them. By default, extensions are loaded into the first existing schema in search_path (usually the public schema) e.g.:

CREATE EXTENSION pgcrypto;
\df
                                        List of functions
 Schema |         Name          | Result data type |        Argument data types         |  Type
--------+-----------------------+------------------+------------------------------------+--------
 public | armor                 | text             | bytea                              | normal
 public | armor                 | text             | bytea, text[], text[]              | normal
…

This loads 36 functions into the public schema. This can be difficult to manage, particularly if multiple extensions are loaded into the same database.

One solution is to place each extension into its own schema. This can be accomplished by setting search_path before loading the extension, or using the with schema clause in the create extension command:

CREATE SCHEMA pgcrypto;
 
CREATE EXTENSION pgcrypto WITH SCHEMA pgcrypto;
 
\df pgcrypto.*
                                         List of functions
  Schema  |         Name          | Result data type |        Argument data types         |  Type
----------+-----------------------+------------------+------------------------------------+--------
 pgcrypto | armor                 | text             | bytea                              | normal
 pgcrypto | armor                 | text             | bytea, text[], text[]              | normal
…

Users can then set their search_path whenever they want to use the extension, or prefix the function call with the schema name, e.g. select pgcrypto.armor('abc').

View or Post Comments

Administration Beware of Antivirus Software with Postgres

Monday, January 30, 2017

Antivirus software helps to protect your computer from external attack. Unfortunately, it can cause problems with Postgres. Antivirus software constantly scans for files and behavior that suggests attacks, but Postgres can easily get caught in that net.

Postgres regularly creates heap, index, and wal files, and these files are binary. Depending on what data is written, these files can contain patterns that are mistaken for viruses, causing the files to be removed from Postgres-controlled directories and quarantined. Obviously, this causes bad things to happen to Postgres. Postgres sites using anti-virus software should disable anti-virus scans of directories controlled by Postgres.

View or Post Comments

News Updated Blog Categories, Images

Saturday, January 28, 2017

When the number of my blog entries surpassed 400, I realized that the categories I was using to group entries for easier reference were too coarse. Therefore, I have added new blog categories to make future reference easier. Also, I have dozens of new blog entries ready to post in the coming months.

Related to restructuring, I have updated many of the stock images in my presentations.

View or Post Comments

Performance Two More Interviews

Wednesday, January 25, 2017

I did two hour-long audio interviews recently with the group Three Devs and a Maybe. The first is covers performance and the second covers database concurrency.

View or Post Comments

Thoughts Moscow Interview

Monday, January 23, 2017

I did a 12-minute interview in Moscow in November and a recording is now available online. The first four minutes cover Russian-specific aspects of Postgres, and the remainder discusses Postgres's general direction, challenges, and its place in the database ecosystem.

View or Post Comments

Security Using Ssl Certificates

Tuesday, January 17, 2017

Having covered ssl certificate creation and the use of certificate authorities (ca), I would like to put it all together and show how certificates and certificate authorities work to ensure trusted Postgres communication.

I have created a diagram showing server, client, and certificate authority certificates. None of these certificates is secret, e.g. the server sends its ssl certificate to the client, and visa versa. In the diagram, the server and client use the same certificate authority certificate. (Intermediate certificate authorities could also be used.)

When the client connects, the server sends its certificate to the client. The client uses the public key in its certificate authority certificate to verify that the server certificate was signed by its trusted certificate authority (the red line). It then uses the public key in the server certificate to encrypt a secret key that is sent to the server. Only a server with the matching private key can reply to generate a session key. It is not the possession of the server certificate that proves the server's identity but the possession of the private key that matches the public key stored in the server's certificate. The same is true for client certificates used for client host and user authentication (the blue line).

Hopefully this diagram helped you see how the same certificate authority certificate on the server and client allows for identity verification. Interestingly, identity verification is not required for two systems to communicate in an eavesdrop-proof manner, but if you can't be sure who you are communicating with, it isn't very useful.

View or Post Comments

Security Creating Ssl Certificates

Thursday, January 12, 2017

Having covered the choice of certificate authorities, I want to explain the internals of creating server certificates in Postgres. The instructions are already in the Postgres documentation.

When using these instructions for creating a certificate signing request (csr), two files are created:

  • certificate signing request file with extension req
  • key file, containing public and private server keys, with extension pem

(It is also possible to use an existing key file.) You can view the contents of the csr using openssl, e.g.:

$ openssl req -in server.req -text
Certificate Request:
    Data:
        Version: 0 (0x0)
        Subject: … CN=momjian.us …
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:b2:ea:53:a6:5e:0a:bd:dd:59:08:52:0d:bb:77:
                    …
                Exponent: 65537 (0x10001)
        Attributes:
            a0:00
    Signature Algorithm: sha256WithRSAEncryption
…
-----BEGIN CERTIFICATE REQUEST-----
…
-----END CERTIFICATE REQUEST-----

The first argument to the openssl command-line tool always starts with a subcommand, in this case req. (The openssl manual pages are split based on the subcommand, so man req shows openssl arguments for the subcommand req.)

The common name (cn) shown above is potentially checked by the client. The encoded section ("certificate request") can be suppressed with -noout. The file server.req already contains this information in text format because -text was specified during its creation.

The server public and private keys stored in privkey.pem can also be viewed:

$ openssl pkey -in privkey.pem -text
-----BEGIN PRIVATE KEY-----
MIIEvwIBADANBgkqhkiG9w0BAQEFAASCBKkwggSlAgEAAoIBAQCy6lOmXgq93VkI
…
-----END PRIVATE KEY-----
Private-Key: (2048 bit)
modulus:
    00:b2:ea:53:a6:5e:0a:bd:dd:59:08:52:0d:bb:77:
…
-----END PRIVATE KEY-----
Private-Key: (2048 bit)
modulus:
    00:a3:ad:e4:61:2b:5f:c0:5b:23:fc:8c:b4:ab:70:
    …
publicExponent: 65537 (0x10001)
    …
privateExponent:
    …
prime1:
    …
prime2:
    …
exponent1:
    …
exponent2:
    …
coefficient:
    …

All of this can be very complex so I have created a diagram which illustrates what is happening. At the top-left is the server key generated by openssl req. This command also creates the certificate signing request (the csr, top-right) which contains:

  • Certificate signing information in X.509 format, e.g. common name (cn)
  • Public server key to be used by clients wishing to authenticate the server's identity
  • The above signed by the server's private key to prove that the server owner supplied this information

The certificate signing request (csr) can't be used as a certificate because it is unsigned. For self-signed certificates, the server's private key is used to sign the csr. (These are the instructions given in the Postgres documentation.) A more powerful option, as I mentioned in my previous blog post, is to use a local certificate authority. In my diagram, a certificate authority (bottom-left) uses its private key to sign the csr and create a certificate (bottom-right).

Once the server has a certificate, any client who has the signer of the server's certificate in their ~/.postgresql/root.crt file can authenticate the server certificate by using the sslmode connection parameter verify-ca. They can also verify that the certificate was created for the specific server's host name by checking the common name (cn) using verify-full. (Clients can record revoked certificates in ~/.postgresql/root.crl.)

Hopefully you can make better use of ssl server certificates now that you understand how they are created, inspected, and verified.

View or Post Comments

Security Ssl Certificates and Certificate Authorities

Monday, January 9, 2017

When setting up ssl in Postgres, you can't just enable ssl. You must also install a signed certificate on the server.

The first step is to create a certificate signing request (csr) file that contains the host name of the database server. Once created, there are three ways to sign a csr to create a certificate:

If the certificate is to be self-signed, use the key created by the certificate signing request to create a certificate. If using a local certificate authority, sign the csr file with the local certificate authority's key.

The use of public certificate authorities doesn't make sense for most databases because it allows third parties to create trusted certificates. Their only reasonable use is if you wish to allow public certificate authorities to independently issue certificates that you wish to trust. This is necessary for browsers because they often connect to unaffiliated websites where trust must be established by a third party. (Browsers include a list of public certificate authorities who can issue website certificates it trusts.)

Issuing certificates to clients as well enables additional features:

These items require the server and client certificates be signed by the same certificate authority.

Ssl setup in Postgres can be complicated, but it offers a unique set of security and usability features that are unmatched.

View or Post Comments

Administration Use Kill -9 Only in Emergencies

Friday, January 6, 2017

During normal server shutdown, sessions are disconnected, dirty shared buffers and pending write-ahead log (wal) records are flushed to durable storage, and a clean shutdown record is written to pg_control. During the next server start, pg_control is checked, and if the previous shutdown was clean, startup can ignore the wal and start immediately.

Unfortunately, a clean shutdown can take some time, and impatient database administrators might get into the habit of using kill -9 or pg_ctl -m immediate to quicken the shutdown. While this does have the intended effect, and you will not lose any committed transactions, it greatly slows down the next database startup because all wal generated since the last completed checkpoint must be replayed. You can identify an unclean shutdown by looking at the server logs for these two ominous lines:

LOG:  database system was interrupted; last known up at 2016-10-25 12:17:28 EDT
LOG:  database system was not properly shut down; automatic recovery in progress

These crash database shutdowns can also happen if the computer crashes, the operating system crashes, or if a proper database shutdown script isn't triggered on computer shutdown.

So, in summary, administrators should avoid Postgres crash shutdowns unless shutdown time is more precious than startup time. Every time those log lines appear, Postgres is doing more work than would have been necessary if the previous shutdown had been clean.

View or Post Comments

Administration Controlling Autovacuum

Tuesday, January 3, 2017

Unlike other database systems, Postgres makes the cleanup process visible and tunable to users. Autovacuum performs recycling of old rows and updates optimizer statistics. It appears in ps command output, the pg_stat_activity system view, and optionally in the server logs via log_autovacuum_min_duration.

Postgres also allows fine-grained control over the autovacuum cleanup process. Occasionally users find that cleanup is slowing the system down, and rather than modifying the behavior of autovacuum, they decide to turn it off via the autovacuum setting.

However, turning off autovacuum can cause problems. Initially the system will run faster since there is no cleanup overhead, but after a while old rows will clog up user tables and indexes, leading to increasing slowness. Once that happens, you can turn on autovacuum again, and it will recycle the old rows and free up space, but there will be much unused space that can't be reused quickly, or perhaps ever.

Turning off autovacuum is kind of like someone trying to save time by not changing the engine oil in their car. They are correct that for a while their car will be in the shop less often, but ultimately the car will be in the shop for a long time, or will stop working. Therefore, don't turn off autovacuum.

Another problem is that while autovacuum is turned off, optimizer statistics were not being updated, perhaps causing slower execution plans. Fortunately restarting autovacuum does fully fix that problem.

Now that we know that turning off autovacuum is a bad idea, what are the options if administrators want to reduce the overhead of autovacuum. Well, first, there are many autovacuum tuning parameters that allow autovacuum activity to happen less often, or consume fewer resources while running.

A more aggressive, and perhaps creative, approach is to change when autovacuum runs. Most systems have busy times and idle times. Sometimes this can be determined by the time of the day or the day of the week, e.g. Sundays, 0200-0500. In other cases it can be determined by the system's load average or number of active sql sessions in pg_stat_activity.

No matter how it is determined, wouldn't it make sense for autovacuum to run during these idle periods, and not run during busy periods? Well, it's not quite that simple. You don't really want to force autovacuum to run during idle times if there is no useful activity to perform, nor do you want it to never run during busy times in case there is a massive delete that requires cleanup or a large data change that requires updated optimizer statistics.

What you want is to encourage autovacuum to run during idle periods, and discourage it from running during busy times. Fortunately, that is easy to do, especially with the alter system command added in Postgres 9.4.

Let's assume you have your own method to determine busy and idle times. When an idle time starts, simply execute:

ALTER SYSTEM SET autovacuum_vacuum_scale_factor = 0.1;
ALTER SYSTEM SET autovacuum_analyze_scale_factor = 0.05;
 
SELECT pg_reload_conf();
 pg_reload_conf
----------------
 t

Of course, if you changed these settings in postgresql.conf you can use this in psql to reduce the current value of autovacuum_vacuum_scale_factor by half:

SELECT current_setting('autovacuum_vacuum_scale_factor')::float8 / 2 AS scale
\gset
ALTER SYSTEM SET autovacuum_vacuum_scale_factor = :scale;

You might also want to reduce autovacuum_freeze_max_age and autovacuum_multixact_freeze_max_age during this period. Then, at the end of the idle period, run this:

ALTER SYSTEM RESET autovacuum_vacuum_scale_factor;
ALTER SYSTEM RESET autovacuum_analyze_scale_factor;
 
SELECT pg_reload_conf();
 pg_reload_conf
----------------
 t

This will set the values back to their defaults. Pretty simple, huh? Yet effective. A more sophisticated approach would be to proportionally decrease these settings based on the load on the system.

Let's walk through how this works. Suppose we don't change any settings during idle times and use the default autovacuum_vacuum_scale_factor of 20%. If a table has 19.5% expired rows at midnight, autovacuum will not run. When the percentage of expired rows reaches 20% at 11am, autovacuum will run and remove the expired rows. Now suppose that autovacuum_vacuum_scale_factor is reduced by half at midnight — autovacuum will run and all tables will have less than 10% expired rows by the time the idle period ends. Now, during the busy time autovacuum will only run if a table has increased from 10% expired rows to 20%.

It would be nice if this proportional behavior could be added to autovacuum but I am not sure how to implement that in a clean way.

View or Post Comments