Bruce Momjian: Postgres Blog

Bruce Momjian

Postgres Blog

This blog is about my work on the Postgres open source database, and is published on Planet PostgreSQL. PgLife allows monitoring of all Postgres community activity.

Online status:
Unread Postgres emails:
Email graphs: incoming, outgoing, unread, commits (details)

View Category Index

Visualizing Postgres Feature Growth

Friday, November 22, 2013

As the creator of the initial major release notes, I am aware of the many changes that go into every release. What I often cannot see is the pattern of feature growth that the Postgres community accomplishes on a yearly basis.

This keynote presentation from PGConf EU contained a graphic that gave me a new perspective. The presentation by EnterpriseDB's Keith Alsheimer includes valuable research and quotes, but slide 18 is the image that caught my eye. It takes Postgres features and groups them into three areas:

Easy to use / deploy
High-end enterprise requirements
New workloads / platforms

With examples under each section, the diagram clearly shows Postgres development targeting three different segments. With this visual, it is easier for me to see which segment a new feature targets, and when a feature might benefit one segment while harming another.

This presentation will also be given in December at the Boston PostgreSQL Users Group.

View or Post Comments

PgPool Improves Reliability

Tuesday, November 19, 2013

Originally released in 2006, pgpool-II is the Swiss Army Knife of Postgres middleware tools, allowing connection pooling, replication, load balancing, and parallel query. With the ascendency of PgBouncer and streaming replication, its role has changed, but its position as an intermediary between database clients and servers has allowed it to retain a unique and valuable toolset. For example, pgpool's ability to detect and redirect read-only queries is used in many read-balancing solutions, including EnterpriseDB's cloud database offering. It is also uniquely positioned to handle server fail-over by redirecting client requests.

Unfortunately, with all these features, pgpool's complexity has hampered its adoption. Fortunately, a new focus on reliability has greatly increased its usefulness. Six months ago, EnterpriseDB employee and now pgpool committer Muhammad Usama started working on pgpool stability by restructuring the code and importing Postgres's memory manager into pgpool. Lead pgpool developer Tatsuo Ishii is now using the Coverity software scanner to find and fix pgpool bugs. Even VMWare's Heikki Linnakangas has joined to help. A new regression test suite has also been developed. Pgpool developers are even considering removing parallel query to simplify the code and help focus development efforts.

These are exciting times for pgpool. The number of fixes in November alone is massive, and a new release with these fixes is already being planned. As middleware, pgpool is in a unique position to provide powerful database features, and it is great to see it growing in that role.

View or Post Comments

PgLife Is Six Months Old

Monday, November 11, 2013

PgLife turned six-months old last week. I have made a few minor improvements since its release, mostly around caching and automatically tracking major releases. Fortunately everything it displays is automatically generated.

I continue to find it helpful for providing a summary of current Postgres community activity. The hourly user count averages five people.

View or Post Comments

New Presentations

Tuesday, November 5, 2013

I completed a few new presentations recently. Building Open Source Teams is about the mechanics of motivating and managing open source teams. I presented this in October at Moscow's National Research Nuclear University (MEPhI) and a modified version as a keynote in Prato, Italy.

My new presentation Managing Global Teams is about my experience with companies managing remote workforces, with leadership tips and my open source experience mixed in. I presented it in Maryland in September and will be presenting it in Montreal in February.

A perpetual problem for me is creating new feature presentations for every major Postgres release. These are difficult because I give Postgres features talks only once or twice for every major release, and they have to be totally rewritten every year. In the past I have used other people's presentations, but I recently found a workable solution for creating my own:

Create a slide for each new feature listed in the Overview section of the major release notes.
For each item, add the explanatory text that is below each item in the release notes' detail section.
Use the Postgres wiki's What's New page to get examples of all the new features.
Add diagrams as needed.

Using this method I was able to quickly create a 9.3 features presentation that I used in New York City.

And finally, while presenting MVCC Unmasked for the thirteenth time in Moscow, I realized that a key slide (#10) needed additional illustration, so I added slide 11 which shows a time line of transactions, highlighting their visibility or invisibility. For those who have already seen MVCC Unmasked, this addition might provide a new perspective.

View or Post Comments

Explain Output Generates Enthusiastic Applause

Friday, November 1, 2013

Explain output rarely yields an audience reaction, but today at PostgreSQL Conference Europe it generated enthusiastic applause.

How did it happen? Yesterday, Alexander Korotkov and Oleg Bartunov gave a talk about their upcoming gin index improvements. Someone asked a question, and that got them thinking about a further improvement — they developed a prototype that night. During their talk today, they showed the explain plan with their new feature. The new index type was now faster and more useful than a similar MongoDB index (see slide 62). That's fast work!

All these improvements are planned for Postgres 9.4, thanks to funding from Engine Yard.

View or Post Comments

Five Events

Tuesday, September 3, 2013

I am presenting at five new events in the next two months; they are either new groups, new cities, or new venues. The events are in Maryland, Chicago, Moscow, Italy, and Dublin.

View or Post Comments

Postgres 9.3 Release Notes Updated

Monday, August 19, 2013

I have updated the Postgres 9.3 release notes to include current commits. I have also added the list of major 9.3 features.

View or Post Comments

What Are Timelines?

Thursday, August 1, 2013

If you have used Continuous Archiving, Point-in-Time Recovery (pitr), or Log-Shipping Replication, you have probably heard of timelines. If you have ever looked in the Write-Ahead Log (wal) subdirectory pg_xlog in your cluster's data directory, you have seen files like 000000010000000000000005; that 00000001 prefix is the timeline.

There has been a lot of talk recently about timelines — Heikki Linnakangas gave a timeline presentation (slides) at this May's PGCon. The upcoming Postgres 9.3 release has improvements to allow streaming-only replication standbys to follow timeline switches.

So, what are timelines, and why are they needed? I have always been unclear on their purpose, but after Heikki's talk and the 9.3 improvements, I decided to give it some thought. The first sentence in the timeline section of the Postgres Manual sounds ominous, "The ability to restore the database to a previous point in time creates some complexities that are akin to science-fiction stories about time travel and parallel universes." It then explains several examples.

Basically, timelines are way of giving a unique number to a sequence of events. For example, if your server is freshly-installed via initdb, and you execute ten transactions (1-10), those transactions will be in timeline 1. If you then use Point-in-Time Recovery or do a controlled fail-over to a Log-Shipping Standby, ten new transactions (11-20) will be in timeline 2. If that happens again, new transactions will be in timeline 3 — pretty boring.

However, these simple cases are not the purpose of timelines. Timelines were designed to handle cases where transaction numbers can overlap. For example, suppose there was a unplanned Point-in-Time Recovery or promotion of a Streaming Standby to master, and only the first five transactions exist in the new cluster, i.e., transitions 6–10 were lost. (This could happen if the last wal file was lost, or if synchronous_standby_names was not enabled.) In this case, transaction numbers 6–10 would be executed twice, once in the old cluster (and lost), and again as different commands in the new cluster. This gets even more complicated if you used Point-in-Time Recovery to recover to an earlier time, e.g., after transaction 5 completed. In that case, transactions 6–10 were not technically lost, but discarded, or perhaps still exist in another cluster.

To handle these cases, timelines were created — they mark a new time era. Anytime Point-in-Time Recovery is used, or a Log-Shipping Standby is promoted to master, the cluster starts a new timeline. These timeline numbers allow overlapping transactions to be uniquely identified, e.g., transaction 6 in timeline 1 is different from transaction 6 in timeline 2.

Timelines bring order to these cases. Timelines can be specified during Point-in-Time Recovery, allowing fine-grained recovery specification. They allow wal files from different timelines to be stored in the same wal archive directory because the timeline prefix is unique, e.g., 00000002. In pre-9.3 Postgres releases, if a standby server was promoted to primary, and got a new timeline, other standbys without access to an archive directory had to be recreated to match the timeline of the newly promoted master; Postgres 9.3 will remove this limitation.

Given the flexibility of Postgres recovery, timelines are a necessity, and understanding them helps to clarify the many recovery options available.

View or Post Comments

Salesforce Abandoning Postgres Migration?

Thursday, June 27, 2013

Postgres adoption news always generates energetic community discussion. When Salesforce announced last year that they wanted to hire 40–50 people for a huge Postgres project, the community and the press took notice. Last month's announcement of Salesforce hiring Postgres core team member Tom Lane was big news, and Salesforce's sponsorship of a Postgres unconference weeks later helped cement their Postgres community involvement.

However, Tuesday's press release about a new nine-year Salesforce/Oracle partnership has spawned new questions about Salesforce's future with Postgres.

The press release is certainly good news for Oracle, and they needed some good news — Oracle have been having serious financial problems for the past nine quarters. Having Salesforce, one of Oracle's largest customers, exploring a Postgres migration also wasn't helpful, and this partnership might kill that.

The Salesforce motivation for this partnership is much less clear, based on their previous actions outlined above, and has generated serious community discussion. There are several possibilities:

Oracle gave Salesforce a good deal and Salesforce no longer wishes to persue a Postgres migration.
The Postgres migration was never a serious option but was a way to get an acceptable deal from Oracle.
The partnership is only a short-term plan and Salesforce will continue with their Postgres migration.

The scope, duration, and publicity surrounding the partnership suggests #1 or #2. It is well known that mentioning Postgres during Oracle negotiations can reduce pricing, so this could be a huge example of that (#2). Only #3 has Salesforce continuing with their Postgres migration. No matter what the truth, the community comment, "It sounds like a chess game and Postgres is the board," is perhaps the most accurate.

There is a press conference in a few hours (20:00 gmt, 16:00 America/New_York), and I might listen — it would be interesting to see if Postgres is mentioned.

Update: blog post confirming I have no inside information 2013-06-27

Update2: The press conference made no mention of Postgres, but Salesforce's ceo did say, "There's no better product in the world in the database area than Oracle." (recording) 2013-06-28

Update3: Another blog analyzing the issue 2013-07-01

Update4: New details 2014-01-08

Update5: details 2016-05-12

View or Post Comments

How Managers View Postgres

Wednesday, June 26, 2013

Those of us who focus on the Postgres community are well aware of how developers and administrators view Postgres — their enthusiasm helps generate significant demand for Postgres. What community-focused people don't often see is how managers and executives view Postgres, and particularly how they make Postgres deployment decisions.

This video interview of EnterpriseDB's Sean Doherty, recorded at HP Discover, is a frank discussion by non-technical people about the database landscape and Postgres's place in it. It has some good quotes:

"The fact that it's inexpensive is one thing, but it still has to work, otherwise, no one cares what it costs."
"NoSQL has got an important place to play in the database world, but right now we kind of consider it the 'big shiny red button' syndrome. Companies don't know exactly what to do with it."
"When we talk about mission critical, there's two things that are mission critical, really: our health, and our pocketbooks."

The portion after 11 minutes, 30 seconds is mostly about EnterpriseDB and not of general community interest.

View or Post Comments

Upcoming Conferences

Monday, June 24, 2013

My conference schedule has filled up through October — I have Brussels and England in July, and Maryland and Chicago in September. October brings Dublin and perhaps Moscow. One big conference I am missing this year is pgbr in Brazil; hopefully I can attend next year.

View or Post Comments

Done Studying Perl

Saturday, June 22, 2013

As Postgres has moved away from a Unix-dominated environment since 2005, there has been decreasing use of Unix shell scripts and increased reliance on Perl, particularly for building Postgres from source. With increasing Perl usage, I realized I had to learn it. I have finally learned Perl with the help of three great books, two of which are readable online:

These are concept-oriented books, which is perfect for me, rather than example-led books. Learning Perl was also necessary for developing PgLife.

View or Post Comments

PGCon Conference Report

Saturday, May 25, 2013

PGCon certainly had some energizing talks and meetings this week. First, Jonathan Katz gave a tutorial about Postgres data types. Though I missed his talk, I just reviewed his slides and it is certainly a great tour of our powerful type system.

Second, Oleg Bartunov and Teodor Sigaev gave a great presentation about merging the very popular hstore and json date types into a new hierarchical hash data type, humorously code-named 'jstore'. (Their work is funded by Engine Yard.) This generated a huge amount of discussion, which has continued to today's unconference.

Third, Alvaro Hernandez Tortosa's talk The Billion Tables Project (slides) really stretched Postgres to a point where its high-table-count resilience and limitations became more visible. A significant amount of research was required to complete this project.

The presentations were recorded, so I am hopeful that recording will be available online soon.

View or Post Comments

New Presentation Online

Thursday, May 23, 2013

I delivered my presentation "Nulls Make Things Easier?" today at PGCon, so I have placed my slides online. The presentation is based on a series of eleven blog posts about nulls I did a few months ago.

View or Post Comments

PgCon Developer Meeting Concluded

Wednesday, May 22, 2013

We just concluded the PgCon Developer Meeting. The two big items from me were that EnterpriseDB has dedicated staff to start work on parallelizing Postgres queries, particularly in-memory sorts. I have previously expressed the importance (and complexity) of parallelism. Second, EnterpriseDB has also dedicated staff to help improve Pgpool-II. Pgpool is the swiss army knife of replication tools, and I am hopeful that additional development work will further increase its popularity.

The Developer Meeting meeting notes (summary) have lots of additional information about the big things coming from everyone next year.

View or Post Comments

Video Interview

Monday, May 20, 2013

I did a video interview while I was at ConFoo, and it is now online.

View or Post Comments

PgLife Averages Thirty Active Users

Monday, May 13, 2013

A week after the release of PgLife, the site is averaging thirty active users. (I define an active user as an ip address that has viewed the site for at least five minutes during the past hour.) I consider that a success. Since the release of PgLife, I have increased the content update interval and added an About page explaining the site's purpose, which also includes the active user count.

The site uses ajax, Perl, Procmail rules, and Apache to collect and deliver dynamic content. Recent improvements in the Postgres mailing list archive feature set have made linking to emails much simpler.

PgLife was my first attempt at a dynamic website, and I learned a few things. First, I learned the value of having an alert file that can force a browser reload to push fixes and improvements to the browser. Second, I used the same file to allow pushing of news alerts to users, e.g., 9.3 Beta1. Third, I learned the importance of controlling browser and server caching and revalidation when using dynamic content.

I have always wanted a dashboard that shows current Postgres activity, and PgLife meets that need for me, and thankfully others. I am very fortunate that much of the Postgres activity is recorded in a way that makes external processing easy, and I know some people are working to make more content available, which will eventually be added to PgLife.

View or Post Comments

PgLife Is Now Live

Monday, May 6, 2013

New Postgres users are often confused by the variety of activities in the Postgres community. Even us old-timers can be overwhelmed in monitoring the Postgres community. To meet these two needs, I have created a website called PgLife which dynamically displays the current activity in each of the community areas: email, development, blogging, media, news, events, and irc chat. I am hopeful people will find this useful in understanding the Postgres community, and in increasing their involvement.

View or Post Comments

9.3 Release Notes Ready for Beta

Friday, May 3, 2013

I have completed the 9.3 release notes so they are ready for beta release. They will be regularly updated until 9.3 final.

View or Post Comments

Starting on 9.3 Release Notes

Friday, April 12, 2013

With our final 9.3 commit-fest almost complete, and few 9.3 open items, I have been asked to start on the 9.3 release notes. I have blogged about this process in the past (1, 2), so I will not bore you with the details. I expect to be done in a week.

View or Post Comments

Matching Client and Server Constraints

Thursday, April 11, 2013

Postgres supports all the sql-standard constraint types, e.g., unique, check. These work fine at the server level:

CREATE TABLE ledger (id SERIAL, balance NUMERIC(10,2) CHECK (balance > 0));
 
INSERT INTO ledger VALUES (DEFAULT, -2.00);
ERROR:  new row for relation "ledger" violates check constraint "ledger_balance_check"
DETAIL:  Failing row contains (1, -2.00).

While this clearly prevents invalid data from being entered into the database, it isn't very helpful in providing client-level feedback to the user. One improvement is to set the name of the constraint:

DROP TABLE ledger;
 
CREATE TABLE ledger (id SERIAL, balance NUMERIC(10,2)
                     CONSTRAINT "Zero and negative balances not allowed" CHECK (balance > 0));
 
INSERT INTO ledger VALUES (DEFAULT, -2.00);
ERROR:  new row for relation "ledger" violates check constraint "Zero and negative balances not allowed"
DETAIL:  Failing row contains (1, -2.00).

This returns a more meaningful client error string in the constraint name. (Fyi, Postgres does not record not null constraint names.) It is possible for clients to parse the error string and return the constraint name as error feedback to the user. This has a few disadvantages:

extracting the constraint name from the error messages is error-prone
different types of constraints, e.g., unique, have different message formats
different locale languages have different error messages, with the constraint name potentially being in a different part of the string
the constraint name is table-specific, and doesn't allow application-specific feedback

This will be much easier in Postgres 9.3 with the addition of the constraint name to error message output. Here is the output in psql:

\set VERBOSITY verbose
INSERT INTO ledger VALUES (DEFAULT, -2.00);
ERROR:  23514: new row for relation "ledger" violates check constraint 
        "Zero and negative balances not allowed"
DETAIL:  Failing row contains (1, -2.00).
SCHEMA NAME:  public
TABLE NAME:  ledger
CONSTRAINT NAME:  Zero and negative balances not allowed
LOCATION:  ExecConstraints, execMain.c:1672

I often tout the ability to write server-side code in the same language as database clients. This eases moving code between client and server. What is more interesting is code sharing, where you use similar code on the client and server. Will Leinweber has an interesting example where he uses plv8 to create JavaScript validation code that can be run in the browser and in the database.

Hopefully this blog entry has given you some ideas of how to harness server-side constraints in client applications, and how to share validation code between client and server. None of these approaches is perfect, but hopefully they will ease the burden of maintaining duplicate client and server validation checks.

View or Post Comments

Dynamic Columns

Wednesday, April 10, 2013

Sql has never been good at handling dynamic columns — it can easily process tables with a fixed number of columns, but dynamic column handling has always been a challenge. (Dynamic columns are often used to implement user-defined fields.) One approach has been to define columns for every possible value that might be stored in a row; this leads to sparse tables containing mostly null values. Because Postgres stores null values in a single bit, there is little overhead to this approach. However, it is hard to call these dynamic columns since every new column requires an alter table ... add column command.

Another approach is to use entity/attribute/value layout, where each value is stored in its own row. This obviously increases storage requirements, has performance challenges, and adds complexity to sql queries.

Fortunately, Postgres has various options to efficiently handle dynamic columns. The simplest solution is arrays which allow multiple values of the same data type to be stored in a single field. This assumes that the values are independent, or that their ordinal position in the array is sufficiently meaningful.

A more complex approach is to associate a key with each value. (Values internally are strings, rather than bound to a specific data type.) Hstore is a very popular key/value store for Postgres, and has index lookup support for arbitrary key/value pairs. (It does not allow hierarchical storage, i.e., key/values inside a key).

Xml and more recently json allow hierarchical storage and more naturally match client application languages. Unfortunately, none of the Postgres native index types understand the structure of json or xml data. (Json and xml documents are stored internally as single text strings.) To index json, it is necessary to create an expression index on a plv8 function (which understands json). Andrew Dunstan has a good blog entry using a plv8 function (jmember) to index specific json keys. Postgres 9.3 will add functions (blog, blog, blog, blog) to access json values directly — this will eliminate the need to use plv8 in simple cases.

For arbitrary json key lookups, it is possible to use a Postgres full text search index to filter out most non-matching rows and use a json-aware function to compare the remaining rows — this avoids the requirement of creating an index per lookup key.

Xml has similar capabilities to json — it can store hierarchical data, and can use full text search to filter matching rows. While many server-side languages understand xml, e.g., PL/Java, PL/Perl, most people use the built-in Postgres xml functions in expression indexes.

As you can see, Postgres offers a variety of options to store dynamic columns while maintaining a fixed-column table format that is easily accessed via sql — this gives you the best of both worlds when storing dynamic data.

View or Post Comments

Pg_Upgrade Now Handles Invalid Indexes

Tuesday, April 9, 2013

In December, I reported that a bug had been discovered in how pg_upgrade handled invalid indexes, and that the next release of pg_upgrade would throw an error if invalid indexes were found; that release happened in February. In an easily-overlooked development, last week's security release changed pg_dump so it now skips dumping invalid indexes. This allows the most recent version of pg_upgrade to upgrade clusters with invalid indexes, rather than throwing an error.

View or Post Comments

Twenty-One Hours to Exploit

Monday, April 8, 2013

You probably heard about the important Postgres security releases (FAQ) last week, but maybe you thought it was mostly a theoretical threat. Well, the Schemaverse game author left a non-upgraded server open to the Internet to see how long it would take to be infected — turns out it took a sobering 21 hours. Hopefully this gets everyone's attention on the need to upgrade.

View or Post Comments

Partial Indexes

Monday, April 8, 2013

Having covered expression indexes, I want to cover another index feature — partial indexes. Partial indexes allow you to create an index on part of a table. Why would you want to do that?

index only frequently-accessed values, allowing smaller index files
avoid indexing common values, which are unlikely to be useful in an index
smaller index files allow faster index traversal and improve index caching
avoid index insertion overhead for non-indexed values
constrain uniqueness to a subset of table rows

People usually don't use expression or partial indexes initially, but there are certainly some use cases that can only be solved with these index features.

View or Post Comments

Expression Indexes

Saturday, April 6, 2013

Having covered virtual columns recently, I wanted to cover one additional type of stored value, expression indexes. Unlike virtual columns, which are computed on each access, expression indexes are stored in index files, allowing rapid access. Let's look at some examples, building on the customer table and fullname function created in my previous blog entry:

CREATE INDEX i_customer_lastname ON customer (lastname);
 
CREATE INDEX i_customer_concat ON customer ((firstname || ' ' || lastname));
 
CREATE INDEX i_customer_fullname ON customer (fullname(customer));

The first create index command simply creates a copy of the column lastname in a btree-ordered index file. The second example concatenates firstname and lastname fields and stores the result in a btree-ordered index file; it requires double-parentheses around the expression. The last example stores the output of the function fullname in an index file.

The ability to store the output of expressions and functions allows rapid access for queries that are more complex than simple column comparisons. For example,

SELECT * FROM customer WHERE firstname || ' ' || lastname = 'Mark Pennypincher';
 id | firstname |   lastname
----+-----------+--------------
  1 | Mark      | Pennypincher
 
SELECT * FROM customer WHERE fullname(customer) = 'Mark Pennypincher';
 id | firstname |   lastname
----+-----------+--------------
  1 | Mark      | Pennypincher
 
SELECT * FROM customer WHERE customer.fullname = 'Mark Pennypincher';
 id | firstname |   lastname
----+-----------+--------------
  1 | Mark      | Pennypincher

Explain allows us to see the index being used:

EXPLAIN SELECT * FROM customer WHERE firstname || ' ' || lastname = 'Mark Pennypincher';
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Index Scan using i_customer_fullname on customer  (cost=0.29..8.31 rows=1 width=12)
   Index Cond: (((firstname || ' '::text) || lastname) = 'Mark Pennypincher'::text)
 
EXPLAIN SELECT * FROM customer WHERE fullname(customer) = 'Mark Pennypincher';
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Index Scan using i_customer_fullname on customer  (cost=0.29..8.31 rows=1 width=12)
   Index Cond: (((firstname || ' '::text) || lastname) = 'Mark Pennypincher'::text)
 
EXPLAIN SELECT * FROM customer WHERE customer.fullname = 'Mark Pennypincher';
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Index Scan using i_customer_fullname on customer  (cost=0.29..8.31 rows=1 width=12)
   Index Cond: (((firstname || ' '::text) || lastname) = 'Mark Pennypincher'::text)

Notice that all three use the i_customer_fullname index. The sql fullname function was inlined when the index was created, so it was expanded to (((firstname || ' '::text) || lastname); the i_customer_concat index is simply unnecessary. Storing Soundex values in expression indexes is also useful.

Basically, you have three options for creating auto-generated columns:

virtual columns, via functions, computed on access
computed table columns, populated by triggers
computed index columns, populated by expression indexes

Of course, only the last one, expression indexes, allows rapid access to computed values. The example below uses sql to find the input value that yields the supplied factorial output value, e.g., what value generates a factorial of 120:

-- create a table of 1000 integers
CREATE TABLE factorial_lookup AS SELECT i FROM generate_series(1, 1000) x(i);
 
-- compute a factorial for every row and compare
EXPLAIN SELECT i FROM factorial_lookup WHERE factorial(i) = 120;
                            QUERY PLAN
------------------------------------------------------------------
 Seq Scan on factorial_lookup  (cost=0.00..52.00 rows=12 width=4)
   Filter: (factorial((i)::bigint) = 120::numeric)
 
-- create an index of factorial output values
CREATE INDEX i_factorial_lookup ON factorial_lookup (factorial(i));
 
-- generate optimizer statistics
ANALYZE factorial_lookup;
 
-- use the index for rapid factorial output comparison
EXPLAIN SELECT i FROM factorial_lookup WHERE factorial(i) = 120;
                                        QUERY PLAN
-------------------------------------------------------------------------------------------
 Index Scan using i_factorial_lookup on factorial_lookup  (cost=0.53..8.55 rows=1 width=4)
   Index Cond: (factorial((i)::bigint) = 120::numeric)

Interestingly, Postgres computes optimizer statistics on expression indexes, even though the expressions do not exist in any table:

SELECT attname FROM pg_stats WHERE tablename = 'i_factorial_lookup';
  attname
-----------
 factorial

Expressions indexes are just another tool available to Postgres database users — they allow rapid access to rows based on table column expressions and functions.

View or Post Comments

All Operators Call Functions

Thursday, April 4, 2013

Operators are usually easily distinguished from functions because they use punctuation, e.g., -, *, /, rather than words with letters and numbers. There are two popular types of operators, unary and binary. Unary operators take one argument, e.g., -3 (minus), !false (not). Binary operators take two arguments, one on each side of the operator, e.g., 2 + 3 (addition), 'a' || 'b' (concatenation). (C-based languages often also support the ternary operator ?:.)

Perhaps surprisingly, in Postgres comparisons are also operators, e.g., a = 3 (equality), to_tsvector('hello bye') @@ 'hello' (full text search). Comparison operators return true, false, or null, and their return values are processed by surrounding keywords or clauses, e.g., where, and. For example, to process where col1 = 10 + 1 and col2 > 5, 10 + 1 is processed with +, col1 = 11 is then processed with =, col2 > 5 is processed with >,; the results processed by and, and its return value is passed to the query processor via where. Just as you can type \dfS to see all defined functions in psql, and \dTS to see all defined data types, \doS shows all defined operators.

You might think of operators as distinct from functions, but internally operator behavior is bound directly to functions. For example, factorial contains both operator and function definitions. The factorial operator, !, is actually the only Postgres-supplied left-associative unary operator:

SELECT 5!;
 ?column?
----------
      120
 
SELECT oprname, oprcode FROM pg_operator WHERE oprright = 0;
 oprname |   oprcode
---------+-------------
 !       | numeric_fac

It is also implemented as a function, factorial:

SELECT factorial(5);
 factorial
-----------
       120
 
SELECT proname, prosrc FROM pg_proc WHERE proname = 'factorial';
  proname  |   prosrc
-----------+-------------
 factorial | numeric_fac

Notice both reference numeric_fac. Where is that defined? It is defined as C-language function, and is called by both ! and factorial(). Create operator allows users to create their own operators based upon already-defined functions, even user-defined functions. While user-defined operators are not as common as user-defined functions, they are frequently used by Postgres extensions. For example, hstore defines 14 operator combinations.

Just like functions and data types, operators are "laid on top" of the sql language in Postgres. So, they next time you call an operator, realize there is a lot going on under the hood, and you can change its behavior.

View or Post Comments

Passing Arguments to Functions

Tuesday, April 2, 2013

There was some nifty function-calling syntax used in my previous blog post about virtual columns, so I now want to cover function calling syntax in detail. Traditionally, functions are called in sql using a procedural syntax, e.g., function(argument):

SELECT upper('a');
 upper
-------
 A
 
select pow(2, 4);
 pow
-----
  16

With Postgres, you can also pass sql composite types in a procedural manner. Composite types represent data rows or records of data values. The following query works because the function was defined in my previous blog post to accept a composite type as an argument, i.e., create function fullname(customer), where customer is a relation name, and hence a composite type:

SELECT fullname(customer) FROM customer;
     fullname
-------------------
 Mark Pennypincher

You can also append .*:

SELECT fullname(customer.*) FROM customer;
     fullname
-------------------
 Mark Pennypincher

Where it really gets interesting is when we use an alternative function call syntax with composite types that allows the function call to appear as a normal column — this is part of the original object-relational design from Berkeley. It follows a more object-oriented approach, where object variables (data storage) and object methods (functions) are accessed using a similar syntax, i.e., fullname appears as a storage column, but is actually a function name:

SELECT customer.fullname FROM customer;
     fullname
-------------------
 Mark Pennypincher

This is the syntax we used for virtual columns — hopefully they make more sense now. It is also possible to dynamically create composite types and use this calling syntax. To create composite types, you use parentheses with either a comma inside the parentheses, or a period after the closing parenthesis:

SELECT ('hi');
 ?column?
----------
 hi
 
SELECT ('hi', 'bye');
   row
----------
 (hi,bye)
 
SELECT ('hola').upper;
 upper
-------
 HOLA

The first example does not match the criteria for composite creation — notice the output does not show parentheses, meaning it is not a composite type. The second does create a composite type by including a comma. The third example illustrates calling the upper function using a dynamically-created composite type. (Row can also be used to create composite types, though the function calling syntax doesn't seem to work with them.)

One disadvantage of the composite syntax is that, because it was developed before schemas (namespaces), there is no way to specify the function's schema:

SELECT ('hola').pg_catalog.upper;
ERROR:  column notation .pg_catalog applied to type unknown, which is not a composite type
LINE 1: SELECT ('hola').pg_catalog.upper;
                ^
SELECT ('hola').(pg_catalog.upper);
ERROR:  syntax error at or near "("
LINE 1: SELECT ('hola').(pg_catalog.upper);
                        ^

Only search_path is used to locate such functions.

You can also output a data row in composite format by surrounding the relation name with parentheses:

SELECT (customer) FROM customer;
       customer
-----------------------
 (1,Mark,Pennypincher)

In summary, there are two function calling conventions — a procedural one, which everyone knows, and a less well-known composite method. The composite method can use either a composite type, like a relation name, or a dynamically-created type, which requires parentheses. The composite method could probably benefit from syntax flexibility improvements in future Postgres releases.

View or Post Comments

Creating Virtual Columns

Monday, April 1, 2013

Databases store data in table columns, but sometimes it is useful to create non-storage columns for code clarity. For example, you might need to access a combination of columns frequently and don't want to repeatedly perform the combining operation in your application. In this case, a column can be created that acts as a virtual read-only column. In the example below, a fullname function is created that combines the firstname and lastname columns, and the fullname function is then accessed as a virtual column:

CREATE TABLE customer (id SERIAL, firstname TEXT, lastname TEXT);
 
-- create virtual column
CREATE FUNCTION fullname(customer) RETURNS text AS $$
    SELECT $1.firstname || ' ' || $1.lastname
$$ LANGUAGE SQL;
 
INSERT INTO customer VALUES (DEFAULT, 'Mark', 'Pennypincher');
 
SELECT customer.fullname FROM customer;
     fullname
-------------------
 Mark Pennypincher
 
-- the table name is not optional
SELECT fullname FROM customer;
ERROR:  column "fullname" does not exist
LINE 1: SELECT fullname FROM customer;
               ^
 
-- virtual columns are read-only
INSERT INTO customer (fullname) VALUES ('Andy Toll');
ERROR:  column "fullname" of relation "customer" does not exist
LINE 1: insert into customer (fullname) VALUES ('Andy Toll');
                              ^

You can even create column aliases in this way:

-- create column alias
CREATE FUNCTION nickname(customer) RETURNS text AS $$
    SELECT $1.firstname
$$ LANGUAGE SQL;
 
SELECT customer.nickname FROM customer;
 nickname
----------
 Mark

Only functions with a table name argument can be called in this way, i.e., as table_name.function_name. Because sql functions are often inlined before execution, their overhead is minimal — notice that the explain plan for the real and virtual column references are identical:

EXPLAIN VERBOSE SELECT customer.firstname, customer.nickname FROM customer;
                            QUERY PLAN
-------------------------------------------------------------------
 Seq Scan on public.customer  (cost=0.00..18.30 rows=830 width=32)
   Output: firstname, firstname

Such functions can also be created in other server-side languages, like PL/pgSQL:

CREATE FUNCTION fullname2(customer) RETURNS text AS $$
BEGIN
        RETURN $1.firstname || ' ' || $1.lastname;
END
$$ LANGUAGE plpgsql;
 
SELECT customer.fullname2 FROM customer;
     nickname2
-------------------
 Mark Pennypincher

Of course, much of this can already be done with views:

CREATE VIEW customer_view AS 
    SELECT firstname, lastname, firstname || ' ' || lastname AS fullname 
    FROM customer;
 
SELECT fullname FROM customer_view;
     fullname
-------------------
 Mark Pennypincher

Views are nice in that they do not require creation of a function to use them, but they do require applications to reference the view name rather than the base table name. (Of course, ddl can be added to make views updatable.) Virtual columns are nice because they are only active when the virtual column is referenced on the base table.

View or Post Comments

Good Enough?

Friday, March 29, 2013

Building upon my previous blog post, where I declared Postgres was "good enough", I want to explore what "good enough" means. I will use an aircraft design analogy. (You might remember me exploring a Postgres/aircraft parallel before.)

In aircraft design, everything is a tradeoff between weight, cost, reliability, and safety. Let's look at safety — what would be the safest material to use in constructing an airplane? One logical selection would be to make it from the same material as the flight data recorder, i.e., the black box. It is designed to withstand the most serious crashes, so obviously, it would be the safest.

Of course, the problem is that an aircraft made with material of the same strength as a flight data recorder would be so heavy it would never get off the ground. A material must be chosen that is safe enough, and light enough so the plane can fly. Obviously, measuring best in only one dimension leads to impractical problems in other areas.

Software has similar issues — cheapest, fastest, most flexible, most featureful, most reliable, least maintenance — these are all possible measures for evaluating software, but the best solution strikes a balance in each of these areas. The fastest software might be custom-produced and written in assembler, but its cost and rigidity would make it a poor choice. Databases have to balance these measure to produce useful software for many users. For example, for some sites, NoSQL has sufficient reliability, and for others, SQLite has enough features. (I have already explored the tradeoff between configurability and maintenance overhead.)

Postgres is always working to extend its attractiveness in every major category. Sometimes we can improve one category without diminishing another, but at other times, this is not possible, and our community must choose what will be most attractive for our current and future users. These tradeoffs are never easy, but thanks to the close communication among our developers and users, we usually triangulate on an ideal outcome. Sometimes we can allow users to choose tradeoffs, e.g., between performance and durability.

So Postgres is good enough, in so many ways. It probably isn't superior in any of these categories (except perhaps cost and flexibility), but its combination of low cost, flexibility, speed, features, reliability, and minimal maintenance makes it a star performer. Our users think its combination is the best, and you might think so too.

View or Post Comments

Postgres Is Good Enough

Thursday, March 28, 2013

With the increased attention Postgres is getting, many ask whether Postgres has the same features as Oracle, MS SQL, or DB2. Of course, the answer is that Postgres doesn't have all their features, and never will, but it is also true that those other databases will never have all of Postgres's features. The better question is whether Postgres has all the features you need for a successful deployment, and the answer to that is usually an enthusiastic "yes".

This gets into the larger question of when something is good enough. Good enough sounds negative, like you are giving up something to settle for a partial solution. However, good enough is often the best option, as the negative aspects of an ideal solution (cost, overhead, rigidity) often make good enough best. In response to Oracle's announcement of new hardware, this Slashdot comment captures how great good enough can be:

I used to work in exclusively Sun shops, and I've dealt with Oracle for years. There's little that the hardware and their database can do that can't be replicated by x64 and something like Postgres with some thought behind your architecture. For certain, the features they do have are not cost effective against the hundreds of thousands of dollars you pay for Oracle DB licensing, and the premium you pay for SPARC hardware and support.

So, in the future, if you hear someone say Postgres is good enough for their needs, don't think of them as settling for less — think of them as having chosen Postgres as the best solution they could find, because in so many situations, Postgres is the best.

View or Post Comments

Postgres as a Data Platform

Tuesday, March 26, 2013

With the all the excitement surrounding Postgres, I have heard speculation about why Postgres has become such a big success. Many point to our exemplary community structure, while others point to our technical rigor. One item of particular scrutiny has been our extensibility — I was asked about this twice at the recent New York City PGDay.

Postgres (named as post-Ingres) was designed to be extensible, meaning not only can users create server-side functions, but also custom data types, operators, index types, aggregates, and languages. This extensibility is used by our contrib extensions, PostGIS, and many others. Someone said it was like Postgres developed the sql language, then laid the data types on top. Will Leinweber, in his nyc presentation, described Postgres as a "data platform rather than a database", and his use of hstore, json, and plv8 illustrated the point well.

Extensibility certainly isn't the only reason for Postgres's success, but it is part of a larger set of ideal circumstances that allows Postgres to thrive in an environment where many other relational databases are struggling.

View or Post Comments

Conference Report

Monday, March 25, 2013

I just returned from PGDay NYC. The conference had an intensity that I have not seen in other conferences recently. Last year's PGDay NYC conference was in a large facility and felt dominated by people considering Postgres for serious deployments. This year's event was in underground adjacent rooms, and the crowd seemed completely sold on Postgres. The compactness of the venue and the enthusiasm of the crowd made for a unique energy. One of the rooms even had couches and cushioned chairs — I don't think I have ever listened to a technical presentation from a couch — hard to beat.

The event was attended by four core team members and distant speakers like Joshua Drake, David Fetter, and Will Leinweber. Will's presentation was particularly exciting for me and I will be blogging about it soon.

View or Post Comments

Table Partitioning Needs Improvement

Friday, March 8, 2013

Postgres is very good at adding features built upon existing features. This process was used when table partitioning was added in Postgres 8.1 in 2005. It was built upon three existing features:

table inheritance
check constraints
triggers and rules (to route data modification queries to the proper partition)

Though constraint_exclusion (which controls partition selection) has received some usability improvements since 2005, the basic feature has remained unchanged. Unfortunately, the feature as implemented has several limitations which still need to be addressed:

Partitioning requires users to create check constraints to define the contents of each partition.
Because of check constraints, there is no central recording of how the partitions are segmented.
Also because of check constraints, there is no way to rapidly evaluate the partition contents — each check constraint must be re-evaluated. Due to this expensive operation, partition selection is done while the plan is being created, rather than executed, meaning that only query constants can be used to select partitions — joining to a partitioned column cannot eliminate unnecessary partition scans.
Check constraint overhead also adds a serious performance overhead for systems using thousands of partitions.
To properly route write queries to the proper partition, triggers or rules must be created on the parent table. They must know about all partitions, and anytime a partition is created or removed, this trigger or rule must be updated, unless some type of partition-name wildcard pattern is used in the trigger code.
There is no parallel partition scanning, though that is more due to the lack of Postgres parallelism, rather than a partitioning limitation.

There is a wiki related to table partitioning that covers some of this in detail.

Postgres partitioning was good enough to foster its wide deployment, but the limitations of building partitioning upon existing features is clear. It is perhaps time to develop more partition-specific infrastructure to allow easier-to-use and more scalable partitioning in Postgres.

One downside is that any partitioning improvements that address these limitations would probably be less flexible than our current partitioning capabilities — in a way, it is our flexibility that has led to the performance and user-overhead requirements of our current setup. It is probably the challenge of reducing partitioning's limitations while maintaining its flexibility that has prevented improvements in this area.

View or Post Comments

The Middleware Porting Challenge

Thursday, March 7, 2013

Last month I visited India for EnterpriseDB and talked to many government and corporate groups. We don't have many independent Postgres community members in India, so I was surprised by the serious interest in Postgres and the number of large deployments under consideration.

The issue of porting applications to Postgres was discussed. Normally, when I think of porting to Postgres, I think of company-created applications that just require a proper Postgres interface library and simple sql query modifications — and our community maintains porting guides for many databases, e.g., Oracle, MySQL, MSSQL, other. In addition, most open source application languages are rapidly moving to standardize on Postgres, so I thought most of the big hurdles for migration were solved.

However, there is one area that still has significant barriers for Postgres migration, and that is middleware, particularly proprietary middleware. If your application is written to use a middleware api, and that middleware doesn't support Postgres, there isn't much possibility of porting to Postgres unless you change your middleware software, and changing middleware software involves not just modifying queries, but rearchitecting the entire application to use a different middleware api.

At this point, almost every application language and all open source middleware has Postgres support, so hopefully more proprietary middleware vendors will add Postgres support soon.

View or Post Comments

The Future of Relational Databases

Tuesday, March 5, 2013

A few months ago I wrote an article about how NoSQL databases are challenging relational databases, and it has recently been published. My article's goal was to explain that while NoSQL databases have advantages over relational databases, there are many disadvantages that are often overlooked. I also explained how relational databases, particularly Postgres, are adding features to adapt to NoSQL workloads.

View or Post Comments

Parallelism Roadmap

Thursday, January 24, 2013

In December of 2011, I blogged about the increasing need for parallelism in the Postgres backend. (Client applications have always been able to do parallelism with subprocesses, and with threads since 2003).

Thirteen months later, I have added two parallel code paths (1, 2) to pg_upgrade. I have learned a few things in the process:

parallelism can produce dramatic speed improvements (4x vs 4%)
adding parallelism isn't difficult to code, even for MS Windows
only certain tasks can benefit from parallelism

Using pg_upgrade as an example, parallelism can yield a 10x performance improvement, and it only took me a few weeks to accomplish. However, to get 10x improvement, you have to have multiple large databases, and be using multiple tablespaces — others will see more moderate gains. Fortunately, I have also improved pg_upgrade performance by 5x even without parallelism.

Anyway, with those lessons learned, I am ready to start mapping out a parallelism strategy for the Postgres server. I have created a wiki that outlines the benefits, challenges, and specific opportunities for parallelism in the backend. Feel free to add items or make adjustments.

View or Post Comments

Null Summary (Part 11/11)

Wednesday, January 23, 2013

I hope you have found my blog series about nulls useful. I was inspired to write this series by Jeff Davis's 2009 blog entry on the topic, in which he states:

"Oh, that makes sense" — When you see individual behaviors of null, they look systematic, and your brain quickly sees a pattern and extrapolates what might happen in other situations. Often, that extrapolation is wrong, because null semantics are a mix of behaviors. I think the best way to think about null is as a Frankenstein monster of several philosophies and systems stitched together by a series of special cases.

The Wikipedia article about nulls is quite thorough, including criticisms. This C. J. Date book excerpt also has a good summary of possible null traps to avoid.

The sql standards committee had to balance convenience and consistency in designing null behavior. Unfortunately, they didn't achieve 100% success in either area, but achieved reasonable success in both.

Here is a summary of all the blog entries in this series:

View or Post Comments

Nulls in Arrays and Row Expressions (Part 10/11)

Monday, January 21, 2013

Null behavior as part of an array or row expression has some unusual behaviors that it is best just to memorize:

SELECT NULL::INTEGER[] IS NULL;
 ?column?
----------
 t
 
SELECT '{}'::INTEGER[] IS NULL;
 ?column?
----------
 f
 
SELECT '{NULL}'::INTEGER[] IS NULL;
 ?column?
----------
 f

A null cast to an integer array is null, while an empty array or an array containing a null is not null.

SELECT ROW() IS NULL;
 ?column?
----------
 t
 
SELECT ROW(NULL) IS NULL;
 ?column?
----------
 t
 
SELECT ROW(NULL,NULL) IS NULL;
 ?column?
----------
 t
 
SELECT ROW(NULL,1) IS NULL;
 ?column?
----------
 f
 
SELECT ROW(NULL,1) IS NOT NULL;
 ?column?
----------
 f
 
SELECT ROW(1,2) IS NOT NULL;
 ?column?
----------
 t

Empty row expressions or row expressions containing only nulls are null, while a row expression containing a mix of nulls and non-nulls is neither null nor not null. Rows containing only non-nulls are not null, of course. Basically, all of a row's values must have the same null status to return true for null checks.

Another odd behavior is using a query returning zero rows in the target list:

CREATE TABLE emptytest (x INTEGER);
 
SELECT * from emptytest;
 x
----
 
SELECT (SELECT * from emptytest);
   x
--------
 (null)
 
SELECT (SELECT * from emptytest) IS NULL;
 ?column?
----------
 t

Though the table has no rows, using it in a select target list returns a null. This is because a select with no from clause is assumed to return one row. (This was recently discussed on a Postgres email list.)

View or Post Comments

Handling Growth with Postgres: 5 Tips From Instagram

Friday, January 18, 2013

Instagram has been public about their large-scale use of Postgres. They recently posted a blog entry mentioning five helpful tips about using Postgres.

View or Post Comments

Mapping Nulls to Strings (Part 9/11)

Friday, January 18, 2013

Sql specifies functions to map null values to and from non-null values. Coalesce returns the first passed non-null value:

SELECT COALESCE(NULL, 0);
 coalesce
----------
        0
 
SELECT COALESCE(NULL, 'I am null.');
  coalesce
------------
 I am null.
 
INSERT INTO nullmaptest VALUES ('f'), ('g'), (NULL);
 
SELECT x, COALESCE(x, 'n/a') FROM nullmaptest;
   x    | coalesce
--------+----------
 f      | f
 g      | g
 (null) | n/a
 
SELECT 'a' || COALESCE(NULL, '') || 'b';
 ?column?
----------
 ab
 
SELECT SUM(x), COALESCE(SUM(x), 0) FROM aggtest;
  sum   | coalesce
--------+----------
 (null) |        0

Notice that coalesce can return a value of any data type. It is very useful for specifying a special output string to represent null values. The second to last use of coalesce (using ||) shows how coalesce can be used to concatenate values that might be null. The last use of coalesce changes the unusual behavior I mentioned earlier about sum() returning null when no values are processed.

The function nullif returns a null if two parameters are equal — the inverse of coalesce:

DELETE FROM nullmaptest;
 
INSERT INTO nullmaptest VALUES ('f'), ('g'), ('n/a');
 
SELECT x, NULLIF(x, 'n/a') FROM nullmaptest;
  x  | nullif
-----+--------
 f   | f
 g   | g
 n/a | (null)
 
SELECT NULLIF('n/a', COALESCE(NULL, 'n/a'));
 nullif
--------
 (null)

The last example converts null to 'n/a', then back to null with nullif.

View or Post Comments

Nulls and Aggregates (Part 8/11)

Wednesday, January 16, 2013

Aggregates are also a regular source of confusion with nulls, both in how aggregates handle nulls on input, and when aggregates return nulls. Let's look at the input case first:

CREATE TABLE aggtest (x INTEGER);
 
INSERT INTO aggtest VALUES (7), (8), (NULL);
 
SELECT COUNT(*), COUNT(x), SUM(x), MIN(x), MAX(x), AVG(x) FROM aggtest;
 count | count | sum | min | max |        avg
-------+-------+-----+-----+-----+--------------------
     3 |     2 |  15 |   7 |   8 | 7.5000000000000000

There are two things to notice here:

COUNT(*) counts rows, regardless of any nulls stored in the rows
count(x) and all other aggregates ignore null values

Let's look at cases where aggregates return nulls, specifically where aggregates have no rows to process:

DELETE FROM aggtest;
 
SELECT COUNT(*), COUNT(x), SUM(x), MIN(x), MAX(x), AVG(x) FROM aggtest;
 count | count |  sum   |  min   |  max   |  avg
-------+-------+--------+--------+--------+--------
     0 |     0 | (null) | (null) | (null) | (null)

COUNT(*) and count(x) have no trouble returning zero rows. Min(), max(), and avg() are probably returning the best value they can for no rows. You might argue that avg() should return *0*, but that would technically be computing 0 / 0, which is indeterminate in mathematics. The real problem here is sum() — the sum of zero rows is clearly zero, but the sql standards committee defined it as returning null, probably to be consistent with the other non-count aggregates. Of course, with sum() returning null for no rows, avg() returning null makes perfect sense. And, because count(x) and non-count aggregates ignore nulls, processing only null values also yields null outputs:

 
INSERT INTO aggtest VALUES (NULL);
 
SELECT COUNT(*), COUNT(x), SUM(x), MIN(x), MAX(x), AVG(x) FROM aggtest;
 count | count |  sum   |  min   |  max   |  avg
-------+-------+--------+--------+--------+--------
     1 |     0 | (null) | (null) | (null) | (null)

Group by groups nulls together, just like order by:

DELETE FROM aggtest;
DELETE 1
 
INSERT INTO aggtest VALUES (7), (8), (NULL), (NULL);
INSERT 0 4
 
SELECT x, COUNT(*), COUNT(x), SUM(x), MIN(x), MAX(x), AVG(x)
FROM aggtest
GROUP BY x
ORDER BY x;
   x    | count | count |  sum   |  min   |  max   |        avg
--------+-------+-------+--------+--------+--------+--------------------
      7 |     1 |     1 |      7 |      7 |      7 | 7.0000000000000000
      8 |     1 |     1 |      8 |      8 |      8 | 8.0000000000000000
 (null) |     2 |     0 | (null) | (null) | (null) |             (null)

The zero in the last row is interesting because even though there are two matching null rows shown by COUNT(*), count(x) ignores nulls, so it returns zero. The other null aggregate return values in that row are not because the nulls are being processed, but rather because no non-null values are seen.

View or Post Comments

Ordering and Indexing Nulls (Part 7/11)

Monday, January 14, 2013

Having looked at the null comparisons, let's see how they behave in ordering and indexing. For ordering purposes, in a bow to practicality, nulls are considered to be equal to each other in terms of grouping, because, hey, if they were not, ordering them wouldn't be very useful.

WITH ordertest AS (
        SELECT NULL
        UNION ALL
        SELECT 2
        UNION ALL
        SELECT 1
        UNION ALL
        SELECT NULL
)
SELECT * FROM ordertest
ORDER BY 1;
 ?column?
----------
        1
        2
   (null)
   (null)

Nulls can even be specified as coming first, rather than last:

WITH ordertest AS (
        SELECT NULL
        UNION ALL
        SELECT 2
        UNION ALL
        SELECT 1
        UNION ALL
        SELECT NULL
)
SELECT * FROM ordertest
ORDER BY 1 NULLS FIRST;
 ?column?
----------
   (null)
   (null)
        1
        2

For unique indexes, nulls are considered not equal to each other, and multiple null can be stored in a unique index:

CREATE TABLE uniqtest (x INTEGER);
 
CREATE UNIQUE INDEX i_uniqtest ON uniqtest (x);
 
INSERT INTO uniqtest VALUES (1), (NULL), (NULL);
 
SELECT * FROM uniqtest;
   x
--------
      1
 (null)
 (null)

In Postgres, nulls are stored in indexes and is null is an indexable operation. Of course, for null index lookups to be useful, there must be a small percentage of null values in the indexed column.

View or Post Comments

Comparing Nulls (Part 6/11)

Friday, January 11, 2013

I just covered how comparisons with nulls return more nulls, so how do you test for null? There are actually two clauses designed specifically to compare nulls.

The first is the is null comparison clause. It is a clause rather than an operator because I mentioned earlier that operators on nulls return nulls. Here is how it works:

SELECT NULL = NULL;
 ?column?
----------
 (null)
 
SELECT NULL IS NULL;
 ?column?
----------
 t
 
SELECT NULL IS NOT NULL;
 ?column?
----------
 f
 
SELECT * FROM inctest;
   x
--------
     30
     40
 (null)
 
SELECT * FROM inctest WHERE x IS NULL;
   x
--------
 (null)
 
SELECT * FROM inctest WHERE x IS NOT NULL;
 x
----
 30
 40

You can see that an is null comparison on a null returns true, which allows the where clause to evaluate the comparison as true. is not null allows null values to be excluded.

Another null comparison clause is is distinct from:

SELECT 2 IS DISTINCT FROM 1;
 ?column?
----------
 t
 
SELECT NULL IS DISTINCT FROM 1;
 ?column?
----------
 t
 
SELECT NULL IS DISTINCT FROM NULL;
 ?column?
----------
 f
 
SELECT NULL <> 1;
 ?column?
----------
 (null)

is distinct from basically performs null comparisons in a two-valued logic system. For non-null inputs, it operates like a not-equal comparison. This allows for cleaner handling of the not-equals row comparison we did in an earlier blog entry:

SELECT * FROM inctest WHERE x IS DISTINCT FROM 30;
   x
--------
     40
 (null)

This returns all rows where x does not equal 30, or is null; in fact, this could have been rewritten as where x <> 30 or x is null. is not distinct from is the opposite (an equality comparison), though it sounds like a double-negative — think of it as the same as:

SELECT * FROM inctest WHERE x IS NOT DISTINCT FROM 30;
 x
----
 30

However, since = 30 already excludes nulls, it doesn't seem as useful, but it does allow you to do equality comparisons which considers nulls as equal:

CREATE TABLE disttest (x INTEGER, y INTEGER);
CREATE TABLE
 
INSERT INTO disttest VALUES (1, 1), (2, 3), (NULL, NULL);
INSERT 0 3
SELECT * FROM disttest where x IS NOT DISTINCT FROM y;
   x    |   y
--------+--------
      1 |      1
 (null) | (null)

This is particularly useful when joining two columns that might contain nulls.

View or Post Comments

Pg_Upgrade Now Supports Parallel Copy/Link

Wednesday, January 9, 2013

Pg_Upgrade's copies or links user data and index files from the old to the new cluster, and of course this can be slow, especially the default copy mode. Previous releases of pg_upgrade did the copying/linking serially.

Building on my recent pg_upgrade --jobs mode addition, I have just applied a patch to allow each tablespace to be copied/linked in parallel. Here are the performance results with two tablespaces on different magnetic disks, in copy mode:

          ---- seconds ----
   GB        git     patch
    2       62.09    63.75
    4       95.93   107.22
    8      194.96   195.29
   16      494.38   348.93
   32      983.28   644.23
   64     2227.73  1244.08
  128     4735.83  2547.09

The test machine has 24gb of ram, and you can see the patch has little benefit until the kernel cache starts to fill at 16gb.

This is my last planned pg_upgrade speed improvement. While performance tuning isn't done until the execution time is negative , I will need to find new ideas to improve pg_upgrade performance further. In summary, Postgres 9.3 will include a pg_upgrade optimized for database clusters using tablespaces and with many tables and databases.

View or Post Comments

The Not In Null Trap (Part 5/11)

Monday, January 7, 2013

As a corollary to three-valued logic expressed in the previous blog, not in comparisons are a regular source of surprise. Consider this comparison of one against various values:

SELECT 1 <> 2 AND 1 <> 3;
 ?column?
----------
 t
 
SELECT 1 <> 2 AND 1 <> 3 AND 1 <> NULL;
 ?column?
----------
 (null)

The first query has an obvious result, and looking at the second query, you can now understand why it would return null because of 1 <> null. Where the surprises come is when you aren't looking at explicit nulls but rather a subquery generating a null:

SELECT 'a' IN (SELECT NULL::text);
 ?column?
----------
 (null)
 
SELECT 'a' NOT IN (SELECT NULL::text);
 ?column?
----------
 (null)

Combining multi-value comparison and nulls, here is an example with the subqueries returning two rows:

SELECT 'a' IN (SELECT 'a' UNION ALL SELECT NULL);
 ?column?
----------
 t
 
SELECT 'a' NOT IN (SELECT 'a' UNION ALL SELECT NULL);
 ?column?
----------
 f
 
SELECT 'a' IN (SELECT 'b' UNION ALL SELECT NULL);
 ?column?
----------
 (null)
 
SELECT 'a' NOT IN (SELECT 'b' UNION ALL SELECT NULL);
 ?column?
----------
 (null)

The first two queries makes sense because there are matches. The last two queries have no matches, so they expand to:

SELECT 'a' = 'b' OR 'a' = NULL;
 ?column?
----------
 (null)
 
SELECT 'a' <> 'b' AND 'a' <> NULL;
 ?column?
----------
 (null)

The first case is clear that 'a' does not equal 'b' or null. What really trips people up is the second case, where 'a' <> 'b' and 'a' <> null returns null, rather than true. The proper solution is never to allow not in subqueries to return nulls, usually by excluding them in the subquery where clause.

View or Post Comments

The Three-Valued Logic of Nulls (Part 4/11)

Friday, January 4, 2013

Nulls exist in a three-valued logic system, which frankly sounds scary. A two-valued logic system, where things are either true or false, is very easy to understand. Three-valued logic sounds like something is wrong, and in an sense something is wrong because it injects uncertainty (null) into a logic system which normally only handles certainty. This is like the flat-nose pliers I mentioned earlier — ideally we could do everything with two-valued logic, but there are cases where three-valued logic, like pliers, is necessary, and we have to be able to deal with such cases:

SELECT NULL = 1;
 ?column?
----------
 (null)
 
SELECT NULL = '';
 ?column?
----------
 (null)
 
SELECT NULL = NULL;
 ?column?
----------
 (null)
 
SELECT NULL < NULL + 1;
 ?column?
----------
 (null)

The first three examples make sense if you assume that null means unknown — we don't know if the unknown value equals '1' or '', and we don't know if the two null (unknown) values are equal. The last one is harder to explain — clearly any numeric is larger if one is added to it, but we don't know if that null value might mean not applicable rather than unknown, and we don't even know if it is a numeric which supports addition, so again, returning null makes sense. Also, null is used for unknown/unassigned values all over the system, so there is no guarantee that the two nulls came from the same place and represent the same value. Assuming too much about the meaning of a null value is usually where problems start.

If the above comparisons return null, how do they behave in query comparisons?

SELECT 1
WHERE true;
 ?column?
----------
        1
 
SELECT 1
WHERE false;
 ?column?
----------
 
SELECT 1
WHERE NULL;
 ?column?
----------

The where clause returns the single row when passed true, but false and null return nothing. It has been suggested that the where clause should be called the sure clause because it only returns values it knows for sure are true. Null behaves like not true in boolean operations, so true or null returns true, while true and null returns null, which behaves like false when it reaches the where clause. Keep in mind that not true is not the same as false, e.g., not null is null, not true. Again, these are not surprising results, but they do suggest more complex queries that could cause surprises:

SELECT * FROM inctest;
   x
--------
     30
     40
 (null)
 
SELECT * FROM inctest WHERE x >= 0;
 x
----
 30
 40
 
SELECT * FROM inctest WHERE x < 0;
 x
—
 
SELECT * FROM inctest WHERE x < 0 OR x >= 0;
 x
----
 30
 40

Two-valued logic suggests that x is either positive, zero, or negative, but three-valued logic allows null values which do not fit into any numeric range; strings and dates have similar behavior. In fact, all data types can be assigned null values, and all operator comparisons with null values return null. These examples are even more bizarre:

SELECT * FROM inctest WHERE x <> 10;
 x
----
 30
 40
 
SELECT * FROM inctest WHERE x <> 10 OR x = 10;
 x
----
 30
 40

The first example kind of makes sense because we don't know if the null represents 10, but the second example would (two-valued) logically have x equaling 10 or not equaling 10, and you would think one of them is true. Null just doesn't work that way, and again shows that making assumptions about null leads to trouble. Don't assume you can split a relation into two parts without considering nulls.

If you stay with non-null values, you never see this, but once a null is involved, the three-valued behavior becomes very clear. This reinforces the advice that it is wise to limit the use of nulls in your schema, e.g by using not null, and to be very careful when nulls are involved.

View or Post Comments

The Non-Value of Nulls (Part 3/11)

Wednesday, January 2, 2013

Having covered why nulls exist and how they can be created explicitly and implicitly, let's look at how nulls behave by looking at some specific examples. I previously showed that you can specify null explicitly, but what happens when nulls interact with non-nulls:

SELECT NULL + 1;
 ?column?
----------
   (null)
 
SELECT NULL || 'a';
 ?column?
----------
 (null)
 
SELECT 'b' || NULL;
 ?column?
----------
 (null)

Above, you can see that null can interact with any data type, and it doesn't have a data type of its own. If you substitute the idea of unknown for null, the answers even make sense, e.g., unknown + 1 is unknown. While the above queries seem reasonable, you can start to see how "surprises" happen when the null is stored in a column:

CREATE TABLE inctest (x INTEGER);
 
INSERT INTO inctest VALUES (30), (40), (NULL);
 
SELECT x + 1 FROM inctest;
 ?column?
----------
       31
       41
   (null)

You can see the increment did nothing to the value in the last row. Add a where clause, and things really get complicated — I will cover that next.

View or Post Comments