Thursday, October 28, 2010

Here Comes PostgreSQL West

In just a few days, I'll be off to PostgreSQL West.  I've attended PostgreSQL East and PGCon both of the last two years, but this will be my first trip out to PG West.  As with past conferences, this will be a good opportunity for me to catch up with people I normally speak with only via the Internet.  But, there's something that's a little different about this one.  Take a look at the agenda.

Monday, October 25, 2010

WAL Reliability

I recently learned, somewhat to my chagrin, that operating systems are pathological liars, and in particular that they habitually lie about whether data has actually been written to disk.  If you use any database product, you should care about this, because it can result in unfixable, and in some cases undetected, corruption of your database.  First, a question.  On which of the following operating systems do fsync() and related calls behave properly out of the box?

A. Linux
B. Windows
C. MacOS


Thursday, October 14, 2010

Choosing a Datastore

In thinking about which database might be best for any particular job, it's easy to get lost in the PR. Advocates of traditional relational database systems like Oracle and PostgreSQL tend to focus on the fact that systems are feature-rich and provide features such as atomicity, consistency, isolation, and durability (ACID), while advocates of document databases (like MongoDB) and key-value stores (memcached, Dynamo, Riak, and many others) tend to focus on performance, horizontal scalability, and ease of configuration.  This is obviously an apples-and-oranges comparison, and a good deal of misunderstanding and finger-pointing can result.  Of course, the real situation is a bit more complicated: everyone really wants to have all of these features, and any trade-off between them is bound to be difficult.

Wednesday, October 06, 2010

Down To Six

From early July until the beginning of this week, the PostgreSQL project has been maintaining eight active branches: 7.4, 8.0, 8.1, 8.2, 8.3, 8.4, 9.0, and the master branch (9.1devel).   As a result, a significant number of bug fixes and security updates had to be back-patched into all of those releases.  At least for me, the recent switch to git has made back-patching, at least for simple cases, a whole lot simpler.  But it's still a fair amount of work - some parts of the code have changed a good deal since 2003, when 7.4 was released.

Monday, October 04, 2010

SURGE Recap

Bruce Momjian and I spent Thursday and Friday of last week in Baltimore, attending Surge.  It was a great conference.  I think the best speakers were Bryan Cantrill of Joyent (@bcantrill), John Allspaw of Etsy (@allspaw), and Artur Bergman of Wikia (@crucially), but there were many other good talks as well.  The theme of the conference was scalability, and a number of speakers discussed how they'd tackled scalability challenges.  Most seem to have started out with an infrastructure based on MySQL or PostgreSQL and added other technologies around the core database to improve scalability, especially Lucene and memcached.  But there were some interesting exceptions, such as a talk by Mike Malone wherein he described building a system to manage spatial data (along the lines of PostGIS) on top of Apache Cassandra.

Some general themes I took away from the conference: