Tuesday, September 20, 2011

Postgres Open, and What Makes a Good Talk

My last few blog posts have been all about scalability and performance, mostly because that's what I've been spending most of my time on these last few months.  But I'm pleased to have another subject: Postgres Open.  I've been going to PostgreSQL conferences for a few years now, but Postgres Open - which is a new conference this year - was my first experience being on the program committee.


In general, I was very pleased with the way the conference turned out.  We didn't have quite as many attendees as I had hoped, probably partly because it was a new conference, and partly because we had some technical issues getting the schedule finalized and posted, but I'm hopeful that with a whole year to plan next year's conference (rather than less than six months) we can do a better job getting the word out.  So far, the feedback from both attendees and sponsors has been very positive.  To the extent that we've gotten any negative feedback, it's nearly all been of the form "I think the conference would be even better if..." rather than "I was really unhappy about...".


Since I was involved in the talk selection process for the conference (a first, for me), I found myself thinking a lot about what makes a good conference talk.  Obviously, there's a lot of room for differences of opinion on that point, but for what it's worth, here's my take:

1. More technical details are better than fewer technical details.
2. The process of solving the problem is often more interesting than the solution itself.

At this conference, I really enjoyed Peter van Hardenberg's talk about Heroku, which was short and incisive.  His best line: "When people say that streaming replication is easy to set up, I don't think they mean this" [at which point he clicked one button and added a replication slave to his running PostgreSQL instance].  I don't think I've ever really understood why people are so excited about the cloud, or what it means to make a piece of software ready for the cloud, but I think maybe I get it now.

I have been meaning to see Clark Evans' talk about HTSQL for a few conferences now, and I'm really glad I finally got around to it this time.  I've written a few query generators in my time, but nothing as sophisticated as what Clark has done.  Although seeing the query generator in action was kind of neat, what made the talk really interesting was his discussion of what's wrong with SQL.  This quote summed it up for me: "Teaching accidental programmers to think in terms of sets is hard."  Listening to the talk, I realized that the way I actually design queries in my head is very close to what HTSQL lets you do; having thought about them, I then translate them into SQL.  I suspect we're stuck with SQL for the forseeable future, but Clark's observations about why it fell short of the mark were very well-articulated and match my own experience.

Selena Deckelmann - the conference chair - insisted that we should take Adam Lowry's talk about Urban Airship, and that it would be really good.  I was skeptical, but she was right.  Adam detailed the process his company had gone through in choosing a database to power their service, and he pretty much tried them all, starting and ending with PostgreSQL.  It was interesting to hear what went wrong - in the case of PostgreSQL, one of his big gripes was that replication was too hard to set up and he didn't feel confident that he had done it correctly (which, for a complex replication setup, is probably a fair complaint); most of the other solutions proved to be unreliable in various ways, but the details were interesting to hear.

What I think all of these talks have in common - along with some others that I've heard a little less recently, like Gabriele Roth's talk A Survey of PostgreSQL Monitoring Tools at last year's PostgreSQL West - is that they tell you things you can't learn from the manual, like "how does this really work under the hood?" or "what happens if you actually try this?".

Next week, I'm off to Surge, now in its second year.

3 comments:

  1. I think supporting an API whereby a sufficiently motivated developer can provide a query tree directly to the planner would be the raw material required for query language designers and language binding developers to make real steps forward.

    In fact, this same mechanism could finally put the damned "query hints" argument to rest: just allow a user to submit a query tree directly for execution and make the planner an optional service which performs query tree manipulation.

    These would not be trivial changes, I am aware, and would probably take several versions to full realize, but I think the benefits would be immense and long-standing.

    Unrelatedly, thank you for your kind words about what we're doing. We believe that the key to moving the industry forward is to provide powerful, flexible abstractions that are reliable and predictable.

    Before we finish our work, deploying a new database of every size should be as natural as making a function call or making use of a graphical windowing system. Both of these were once tasks that required considerable investments on every project. Now we take them for granted. Let reliable data storage be so too.

    Peter

    ReplyDelete
  2. You might be interested on some rambling thoughts of mine on the topic here:

    The Executor as a Virtual Machine
    http://tapoueh.org/pgsql/char10.html#sec11

    Regards,

    ReplyDelete
  3. Robert,

    Thank you for your kind words. It was great to have your thoughtful engagement.

    Clark

    ReplyDelete