Tomas Vondra



Tag:   postgresql   
Category:   Interviews   
Interviewed by: Andreas Scherbaum

PostgreSQL is the World’s most advanced Open Source Relational Database. The interview series “PostgreSQL Person of the Week” presents the people who make the project what it is today. Read all interviews here.

Please tell us about yourself, your hobbies and where you are from.

My name is Tomas Vondra, I live in Prague, and I’m a PostgreSQL user, developer, contributor and committer. I work for 2ndQuadrant, one of the companies contributing to PostgreSQL and providing services related to it, and I’m also involved in the local PostgreSQL community in various ways. Aside from that I do have various sports-related hobbies - cycling for example.

Tomas Vondra

Tomas Vondra

Any Social Media channels of yours we should be aware of?

I pretty much use just @fuzzycz on Twitter, and I do occasionally post something on our company blog. I used to have my personal blog at pgaddict.com but I haven’t posted anything new for a long time. And LinkedIn of course.

When did you start using PostgreSQL, and why?

I think it was around 2003, or maybe 2004. I’ve been still at the university, but I’ve been also working for a small company running a bunch of e-commerce sites - mostly on MySQL at that time. At some point we’ve been hit by serious performance issues, it was blamed on the database and we’ve started looking for an alternative. There were proposals from management to try Oracle, but we the developers were not too keen about that and we’ve been looking for alternatives - we eventually noticed PostgreSQL, gave it a try and stuck with it. A fun fact is that the performance issues were not really a database issue, but the application doing very silly things, generating thousands of queries for each page view. But we liked PostgreSQL so much we decided to keep using it.

Do you remember which version of PostgreSQL you started with?

I think when we started looking at PostgreSQL the lates release was still 7.4. But I think we really started using it widely with 8.0. It seems ages ago, and it didn’t even have autovacuum back then.

Yes, I studied a programme called “software engineering” at a Czech Technical University. In practice it was mostly applied math and physics, with a little bit of software-related stuff. But I did enjoy it and in retrospect I actually appreciate that. I think it gave me a lot.

What other databases are you using? Which one is your favorite?

I don’t think I’m using any other database, at least not directly. I’m sure various products I’m using have databases behind them, but I don’t really have any direct experience with it.

I do contribute to various projects associated with 2ndQuadrant (like pglogical or BDR), but I’d say most of my development time is spent on the main PostgreSQL code - there’s plenty of interesting improvements in a commitfest at any given time. Some of them are mine, some of them I just review or test.

How do you contribute to PostgreSQL?

I think the most obvious contributions are code-related. Either code written directly by me, or patches written by other people that I help to review and sometimes commit.

Any contributions to PostgreSQL which do not involve writing code?

I don’t know if it matches your definition of a contribution, but as I mentioned I’m involved in organizing the local PostgreSQL community - I’m the president of the Czech and Slovak PostgreSQL User Group (CSPUG), and I’m co-organizing the local conference. For me a healthy and active user community is what makes a great open-source project.

What is your favorite PostgreSQL extension?

Well, that’s a really difficult question. There are so many great and interesting extensions that it’s really hard to pick a favourite one - I do think this illustrates the benefits of extensibility built into PostgreSQL, and I’ve written a number of extensions myself. But if I had to name just one extension, I’d probably say pg_partman - I do hope we eventually get some of the features into PostgreSQL, but until then it’s a tremendous help with management of common partitioning schemes.

What is the most annoying PostgreSQL thing you can think of? And any chance to fix it?

I’m afraid I’m the victim of Stockholm syndrome - I’m working with PostgreSQL for so long I got used to the annoying bits - I don’t like them, but I understand where the limitations come from and I got used to them. A lot of it is a consequence of extensibility, and the desire to do stuff outside core to allow different approaches / use cases. A nice example of that is the difficulty of setting up and managing a HA cluster, particularly with automatic failover, or horizontal scaling. You still need a lot of expertise and external tools to do that but I think we’re improving things.

What is the feature you like most in the latest PostgreSQL version?

I think the REINDEX CONCURRENTLY command is pretty neat. It’s pretty deceptive - it seems simple (Building a new index and dropping the old one, how hard could it be, right?) but it turns out to be pretty difficult to do safely. We’ve been trying to get this feature for a long time.

Adding to that, what feature/mechanism would you like to see in PostgreSQL? And why?

There’s a lot of those, unfortunately - columnar storage/execution, making HA easier to operate, horizontal scalability, incremental backups, and so on. Luckily, people are already working on a lot of those things.

Could you describe your PostgreSQL development toolbox?

I’d say my setup is a pretty traditional set of tools for C development on Linux. I use geany as my main development environment - I’m sure I’m not using it to a full potential, but it’s very lightweight and it gets the job done. Aside from that I use gcc/clang, gdb, perf, valgrind and traditional unix tools like grep etc. Nothing terribly uncommon, really.

Which skills are a must have for a PostgreSQL developer/user?

Not sure. Curiosity and civility, maybe? As a developer, I think it’s important to not get too attached to individual patches - the fact that I spent a lot of time working on a patch does not necessarily mean it’s a desirable feature or the right approach. And similarly, I think it’s important to be respectful when giving feedback to others - considering how diverse the PostgreSQL community is, and that we primarily communicate on a mailing list in a language which is not the native language for many, this is not entirely trivial.

Do you use any git best practices, which makes working with PostgreSQL easier?

My git workflow is pretty boring - develop a feature as a sequence of “meaningful” commits in a branch in a private repo clone, rebase the branch over time, send “git format-patch” to mailing list, eventually get it committed to master.

Which PostgreSQL conferences do you visit? Do you submit talks?

I try to get to various conferences, particularly those in Europe - pgconf.eu, Prague PostgreSQL Developer Day, Nordic PGDay, PGDay Paris, pgconf.de, pgconf.be, pgday Ukraine, and so on. And obviously PGCon. I’ve been to various PostgreSQL conferences around the the world and I’d like to visit most of them again, depending on feasibility. Yes, I do submit talks, but it’s always nice to be just an attendee and watch interesting stuff presented by other speakers.

Do you think Postgres has a high entry barrier?

Possibly, but it’s hard for me to say - I’ve joined the community a long time ago, and it’s hard for me to say how much higher the bar is now. I’m sure it’s not trivial to join the development, though - both because the code base is not small (it’s ~2M lines of code, and it’s ~4x larger compared to 2000), and unfamiliarity with the community development process.

What is your advice for people who want to start PostgreSQL developing - as in, contributing to the project. Where and how should they start?

I’d say there’s about three recommendations I’d give them:

  1. If you want to develop a feature, pick a topic / patch / feature that is in some sense valuable to you. Maybe it’s something you personally find interesting, or maybe it’s a feature your application would benefit from.
  2. To get familiar with the relevant code, take a look at patches in the current commitfest, and see if there’s something related to the feature you’d like to work on, and do a review. See if the feature makes sense to you, learn to apply a patch, build it, run tests, and see which parts of the code it touches. Reviews are a very important and valuable part of our development process, and it’s a good way to learn how the code works.
  3. Don’t be ashamed of copy-pasting. A lot of the development is copying code that almost does what you want, and whacking it until it actually does that.

Do you think PostgreSQL will be here for many years in the future?

I certainly hope so, and based on the steadily growing demand I think it will.

Would you recommend Postgres for business, or for side projects?

I think it works for both, for various reasons. I think one of the main benefits for business is that there is no “owner” of PostgreSQL. There are multiple companies with about equal access to the project, cooperating on the development but also competing in various ways - providing support and other services, different hosted offerings, etc. And of course, there’s the benefit of open source licensing.

Are you reading the -hackers mailinglist? Any other list?

Well, I’m certainly reading some of it - I’m trying to keep up with threads related to stuff I’m working on or somehow interesting to me. I don’t think there’s anyone able to read all of it and then also do anything else.

What other places do you hang out?

Pretty much just the mailing lists related to development and performance. I find the chat applications (IRC, Slack, …) very distracting, because of the implicit expectation of ad-hoc immediate conversation. The e-mail does not have this issue, when someone sends a message there’s no expectation of immediate response.

Which other Open Source projects are you involved or interested in?

None, really. At least not to an extent comparable to PostgreSQL. I do use many open source products, report bugs and so on, but when it comes to development I’m fully focused on PostgreSQL.