Markus Winand



Tags:   postgresql    modern sql    standard conformance    testing   
Category:   Interviews   
Interviewed by: Andreas Scherbaum

PostgreSQL is the World’s most advanced Open Source Relational Database. The interview series “PostgreSQL Person of the Week” presents the people who make the project what it is today. Read all interviews here.

Please tell us about yourself, your hobbies and where you are from.

I’m living in Vienna, Austria together with my wife and my 8-year-old son. Vienna is also the base of my one-man business for SQL consulting, training and publishing.

Markus Winand

Markus Winand

Any Social Media channels of yours we should be aware of?

When did you start using PostgreSQL, and why?

That was probably around 2010, when I was working on my book “SQL Performance Explained”.

The book was originally intended to cover one specific commercial database (guess which one) because this was the database that my employer at the time, a bank, used. I proposed that book to a well-known publisher. After a pretty long game of ping-pong and very positive feedback from the publisher, it was eventually rejected.

After that I was even more convinced that there is a need for this book. I continued writing and started publishing it for free on the web – under its working title “Use The Index, Luke!”. The reaction on the web was very positive. Without a publisher, I wasn’t bound to the original concept of focusing on one database, so I widened the scope to other databases.

I must admit that PostgreSQL was not on my radar at that time. A friend kept on mentioning PostgreSQL over and over again, so I had a look at it. I quickly realized that it had fewer limitations than other free open source databases, and so I added it.

Of the additional databases I covered, PostgreSQL had the biggest impact on the structure of “Use The Index, Luke!”. It turned out that the way PostgreSQL supports some features makes explaining things easier…even for users of other databases that don’t support those features.

At that time, PostgreSQL was the only major database that had proper row values support, including optimal index usage. That made keyset pagination easier to explain. PostgreSQL even had an impact on the table of contents at a very late stage of writing. I already had a section on NULL in the Oracle Database because Oracle has a pretty “unique” approach to it, which often causes headaches for people who don’t know about it. It was a pretty long section, and I wasn’t really proud of. Then I realized that it is easier to explain that oddity once the reader knows about the WHERE clause for CREATE INDEX (“partial indexes”). The Oracle Database doesn’t even allow that, but the way Oracle treats NULL values in indexes is super easy to explain once you understood partial indexes. So I added a section about partial indexes and rewrote the section on NULL in the Oracle Database. This section became much shorter and clearer.

Do you remember which version of PostgreSQL you started with?

The Web tells me that it must have been 9.0. What I actually remember is that I was desperately waiting for 9.2 because it introduced the Index Only Scan. The lack of Index Only Scan was the most critical limitation from my perspective: On the one hand, it is a very important tuning technique, and on the other hand all the other databases on “Use The Index, Luke!” were able to do index only scans for ages. From an indexing perspective, it was the most critical gap. Unfortunately, the PostgreSQL Index Only Scan is still not as reliable as it is for other databases. This is a long story that I most recently covered in my article about the INCLUDE clause for (b-tree) indexes.

My formal education was a long but rather unsuccessful exercise.

I started programming at the age of eight and took the first opportunity to attend a “computer high school” (HTL) at the age of 14. I got good grades in the technical subjects, and bad grades in English and geography. I managed to fail the first year. In Austria, that means you must repeat the entire year, so all of the subjects. That time I got even better grades in the technical subjects, but still bad grades in the other subjects. I failed again.

As you are not allowed to repeat another time, I had two options: find a school for another main subject (not computers) or start working. Luckily, I ended up in a school for electrical engineering. I managed to pass every year until the fifth and final year, which I failed – but this time in the technical main subjects. The reasons were complex and are best summarized as “lack of interest” and “being too old to be treated like a school kid”.

In total I’ve spent seven years at schools that are supposed to take only five years. I didn’t graduate. I don’t regret the time I spent there – it turns out you can keep the knowledge even if you don’t graduate ;)

What other databases are you using? Which one is your favorite?

I’m currently supporting these other databases (listed alphabetically): BigQuery, Db2, MariaDB, MySQL, Oracle DB, SQL Server, SQLite.

I’m getting the “favorite” question quite often, so I have a diplomatic answer prepared for that:

My business helps people to solve their database-related problems, no matter what database they are using. Telling clients that a problem would be easy to solve or wouldn’t even occur on another database isn’t helpful.

Every database product has its strengths. If you start a new project from scratch and you know the exact requirements, I can help you find the best fit. If you don’t know the exact requirements yet, or you don’t want to optimize for requirements that might change someday, I would personally start with PostgreSQL as a safe default.

How do you contribute to PostgreSQL?

Currently my main contribution is testing new features for their compliance with the ISO SQL standard. This is a surprisingly depressing exercise as it can involve a huge amount of effort with very little yield (i.e. none or only a few bugs identified). For example, it took me about two weeks to test the JSON/Path features of PostgreSQL 12. I did find one (minor) problem in that time. On the other hand, I’ve found several issues in other products, so my time wasn’t wasted ;)

What is your favorite PostgreSQL extension?

What is the most annoying PostgreSQL thing you can think of? And any chance to fix it? What is the feature you like most in the latest PostgreSQL version?

I can answer these questions together: The pre-12 way of running WITH queries (CTEs) was really annoying as it was surprising – some might say dangerous. I’m happy that’s fixed now.

Adding to that, what feature/mechanism would you like to see in PostgreSQL? And why?

From my modern SQL presentations, I can say that the most requested feature from developers is system versioning – i.e. time travel queries without changing the application (just DDLs). This feature also has strategic relevance as most commercial databases have it already, including Google BigQuery. In the open source space, MariaDB was the first major player to support it. That means PostgreSQL is already behind many competitors.

Still, that’s just what people are asking for. My personal pick is a different one: row pattern recognition, a.k.a. MATCH_RECOGNIZE. I think MATCH_RECOGNIZE is pretty much a game changer for SQL, just like window functions. It is basically about regular expressions, but not matching characters out of strings but rows out of tables. I have a presentation about it.

Which PostgreSQL conferences do you visit? Do you submit talks?

I have attended and spoken at several PostgreSQL events. It started with PgConf.EU in Prague in 2012. In early 2013, I gave my first ever presentation at the FOSDEM PgDay in Brussels (“Pagination done the PostgreSQL way”). The same year I presented “Indexes: The neglected performance all-rounder“ at PgConf.EU in Dublin. In 2014 I delivered an indexing tutorial at PgConf.org in Ottawa. At FOSDEM 2015, I presented “Modern SQL in PostgreSQL”. Later that year I decided that this is going to be my next big thing (https://modern-sql.com/).

Do you think PostgreSQL will be here for many years in the future?

Definitively.

Nowadays, the willingness to change the database is much higher as it was maybe 10 year ago. Many clients are asking me about pros and cons of different products. Some clients have recently completed a migration. Out of those, PostgreSQL is the by far most popular migration target.

Would you recommend Postgres for business, or for side projects?

PostgreSQL can be used in pretty much every context.

Are you reading the -hackers mailinglist? Any other list?

I tried. Twice. I had to unsubscribe as an act of self-defense.

The problem was that there are too many interesting discussions going on. I couldn’t resist reading them – I was spending too much time on it.

To get an idea what is going on before it is released or mentioned on Depesz’s “Waiting for …” series, I keep an eye on the Commitfest website.