Alexander Kukushkin



Tags:   postgresql    berlin    zalando    patroni    spilo    pgconf    pgday    ha    clustering   
Category:   Interviews   
Interview conducted by: Andreas Scherbaum

PostgreSQL is the World’s most advanced Open Source Relational Database. The interview series “PostgreSQL Person of the Week” presents the people who make the project what it is today. Read all interviews here.

Please tell us about yourself, and where you are from.

My name is Alexander. Originally I am from Russia. Since 2013 I live in Berlin and work as a Database Engineer at Zalando.

Alexander Kukushkin

Alexander Kukushkin

How do you spend your free time? What are your hobbies?

Like many other people I enjoy traveling to new places, sightseeing and trying some new food. Unfortunately, the Corona-virus made traveling impossible, therefore in the last few months I tried to explore walking-distance neighbourhoods. First of all it is a nice workout, but also sometimes you get to see very interesting animals in the wild (in a big city!). So far in my list not only something ordinary, like squirrels and rabbits, but also hares, foxes, and once even the occasional family of racoons.

Any Social Media channels of yours we should be aware of?

Any favorite movie, or show?

The Big Bang Theory

What’s still on your bucket list?

There are still quite a few continents and countries to visit.

When did you start using PostgreSQL, and why?

My first encounter was in 2001. Back then I was working as a Web Developer and we built one of the projects with PostgreSQL. I can’t say that I immediately fell in love with it, because in a few months the project was done and I didn’t return to Postgers for the next eight long years. The year 2009 was very different. It gave me an opportunity to look at PostgreSQL from another side, i.e. like a DBA. We had to set up a replica using Londiste, I did some configuration tuning, used PostGIS. Last but not least, one day we figured out that trigger functions implemented in PL/pgSQL were causing some performance issues, and I volunteered to rewrite them in C. All of that made me realise how flexible and extensible PostgreSQL is.

Do you remember which version of PostgreSQL you started with?

A very good question. I don’t remember precisely, but in 2001 it likely was 7.0 or 7.1. In 2009 I can tell for sure that it was PostgreSQL 8.4. It still didn’t have streaming support and we were using Londiste for replication. A couple of weeks after 9.0 was released we did the major upgrade with dump/restore and became happy users of streaming replication.

After six years at Moscow Institute of Physics and Technology (State University) I’ve earned a master’s degree in Applied Mathematics and Physics. My main field of study was Physics of Plasma, and it has very little to do with computers and databases. The greatest outcome of the university was that it taught me how to approach any problem from different sides, evaluate possible solutions, analyze all trade-offs, and choose a path that would work the best for the current circumstances.

What other databases are you using? Which one is your favorite?

Now I exclusively use PostgreSQL, but in the past it was also MySQL and a little bit of Oracle. Guess which one is my favorite? :)

I am a maintainer of Patroni - the PostgreSQL HA solution, and the Spilo - Docker image packaging Patroni and PostgreSQL. Those are probably my most known contributions, but there are actually a few more. Like, for example, bg_mon - a PostgreSQL background worker that collects a lot of metrics with one second resolution, stores them in memory for the next few minutes, and provides the http API exposing them in JSON format. It allows you to have a tool similar to pg_activity, pg_view, pgcenter, or similar in the browser.

How do you contribute to PostgreSQL?

I didn’t contribute to the core of PostgreSQL as much as I would like to. The recent and probably the biggest contribution was making walsenders not count as normal connections, so the replication is not affected when the amount of client backend reaches max_connections. Also, like any other user should do, I send bug reports and if I have enough knowledge in this area, I contribute a fix.

Any contributions to PostgreSQL which do not involve writing code?

Speaking at conferences doesn’t involve writing code, at least not directly.

What is your favorite PostgreSQL extension?

Extensions are multiplying the power of PostgreSQL and it is just not possible to name only one:

  • The pg_stat_statements extension is something that everyone must enable for performance monitoring and troubleshooting.
  • Despite declarative table partitioning being in core for a few years now, the pg_partman extension is still very useful, because it helps to automate the partition life-cycle.
  • Sooner or later developers want to set up regular tasks in the database, and here comes pg_cron to the rescue.
  • The pg_repack extension helps to rebuild bloated tables and indexes.

Last, but not least - PgQ. Don’t try reinventing the wheel. Either you will create the second PgQ or your solution will fail at a critical moment.

What is the most annoying PostgreSQL thing you can think of? And any chance to fix it?

That would certainly be the VACUUM. PostgreSQL 12 introduced table access methods and opened a possibility for implementing alternative storage solutions. It is pity that the zheap topic seems to have stopped receiving attention in the past year. I am happy that Cybertec recently committed to get it finished.

What is the feature you like most in the latest PostgreSQL version?

Most people would certainly mention the index deduplication, and I totally agree with them. But, as a maintainer of a HA solution for PostgreSQL I really like that starting from 13 the primary_conninfo could be changed without restarting a standby.

Adding to that, what feature/mechanism would you like to see in PostgreSQL? And why?

  • Failover of logical slots. Replication slots and logical decoding were introduced in 9.4, but since then there is still no built-in mechanism to replicate logical slot state to the standby in order to continue logical replication when the standby is promoted to the primary. However, the Patroni release 2.1.0 did resolve the problem for PostgreSQL 11 and newer.

  • In-core connection pooler. Certainly there are external solutions available, like pgBouncer, or Odyssey, but they don’t play nicely with prepared statements and could introduce some overhead for restoring client settings like statement_timeout, application_name, and similar.

  • Protocol (libpq) compression. When for example we use pg_basebackup to create a standby, all data is transferred uncompressed, producing unnecessary load on the network infrastructure. In fact, such data is easily compressible and if modern algorithms (like for example lz4) are used, the CPU overhead is insignificant.

Could you describe your PostgreSQL development toolbox?

I have been using Linux and VIM since 2000, so nothing really fancy. Although, if you are not familiar with it yet, I would really recommend you to try rr. With this tool you record a failure once, then debug the recording, deterministically, as many times as you want. The same execution is replayed every time.

Which skills are a must have for a PostgreSQL developer/user?

If we talk about developers, there are two different things that come to my mind. Being a PostgreSQL hacker requires at least some C skills, a lot of patience and being ready to participate in very-long threads on pgsql-hackers mailing list. PostgreSQL users should learn SQL (instead of relying on an ORM), get some understanding on how MVCC works, and what tasks autovacuum solves.

Do you use any git best practices, which makes working with PostgreSQL easier?

I am not an advanced git user. If something goes beyond the basic commands, often I have to google for the solution.

Which PostgreSQL conferences do you visit? Do you submit talks?

In the past I attended a few different PostgreSQL conferences as a speaker. Among them are PGConfs and PGDays of different kinds: PGConf.EU, PGConf.DE, PGConf.Russia, PGConf APAC, PGConf.Asia, FOSDEM PGDay. The most unique (and my favorite) is PGCon in Ottawa.

Do you think Postgres has a high entry barrier?

Getting PostgreSQL up and running is easy enough. Going to production is much harder, you need to think about backups, performance tuning, high-availability and automatic failover, security, monitoring and so on. Very often it includes installing and configuring third party tools, and only a few of them are referenced from the Postgres documentation. Fortunately there is a lot of free learning material available on the internet, in the form of blog posts (Planet PostgreSQL), and slides/videos from different conferences.

What is your advice for people who want to start PostgreSQL developing - as in, contributing to the project. Where and how should they start?

Get familiar with the Developer FAQ, clone the PostgreSQL source, try to compile and install it. Subscribe and start reading mailing lists, -hackers and -bugs . Then you need to find an idea of what to hack. Don’t even try to use the Todo list for that. It is more like a list of ideas which someone already tried to implement and failed. Instead, you better start reviewing and testing existing patches. Maybe someone is already working on the feature which you want to have in the core? Reviewing patches helps you to get familiar with Postgres internals, and more important, you immediately start helping other developers and therefore the whole community.

Do you think PostgreSQL will be here for many years in the future?

The project or product exists while people are developing/using it. Being an open-source, not owned by a single company and having a very diverse community of developers and users that are steadily growing, PostgreSQL will surely stay with us for many years.

Would you recommend Postgres for business, or for side projects?

The practice showed that PostgreSQL is flexible and scalable, it is a great choice for projects of any kind. I will never stop saying: if you are in doubt about which database to choose - start with PostgreSQL. Later, if you figure out that PostgreSQL wasn’t a good choice (what is very unlikely) - you can always migrate with the help of logical replication and/or foreign data wrappers.

Are you reading the -hackers mailinglist? Any other list?

I am subscribed to -hackers and -bugs . I don’t have time to carefully read all messages, but try to go through the list of subjects at least once a day and carefully read threads that caught my eye.

What other places do you hang out?

Slack , a couple of Telegram chats, but I am not very active there.

Which other Open Source projects are you involved or interested in?

Occasionally I contribute to some python modules that are used in Patroni.