Artur Zakirov



Reading time: 8 minutes

Tags:   postgresql (175)   adjust (2)   berlin (4)   postgrespro (3)   russia (6)  
Category:   interviews   
Written by: Andreas Scherbaum

Please tell us about yourself, and where you are from.

I currently reside in Berlin, Germany, and work at Adjust. I grew up in a village in Bashkortostan, Russia. It is located in a green area far from the hustle of big cities. During my early years I didn’t think about living in big cities. Today it’s hard to imagine myself away from a vibrant city.

I moved to Berlin around three years ago. And I lived in Tokyo, Japan, around one year before moving to Berlin.

Artur Zakirov

Artur Zakirov

How do you spend your free time? What are your hobbies?

I like cycling and I did a few long bicycle tours. Nowadays I prefer using a bicycle for transportation over public transport.

On weekends I like going hiking to explore areas outside of Berlin. It is a good way to discover new surroundings and to unwind after a busy week. I find www.komoot.com very useful to find new routes and www.meetup.com to find company for a hike.

Every week I go bouldering with friends. There we try to do new upward routes, although progression and getting better at bouldering becomes challenging at times.

Any Social Media channels of yours we should be aware of?

Some of my social media accounts are private. I have the following accounts, which are visible to everybody, although I don’t post often:

Any favorite movie, or show?

I like watching science fiction and fantasy. Among my favorite movies are Blade Runner 2049 and Dune. And some of my favorite TV shows are Better Call Saul, House of the Dragon and the anime series Samurai Champloo.

When did you start using PostgreSQL, and why?

I started my career with MS SQL Server. I worked with it for about 6 years. In 2015 I started working at Postgres Professional as a developer. There I actively contributed to the enhancement of the Postgres Pro fork, introducing new features. Some of the new features and bug fixes were accepted and committed to the PostgreSQL core.

I am also a contributor and maintainer for various open-source extensions, including the RUM access method, pg_variables for session wide variables, and the pg_probackup backup manager. The experience and knowledge gained during my work at the company were very valuable, and I am very grateful for the opportunity to have worked there. Today I’m a maintainer for pg_repack, which we also use heavily where I work.

Do you remember which version of PostgreSQL you started with?

If I remember correctly, it was PostgreSQL 9.4.

What other databases are you using? Which one is your favorite?

I have an interest in exploring column-oriented databases. ClickHouse has caught my attention due to its performance and features, although I haven’t had the opportunity to use it.

At Adjust we use Apache Parquet as a column-oriented data storage format. We utilize parquet_fdw to interact with Parquet files. There are a few challenging issues that I am addressing and working on solutions for.

At my current job we have many PostgreSQL extensions, most of which are in private repositories, and I am responsible for their maintenance.

Since 2023 I’m one of the maintainers for pg_repack. Given its extensive use within our environment, I have actively worked on addressing some issues that previously made it difficult to use it smoothly.

How do you contribute to PostgreSQL?

I contribute to PostgreSQL by submitting patches to address bugs that occasionally happen in our environment. One challenge we face is a rare and difficult-to-reproduce bug in streaming replication replay. I’d be happy to fix the bug and submit a patch, however I couldn’t come up with a fix yet.

Unfortunately nowadays finding additional time to dedicate to the review of patches from other contributors isn’t easy due to lack of time.

What is your favorite PostgreSQL extension?

One of my favorite PostgreSQL extensions is postgres_fdw. We use it a lot for various purposes, facilitating seamless access to remote data.

What is the most annoying PostgreSQL thing you can think of? And any chance to fix it?

Performing a PostgreSQL instance upgrade without downtime can be a challenging task. The use of pg_upgrade requires caution, especially when you have streaming replication with replicas. The documentation for using pg_upgrade and together with rsync may seem intimidating and risky.

Another approach to minimize downtime during PostgreSQL upgrades involves logical replication. However this method comes with its own set of limitations and caveats, and it requires additional space for the target databases.

It would be highly beneficial if pg_upgrade had the capability to upgrade a replica instance as well. Currently, I lack a viable strategy for its implementation. pg_upgrade cannot perform the same upgrade procedures on a replica as it does on a primary, since the upgraded replica would then be unable to replicate data from the primary. However, one potential approach could involve utilizing the replication protocol to copy upgraded files from the primary to the replica.

What is the feature you like most in the latest PostgreSQL version?

Robert Haas recently committed support for incremental backups. I recall his proposal for the design of incremental backups back in 2019. In 2023 he proposed patches that were committed in the same year. Although this feature is not yet released, it’s going to be included in the upcoming major version, PostgreSQL 17. It is a great improvement of backup and I look forward to its eventual release.

Adding to that, what feature/mechanism would you like to see in PostgreSQL? And why?

It would be great to have undo logs in PostgreSQL. Undo logs could effectively mitigate data bloat, a common issue in many environments. Few years ago EnterpriseDB developed the zheap storage engine, which implemented undo logs. However it appears that the development of zheap has since stalled.

Another storage engine, which implements undo logs, is OrioleDB. But you can use it only with a patched PostgreSQL, which might be not convenient in some environments.

Another feature, which would be good to have, is index-organized tables. In our environment many of our largest tables have primary keys resulting in additional space usage. However index-organized tables may be inefficient when dealing with secondary indexes, making table scans using a secondary index less effective.

Should PostgreSQL have a built-in connection pooler?

It is known that PostgreSQL connections are expensive. Nowadays several popular standalone connection poolers, such as PgBouncer and Pgpool-II, offer advantages over a built-in connection pooler. They allow for redirecting client connections to a second PostgreSQL instance during maintenance on the first one. Additionally installing them on the client side may reduce latency.

However, standalone connection poolers (in case of non-session level pooling) come with some drawbacks that could be mitigated by a built-in connection pooler. A built-in solution could offer better transparency to clients and enable the use of session-level features, like GUC and other session state variables, or advisory locks.

Could you describe your PostgreSQL development toolbox?

I use VS Code and Neovim with plugins such as clangd LSP plugin and others to transform them into an IDE. For debugging, I use gdb with the gdb dashboard plugin, along with tools such as strace, valgrind and perf with FlameGraph.

Do you use any git best practices, which makes working with PostgreSQL easier?

git provides a useful command when submitting multiple patches for a feature: git format-patch -N -vM. This command generates N patches for the topmost N commits, with filenames prefixed by v<M> referencing a version of the patches. Each file includes useful metadata for reviewers, such as the author, commit message and the patch statistics. That command is quite useful to add a version number and divide a feature into multiple patches.

Which PostgreSQL conferences do you visit? Do you submit talks?

Occasionally I speak at Meetups.

What is your advice for people who want to start PostgreSQL developing - as in, contributing to the project. Where and how should they start?

The PostgreSQL Developer FAQ is a great starting point, offering answers to a range of topics, from debugging tools to running tests, along with links to valuable information on PostgreSQL internals.

If you already have a patch or a feature idea you’d like to implement, you can get familiar with the Submitting a Patch wiki page. However for complex or big patch ideas it’s advisable to share a proposal description with a high-level design on the pgsql-hackers mailing list first. This step can save time and provide you with valuable insights.

If you don’t have your own patch, contributing to reviews of other patches is always a good idea. This not only helps others but also familiarizes you with the PostgreSQL community and the source code.

Do you think PostgreSQL will be here for many years in the future?

I believe that PostgreSQL will remain popular for many years. Nowadays there are areas, such as storing time series data or running analytical queries, where other database management systems may be more suitable. However reliable and solid relational DBMSs will always be in demand. In such cases the highly configurable PostgreSQL is a good choice.

Would you recommend PostgreSQL for business, or for side projects?

At my current job, we are handling vast amounts of data, and we use PostgreSQL quite extensively. It is well-suited for both business and side projects. Furthermore it provides an excellent environment for learning about database systems.

Are you reading the -hackers mailing list? Any other list?

I’m subscribed to -hackers, -general and -bugs mailing lists. I try to read threads on interesting topics or new messages.

Additionally I’m a subscriber to the Postgres Weekly mailing list. It helps to stay up-to-date with changes in PostgreSQL and related projects, as well as updates on new releases and events.

Which other Open Source projects are you involved or interested in?

Beyond PostgreSQL related projects I’m interested in the Go programming language, and the Apache Arrow projects. I try to stay informed about the latest changes in these projects and their ecosystems.

Anything else you like to add?

I’m happy that I’ve chosen to align my career with PostgreSQL. It has been a rewarding experience, providing me with the opportunity to connect with great people.