Oleksii Vasyliev



Reading time: 10 minutes

Tags:   postgresql (170)   pgtune (1)   book (3)   developer (1)   webdev (1)   performance (6)   kyiv (1)   ukraine (5)  
Category:   Interviews   
Interview conducted by: Andreas Scherbaum

PostgreSQL is the World’s most advanced Open Source Relational Database. The interview series “PostgreSQL Person of the Week” presents the people who make the project what it is today. Read all interviews here.

Please tell us about yourself, and where you are from.

Hello, my name is Oleksii Vasyliev. I am a web software developer in Railsware. Currently located in Kyiv, Ukraine.

Oleksii Vasyliev

Oleksii Vasyliev

How do you spend your free time? What are your hobbies?

Here list of activities which I am doing between work: Open Source activities. This activity help me to learn and improve stuff, which I cannot cover at my work Podcasting and screencasting. I like to share some interesting stuff with the dev community and talk with different engineers. Travel with my drone. Before the war I had a drone and traveled with it to make cool videos for myself. Busy with our child - it is like a separate project, which needs to be maintained. Gaming and watching videos. Yep, sometimes I just want to do nothing :)

Any Social Media channels of yours we should be aware of?

Last book you read? Or a book you want to recommend to readers?

“The Three-Body Problem” by Liu Cixin is the last book I read (recommend if you like the science fiction genre). “High Performance Browser Networking” by Ilya Grigorik is the last technical book I read. This book is very good and I think that it should be a must-read for every developer of web-applications, but also for developers of mobile applications. The book goes through most layers of the network stack with great detail, explaining TCP, UPD, TLS, how WiFI and mobile networks work, HTTP and various browser APIs. The author’s idea is to really explain the working principles of these technologies and then use that deep understanding to suggest performance optimizations. “Atlas Shrugged” By Ayn Rand I recommend reading, if you haven’t already. It is a very good introduction to the objectivist philosophical system. Or you can play the BioShock games (but better after reading the books) 😀

What does your ideal weekend look like?

Traveling to some conferences in a different country, chatting at the afterparty with engineers, visiting different places after the conference in this country/city.

What is the best advice you ever got?

“Get comfortable with failure”. Programming is hard and people make mistakes. You need to accept and learn from your own mistakes. “The master has failed more times than the beginner has tried” moto will help you growth in your profession

When did you start using PostgreSQL, and why? Do you remember which version of PostgreSQL you started with?

My introduction to PostgreSQL started with version 8.1 in 2005. I was a PHP developer and at that time there was a popular LAMP stack (Linux, Apache, MySQL, PHP) for development. But for the development of the next project in our company one engineer proposed to use a different database - PostgreSQL. Nobody knows what this is and how to use it. I volunteered to figure it out and help with this database.

I remember at that time there was almost zero information about this database in Ukrainian or Russian languages. Until I read and learn from it, I decided to collect and translate needed information in notes. In the end all these notes were converted into a free book (now it is outdated and not supported any more).

I even remember my first bad situation with PostgreSQL. At that time the “autovacuum” daemon was not enabled by default and system administrators needed to maintain “VACUUM” for the database on their own (cron script or some other stuff). After the product was deployed, several months later it just stopped working and the database stopped accepting any connections. By this incident I learned about the error “database is not accepting commands to avoid wraparound data loss”, single-user mode (“–single” option) and why “VACUUM” is important for PostgreSQL.

I studied at National Aviation University, Kyiv, Ukraine. I selected the field of study “Computer systems and networks”, because at that time I was already doing some programming and didn’t want to waste time learning stuff, which I already knew (I already sold several software products at that time). University education helped me to see computers not as black boxes which accept my commands, but also understand how this stuff works under the hood. In my web developer profession this knowledge is not needed every time, but sometimes help to find reason faster because to it.

What other databases are you using? Which one is your favorite?

My default choice database is PostgreSQL. After many years of development I did try a good amount of databases, but this is my list of Open Source databases, which I use often together with PostgreSQL (not instead, but together):

  • Redis
  • Apache Cassandra
  • Sqlite
  • HBase
  • Elasticsearch

My closest PostgreSQL-related project is PGTune, which allows you to calculate some basic parameters for your database based on provided hardware information.

At work PostgreSQL itself is core for almost all my projects 🙂

How do you contribute to PostgreSQL? Any contributions to PostgreSQL which do not involve writing code?

I did a single contribution to PostgreSQL code base and I think major work there was done by Michael Paquier.

I didn’t make any more code contributions, because my knowledge in C language is on a basic level.

Some time ago I maintained a book about PostgreSQL in the Russian language, but decided to stop doing this in 2022. If I want to do something similar in the future, I will use English.

Right now I am focused on PostgreSQL performance (tools, tips, etc), which I can create and share for the community. One way of contribution is conference talks about it.

What is your favorite PostgreSQL extension?

My favorite extension is ltree. It implements a data type ltree for representing labels of data stored in a hierarchical tree-like structure. In development features sometimes I need to store tree structures. I can use some specialized databases for such structures, or continue to use PostgreSQL and store such structures in it. If in the end wins PostgreSQL, then ltree will be selected as the solution.

What is the most annoying PostgreSQL thing you can think of? And any chance to fix it?

“There are only two kinds of languages: the ones people complain about and the ones nobody uses“. I use PostgreSQL for long time and can create massive list of complains, but let’s focus on stuff, which happen very often:

  • Table/index bloating. Yep, everybody can say “we have pg_repack” or “do not remove data from tables”, but the current solution built-in in PostgreSQL - VACUUM FULL, which leads to downtime for systems which need to use this data. Maybe it is not critical for the community, that is why we have had this issue for so long and it still has not been resolved.
  • Very costly connections. Each connection in postgresql is a separate process and it is very costly. Some other databases use threads and it allows to support many connections without a big cost. Of course, we can solve this issue with separate products like pgPool, pgBouncer, or an application pool. But sometimes you just want to have this out of the box.
  • Upgrade between major versions is not for nervous people. Sometimes I just want a simple process, when PostgreSQL lazily converts old data on disk to the new format, if the database engine changed (so downtime will be close to zero). Like the community did with NOT NULL DEFAULT <value>, which does not block the table and fills in all defaults, but just does it until the user requests data from the table.

What is the feature you like most in the latest PostgreSQL version?

PostgreSQL 15.0 doesn’t bring some special features, which I waited for all the time. But I like the performance improvements for sorting and I think it is important to improve this metric for the database.

Should PostgreSQL have a built-in connection pooler?

To be short - yes. It is fine, if this feature will be optional and you need to activate it by some additional settings. But this definitely will help me as a web developer, when I don’t need to set up additional tools for this.

Should PostgreSQL have built-in multi-master replication?

In my humble opinion - I don’t think we need this. I did use multi-master replication in PostgreSQL (by using Bucardo) and I saw it is difficult stuff, especially the conflict resolution mechanism. I still think the problem, which can be resolved by multi-master replication, can also be resolved without it by creating a high availability (just add failovers) system.

Could you describe your PostgreSQL development toolbox?

Here is the list my daily tools:

  • Terminal: My daily driver is MacOS system, so I am using the iTerm terminal
  • Text editor/IDE: It depends on the programming language and platform. I’m often using:
    • VIM
    • VsCode
    • TextMate
    • Android Studio
    • Xcode
  • Browser: I am web developer, so mostly check my work in Chrome and Firefox browsers
  • Languages/compilers:
    • Ruby
    • Python
    • Node.js
    • Golang
    • Crystal
    • Java/Kotlin
    • Dart
    • Objective c

Which PostgreSQL conferences do you visit? Do you submit talks?

Before the war I very often visited PGConf.EU. Here are the slide from my talk “Supercharge your PostgreSQL with extensions” in 2017.

I like this conference and the feeling to be part of a community.

Do you think PostgreSQL will be here for many years in the future?

I hope so. Until we are using and improving it, it will be my “default” engine to store and flexibly retrieve data. I think extensibility for PostgreSQL by open interface for extensions helps to make it a versatile tool for different tasks.

Would you recommend PostgreSQL for business, or for side projects?

Yes, yes and one more time yes.

What is your opinion on ORMs?

Let me try to be the devil’s advocate for ORMs. ORMs are created to solve some development issues:

  • Speedup development for teams by following the same approach.
  • Improves security: ORM tools are built to eliminate the possibility of SQL injection attacks.
  • Code reuse (DRY - Don’t Repeat Yourself) for same constructions like “SELECT”, “LEFT JOIN”, “ORDER BY”, etc.
  • Depending on the ORM you get a lot of advanced features out of the box, such as support for transactions, connection pooling, migrations, seeds, streams, and all sorts of other goodies.

I understand the hate, when DBI engineers don’t understand why you need an ORM, if you already have SQL. But even to resolve SQL injection in a query, I need to create some “helper functions”, so I don’t reimplement this each time, when I write new SQL (DRY). This means after some time to resolve all these points I will create my own “ORM like” system, even if I don’t want to use an existing solution.

Major problems happen not with ORM, but with developers, who use ORMs like a “black box” (they don’t understand what is going on inside) and sometimes even don’t know SQL. So in the end such engineers generate so bad SQL using the ORM, that other developers start to complain about ORMs with exactly these issues. An ORM never removes the need to know the SQL syntax. So it is not a “bad ORM”, it is just a “bad developer with an ORM”. You can often replace “ORM” in this phrase with something else (for example “SQL”) and the results will still be bad.

SQL versus NoSQL databases?

Apples vs oranges? 😀

The NoSQL movement is popular today. However, this does not mean that relational databases are becoming rudimentary or archaic. Right now it is a symbiotic relationship between SQL and NoSQL (also appears in the NewSQL type of databases). We live in an era of polyglot persistence, an era of using the different needs of different data warehouses. Now there is no monopoly of relational databases, as there is no alternative source of data. Increasingly, architects are selected based on the nature of the storage of the data itself and how we want them to manipulate what volumes of data are expected.

Which other Open Source projects are you involved or interested in?

I mostly busy with my own OSS projects, like:

etc

And sometimes occasionally I help other OSS projects (if I know the language and I find some bug/idea about a feature/improvement).

Anything else you like to add?

Nope. Thanks for interview 👍