Abhijit Menon-Sen

Reading time: 14 minutes

Tags:   postgresql (170)   2ndquadrant (10)   enterprisedb (6)   barman (3)   ansible (5)  
Category:   Interviews   
Interview conducted by: Andreas Scherbaum

PostgreSQL is the World’s most advanced Open Source Relational Database. The interview series “PostgreSQL Person of the Week” presents the people who make the project what it is today. Read all interviews here.

Please tell us about yourself, and where you are from.

My name is Abhijit, and my family and I live in a small village in the Himalayas in India, overlooking a peaceful little valley with a small river that we can hear when it rains heavily.

I joined 2ndQuadrant in 2012 (when I had already been contributing to Postgres off and on for many years), and now I work at EDB after it acquired 2ndQuadrant in 2020. I have worked from home (wherever that happened to be) for nearly all of my career. (I had a regular job in Delhi in the late 1990s, but I couldn’t last even a whole year of having to commute to work every day.)

How do you spend your free time? What are your hobbies?

I enjoy spending time out in nature, or listening to music at home. I also like to make things, mostly with wood. I’m fortunate to live somewhere with a dark-enough sky that I can see the milky way on most nights, and the meteor showers are truly awe-inspiring.

Any Social Media channels of yours we should be aware of?

Last book you read? Or a book you want to recommend to readers?

The last book I read is “Before the coffee gets cold”, by Toshikazu Kawaguchi, translated from Japanese into English. It’s a whimsical story about the people who visit an old cafe to travel back in time—but under rules so restrictive that it’s hard to see the point of going at all!

I would also recommend “The Hungry Tide”, by Amitav Ghosh. Anyone could enjoy its complex, layered plot and skillful narration, but I find it all the more meaningful because it’s set in the Sundarbans (the Gangetic delta, shared by Bangladesh and India), a fascinating and beautiful but relentlessly hostile environment, of which I have many fond memories.

Any favorite movie, or show?

I love Kurosawa’s “Redbeard” and Tarkovsky’s “Andrei Rublev” for the matter-of-fact depiction of history, not making the past a grand spectacle, but using it as a mere backdrop for their deeply relatable and moving (but also very different) human themes.

I also enjoyed Tarkovsky’s “Stalker” for its surreal, slow, melancholic atmosphere and poetic interpretation of what was originally a science-fiction story. (I’ve read some people comparing this film to watching paint dry—so your mileage may vary!)

Ritwick Ghatak’s “Meghe Dhaka Tara” (Cloud-capped Star) is another magnificent film that follows a small personal tragedy in the context of an (unspoken) larger social tragedy, and unapologetically uses melodrama to tell the story in an unforgettable way.

What does your ideal weekend look like?

Putting on a loaf of bread to bake in the morning, having nice weather and time to go for a walk or a drive or to laze around reading, cooking something nice with my family, going to sleep early, and having absolutely no Postgres emergency support escalations. Not even a little “oh, we restarted Postgres and the problem is gone” one.

What’s still on your bucket list?

In January this year, after a seemingly endless period of isolating at home, my beloved partner and I did a 4000km+ road trip across the country, driving from mountains to the sea and back over the whole month, going through eight different states with no fixed itinerary. This kind of free-form trip was very different from any travel we’ve done before, and we liked it enough to want to do a lot more. But I’ve never had a bucket list.

What is the best advice you ever got?

I’m sure I’ve received a lot of good advice from people, but the only one that I remember as such was someone telling me, “You don’t need to write about everything”, which (in context) I understood to mean that staying quiet and thinking is sometimes the best you can do, even if you have something to say. I have applied this advice to all sorts of situations for which it wasn’t originally intended, and failed to apply it in others where I should have, and I didn’t like it when I first heard it, but somehow it’s always stayed with me (which may surprise the person who said this as an offhand remark in a longer email to me, a stranger).

When did you start using PostgreSQL, and why?

I don’t remember exactly when I started using it. I first installed Linux in mid-1996, when an Indian computer magazine distributed a free CD with Slackware on it. (At the time, I was still using an unreliable dialup connection without direct TCP/IP access, and it worked only for a few hours at night, so downloading anything larger than a few hundred kilobytes was a nightmare.) It took me a while to get used to it, and it must have been after I finished school, around 1997 or so, when I started playing with the different databases (and other software) available on the CD.

Do you remember which version of PostgreSQL you started with?

I think the earliest version I ever used was 6.3.x, which I just installed and played around with for a while. The earliest version I used seriously was 7.1.

No, I never went to university, computers or not. I took a gap year after I finished high school, and I’ve just kept doing that again and again for twenty four more years now.

What other databases are you using? Which one is your favorite?

I’ve looked at or used many databases (including using Oracle and Informix for some years, and using and hacking on MySQL for years), but I’ve never met anything I liked more than Postgres. I’m not religious about it—I would happily use something else if I felt it was more appropriate for some situation—but I also haven’t felt a need for anything but Postgres in a long time.

I am working on various features for Barman, an open-source Postgres backup program, as well as improvements to the backup functionality available on the server side, with the hope to seeing usable incremental backups available in core someday (my own contribution towards that goal may end up being relatively minor, but the problem is of central importance to me).

I’m also responsible for an Ansible-based deployment tool for Postgres and related software, which I built at 2ndQuadrant and have been maintaining for the past several years. It allows the declarative configuration and modification of Postgres clusters in various forms, and is a tool used across the organisation, by consultants (to prototype projects for customers), support engineers (to reproduce problems with specific configurations), and developers (to test fixes), as well as by customers themselves.

Of course, working at 2ndQuadrant/EDB means I’m also tangentially involved in a lot of other Postgres-related work, such as pglogical/BDR development.

How do you contribute to PostgreSQL?

I’ve submitted bug fixes and feature patches of varying size to Postgres for many years, and I’ve been involved with many other projects in the wider ecosystem, e.g., I was a repmgr maintainer, I’m the technical lead for Barman now, and I’ve contributed to many extensions etc.

Any contributions to PostgreSQL which do not involve writing code?

Answering questions on IRC (though I’m not as active these days as I used to be), and participating in mailing list discussions (ditto). Also a few conference talks, discussed below.

What is your favorite PostgreSQL extension?

Among publicly available extensions, I’m rather fond of pgaudit, which was the first major Postgres extension that I wrote (together with Ian Barwick). Its present incarnation is different from the original version, and I haven’t been involved in its development for quite a while, but it’s still excellent software.

What is the most annoying PostgreSQL thing you can think of? And any chance to fix it?

At a small scale, I would love to see an overhaul of the host-based authentication system (not just pg_hba.conf, which is confusing enough, but a host of design decisions around it). In the larger sense, I wish there was a built-in consensus mechanism that could be used to implement failover; although I generally appreciate the project’s policy to leave some important features to external software to implement (connection pooling is one example), I think failover is just too complicated (error-prone) and critical to be left out entirely.

Like many others, I too wish the project were just named “Postgres”.

What is the feature you like most in the latest PostgreSQL version?

I always find it hard to pick one or even a few features in a single Postgres release. If I had to choose, I would say the parallel query processing and vacuum improvements, but more for the trend of introducing significant improvements release-by-release than because of any particular change in this latest version. I also like the progress reporting changes.

In Postgres 15, I am very much looking forward to using archiving modules and pluggable base backup processing modules. They are important advances towards making Postgresql backups more convenient (which, again, will take a few more releases’ worth of work to accomplish). The WAL prefetch code (which is a different approach than a prototype I worked on long ago) is also an exciting addition that solves a real performance bottleneck.

Adding to that, what feature/mechanism would you like to see in PostgreSQL? And why?

One of my favourite projects was to write an embedded web server that runs as a bgworker in Postgres, and exposes internal state through an HTTP monitoring endpoint that does not incur backend startup overhead for each statistics request. There has been some recent discussion by other people on -hackers about this approach too, and I would really like to see this feature in some form (either in core or as an extension). It would dramatically improve the overall observability of Postgres.

On a less serious note, I would like to see support for bitmap indexes, but only because I was involved in their development long ago. (The use-cases vs. expected performance for the feature were not compelling enough to continue development at the time.)

Could you describe your PostgreSQL development toolbox?

I use Neovim. I’m not much of an IDE person, but I appreciate being able to use the clang language server for code navigation/completions within my familiar vim environment.

As for the build toolchain, I just use whatever the distribution installs by default. On this Debian 11 laptop, that’s gcc 10.2.1, but I have clang-12 (LLVM) installed on another machine. I use gdb or lldb for debugging, and strace+bpftrace (a lot!) for tracing.

I really like Sourcetrail for code navigation and visualisation, and I’ve used it a lot with Postgres; but alas, it is now abandonware. I still use the last released version, because there is really no viable replacement for it.

Which skills are a must have for a PostgreSQL developer/user?

Curiosity, persistence, and an attention to detail (for developers of nearly anything).

Which PostgreSQL conferences do you visit? Do you submit talks?

I have attended PgConf.IN (Bangalore, India) since its inception, and presented talks on some aspects of the Postgres internals (locking, backups, WAL, VACUUM). I was fortunate to have an enthusiastic audience, and lots of questions in the hallways after each talk.

I write slides (in HTML, using Slidy.js) with terse phrasing and lots of examples (e.g., command output, or snippets of source code) and diagrams—but I’m terrible at using drawing programs, so I write JavaScript code to generate SVG diagrams using Raphael.js. This is probably why I can give only one talk per year.

Do you think PostgreSQL has a high entry barrier?

Postgres has always had the reputation of being harder than MySQL to install and get started with (in no small part due to the mysteries of pg_hba.conf), but I think it’s no longer deserved. The default distribution-packaged Postgres setup is perfectly usable now, and has been for many years.

In terms of getting started with developing Postgres, however, the barrier is extraordinarily low. The source code is approachable to an extent that I’ve rarely encountered with other large projects (certainly much more so than the MySQL source code, which I have also worked on, long ago). The source code layout, comments, commit messages, and overall design are the biggest factors in newcomers being able to quickly make meaningful changes to a full-featured, production-quality RDBMS. (Of course, the development community is responsive and generally quite friendly, but it’s not particularly unusual in this regard. It’s the code itself that is remarkable.)

For better or worse, actually getting significant changes merged and released in Postgres can be quite challenging. But that’s a barrier well beyond the “entry level”.

What is your advice for people who want to start PostgreSQL developing - as in, contributing to the project. Where and how should they start?

The obvious starting point is the commitfest patch review process. I would encourage anyone to read earlier patch review discussions to become familiar with the kinds of things that reviewers should consider while reading code. To go beyond patch review, I would ignore the “todo list” (a place where bad ideas traditionally go to die) and find a bug to work on from pgsql-bugs. Follow some bugs from the initial report and discussions until the eventual fix, and you will become familiar enough with the code to fix bugs yourself. Of course, if there’s a specific feature you need, that will give you some direction and more motivation to dig in to the source code too.

Do you think PostgreSQL will be here for many years in the future?

I would really like to answer with an unequivocal yes—and of course Postgres is far too widely used to go away anytime soon—but I also realise that the project faces significant challenges in future to remain sustainable. For one thing, a lot of funding and developer attention has moved away from RDBMSes and SQL towards “novel”, “simple”, “fast” data stores. Whatever one may think of people’s opinions of SQL or Postgres, this change represents valuable resources being diverted away from Postgres development. In such conditions, Postgres’s reputation for being very hard to get significant new features into does not help, and more and more effort is going into solving people’s real problems outside Postgres (e.g., either in proprietary forks, or spinoff projects that care to maintain only wire-protocol compatibility).

Of course, it also doesn’t help that Postgres is coming closer and closer to fundamental design limitations as database sizes grow. This means that there are fewer low-hanging fruit to work on, and meaningful improvements would require major structural changes, for which the barrier to entry is enormously high. There are people doing excellent work on such projects (e.g., async IO support), but there are many more projects that have fallen by the wayside as well. Although the Postgres server developer community largely comprises people working for a small number of competing companies, it must somehow find better ways to work together to effect these big changes without making contributors bitter and burned out and cynical about the process.

I do hope we can solve those problems and keep the project healthy for decades to come.

Would you recommend PostgreSQL for business, or for side projects?

I’ve run business-critical Postgres servers, and been professionally involved in support/disaster recovery operations for such services, for many years now. I would not hesitate to recommend Postgres for data storage in most contexts, for side projects or for serious production use.

Are you reading the -hackers mailinglist? Any other list?

I read -hackers regularly and skim -performance occasionally.

What other places do you hang out?

I’m usually on IRC (Libera #postgresql) as amenonsen (I was crab for decades, but alas, I lost that nick in the Freenode/Libera shuffle), but I’m not very active these days.

Which other Open Source projects are you involved or interested in?

I’ve been involved with FLOSS (free/libre/open source software) development since I was sixteen, and I’ve had the opportunity to contribute to a number of projects throughout my professional life. Some notable examples: I’ve been a Perl committer (“pumpkin holder”) and an Ansible committer for several years, and I was one of the founders of the Archiveopteryx mail server (archiveopteryx.org) based on Postgres, which was unfortunately a little ahead of its time. I am also particularly proud of writing the Postgres protocol “dissector” (plugin) for Wireshark (which was, at the time, still called Ethereal).

Anything else you like to add?

I feel very fortunate to have used and contributed to Postgres in so many different ways over the majority of my working life, and I really value the productive working relationships I’ve built with many other contributors during that time.