Stefan Keller



Tag:   postgresql   
Category:   Interviews   
Interview conducted by: Andreas Scherbaum

PostgreSQL is the World’s most advanced Open Source Relational Database. The interview series “PostgreSQL Person of the Week” presents the people who make the project what it is today. Read all interviews here.

Please tell us about yourself, and where you are from.

I’m a Professor for Data Engineering since about 20 years and director of Institute for Software at Eastern University of Switzerland (short “OST”). I’m a “spatialist”; my focus is on (spatial) data engineering, spatial data analytics and open source, open data and educational resources.

The OST campus of Rapperswil is at a nice lake side at the other end of lake of Zurich. We offer Bachelors and Masters degrees in Computer Science and Data Science. We also are well known in industry for our continuing education courses which cover a broad range from Software Engineering and Human-Computer Interaction to databases - like PostgreSQL/PostGIS.

Stefan Keller

Stefan Keller

How do you spend your free time? What are your hobbies?

Family, mapping, hiking as well as skiing and motorcycling.

Any Social Media channels of yours we should be aware of?

The news blog on the website of the Institute for Software at the most beautiful campus of Switzerland.

My personal Twitter handle.

Last book you read? Or a book you want to recommend to readers?

I’m reading mostly during my holidays. I hope epub reader for mobiles will change that so that I can read while commuting. I recommend following modern form as a “must read”: The programming zines from Julia Evans! Look for example at this one about SQL.

Any favorite movie, or show?

The movie “Pirates of the Carribean”, and following sports like ice hockey, soccer and volleyball. I’m also an opera fan. I helped out singing e.g. in the prisoners’ chorus of Nabucco - but that was many years ago.

What’s still on your bucket list?

Travelling, just travelling!

Stefan Keller

Stefan Keller

When did you start using PostgreSQL, and why?

I’ve started using PostgreSQL around year 2002 when I became a professor here at Eastern University of Switzerland. I had to teach database systems and there was Oracle in place. And Oracle was a pain to install just because of its size. And also because all that license stuff. So, I looked for an alternative which was close to the SQL standard. And it should be Open Source so the students could install it on their own computers which was not a common place at the time.

This is how I found PostgreSQL. Then - since I specialized in geospatial information technology and got addicted by maps - I learned to know OpenStreetMap, the wikipedia of maps. And guess, what’s at the core of OpenStreetMap: A PostgreSQL database.

Do you remember which version of PostgreSQL you started with?

No - I thought that’s an Oracle question ;-). I had to look it up and it must have been version 7.2.

Yes, I studied Geography with a minor in Computer Science at the University of Zurich. I was an auxiliary assistant at the Institute for Informatics where I learned Lisp in addition to Pascal. That was rather helpful.

What other databases are you using? Which one is your favorite?

I love the joke “The solution is PostgreSQL - What was the question?” since it’s a kind of reaction to the NoSQL movement which somehow still seems to maintain some hype. Actually I’m using also SQLite or H2 when they fit the task. And Lucene (OpenSearch) if you count that as a database.

Many research projects are related to PostgreSQL since OpenStreetMap is a treasure trove for data engineering and spatial data analysis. And the PostGIS extension is the perfect tool for many tasks.

As a teacher I have the possibilities to work with young computer science people which are eager to learn, to program and to evaluate and benchmark. So there are always student theses which are using PostgreSQL at the persistence level.

The most recent software develpment project is called pgsynthdata. This is a synthethic data generator tool with almost no manual configuration needs. It takes a PostgreSQL database as input, clones the schema and generates purely synthetic data of almost arbitrary size. It’s a tiny but promising command line program written in Python with an open source license.

How do you contribute to PostgreSQL?

I’m not a core contributor. I sometimes maintain self-made PostgreSQL extensions written in SQL and PL/pgSQL.

My main contributions are outreaching PostgreSQL while teaching and organizing events. Actually, I claim to be the inventor of the notion of “postgresonic” which was inspired by pythonic. If I say this is postgresonic I want to express that this is good SQL coding style. And that’s my initial definition:

  • Prefer modern SQL standard over idiosyncracies, when you have a choice, like e.g. “JOIN ON …".
  • Prefer lowercase snake_case for object names (e.g. my_table, not MYTABLE).
  • Prefer operators over functions (e.g. select ‘2021-03-13’::date + ‘1 day’::interval).
  • Prefer cast over typename (e.g. not: select date ‘2021-03-12’ but see below).

I’m keen to add more and I’m open for any comments on this.

Any contributions to PostgreSQL which do not involve writing code?

As said before, I’m helping to organize for example the Swiss PGDay together with the Swiss PostgreSQL Association (SwissPUG.ch). And I regularly organize public presentations of my database seminar as part of our masters program of computer and data science.

What is your favorite PostgreSQL extension?

Definitely PostGIS - although there’s also the key-value store “hstore” and “jsonb”. As you may know, PostGIS became the leading geospatial management software tool - beyond the Open Source domain. Even implementations like Microsoft SQL Server and Google stores are referring to PostGIS.

What is the feature you like most in the latest PostgreSQL version?

PostgreSQL is a rather underestimated multi-model database which is highly extensible. This extensibility combined with a vibrant community of contributors is just unprecedent.

As a specific feature I like the recent implementations of SQL/JSON and JSON Path. I’ve actually written a whole 20 page tutorial about this. And I’m waiting for the JSON_TABLE function to be released!

Adding to that, what feature/mechanism would you like to see in PostgreSQL? And why?

I’m waiting for a native column store as an application of the pluggable storage engine. That would be a kind of milestone besides the parallelization efforts. A significant speed-up is being expected by column store processing as compared to the traditional tuple-at-a-time pipelining (“Volcano-style”). It would also be another “proof” of how extensible and up-to-date PostgreSQL is.

Could you describe your PostgreSQL development toolbox?

It’s just pgAdmin4, psql and Notepad++. Then I eventually check tools like DBeaver and Visual Code in order to be up-to-date what students may use.

Which skills are a must have for a PostgreSQL developer/user?

Modern SQL skills meaning SQL knowledge which goes beyond SQL:92.

My observation is that two kinds of people sometimes want to avoid SQL in an absurd way: Software engineers and NoSQL guys. That’s mainly just because they either don’t like it or don’t understand it. But who says “I don’t understand it” when you can say “it’s outdated or bad”.

A colleague of mine and database consultant, Lukas Eder, gave a talk that shows how some tasks are easier and faster with SQL databases than with programming with a computer language like Java.

Do you use any git best practices, which makes working with PostgreSQL easier?

We’re using git and git flow by default. This also helps PostgreSQL related projects. So for example pgsynthdata is being maintained here.

I’d rather emphasize that unit testing with PostgreSQL is a challenge - like with many other databases. Here I’m wondering why e.g. pgunit is not used more.

And - though simple enough - I often have to explain people the advantages of shell scripts and command line tools - like psql!

Which PostgreSQL conferences do you visit? Do you submit talks?

Besides the Swiss PGDay I’m sometimes visiting PGConf.EU. And yes, I submit talks occasionally if my lecturing time permits.

Do you think PostgreSQL has a high entry barrier?

No. It’s nifty and easy to install. And it has a vibrant, welcoming and diverse community.

What is your advice for people who want to start PostgreSQL developing - as in, contributing to the project. Where and how should they start?

There are so many possibilities to start in an open source project like this. The very first step probably is to get in contact with your local community or on one of the social media channels.

Do you think PostgreSQL will be here for many years in the future?

Definitely. And the reason is in it’s robustness, extensibility and community. It’s an “elephant”, as Mike Stonebraker coined it; but it’s not legacy code as he probably wanted to point to.

Are you reading the -hackers mailinglist? Any other list?

Yes, sometimes and mainly to search it’s archives.

What other places do you hang out?

Sometimes I’m on Twitter and sometimes on IRC which is another unprecedent element in the PostgreSQL ecosystem. You have to know how the Postgres IRC channel works, like before all, to ask a specific and precise question. I would guess that this kind of expertise and often instant support would cost you a fortune in commercial systems.

Anything else you like to add?

Thanks, Andreas, for your passion about databases in general and especially for PostgreSQL!