Blog Engineering We are upgrading the operating system on our Postgres database clusters
2022-08-12
2 min read

We are upgrading the operating system on our Postgres database clusters

Learn when these upgrades will happen and how they will help boost performance and reliability on GitLab.com.

engineering.png

Continuing on the theme of improving the performance and reliability of GitLab.com, we have another step we will be taking for our clusters of Postgres database nodes. These nodes have been running on Ubuntu 16.04 with extended security maintenance patches and it is now time to get them to a more current version. Usually, this kind of upgrade is a behind-the-scenes event, but there is an underlying technicality that will require us to take a maintenance window to do the upgrade (more on that below).

We have been preparing for and practicing this upgrade and are now ready to schedule the window to do this work for GitLab.com.

When will the OS upgrade take place and what does this mean for users of GitLab.com?

This change is planned to take place on 2022-09-03 (Saturday) between 11:00 UTC and 14:00 UTC. The implementation of this change is anticipated to include a service downtime of up to 180 minutes (see reference issue). During this time you will experience complete service disruption of GitLab.com.

We are taking downtime to ensure that the application works as expected following the OS upgrade and to minimize the risk of any data integrity issues.

Join us at GitLab Commit 2022 and connect with the ideas, technologies, and people that are driving DevOps and digital transformation.

Background

GitLab.com's database architecture uses two Patroni/Postgres database clusters: main and CI. We recently did functional decomposition and now the CI Cluster stores the data generated by CI GitLab features. Each Patroni cluster has primary and multiple read-only replicas. For each of the Patroni clusters, the Postgres database size is ~18 TB running on Ubuntu 16.04. During the scheduled change window, we will be switching over to our newly built Ubuntu 20.04 clusters.

The challenge

Ubuntu 18.10 introduced an updated version of glibc (2.28), which includes a major update to locale data and causes Postgres indexes created with earlier versions of glibc to be corrupted. Because we are upgrading to Ubuntu 20.04, our indexes are affected by this. Therefore, during the downtime window scheduled for this work, we need to detect potentially corrupt indexes and have them reindexed before we enable production traffic again. We currently have the following types and the approximate number of indexes:

 Index Type | # of Indexes

We want to hear from you

Enjoyed reading this blog post or have questions or feedback? Share your thoughts by creating a new topic in the GitLab community forum. Share your feedback

Ready to get started?

See what your team could do with a unified DevSecOps Platform.

Get free trial

New to GitLab and not sure where to start?

Get started guide

Learn about what GitLab can do for your team

Talk to an expert