Postgres

Protect your Postgres Servers with Zmanda Pro

Postgres databases provide several tools to perform different types of backups and restores. Each tool has pros and cons and recommended use cases. This documentation covers backups using the pg_basebackup utility. Later we will add documentation for pg_dump.

pg_basebackup

About the utility

pg_basebackup allows you to take a base backup of a running PostgreSQL database cluster. This backup can be used for various purposes, such as setting up a standby server for high availability, creating a replica, or creating a base backup for point-in-time recovery.

Here are some key points about pg_basebackup:

  1. Purpose: The primary purpose of pg_basebackup is to create a consistent snapshot of the entire PostgreSQL database cluster with minimal downtime (on the order of seconds, up to several minutes).

  2. Online Backup: It can perform an online backup, meaning you can create a backup while the PostgreSQL server is running and accepting connections. This ensures minimal downtime for your production system.

  3. Full Copy of the Cluster: The backup created by pg_basebackup is a full copy of the database cluster, including all databases, tablespaces, and other configuration files. It can be used to restore the entire cluster to a specific point in time.

  4. WAL Shipping: Along with the base backup, pg_basebackup also copies the Write-Ahead Log (WAL) files. This ensures that the standby server can catch up with changes made on the primary server.

  5. Options and Parameters: pg_basebackup has various options and parameters that allow you to customize the backup process. You can specify the destination directory, compression options, connection parameters, and more.

Here is an example command:

PGPASSWORD=your_password pg_basebackup -h your_hostname -U your_username -D /path/to/backup/directory -P

  • PGPASSWORD=your_password: Sets the PostgreSQL password using the PGPASSWORD environment variable.

  • pg_basebackup: Initiates the PostgreSQL base backup command.

  • h your_hostname: Specifies the host where the PostgreSQL server is running.

  • U your_username: Specifies the PostgreSQL username.

  • D /path/to/backup/directory: Specifies the target directory for the backup.

  • P: Shows progress information.

Prepare the Postgres Server

  1. Create two directories, one for the full backups and the other for WAL files. Ownership of the directories must be postgres:postgres with 770 permissions. A good place may be in the home directory of the postgres user.

  2. Edit the postgres.conf file with the following configurations (generally located at /var/lib/pgsql/<version>/data/postgres.conf)

#Enable continuous archiving 
wal_level = replica

#Set the location for archiving WAL files 
archive_mode = on
archive_command = 'cp %p /path/to/inc_backup/%f' # command to copy wal files from data/pg_wal to pre-backup location. 

# Enable logs (Optional but helpful) 
log_destination = 'stderr' 
logging_collector = on 
log_directory = 'log' 
log_filename = 'postgresql-%a.log' 
log_rotation_age = 1d
  1. Update the pg_hba.conf file by adding the line host all all <ip of client system> 255.255.255.252 md5as shown below (file usually located at /var/lib/pgsql/<version>/data/pg_hba.conf) :

  1. Restart the Postgres Services

# name the of the service can change based on the version of the postgres and the host OS.
sudo systemctl restart postgres-14

# or open connection to database and run the below query
SELECT pg_reload_conf();

Configure Backups

Backups will be created using the "Files and Folders" protected item type and will make use of Before task and After task commands. We will create one protected item for full backups, and one for WAL backups.

We include unmasked passwords in our Before and After task commands. To avoid exposing passwords in the Zmanda Pro UI, you may choose to input these commands in shell scripts and provide the path to the script within Zmanda Pro.

  1. Create a Files and Folders protected item for the full backups:

    1. Items: Include /path/to/full_backup

    2. Commands:

      1. Before Task: PGPASSWORD=<password> pg_basebackup -h <host ip> -U postgres --pgdata=/path/to/full_backup --wal-method=stream --progress -v

      2. After task: rm -rf /path/to/full_backup/*

    3. Click Save

  1. Create a Files and Folders protected item for the WAL backups:

    1. Items: Include /path/to/inc_backup

    2. Commands:

      1. Before Task: PGPASSWORD=<password> psql -h <host ip> -U postgres -d postgres -c "SELECT pg_switch_wal();"

      2. After task: rm -rf /path/to/inc_backup/*

    3. Click Save

You can now run these backups manually or according to a schedule. Please note that you may see warnings next to the jobs. Check the job logs to make sure that you see (100%), 1/1 tablespace as shown below:

Restore

You must first restore the desired full backup, then sequentially restore any subsequent WAL backups.

  1. Stop the Postgres Server: systemctl stop postgresql-<version>

  2. Create target directories for the full and incremental restores. As before, the directories should be owned by postgres:postgres with 770 permission

  3. Restore the full and incremental backups to the paths you just created using the Zmanda Pro UI. To save time, you can restore all the required incremental backups for your specific point-in-time recovery in this step and then proceed with the following steps.

  4. After the restores are complete, check to ensure that all the restored files and folders are owned by postgres:postgres. Directories and files should have 700 and 600 permissions respectively.

  5. To restore the full backup, run the following commands:

# With the services stopped, remove all the contents of the Postgres data directory
rm -rf /path/to/data/directory/*

# Copy the files from your restore directory to the data directory
mv /path/to/full_restore/full_backup /path/to/data/directory/

Again, all files and folders should be owned by postgres:postgres. Files should have 600 permissions, and all directories should have 700 permissions

  1. To restore the incremental backups, update the following line in the postgres.conf file (located in the data directory):

restore_command = 'cp /path/to/incremental/restore/inc_backup/%f %p'
  1. Create a recovery.signal file in the data directory

cd /path/to/data/directory
touch recovery.signal
  1. Start the postgres server:systemctl start postgresql-<version>

  2. Once the recovery.signal file is gone, check the database to ensure you have recovered to the desired state.

  3. Remove the restored directories and contents created in Step 2. This ensures a clean slate for the next restore jobs. Otherwise, you will likely have multiple full and incremental restore jobs mixed which may cause issues down the line.

rm -rf /path/to/restore/directory

Considerations for pg_basebackup

  1. Restore granularity is limited to the entire Postgres server. Restoration of specific databases and tables is not supported (use pg_dump for higher granularity)

  2. Postgres config files (postgresql.conf, pg_hba.conf, and pg_ident.conf) are not included in incremental backups. They are only part of full backups.

  3. This approach requires a temporary storage area equal to the size of the base backup + WAL files. Large databases with many transactions may generate large sets of WAL files and large base backups.

  4. Regarding Step 3 of the Restore process, you may be required to perform a large number of restores depending on the frequency of full backups and the number of incremental backups taken between two full backups. For example, if a full backup is taken once per week and an incremental is taken every hour, you will have to perform 1 restore for the full backup, and up to 24*7 restores to reach the desired point in time.

Last updated