PostgreSQL

PostgreSQL Server Backup

When the Amanda Enterprise Edition is configured and licensed for PostgreSQL backup, the ZMC allows you to select a PostgreSQL server to back up. When a PostgreSQL server is backed up, all databases are included in the backup, which can be either full (using a copy of the data directory) or incremental (which depends on PosgreSQL write-ahead logs, referred to as WALs).

Note that backing up PostgreSQL tablespaces is not supported.

Requirements for PostgreSQL Server Backup and Restore

These instructions assume you have already installed and licensed the Amanda Enterprise server and the PostgreSQL server being backed up. There are a number of additional requirements:

  • Make sure that your versions of Postgres conform to the tested platforms listed on the Zmanda Network Supported Platforms page.
  • Create a backup set that you intend to use for backing up the PostgreSQL server as described here.
  • On the Amanda server, edit the /etc/zmanda/zmc_aee/zmc_user_dumptypes to add the following lines to app_ampgsql_user definition if they are different from the Amanda default values (see below) :

property "TMPDIR" "Path_to_temp_dir"
property "STATEDIR" "Path_to_state_dir"

Path_to_temp_dir

Temporary directory. Must have enough capacity to temporarily store the backup data for a full backup of the PostgreSQL server. Defaults to /tmp. The amandabackup user (Amanda client user) must have read, write and execute privileges on this directory. It is recommended to use different directory for TMPDIR because /tmp file system may not have sufficient space.  Please make the modification to the value as shown above.

Path_to_state_dir

Directory to store information about what has been backed up (i.e., the state of the backup). It requires only about 20K for each backup object/DLE. The amandabackup user must have read, write and execute privileges on this directory. Default is /var/lib/amanda/gnutar-lists.

  • On the PostgreSQL server (Amanda client), edit the /etc/amanda/backup_set_name/amanda-client.conf file to include the following lines (to make the settings global, make these changes to /etc/amanda/amanda-client.conf ):

property "PG-DATADIR" "Path_to_PSQL_Data_Dir"

property "PSQL-PATH" "Path_to_PSQL_Binary"
property "PG-ARCHIVEDIR" "Path_to_PSQL_Archive_Dir"

property "PG-HOST" "hostname_or_directory_of_socket_file"
property "PG-PORT""TCP_port_to_connect_to. Default: 5432"
property "PG-USER"
"PostgreSQL_username"
property "PG-PASSWORD" "PSQL_Password"

property "PG-CLEANUPWAL" "Whether_to_clean_up_WAL_Yes_or_No"

property "PG-PASSFILE" "Path_to_PSQL_Password_File"

property "PG-DB" "Database_name"

If you are running PostgreSQL 8.1 or later, use PG-PASSFILE parameter. PG-HOST, PG-PORT, PG-DB, PG-USER and PG-PASSWORD properties should be commented out. If you are using older version of PostgreSQL (earlier than 8.1), you cannot use PG-PASSFILE. You have to use PG-POST, PG-PORT, PG-DB, PG-USER and PG-PASSWORD property.

To specify multiple databases, add a prefix to the property name that corresponds to the diskname, followed by a dash. For example:
       property "PG-USER" "amandabackup"

becomes:

       property "/path/to/data/dir-PG-USER" "amandabackup"

Path_to_PSQL_Data_Dir

The path to the PostgreSQL data directory.

Path_to_PSQL_Archive_Dir

The path to the PostgreSQL archive directory.  Specify the path where the archive command copies files and stores them between full backup runs. The PostgeSQL user must have read, write and execute privileges in this directory. Zmanda recommends using system groups to manage permissions rather than granting access to all users.

Specify the location that has been configured in PostgreSQL for continuous Write-ahead Log (WAL) archiving (i.e. the archive_command in the PostgreSQL config file). Note that write-ahead logging must be enabled. WAL is not enabled by default in either PostgreSQL or PostgreSQL Plus.

hostname_or_directory_of_socket_file

Specify the hostname (localhost if that is appropriate) or the directory where a socket file is located. Entries beginning with / are interpreted as a socket file directory (just the directory, for example, /tmp, not /tmp/.s.PGSQL.5432). If a directory is used, the PostgreSQL server and Amanda backup server must reside on the same machine.

Path_to_PSQL_Binary

The path to the PostgreSQL psql binary executable file.

PostgreSQL_username

The PostgreSQL database user to connect as, which must have superuser privileges.

Path_to_PSQL_Password_File

Passfile to use for Postgre 8.1 or greater. See http://www.postgresql.org/docs/8.1/static/libpq-pgpass.html. The file must be owned by the amandabackup user and the file must be readable only by that user (e.g. 0600), as noted in the PostgreSQL documentation.

PSQL_Password

The PostgreSQL password (deprecated in PostgreSQL 8.1).

Whether_to_clean_up_WAL

Whether or not to remove old WAL segment files during full backups. WAL archive files are removed from PG_ARCHIVEDIR location after full backup is completed. Default is yes.

Database_name

The database to connect to. The default value is "template1" that exists in default PostgreSQL installations.

 

For further details on application properties, see amanda-client.conf(5). For specific details on PostgreSQL agent properties, see ampgsql(8).

  • The pathnames referenced above must exist, with permissions set as indicated in the table below:

Directory

Amanda Postgres
TMPDIR rwx  
STATEDIR rwx  
PG-DATADIR    rwx
PG-ARCHIVEDIR rwx  rwx
PG-PASSFILE

 r

 

 

  • Create a Postgresql role called amandabackup (or whatever system user name is used for Amanda backups). The role should be created as a LOGIN role with SUPERUSER privileges and should be generated with a password that either matches the value of PG-PASSWORD, or is supplied in the PG password file, PG-PASSFILE.  This can be accomplished using either the Postgresql system command createuser, or the Postgresql database command CREATE ROLE.

    For example (using the Postgresql database command CREATE ROLE):
CREATE ROLE amandabackup WITH SUPERUSER LOGIN PASSWORD 'password';
  • PostgreSQL does NOT allow a user to run psql as any other user. For example: root user cannot run psql command as another user. This will cause Amanda configuration check failures.  You will have to modify /var/lib/pgsql/data/pg_hba.conf (the actual location might be different depending on Postgres installation) to allow amandabackup and root user to access all databases from the Postgres server. Following example allows all users to access all databases (you may have modify existing line in the configuration file):
TYPE       DATABASE  USER         CIDR-ADDRESS         METHOD

local        all     all                                 md5

Configuring PostgreSQL Server Backups from the ZMC Backup What Page

Create a dedicated backup set for each PostreSQL server you intend to back up. On the Backup What page you are prompted to select what type of object you want to back up. Choose PostgreSQL, and the following options are displayed:

backup_where_postgreSQL.png

Host Name
The name of the machine running the PostgreSQL server you intend to back up.
Data Directory
The path to the PostgreSQL data directory.
Encryption and Compression
Set these options as desired. They are described in more detail here
Advanced Options - Estimate 
If estimates are taking too long and the databases being backed up do not change in size that much from backup to backup, use the the Historical Average calculated from previous backups. In most cases, the default of Reliably Accurate is appropriate.

After you have set the options, click the Add button to add the PostgreSQL server to the backup set. Click Apply Plan to Server to commit the changes; you can then configure the backup set just as you would any other by setting the options on Backup Where, Backup How, and Backup When, etc.

For PostgreSQL backups, the data timeout must be increased in the Backup How page. The default data timeout is not sufficient. The data timeout is specified for the backup set and cannot be specified for each Disk List Entry. Zmanda recommends the data timeout should be set to 6000 seconds or higher as shown below. Please note the setting the time out higher will not impact backup performance.

BackupHow-data-timeout-3.1.png

 

Configuring PostgreSQL Server Restores from the ZMC Restore What Page

Make sure that PostgreSQL is installed in the same location as when the backup was run. The databases and logs file locations should also match the original configuration.

Either select the desired PostgreSQL backup from one of the Reports, or go directly to the Restore Where page and select a PostgreSQL backup for restore. The Explore button lets you select from the most recent backups.

When you have selected the backup object that includes the PostgreSQL server for restore, the Restore What page displays the following options: 

restore-what-postgresql (1).png

Select the databases you wish to restore. Choose All for a full restore up to the last backup.  Click Next Step when you are done, and the Restore Where options are displayed:

restore-where-postgreSQL.png

Set the restore options as desired. Note that the Destination Directory and Temporary Directory much each have enough space to hold the selected backup data. If you choose the same directory for both, make sure that the selected directory has enough space to hold two copies of the backup image. Do not specify specify the PostgreSQL data/cluster directory as a destination, especially if PostgreSQL is running.

After reviewing the entries, click Restore to start the restore process.

When the ZMC restore process is complete, the restored files will reside on the specified host and destination directory.  Completing the recovery is accomplished outside the ZMC using the host operating system and PostgreSQL as described below.

Completing the PostgreSQL Database Point-in-time Recovery

This section describes the steps to do point-in-time recovery using the database and WAL logs restored using ZMC. You can perform database restoration using amrecover command which is described in the next section.

  1. If the PostGreSQL server is running, stop it using the following command:

        # /etc/init.d/pgplus_83 stop
     
  2. As a safeguard, copy the entire cluster data directory from the stopped production server to a temporary location. This will require enough disk space on the system to hold two copies of the existing database. If sufficient disk space is unavailable, you should at the very least make a copy of everything in the pg_xlog subdirectory of the cluster data directory. pg_xlog may contain logs that were not archived before the system was stopped.  For example:
        mkdir /opt/postgres-restore/safeguard
        cp -rp /opt/PostgresPlus/8.3/data/ /opt/postgres-restore/safeguard
    
     
  3. Remove all files and subdirectories under the cluster data directory.

        # rm -rf /opt/PostgresPlus/8.3/data/*
     
  4. Using the restored database dump, restore the database files by unpacking the earliest (i.e. the base or level 0) backup image:

          # cd /opt/postgres-restore/
       # ls
         data  zmc_restore_20090327141718  zmc_restore_20090329142314 
       # tar xfv zmc_restore_20090327141718
            archive_dir.tar
            data_dir.tar


    The earliest backup image (zmc_restore_20090327141718 in this case) may be removed if space is needed.

    --Unpack  the data_dir.tar file to the database data directory

        # cd /opt/PostgresPlus/8.3/data/
        #
    tar xf /opt/postgres-restore/data_dir.tar

    -- Ensure that all files and directories are restored with the correct ownership (i.e., owned by the database system user, not by root) and permissions.

    -- unpack "archive_dir.tar" file and all remaining  incremental PostgreSQL backup images to a temporary archive directory owned by the database system user

      # mkdir /opt/postgres-restore/archive/
      # chown postgres:postgres /opt/postgres-restore/archive/
      # cd /opt/postgres-restore/archive/
      # tar xf /opt/postgres-restore/archive_dir.tar
      # tar xf /opt/postgres-restore/zmc_restore_20090329142314
      # ls
      00000002000000000000004A  00000002000000000000004B


    -- The data_dir.tar, archive_dir.tar and all Postgres backup images may all be deleted now if desired.
     
  5. Purge any logs in pg_xlog/; these are from the backup dump and unlikely to be current. If you create a copy of pg_xlog/ in step 2,  create it now, taking care to ensure that it is a symbolic link if the original was configured as such. If manually creating the pg_xlog/  directory, you must also recreate the subdirectory pg_xlog/archive_status/ as well.

        # rm /opt/PostgresPlus/8.3/data/pg_xlog/*
        #
    rm /opt/PostgresPlus/8.3/data/pg_xlog/archive_status/*
     
  6. If you copied the unarchived WAL segment files as described in step 2, copy them now to pg_xlog/. You should copy them rather move them in case there is a problem that requires you to start over.

        # rm -rf /opt/PostgresPlus/8.3/data/pg_xlog
        #
    cp -rp /opt/postgres-restore/data/pg_xlog/ /opt/PostgresPlus/8.3/data/
     
  7. Create the file recovery.conf in the cluster data directory (see the PostgreSQL documentation's Recovery Settings). It is also prudent to modify pg_hba.conf to prevent users from connecting before successful recovery has been verified.

    - Edit /opt/PostgresPlus/8.3/data/recovery.conf to include (at minimum) the following entry, which must specify the path to your temporary archive directory:

        restore_command = 'cp /opt/postgres-restore/archive/%f "%p"'

    - Change the ownership and permissions on this file so that it is owned by the database system user, and that it is only readable and writable by this user

        # chown postgres:postgres /opt/PostgresPlus/8.3/data/recovery.conf
        # chmod 0600 /opt/PostgresPlus/8.3/data/recovery.conf

  8. Start the server, which will automatically begin recovering from the archived WAL files. If the recovery stops on an error, restart the server to continue the recovery after you have corrected the error condition. Upon successful completion of the recovery, the server renames recovery.conf to recovery.done and then starts normal database operations.

         # /etc/init.d/pgplus_83 start
     
  9. Inspect the the database to verify that it is in the expected point in time. If it is not recovered to the correct point, return to step 1. After the recovery is verified, allow end-user access by restoring pg_hba.conf to to its production state.  Further details on PostgreSQL Point-in-time recovery are available in the PostgreSQL documentation; see Recovering using a Continuous Archive Backup.

 

PostgreSQL database recovery using amrecover command

Instead of using Zmanda Management Console, you can recover PostgreSQL database backups to the client by running amrecover command on the client.  The procedure is to restore the database to an alternate location or directly to the database directory (default /var/lib/pgsql). PostgreSQL database will perform the recovery when the service is started.  Recovery from WAL can also performed using recovery.conf as described in the previous section.

An example of recovery using amrecover command from a full backup. For complete set of amrecover command options, please see amrecover man page. This steps are valid for RedHat Enterprise Linux/Cent OS. Other platforms will require modifications to the PostgreSQL service name and database directory location.

  1. Stop the PostgreSQL service

# service postgresql stop

  1. Copy the PostgreSQL database directory if the data is being restored the database directory.
  2. Remove all the contents of the Postgres database directory (/var/lib/pgsql)
  3. Run amrecover on the backup set containing PostgreSQL database

# amrecover <backup set name>

  1. Select the backup image to be restored

amrecover> setdisk <DLE-name>

  1. Select the name of the backup file

amrecover> add <name of Postgres backup file>

  1. Switch the directory to be restored to

amrecover> lcd /var/lib/pgsql

  1. Start the restoration process

amrecover> extract

  1. Exit amrecover command

amrecover> quit

  1. Delete the PostgreSQL Backup label file

# rm /var/lib/pgsql/data/backup_label

  1. [OPTIONAL] This step is required only for point-in-time recovery. Create the file recovery.conf in the  data directory (see the PostgreSQL documentation's Recovery Settings). It is also prudent to modify /var/lib/pgsql/pg_hba.conf to prevent users from connecting before successful recovery has been verified.

    - Edit /var/lib/pgsql/data/recovery.conf to include (at minimum) the following entry, which must specify the path to your temporary archive directory:

        restore_command = 'cp /var/lib/pgsql/archive/%f "%p"'

    - Change the ownership and permissions on this file so that it is owned by the database system user, and that it is only readable and writable by this user

        # chown postgres:postgres /var/lib/pgsql/data/recovery.conf
        # chmod 0600 /var/lib/pgsql/data/recovery.conf

 

  1. Start the PostgreSQL service. PostgreSQL will start recovering the database.

# service postgresql start

Troubleshooting

If the checks or backups are failing due to Postgresql login problems, check that the pg_hba.conf file (a PostgreSQL configuration file located in in the database cluster's data directory) is set up to allow the amandabackup user to log in to the database using the PG-USER and PG-PASSWORD specified.  For further information, see the following PostgreSQL documentation:

http://www.postgresql.org/docs/8.3/static/client-authentication.html
http://www.postgresql.org/docs/8.3/static/auth-methods.html#AUTH-IDENT-MAPS