nblug : augie : dba : postgresql

Database Administration

The Basics

Connecting to and Disconnecting from the Server
Creating and Using a Database
Creating a Table

The Basics

These are some of the basics of using the PostgreSQL DBMS (Database Management System), the lessons learned here should help you function throughout the rest of this tutorial.

Connecting to and Disconnecting from the Server is easy to do. You may pass as many or as few arguments to psql as you want.

psql postgres

This should be all you need to connect to the database postgres (assuming it exists). You may be more specific in your connection by using the --host, --username, --dbname options. If you don't specify these then psql will use localhost as your default host, and your Linux username for both username and dbname.

Once you are connected you will be greeted with something like this:

Welcome to psql, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help on internal slash commands
       \g or terminate with semicolon to execute query
       \q to quit

postgres=#

Depending on your privileges that last line could also be:

postgres=>

The prompt tells you it is ready for your commands, use \q or Ctrl-D to quit.

Creating and Using a Database. You may use the SQL statement CREATE DATABASE to create a new database if you want, but PostgreSQL (as you will see later) has many shell commands which accomplish the very same thing. To create a new database from the shell use the command:

$ createdb tutorial

You should get a response like this:

CREATE DATABASE

You may now connect and use your new database like so:

psql tutorial

Creating a Table. After connecting to your database you may create a new table like so:

mysql> CREATE TABLE cheese (
name VARCHAR(15) NOT NULL,
weight INT NOT NULL,
PRIMARY KEY(name));

Configuring PostgreSQL

postgresql.conf is the main configuration file for the PostgreSQL DBMS. It will be located in your data directory (e.g. /usr/local/pgsql/data) on my Mandrake 8.2 system this is /var/lib/pgsql/data. The setup of the postgresql.conf file is similar to other config files, options are one per line. Here is an example of what a file may look like.

# this is a comment

#what port to run on
port = 5432

#hostname or address on which the postmaster is to listen for connections from client applications
virtual_host = 'superfoo.org'

#prints a line informing about each successful connection to the server log
log_connections = true

#prefixes each server log message with a timestamp
log_timestamp = true

#determines how many concurrent connections the database server will allow
max_connections = 32

The above are just a sample of what i thought were the most interesting options. For a full list of options and there meanings see the PostgreSQL documentation here. Your instillation method probalbly came with a config file already in your data directory, and you may also have an example config file (e.g. /usr/share/pgsql/postgresql.conf.sample).

A second way to set these configuration parameters is to give them as a command line option to the postmaster, such as:

postmaster -c log_connections=true log_timestamp=true

Command-line options override any conflicting settings in postgresql.conf.

There are also some neat environment variable you can set which may cut down on your typing time: PGPORT, PGHOST, PGDATABASE, PGUSER, and PGPASSWORD.

Security

Traditionally the user postgres is the owner of all PostgreSQL files and processes. If you installed from a package then your package most likely has already created a user named postgres for you, if not or if you installed form source please create a user named postgres. This is akin to running Apache as the user apache.

Client Authentication is controlled by the file pg_hba.conf in the data directory, e.g., /var/lib/pgsql/data/pg_hba.conf. (HBA stands for host-based authentication.)

The general format of the pg_hba.conf file is of a set of records, one per line. Blank lines and lines beginning with a hash character ("#") are ignored. A record is made up of a number of fields which are separated by spaces and/or tabs. Records cannot be continued across lines.

Each record specifies a connection type, a client IP address range (if relevant for the connection type), a database name or names, and the authentication method to be used for connections matching these parameters. The first record that matches the type, client address, and requested database name of a connection attempt is used to do the authentication step. There is no "fall-through" or "backup": if one record is chosen and the authentication fails, the following records are not considered. If no record matches, the access will be denied.

A record is typically in one of two forms:

local   database authentication-method [ authentication-option ]
host    database IP-address IP-mask authentication-method [ authentication-option ]

The meaning of most of these fields is as follows: (for the full story check out the pg_hba.conf documentation.)

local
This record pertains to connection attempts over Unix domain sockets.

host
This record pertains to connection attempts over TCP/IP networks.

database
Specifies the database that this record applies to. The value all specifies that it applies to all databases, while the value sameuser identifies the database with the same name as the connecting user. Otherwise, this is the name of a specific PostgreSQL database.

authentication method
trust: The connection is allowed unconditionally. This method allows any user that has login access to the client host to connect as any PostgreSQL user whatsoever.

reject: The connection is rejected unconditionally.

password: The client is required to supply a password which is required to match the database password that was set up for the user. An optional file name may be specified after the password keyword. This file is expected to contain a list of users who may connect using this record, and optionally alternative passwords for them (in the /etc/shadow form, i.e. username:password). The password is sent over the wire in clear text.

md5: Like the password method, but the password is sent over the wire encrypted using a simple challenge-response protocol. The name of a file may follow the md5 keyword. It contains a list of users who may connect using this record. However you may not give an alternative password with this option. Which means whatever password the user gives must match their password in the pg_shadow system table.

ident: This method uses the "Identification Protocol" as described in RFC 1413. It may be used to authenticate TCP/IP or Unix domain socket connections, but its reccomended use is for local connections only and not remote connections. See the Caveats section for more info on why. This authentication method will use the login name of the OS you are currently logged into to authenticate you. You may follow the ident keyword with the name of an ident map for mapping OS login names with PostgreSQL login names. See the Authentication Methods documentation for more info on how to accomplish this.

The pg_hba.conf file is read on start up of the PostgreSQL service, so if you edit the file on an active system you will need to restart the service to make it re-read the file. Below is an example pg_hba.conf file:

# TYPE       DATABASE    IP_ADDRESS    MASK               AUTHTYPE  MAP

# Allow any user on the local system to connect to any
# database under any username, but only via an IP connection:

host         all         127.0.0.1     255.255.255.255    trust

# The same, over Unix-socket connections:

local        all                                          trust

# Allow any user from any host with IP address 192.168.93.x to
# connect to database "template1" as the same username that ident on that
# host identifies him as (typically his Unix username):

host         template1   192.168.93.0  255.255.255.0      ident     sameuser

# Allow a user from host 192.168.12.10 to connect to database "template1"
# if the user's password in pg_shadow is correctly supplied:

host         template1   192.168.12.10 255.255.255.255    md5

# In the absence of preceding "host" lines, these two lines will reject
# all connection attempts from 192.168.54.1 (since that entry will be
# matched first), but allow Kerberos V5-validated connections from anywhere
# else on the Internet. The zero mask means that no bits of the host IP
# address are considered, so it matches any host:

host         all        192.168.54.1   255.255.255.255    reject
host         all        0.0.0.0        0.0.0.0            krb5

# Allow users from 192.168.x.x hosts to connect to any database, if they
# pass the ident check.  If, for example, ident says the user is "bryanh"
# and he requests to connect as PostgreSQL user "guest1", the connection
# is allowed if there is an entry in pg_ident.conf for map "omicron" that
# says "bryanh" is allowed to connect as "guest1":

host         all        192.168.0.0    255.255.0.0        ident     omicron

# If these are the only two lines for local connections, they will allow
# local users to connect only to their own databases (database named the
# same as the user name), except for administrators who may connect to
# all databases.  The file $PGDATA/admins lists the user names who are
# permitted to connect to all databases.  Passwords are required in all
# cases.  (If you prefer to use ident authorization, an ident map can
# serve a parallel purpose to the password list file used here.)

local        sameuser                                     md5
local        all                                          md5  admins

User Management

Database Users are seperate from operating system users. In practice it might be convenient to maintain a correspondence, but this is not required.

To create a user you may use the SQL commands:

CREATE USER username

And to drop a user it is:

DROP USER username

For convenience, the shell scripts createuser and dropuser are provided as wrappers around these SQL commands.

A user's attributes can be modified after creation with the ALTER USER SQL command.

Groups are a way of logically grouping users to ease management of permissions: permissions can be granted to, or revoked from, a group as a whole. To create a group, use:

CREATE GROUP name

To add users to or remove users from a group, use:

ALTER GROUP name ADD USER uname1, ...
ALTER GROUP name DROP USER uname1, ...

Privileges. There are three main types of privileges you may GRANT or REVOKE to your users: SELECT (read), INSERT (append), and UPDATE (write). (See the GRANT manual page for more information.) So to grant write access to a table named facilities to a user named frank_grimes you would issue the following:

GRANT UPDATE ON facilities TO frank_grimes;

To grant a privilege to a group, use:

GRANT SELECT ON facilities TO GROUP staff;

The special username PUBLIC may be used to grant privileges to all users on the system, and using ALL in place of a specific privilege will grant all privileges to that user or group.

To revoke a privilege use the REVOKE command:

REVOKE ALL ON facilites FROM PUBLIC;

Backup and Restore

pg_dump is the utility for backing up PostgreSQL databases. The basic usage is:

pg_dump dbname > outfile

pg_dump will write to STDOUT the SQL commands needed to reconstruct your database.

The text files created by pg_dump are intended to be read in by the psql program. The general command form to restore a dump is:

psql dbname < infile

pg_dumpall backs up each database in a given cluster and also makes sure that the state of global data such as users and groups is preserved. The basic usage is as follows:

pg_dumpall > outfile

The resulting dump is restored in the same manner as pg_dump output.

Because these utilities write their output to STDOUT you can manipulate the output with other favorite tools. Here are some examples originally written by Hannu Krosing (hannu@trust.ee):

Use compressed dumps. Use your favorite compression program, for example gzip.

pg_dump dbname | gzip > filename.gz

Reload with

createdb dbname

gunzip -c filename.gz | psql dbname

or

cat filename.gz | gunzip | psql dbname

Use split. This allows you to split the output into pieces that are acceptable in size to the underlying file system. For example, to make chunks of 1 megabyte:

pg_dump dbname | split -b 1m - filename

Reload with

createdb dbname

cat filename* | psql dbname

Caveats

Client Authentication.
The password authentication method is a great way to migrate an existing user base to PostgreSQL because you can just cut and paste your /etc/shadow values into a file in the same directory as your pg_hba.conf file, and their existing passwords will work for their PostgreSQL accounts. The caveat here is of course that the password flies over the wire in plain-text, there are solutions though to wrap PostgreSQL connections with SSL or SSH which can give you a higher level of security.

The preferred client authentication method is md5 which will transport your password in an encrypted form. The caveat is that you may not specify an alternative password, so no more cut and paste from /etc/shadow. The password the client gives must match the password stored in the PostgreSQL system table pg_shadow.

The reason why you only want to use the ident method of authentication for local connections and never for remote connections is because this procedure depends on the integrity of the client. A client may create a postgres or root account and would then be authenticated as such. Even if you trust the integrity of the client another reason not to use ident may be that a previously untrusted client will spoof their IP to match that of a trusted client thus gaining access.

Disclaimer: Most of this information is taken directly from the excelent PostgreSQL documentation located here:
http://www.postgresql.org/idocs/

Author: August Schwer augie@nblug.org