PostgreSQL/ClientServerComm
Protocol
editAll access to data is done by server (or backend) processes, to which client (or frontend) processes must connect to. In most cases instances of the two process classes reside on different hardware, but it's also possible that they run on the same computer. The communication between them uses a PostgreSQL-specific protocol, which runs over TCP/IP or over UNIX sockets. It is implemented in the C library libpq. For every incoming new connection the backend process (sometimes called the postmaster- process) creates a new postgres backend process. This backend process gets part of the PostgeSQL instance, which is responsible for data accesses and database consistency.
The protocol handles the authentication process, client request, server responses, exceptions, special situations like a NOTIFY, and the final regular or irregular termination of the connection.
Driver
editMost client programs - like psql - use this protocol directly. Drivers like ODBC, JDBC (type 4), Perl DBI, and those for Python, C, C++, and much more are also based on libpq.
You can find an extensive list of drivers at the postgres wiki [1] and some more commercial and open source implementations at the 'products' site [2].
Authentication
editClients must authenticate themselves before they get access to any data. This process has one or two stages. During the first - optional - step the client gets access to the server by satisfying the operating system hurdles. This is often realized by delivering a public ssh key. The authentication with PostgeSQL is a separate, independent step using a database-username, which may or may not correlate to an operating system username. PostgreSQL stores all rules for this second step in the file pg_hba.conf.
pg_hba.conf stores every rule in one line, one rule per line. The lines are evaluated from the top of ph_hba.conf to bottom and the first matching line applies. The main layout of these lines is as follows
local DATABASE USER METHOD [OPTIONS] host DATABASE USER ADDRESS METHOD [OPTIONS]
Words in upper case must be replaced by specific values. Lower case words like local and host are key words. They decide, for which kind of connection the rule shall apply: local for clients residing at the same computer as the backend (they use UNIX sockets for the communication) and host for clients at different computers (they use TCP/IP). There is one notable exception. In the former case clients can use the usual TCP/IP syntax --host=localhost --port=5432
to switch over to use TCP/IP. Thus the host syntax applies for them.
DATABASE and USER have to be replaced by the name of the database and the name of the database-user, for which the rule will apply. In both cases the key word ALL is possible to define, that the rule shall apply to all databases and respectively all database-users.
ADDRESS must be replaced by the hostname or the IP adress plus CIDR mask of the client, for which the rule will apply. IPv6 notation is supported.
METHOD is one of the following. The thereby defined rule (=line) applies, if database/user/address combination is the first matching combination in pg_hba.conf.
- trust: The connection is allowed without a password.
- reject: The connection is rejected.
- password: The client must send a valid user/password combination.
- md5: Same as 'password', but the password is encrypted.
- ldap: It uses LDAP as the password verification method.
- peer: The connection is allowed, if the client is authorized against the operation system with the same username as the given database username. This method is only supported on local connections.
There are some more techniques in respect to the METHOD.
Some examples:
# joe cannot connect to mydb - eg. with psql -, when he is logged in to the backend. local mydb joe reject # bill (and all other persons) can connect to mydb when they are logged in to the # backend without specifying any further password. joe will never reach this rule, he # is rejected by the rule in the line before. The rule sequence is important! local mydb all trust # joe can connect to mydb from his workstation '192.168.178.10', if he sends # the valid md5 encrypted password host mydb joe 192.168.178.10/32 md5 # every connection to mydb coming from the IP range 192.168.178.0 - 192.168.178.255 # is accepted, if they send the valid md5 encrypted password host mydb all 192.168.178.0/24 md5
For the DATABASE specification there is the special keyword REPLICATION. It denotes the streaming replication process. REPLICATION is not part of ALL and must be specified separately.
References
edit