Thursday, January 24, 2019

PostgresSQL Architecture


PostgresSQL Architecture 



Postgresql Architecture 

When PostgreSQL starts-
The Postmaster starts first and allocates the shared memory
 It also accepts connections and spins off a backend for each new connection. So each backend (server process) gets its pointers to shared memory from the postmaster.
It is disastrous if the postmaster dies with backends still running, so we have it do as little as possible, so that there isn't as much which can crash it.
Postgres does have a pool of shared memory;
however, it does not have a library or dictionary cache stored in that memory (This means that statements do need to be parsed and planned every time they are entered. If parse/plan overhead is an issue, we suggest the use of prepared statements.
While Oracle is able to avoid the repeated parse/plan overhead, it must still do enough analysis of the query to determine whether the information is present in the library cache, which also consumes some time and CPU resources. The parser is quite lightweight, so we feel that the overhead of parsing the query each time is acceptable.
1.             Shared Memory:
----------------------
SHARED BUFFERS –
The biggest chunk of shared memory is shared_buffers.
When pages from a table or index are read from the OS, they are read into shared_buffers, and the backends reference the pages and their contents right there in shared memory.
An exception is temporary tables, where (since only the creating backend can reference the temp table) data is accessed in temp_buffer space as much as possible(temp_buffers is separate. It is not in shared memory). It's faster to access process-local memory like that because you don't need to worry about pinning or locking the data, since you are not sharing it.
WAL BUFFERS -- are for buffering data to be written to the WAL files.
CLOG BUFFERS -- are one of the SLRU-style buffers oriented toward circular "rings" of data, like which transaction numbers have been committed or rolled back.
LOCK SPACE -- Memory structures in shared memory are generally protected by "lightweight" locks, which are in shared memory. Tables are protected by "heavyweight" locks which are also in shared memory (and themselves protected by lightweight locks). Of course, lightweight locks are protected by spinlocks. It gets very complicated and fussy. :-)
OTHER BUFFERS -- are probably mostly SLRU buffers besides CLOG (which was the first user of the SLRU system). SLRU is good for data where you mostly want to use recently accessed data and then you are done with it relatively quickly.
The opposite of shared memory is process-local memory -- only the one process that allocates it can access it.
Each SLRU system has a separate subdirectory. Shared memory is memory that all of the backend server processes can directly access. To prevent chaos, access to shared memory must follow some rules which tends to make it a little slower, like locking areas of memory a process will be using. Process-local memory is allocated by one backend server process, and the other backend server processes can't see it or use it, so it's faster to access, and no worries about another process trashing it while you're using it.
Getting back to the shared memory uses I hadn't talked about, CLOG buffers and SLRU buffers like multixact, notify, subtrans, serial, etc. use buffers in memory for recently accessed data, but spill to disk in those subdirectories beyond a limited number of buffers.

2. Utility Processes:

---------------------
With a default configuration I see the postmaster, the checkpointer process, the writer process, the wal writer process, the autovacuum launcher process, and the stats collector process. I think you will see more processes running if you turn on archiving or streaming replication. You might also get a process for writing the server log, depending on configuration. As their name say, WRITER -- process is responsible to write the dirty buffers to data files, CHECKPOINTER -- process is for checkpoint, WAL WRITER -- is for writing the dirty buffers in WAL buffers to WAL files, AUTOVACUUM LAUNCHER -- process lauches autovacuum when require(depends on your autovacuum settings in postgresql.conf file) and STATS COLLECTOR -- process to collect the statistics of objects in the database require by Optimizer to improve the performance.

The checkpointer process is responsible for creating safe points from which a recovery can begin; the background writer tries to keep some pages available for re-use so that processes running queries don't need to wait for page writes  in order to have free spots to use in shared buffers. Both checkpointer and writer processes writes to the same files, however the checkpointer writes all data that was dirty as of a certain time (the start of the checkpoint) regardless of how often it was used since dirtied, and the background writer writes data that hasn't been used recently, regardless of when it was first dirtied.  Neither knows or cares whether the data being written was committed, rolled back, or still in progress.

Follow link Taken artcile from - Memory Structure

PostgreSQL is a client/server type relational database management system with the multi-process   architecture and runs on a single host.A collection of multiple processes cooperatively managing one  database cluster is usually referred to as a 'PostgreSQL server', and it contains the following types of   processes:

·       A postgres server process is a parent of all processes related to a database cluster management.

·       Each backend process handles all queries and statements issued by a connected client.

·       Various background processes perform processes of each feature (e.g., VACUUM and CHECKPOINT processes) for database management.


·       In the replication associated processes, they perform the streaming replication. In the background worker process supported from version 9.3, it can perform any processing implemented by users.  


An example of the process architecture in PostgreSQL.


This figure shows processes of a PostgreSQL server: a postgres server process, two backend processes, seven background processes, and two client processes. The database cluster, the shared memory, and two client processes are also illustrated.



Memory Architecture

Memory architecture in PostgreSQL can be classified into two broad categories:
Local memory area – allocated by each backend process for its own use.
Shared memory area – used by all processes of a PostgreSQL server.



Article taken from - http://bajis-postgres.blogspot.com
                         


No comments: