PostgreSQL Çorbası: Ağustos 2019

Giriş
Açıklaması şöyle.

PostgreSQL has two important files where most of the developer-friendly configurations are defined: "postgresql.conf" and "pg_hba.conf". You can find these files under "C:\Program Files\PostgreSQL\11\data" or incase of Linux "/var/lib/pgsql/11/data/".

Not : pg_hba.conf Dosyası yazısına bakabilirsiniz.

autovacuum Alanı

Örnek

Şöyle yaparız

autovacuum = on
autoanalyze = on

Açıklaması şöyle

1. Autovacuum ensures that space occupied by deleted or outdated data is reclaimed, preventing unnecessary bloat and optimizing query execution.
2. Auto analyze updates table statistics, enabling the query optimizer to generate accurate query plans. By configuring and monitoring these processes, you can prevent performance degradation due to excessive table bloat or outdated statistics.

autovacuum_freeze_max_age Alanı

Açıklaması şöyle

This is the max age (in no. of transactions) a table’s pg_class.relfrozenxid can attain before vacuum is forced in order to prevent XID wraparound.

Default is 200m which is ~10% of the possible 2³¹ values (wraparound). Now 10% may seem too less and one might be tempted to increase it. But that would just delay the inevitable forced vacuum process further (thereby also increasing the time it’d take to complete).

The right value can be found out based on your transaction burn rate i.e. how fast the XIDs will be used up. But we might just be able to get away with the default value itself if other things are configured correctly (as explained a bit later).

vacuum_freeze_min_age

Açıklaması şöyle

The cutoff age (in transactions) that vacuum should use to decide whether to freeze row versions while scanning a table.

Default is 50m which means all XIDs that are older than 50m will be frozen and the youngest XID in the table will have an age of 50m. This is very very conservative number. Consider transaction advancement rate of 50m in 24 hours. Unless we have transactions or queries that run for 24 hours, do we really need that old XIDs to be present? Another way to think about it is — even after vacuum is run, we are only 50m transactions away from the next forced vacuum. From a practical standpoint, this value can be reduced by a lot (considering the default 200m autovacuum_max_freeze_age).

The right value can be decided based on two things — the rate of transactions and the duration of longest running query/transaction served by the cluster.

A vacuum freeze instead of plain vaccumimplicitly sets this value to 0 i.e. XID on all rows will be frozen.

vacuum_freeze_table_age

Açıklaması şöyle

If the table’s pg_class.relfrozenxid value reaches this threshold, then vacuum will aggressively scan the tuples i.e. it will visit every page that may contain unfrozen tuples, not just the ones that contain dead tuples. This is provided because normally, vacuum uses an optimization to avoid scanning all pages. The visibility_map tells which pages may contain dead tuples and vacuum can only scan those.

Default value is 150m. Recommended is 80% of autovacuum_freeze_max_age

Again, a vacuum freeze instead of plain vaccumimplicitly sets this value to 0 ie. XID on all rows will be frozen.

autovacuum_vacuum_scale_factor Alanı

Varsayılan değer 0.2. Tablodaki satırların %20'sı MVCC yüzünden atıl hale gelirse auto vacuum başlar. Yazma işlemi daha fazla olan tablolarda autovacuum_vacuum_scale_factor daha düşük bir değer olabilir

Örnek

Şöyle yaparız

# Increase Shared Memory
shared_buffers = 4GB

# Optimize Disk I/O
random_page_cost = 1.1

# Tune Autovacuum Settings
autovacuum_vacuum_scale_factor = 0.1

commit_delay Alanı

Açıklaması şöyle

PostgreSQL has a commit_delay configuration where transaction commits are delayed for a configured amount. During this delay, more commits can arrive and be batched in the WAL memory (write-ahead log). If delay elapses, the database flushes the WAL changes from all batched transactions once saving multiple disk I/Os.

Örnek

Şöyle yaparız

# postgresql.conf
# Set commit_delay to 10 milliseconds
commit_delay = 10ms
# Set wal_writer_delay to 5 milliseconds
wal_writer_delay = 5ms

veya şöyle yaparız

ALTER SYSTEM SET commit_delay TO '30ms';
ALTER SYSTEM SET wal_writer_delay TO '200ms';

deadlock_timeout Alanı

Açıklaması şöyle

In PostgreSQL, there's a configuration parameter called deadlock_timeout that sets the time to wait on a lock before checking for a deadlock. If the system detects a deadlock, it rolls back one of the transactions and returns an error.

effective_io_concurrency Alanı

Açıklaması şöyle

This parameter determines the maximum number of parallel I/O requests that a PostgreSQL instance can execute simultaneously, which in turn affects how efficiently the database can retrieve data from disk.
...
If your storage system has multiple disks, you may want to set the parameter to a value greater than 1 to take advantage of parallelism, resulting in improved I/O performance. However, if your storage system has a single disk, it is recommended to set the parameter to 1 to avoid the overhead of parallel I/O.

keystore_location Alanı

keystore_location ve tablespace_encryption_algorithm birlikte kullanılarak "Tablespace-Level Encryption" yapılabilir.

listen_addresses Alanı
İlk kurulumda sadece localhost'tan bağlantıya izin veriliyor. Aslında benzer bir yöntem diğer veri tabanları tarafından da takip ediliyor. Örneğin MySql için açıklama şöyle

Usually, mysql server is accessed only from the same machine, in which case you can set it to listen only on a unix socket, or on a loopback interface.

If it indeed needs to be accessed from other machines, these are generally just a few ones, in which case you can firewall the port from any host but those that legitimately need access to it. (Moreover, if all mysql users are host-restricted, mysql itself won't allow connecting from hosts different than those)

Sadece localhost'tan bağlanmak genellikle yetersiz olduğu için değiştirmek gerekiyor. Açıklaması şöyle

By default, PostgreSQL DB server listen address is set to the 'localhost' , and we need to change it so it accepts connection from any IP address; or you can use comma separated list of addresses.

Genellikle "*" değeri atanarak makinedeki her ağ arayüzünden bağlanılabilmesi şeklinde değiştiriliyor.

Şöyle yaparız

listen_addresses = ‘*’

Açıklaması şöyle.

Make PostgreSQL listen on all IP Addresses. This is mandatory if you want to connect or debug remotely. Search for "listen_addresses" and use '*' as its value. You can also change the port number (default 5432)

1. Eğer host based authentication yöntemi kullanılıyorsa pg_hba.conf dosyasında da benzer bir değişiklik yapmak gerekiyor. Bu dosyada hangi istemcilerin veri tabanı sunucusuna bağlanabileceği belirtiliyor.

2. Bu değişiklikten sonra veri tabanı sunucusunu tekrar başlatmak gerekiyor.

Örnek

Şöyle yaparız

# /etc/init.d/postgresql restart

log_min_duration_statement

Açıklaması şöyle

log_min_duration_statement: just logging the statements whose duration is more than this value in milliseconds. This is very useful for detecting slow queries.

Açıklaması şöyle

It sets the minimum execution time in milliseconds (ms) above which all statements will be logged.

The default value for the log_min_duration_statement parameter is -1, which disables logging statements.

Setting the PostgreSQL parameter log_min_duration_statement to 0 will print all statements' durations.

Örnek

Şöyle yaparız

# set log_min_duration_statement=1;
SET

# show log_min_duration_statement;
 log_min_duration_statement
----------------------------
 1ms
(1 row)

PostgreSQL log dosyalarındaki statement log cümleleri şöyledir

LOG:  duration: 5.477 ms  statement: insert into abc values(1);
LOG:  duration: 8.656 ms  statement: insert into abc values(2);

log_statement_sample_rate

log_destination

Bu alanlar için Audit Trails yazısına bakabilirsiniz

log_destination Alanı

PostgreSQL 15 için açıklama şöyle

PostgreSQL supports several methods for logging server messages, including stderr (default), csvlog, and syslog. With this release, jsonlog is also supported which is convenient for exporting logs and debugging. In order to enable this, add the jsonlog under the log_destination inside the postgresql.conf file.

This will be a nice feature to have with exports to logging and monitoring tools like HoneyComb and DataDog etc.

max_connections Alanı
Açıklaması şöyle.

Maximum connections to DB, default is 100. Based on your application's requirement, you can set this number to a higher value. However, it is recommended to use a connections pool to safeguard from database overhead.

PostgreSQL thread kullanmıyor, process kullanıyor. Bu yüzden şeklen şöyle

Örnek

Şöyle yaparız

listen_addresses = '*'
port = 5432
max_connections = 20
shared_buffers = 128MB
temp_buffers = 8MB
work_mem = 4MB
wal_level = logical
max_wal_senders = 3

max_parallel_workers_per_gather Alanı

Açıklaması şöyle

PostgreSQL allows parallel query execution to utilize multiple CPU cores for computationally intensive queries. The max_parallel_workers_per_gather setting controls the maximum number of worker processes that can be used per query.

Örnek

Şöyle yaparız

ALTER SYSTEM SET max_parallel_workers_per_gather = 4;

work_mem Alanı

Açıklaması şöyle

PostgreSQL uses the work_mem configuration parameter to determine the amount of memory available for sorting and hashing operations. Properly configuring work_mem can greatly impact the performance of queries involving sorting or hash joins.

Configuring the work_mem parameter in PostgreSQL is crucial for optimizing query performance, but it requires finding the right balance. Increasing work_mem on high-performance systems can speed up sorting and hashing operations, leading to faster query execution.

However, setting it too high might cause excessive memory usage and potential memory contention, impacting overall performance. To determine the ideal work_mem value, start conservatively and monitor system performance while gradually increasing it. Keep a close watch on memory consumption and query execution times. Additionally, consider using partitioned queries or limiting result set sizes for resource-intensive queries to prevent excessive memory usage. Striking the right balance ensures PostgreSQL operates efficiently without overwhelming system memory and maximizes the benefits of work_mem for improved query performance.

Örnek

Şöyle yaparız

SET work_mem = "128MB";

shared_buffers Alanı

PostgreSQL Shared Buffer yazısına taşıdım

wal_level Alanı

Write-Ahead Log yazısına taşıdım

wal_writer_delay Alanı

Write-Ahead Log yazısına taşıdım

Örnek

Şöyle yaparız

max_wal_size = 8GB
min_wal_size = 512MB
checkpoint_completion_target = 0.9

PostgreSQL Çorbası

22 Ağustos 2019 Perşembe

pg_dumpall komutu

20 Ağustos 2019 Salı

High Cardinality - Öğe Sayısı

19 Ağustos 2019 Pazartesi

postgresql.conf Dosyası