27 Aralık 2023 Çarşamba

ROW LEVEL SECURITY - CREATE POLICY veya DROP POLICY

Giriş
Açıklaması şöyle
Row Level Security (RLS) lets you define access control policies at the row level within tables. This means that you can specify which users can SELECT, INSERT, UPDATE, or DELETE rows based on the actual data within the row.
Örnek
Elimizde şöyle bir tablo olsun
CREATE TABLE client (
    client_id PRIMARY KEY,     
    client_name VARCHAR(255) NOT NULL,
    email VARCHAR(255),               
    registration_date DATE DEFAULT CURRENT_DATE 
);
ROW LEVEL SECURITY kullanmak için şöyle yaparız
ALTER TABLE client ENABLE ROW LEVEL SECURITY;
ROW LEVEL SECURITY kullanmak için şöyle yaparız
DROP POLICY IF EXISTS client_isolation_policy ON client;
Sonra şöyle yaparız
CREATE POLICY client_isolation_policy on client
  USING (current_user =CONCAT('appuser_', client_id::text));
apppuser + client_id sütun değeri current_user ile aynı ise bu satır kullanıcıya gösterilir

Yeni bir kullanıcı yaratalım
create user appuser_1001 WITH ENCRYPTED PASSWORD 'password';

GRANT connect on database sample_db to appuser_1001;

GRANT USAGE ON SCHEMA public TO appuser_1001;

GRANT select on all tables in schema public to appuser_1001;

GRANT INSERT,UPDATE ON all tables in schema public TO appuser_1001;
root veya postgres kullanıcı etkilenmez. Her şeyi görebilir
select * from client c ;

client_id|client_name|email          |registration_date|
---------+-----------+---------------+-----------------+
     1001|Client 1001|abc@example.com|       2023-10-17|
     1002|Client 1002|abc@example.com|       2023-10-17|
appuser_1001 kullanıcısı sadece 1001 değerine sahip satırları görebilir.
select * from client c ;

client_id|client_name|email          |registration_date|
---------+-----------+---------------+-----------------+
     1001|Client 1001|abc@example.com|       2023-10-17|







Sütun Tipleri - ULID - Universally Unique Lexicographically Sortable Identifier

Bu tipi ilk olarak Performance of ULID and UUID in Postgres Database yazısında gördüm. Açıklaması şöyle
ULIDs contain a 48-bit timestamp followed by 80 bits of random data. The timestamp allows ULIDs to be sorted, which works much better with the b-tree data structure databases use for indexing.”
UUID performans kaybına sebep oluyor. Diğer seçenekler şöyle
ULIDs
Snowflake
Composite keys

25 Aralık 2023 Pazartesi

Checkpointing

Giriş
Checkpointing kısaca WAL dosyasındaki değişikliklerin gerçekten Data Page dosyasına uygulanması demek. 

Açıklaması şöyle
The question here can I control how data pages are flushed to disk? And the answer is yes, this is called as checkpointing, where a new WAL record is created called checkpoint record write after the entire data pages in memory are flushed to disk. This is a very delicate process, too much checkpointing sure makes start up faster, but it indeeds takes a toll on your I/O as checkpointing is IO heavy operation as we mentioned before data pages are very large.
Açıklaması şöyle. Page'lerin diske yazılması ve bu anın kaydedilmesi
Technically speaking we are consistent in memory because we do write to pages as we write the WAL and because we always read and write from and to memory first we get the latest and greatest. So data files and indexes pages are up to date in memory but they are out of date on disk.

So in a happy path, I issue a commit and my WAL is flushed successfully to disk and the database returns OK to the client (me) who issued the commit. So the WAL is updated. Then moments later the database start flushing my full pages to disk and now both data files and WAL are in sync on disk. We call this a checkpoint and the process that creates this checkpoint is checkpointing. The database will actually create a checkpoint record that says, I hereby declare at this moment in time, everything written to WAL on this moment is completely in SYNC with what we have on data files on disk.
Undo logs
Açıklaması şöyle. Checkpointing çökerse bellekteki kirli sayfaların bir kısmı diske yazılmış bir kısmı yazılmamış olacak. 
Ok. Here is another edge case for you, I flush my WAL and commit, a checkpointer process kicks in and start flushing dirty pages to disk (dirty pages are pages in memory that have been written to) and midway through checkpointing database crashed!

You now have half of the data pages updated and half of them are outdated with the WAL. The WAL only has changes, it has no clue what the original rows looked like. So for that we need another log to record the original state of the row before we made the change and that is the undo log.

Now with the undo log, the database first checks for transactions that started but didn’t commit and roll those back, any changes made by those transactions must be discarded from the WAL. Any changes that made it to the data files MUST be undone so we go back to consistent state where all rows in the data pages are trusted and committed. And finally we redo the WAL changes on those original pages to bring the pages back in sync with the WAL.

The undo logs are also used by transactions that need MVCC visibility to the old version of the row. This means having a long running transaction that does massive changes may slow down repeatable read/serializable isolation level read transactions as those have to go to the undo log and restore the original row before use.

Postgres clever architecture completely can get away without undo logs I explain it here.




14 Aralık 2023 Perşembe

FOR LOOP

Örnek - Batch Deletion
Şöyle yaparız
BEGIN; -- BEGIN A TRANASCATION

WITH batch_cursor AS (
  SELECT id FROM state_transitions
  ORDER BY id ASC
  LIMIT 10000 --- 10k record in a batch
)
FOR batch IN (SELECT * FROM batch_cursor)
LOOP
  DELETE FROM large_table
  WHERE id = batch.id;
END LOOP;

COMMIT;  -- COMMITING THE TRANSACTION
Açıklaması şöyle
With batch deletion, we’re gently removing a bunch of records in each go. This not only keeps our database humming but also reduces the risk of causing chaos. ALWAYS RUN THIS IN A TRANSACTION AND COMMIT IT TOGETHER

7 Aralık 2023 Perşembe

pg_partman

Giriş
Açıklaması şöyle
pg_partman is an extension for PostgreSQL that provides automatic partitioning of tables. 
...
pg_partman simplifies the process of managing partitioned tables by automating the creation and maintenance of partitions. It includes features to handle both time-based and serial-based partitioning. This extension is particularly useful for tables that grow over time and need to be partitioned based on date or other criteria.


Parallel Sequential Scan

Giriş
Sequential Scan gibidir. Sadece birden fazla worker tarafından gerçekleştirilir

Örnek
Şöyle yaparız. Tabloda Index olmadığı için Full Table Scan yapılıyor
CREATE TABLE t1 (
    a   int,
    b   int
);
test=# explain analyze select count(*) from t1 where a=9999;
                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=10633.65..10633.66 rows=1 width=8) (actual time=43.930..47.172 rows=1 loops=1)
   ->  Gather  (cost=10633.44..10633.65 rows=2 width=8) (actual time=43.766..47.164 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=9633.44..9633.45 rows=1 width=8) (actual time=34.394..34.395 rows=1 loops=3)
               ->  Parallel Seq Scan on t1  (cost=0.00..9633.33 rows=42 width=0) (actual time=34.373..34.385 rows=33 loops=3)
                     Filter: (a = 9999)
                     Rows Removed by Filter: 333300
 Planning Time: 0.313 ms
 Execution Time: 47.250 ms
(10 rows)

Pghero

Giriş
Açıklaması şöyle
Pghero is a performance dashboard and monitoring tool for PostgreSQL. It provides an easy-to-use web interface to track database metrics, query performance, and index usage. Pghero helps you identify slow queries, analyze database trends, and optimize your PostgreSQL database for better performance.

Autovacuum Daemon

Giriş
Açıklaması şöyle
Postgres automates the VACCUM processing using auto vacuum daemon. By default, it runs every 1 minute. When the VACCUM wakes up, it invokes three workers. These workers do the VACCUM processing on the target tables.

You can query pg_settings to check various configurations for the autovacuum process in Postgres:
Şöyle yaparız
SELECT
  name,
  setting,
  category,
  short_desc
FROM pg_settings
WHERE name LIKE '%autovacuum%'
autovacuum_vacuum_scale_factor Alanı
Açıklaması şöyle
Varsayılan değer 0.2. Tablodaki satırların %20'sı MVCC yüzünden atıl hale gelirse auto vacuum başlar. Yazma işlemi daha fazla olan tablolarda autovacuum_vacuum_scale_factor daha düşük bir değer olabilir
Örnek
Şöyle yaparız
ALTER TABLE SampleTable2 SET
(autovacuum_vacuum_scale_factor = 0, autovacuum_vacuum_threshold = 100)
Açıklaması şöyle
 ... we set autovacuum for SampleTable2 if it has more than 100 DEAD tuples.

4 Aralık 2023 Pazartesi

Soft Delete

Giriş
Açıklaması şöyle
When using the soft delete mechanism on the database, you might run into a situation where a record with a unique constraint was deleted and you want to reuse the unique value.
Örnek
Elimizde şöyle bir tablo olsun
CREATE TABLE users (
  user_id UUID PRIMARY KEY DEFAULT uuid_generate_v7(),
  email TEXT NOT NULL,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
  updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
  deleted_at TIMESTAMP WITH TIME ZONE,
  created_by UUID NOT NULL,
  updated_by UUID NOT NULL,
  FOREIGN KEY (created_by) REFERENCES users(user_id),
  FOREIGN KEY (updated_by) REFERENCES users(user_id),
  is_system SMALLINT DEFAULT 0 NOT NULL CHECK (is_system BETWEEN 0 AND 1)
);
Açıklaması şöyle
Whether a row should be created that has the same natural key as a ‘deleted’ row depends on the table. If this is desirable, unique indexes should be written with a WHERE clause, for example:
Şöyle yaparız. Böylece aynı email değerine sahip başka kullanıcı yaratılabilir
CREATE UNIQUE INDEX “users_email_unique”
  ON users(email, deleted_at)
  WHERE deleted_at IS NULL;
veya şöyle yaparız. Burada email alanı case insensitive olarak düşünüldüğü için case insensitive collation kullanılıyor
CREATE UNIQUE INDEX users_email_unique
  ON users(email COLLATE case_insensitive)
  WHERE deleted_at IS NULL;


3 Aralık 2023 Pazar

CREATE TEMPORARY TABLE

Giriş
TEMPORARY TABLE iki şekilde silinebilir
1. DELETE ALL
2. TRUNCATE

DELETE ALL zaten iyi bir seçenek değil.
TRUNCATE için açıklama şöyle
TRUNCATE simply creates a new file on disk and does an UPDATE of the pg_class table.
Yani 
1. gp_class sitem tablosunun dolmasına sebep oluyor
2. Disk I/O yapılması gerekiyor

Çözüm 1
Açıklaması şöyle
In Linux, there is one approach that can significantly reduce disk usage of temporary tables. It is to allocate a separate RAM disk for temporary tables.
Bunun için temp_tablespaces seçeneği kullanılıyor
Örnek
Şöyle yaparız
mkdir /mnt/dbtemp
chmod 700 /mnt/dbtemp
chown postgres:postgres /mnt/dbtemp
Daha sonra SQL ile şöyle yaparız. Burada belirtilen dizinin kullanana bir  tablespace yaratılıyor. /mnt/dbtemp altında örneğin /mnt/dbtemp/PG_13_202007201/ diye yeni bir dizin yaratılıyor
CREATE TABLESPACE temp LOCATION '/mnt/dbtemp';
Sonra şöyle yaparız. Böylece örneğin /mnt/dbtemp/PG_13_202007201/936082 diye yeni bir dizin yaratılıyor
CREATE TEMPORARY TABLE mytemptable (key0 int8) TABLESPACE temp;
/etc/fstab dosyasına şöyle yaparız. Bir dahaki tekrar başlatmada bu dizin otomatik yüklenir.
tmpfs /mnt/dbtemp/PG_13_202007201/936082 tmpfs \ 
  rw,nodev,nosuid,noatime,nodiratime,size=1G 0 0
Ama bir dahaki başlatmayı beklememek için dizini RAM'e yükleriz
mount /mnt/dbtemp/PG_13_202007201/936082
Sonra postgresql.conf  dosyasına şöyle yaparız
temp_tablespaces = 'temp'
ve uygulamak için şöyle yaparız
SELECT pg_reload_conf();