PostgreSQL Çorbası: 2022

30 Aralık 2022 Cuma

CREATE INDEX HASH_INDEX

Giriş

Şeklen şöyle. Aynı hash değerine sahip satırlar linked list gibi bağlanıyor

HASH CODE

Açıklaması şöyle

Despite the b-tree index, which can store many values without reducing the expected performance, the hash index has a limit of 2³²-1 of unique hash codes (different values may have the same hash codes). Therefore, increasing the number of duplicates (in terms of hash codes) negatively affects the index performance.

EQUALITY OPERATOR

Açıklaması şöyle

One of the reasons why the b-tree index is so standard is its flexibility because it supports all comparison operators. Hash index, on the other hand, supports only equality operators.

Bazı olumsuz yönleri şöyle

Limited search capabilities: Hash indexes are designed to handle only equality searches (i.e., “find all records where column A equals a specific value”). They are not well-suited for range queries or sorting.
Collisions: Hash indexes can have collisions, where multiple keys map to the same hash value. This can result in degraded performance, as the database must perform additional operations to resolve the collisions.
Unpredictable storage requirements: The size of a hash index cannot be predicted in advance, as it depends on the number of unique values in the indexed column. This can make it difficult to plan for storage requirements.

Örnek

Şöyle yaparız

CREATE TABLE IF NOT EXISTS hash_table(example varchar);

CREATE INDEX IF NOT EXISTS hash_index ON hash_table USING hash(example);

23 Aralık 2022 Cuma

VACUUM FULL - Production Veri Tabanında Kullanılmaz. Tabloyu Bloke Eder, Disk Alanını Geri Verir

Giriş

Söz dizimi şöyle

VACUUM FULL tablename;

Diğer Seçenekler

1. CLUSTER

VACUUM FULL Neden Production Veri Tabanında Kullanılmaz

Açıklaması şöyle. Production Veri Tabanında Kullanılmaz çünkü disk alanını geri verir ama tabloyu kilitler. Sadece bakımdayken kullanılır. Eğer canlı ortamda tabloyu küçültmek yapmak gerekiyorsa REPACK Extension da kullanılabilir

By contrast, the full VACUUM does reclaim the free space to the operating system, but it blocks application workloads. You can think of it as Java’s “stop-the-world” garbage collection pause. It’s only in PostgreSQL that such a pause can last for hours (or days). Thus, database admins try their best to prevent the full VACUUM from happening at all.

Açıklaması şöyle

This will reclaim all the space used by the float and stale tuples but in turn creates a lot of problems like locking of a table(hanged) that ultimately makes it not a production-use solution. The only time we can use it can allow an application downtime we have taken for maintenance.

Açıklaması şöyle.

VACUUM FULL
The VACUUM FULL process returns the space to the operating system, ... It does the following tasks.

1. VACUUM FULL process obtains an exclusive lock on the table.
2. It creates a new empty storage table file.
3. Copy the live tuples to the new table storage.
4. Removes the old table file and frees the storage.
5. It rebuilds all associated table indexes, updates the system catalogs and statistics.

Örnek

İsmi SampleTable olan tabloyu vakumlamak için şöyle yaparız

VACUUM FULL SampleTable;

Daha sonra bu komutun etkisini görmek için şöyle yaparız.

SELECT
  COUNT(*) AS npages,
  ROUND(100 * AVG(avail) / 8192, 2) AS avg_fs_ratio
FROM
    pg_freespace('SampleTable');

REPACK Extension - Storage Optimization İçindir

Kurmak için şöyle yaparız

CREATE EXTENSION pg_repack;

Açıklaması şöyle. Tabloyu yeni bir yere kopyalar ve indeksleri yeniden yapılandırır. Böylece saklama alanından tasarruf ederiz.

It creates a new table like we have seen in reindexing and copies all the data from the old table to a new table and after that rebuild the indexes. After full copying is done new table acts as the main one and the old one is dropped.

Şöyle yaparız.

pg_repack -k --table mytable mydb;

PG_STAT_ALL_INDEXES Sistem Tablosu

seq_scan Alanı

Örnek

Açıklaması şöyle

Suppose you had a database with many indexes and wanted to monitor how well they were being used. You can use the number of times the index scan is against sequential scan pg_stat_statements

Şöyle yaparız

SELECT indexrelname, idx_scan, seq_scan
FROM pg_stat_all_indexes
ORDER BY (idx_scan + seq_scan) DESC
LIMIT 10;

idx_scan Alanı

Örnek

Kullanılmayan indeksleri bulmak için şöyle yaparız

SELECT
  relname,
  indexrelname,
  idx_scan,
  idx_tup_read,
  idx_tup_fetch,
  pg_size_pretty(pg_relation_size(indexrelname::regclass)) as size
FROM
pg_stat_all_indexes
WHERE
  schemaname = 'public'
  and indexrelname not like 'pg_toast_%'
  and idx_scan = 0
  and idx_tup_read = 0
  and idx_tup_fetch = 0
  and indexrelname != 'SequelizeData_pkey'
ORDER BY
  pg_relation_size(indexrelname::regclass) desc;

21 Aralık 2022 Çarşamba

CREATE DATABASE

Örnek

Şöyle yaparız

> CREATE DATABASE pg_dev OWNER postgres;

20 Aralık 2022 Salı

Docker Compose ve PostgreSQL Debezium

Giriş

Image'lar şöyle

debezium/postgres:13, zookeeper, confluentinc/cp-enterprise-kafka, debezium/connect mutlaka var

İlave olarak

confluentinc/cp-schema-registry, confluentinc/ksqldb-server, provectuslabs/kafka-ui, dpage/pgadmin4 kullanılabilir

Örnek

Şöyle yaparız

version: "3.7"
services:
  postgres:
    image: debezium/postgres:13
    ports:
      - 5432:5432
    healthcheck:
      test: "pg_isready -U postgresuser -d shipment_db"
      interval: 2s
      timeout: 20s
      retries: 10
    environment:
      - POSTGRES_USER=postgresuser
      - POSTGRES_PASSWORD=postgrespw
      - POSTGRES_DB=shipment_db
      - PGPASSWORD=postgrespw
    volumes:
      - ./scripts:/docker-entrypoint-initdb.d
  zookeeper:
    image: confluentinc/cp-zookeeper:5.5.3
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
  kafka:
    image: confluentinc/cp-enterprise-kafka:5.5.3
    depends_on: [zookeeper]
    environment:
      KAFKA_ZOOKEEPER_CONNECT: "zookeeper:2181"
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
      KAFKA_BROKER_ID: 1
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_JMX_PORT: 9991
    ports:
    - 9092:9092
  debezium:
    image: debezium/connect:1.4
    environment:
      BOOTSTRAP_SERVERS: kafka:9092
      GROUP_ID: 1
      CONFIG_STORAGE_TOPIC: connect_configs
      OFFSET_STORAGE_TOPIC: connect_offsets
      KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
      VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
      CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
      CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
    depends_on: [kafka]
    ports:
      - 8083:8083
  schema-registry:
    image: confluentinc/cp-schema-registry:5.5.3
    environment:
     - SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=zookeeper:2181
     - SCHEMA_REGISTRY_HOST_NAME=schema-registry
     - SCHEMA_REGISTRY_LISTENERS=http://schema-registry:8081,http://localhost:8081
    depends_on: [zookeeper, kafka]

Örnek

Şöyle yaparız

version: "3.7"
services:
  postgres:
    image: debezium/postgres:13
    ports:
      - 5432:5432
    environment:
      - POSTGRES_USER=admin
      - POSTGRES_PASSWORD=root
      - POSTGRES_DB=food

  pgadmin:
        image: dpage/pgadmin4
        environment: 
            - PGADMIN_DEFAULT_EMAIL=admin@admin.com
            - PGADMIN_DEFAULT_PASSWORD=root
        ports:
            - '5050:80'
        restart: always

  zookeeper:
    image: confluentinc/cp-zookeeper:5.5.3
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181

  kafka:
    image: confluentinc/cp-enterprise-kafka:5.5.3
    depends_on: [zookeeper]
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_JMX_PORT: 9991
    ports:
      - 29092:29092

  debezium:
    image: debezium/connect:1.4
    environment:
      BOOTSTRAP_SERVERS: kafka:9092
      GROUP_ID: 1
      CONFIG_STORAGE_TOPIC: connect_configs
      OFFSET_STORAGE_TOPIC: connect_offsets
      STATUS_STORAGE_TOPIC: my_status_topic
      KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
      VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
      CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8085
      CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8085
    depends_on: [kafka]
    ports:
      - 8083:8083

  schema-registry:
    image: confluentinc/cp-schema-registry:5.5.3
    environment:
      - SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=zookeeper:2181
      - SCHEMA_REGISTRY_HOST_NAME=schema-registry
      - SCHEMA_REGISTRY_LISTENERS=http://schema-registry:8085,http://localhost:8085
    ports:
      - 8085:8085
    depends_on: [zookeeper, kafka]

Örnek - postgres + debezium + kafka

Elimizde şöyle bir dosya olsun. Burada 4 container tanımlı. Bunlar postgre, zookeeper, kafka, connector. Postgre'ye bağlanmak için kullanıcı ismi "appuser", şifresi ise "qwerty"

version: '3.1'
services:
    postgres:
        image: debezium/postgres
        environment:
          POSTGRES_PASSWORD: qwerty
          POSTGRES_USER: appuser
        volumes:
           - ./postgres:/data/postgres
        ports:
          - 6532:6532
    zookeeper:
        image: confluentinc/cp-zookeeper
        ports:
          - "2181:2181"
        environment:
          ZOOKEEPER_CLIENT_PORT: 2181
    kafka:
        image: confluentinc/cp-kafka
        depends_on:
          - zookeeper
          - postgres
        ports:
          - "9092:9092"
        environment:
          KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
          KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
          KAFKA_LOG_CLEANER_DELETE_RETENTION_MS: 5000
          KAFKA_BROKER_ID: 1
          KAFKA_MIN_INSYNC_REPLICAS: 1
    connector:
        image: debezium/connect:latest
        ports:
          - "8083:8083"
        environment:
          GROUP_ID: 1
          CONFIG_STORAGE_TOPIC: my_connect_configs
          OFFSET_STORAGE_TOPIC: my_connect_offsets
          BOOTSTRAP_SERVERS: kafka:9092
        depends_on:
          - zookeeper
          - postgres
          - kafka

Örnek - postgres + debezium + kafka

Şöyle yaparız

services:
  db:
    image: postgres:latest
    ports:
      - "5432:5432"
    environment:
      - POSTGRES_PASSWORD=arctype

  zookeeper:
    image: debezium/zookeeper
    ports:
      - "2181:2181"
      - "2888:2888"
      - "3888:3888"

  kafka:
    image: debezium/kafka
    ports:
      - "9092:9092"
      - "29092:29092"
    depends_on:
      - zookeeper
    environment:
      - ZOOKEEPER_CONNECT=zookeeper:2181
      - KAFKA_ADVERTISED_LISTENERS=LISTENER_EXT://localhost:29092,LISTENER_INT://kafka:9092
      - KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=LISTENER_INT:PLAINTEXT,LISTENER_EXT:PLAINTEXT
      - KAFKA_LISTENERS=LISTENER_INT://0.0.0.0:9092,LISTENER_EXT://0.0.0.0:29092
      - KAFKA_INTER_BROKER_LISTENER_NAME=LISTENER_INT

  connect:
    image: debezium/connect
    ports:
      - "8083:8083"
    environment:
      - BOOTSTRAP_SERVERS=kafka:9092
      - GROUP_ID=1
      - CONFIG_STORAGE_TOPIC=my_connect_configs
      - OFFSET_STORAGE_TOPIC=my_connect_offsets
      - STATUS_STORAGE_TOPIC=my_connect_statuses
    depends_on:
      - zookeeper
      - kafka

Örnek - postgres + debezium + kafka

Şöyle yaparız

version: “3.7”
services:
  postgres:
    image: debezium/postgres:13
    ports:
      — 5432:5432
    environment:
      — POSTGRES_USER=docker
      — POSTGRES_PASSWORD=docker
      — POSTGRES_DB=profile_service_dev
  zookeeper:
    image: confluentinc/cp-zookeeper:5.5.3
    environment:
    ZOOKEEPER_CLIENT_PORT: 2181
  kafka:
    image: confluentinc/cp-enterprise-kafka:5.5.3
    depends_on: [zookeeper]
    environment:
      KAFKA_ZOOKEEPER_CONNECT: “zookeeper:2181”
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
      KAFKA_BROKER_ID: 1
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_JMX_PORT: 9991
    ports:
      — 9092:9092
  kafka-ui:
    container_name: kafka-ui
    image: provectuslabs/kafka-ui:latest
    ports:
      — 8080:8080
    depends_on:
      — zookeeper
      — kafka
      — schema-registry
    environment:
      KAFKA_CLUSTERS_0_NAME: local
      KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: kafka:9092
      KAFKA_CLUSTERS_0_ZOOKEEPER: zookeeper:2181
      KAFKA_CLUSTERS_0_JMXPORT: 9997
      KAFKA_CLUSTERS_0_SCHEMAREGISTRY: http://schema-registry:8081
  debezium:
    image: debezium/connect:1.4
    environment:
    BOOTSTRAP_SERVERS: kafka:9092
    GROUP_ID: 1
    CONFIG_STORAGE_TOPIC: connect_configs
    OFFSET_STORAGE_TOPIC: connect_offsets
    KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
    VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
    CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
    CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
    depends_on: [kafka]
    ports:
      — 8083:8083
  schema-registry:
    image: confluentinc/cp-schema-registry:5.5.3
    environment:
      — SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=zookeeper:2181
      — SCHEMA_REGISTRY_HOST_NAME=schema-registry
      — SCHEMA_REGISTRY_LISTENERS=http://schema-registry:8081,http://localhost:8081
    ports:
      — 8081:8081
   depends_on: [zookeeper, kafka]

Örnek

Şöyle yaparız

version: '3.1'

services:

  db:
    image: postgres
    container_name: postgres
    restart: always
    environment:
      POSTGRES_PASSWORD: example
    ports:
      - 5432:5432
    volumes:
      - ./data:/var/lib/postgresql/data
    command:
      - postgres
      - -c
      - wal_level=logical

  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    container_name: zookeeper
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000

  kafka:
    image: confluentinc/cp-enterprise-kafka:latest
    container_name: kafka
    depends_on:
      - zookeeper
    ports:
      - 9092:9092
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 100
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1

  schema-registry:
    image: confluentinc/cp-schema-registry:latest
    container_name: schema-registry
    ports:
      - 8081:8081
    depends_on:
      - zookeeper
      - kafka
    environment:
      SCHEMA_REGISTRY_HOST_NAME: schema-registry
      SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: zookeeper:2181
      SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: PLAINTEXT://kafka:29092

  kafka-connect:
    image: confluentinc/cp-kafka-connect:latest
    container_name: kafka-connect
    depends_on:
      - zookeeper
      - kafka
      - schema-registry
    ports:
      - 8083:8083
    environment:
      CONNECT_BOOTSTRAP_SERVERS: "kafka:29092"
      CONNECT_REST_PORT: 8083
      CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect"
      CONNECT_GROUP_ID: compose-connect-group
      CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
      CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
      CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
      CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
      CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
      CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema-registry:8081'
      CONNECT_LOG4J_ROOT_LOGLEVEL: "INFO"
      CONNECT_LOG4J_LOGGERS: "org.apache.kafka.connect.runtime.rest=WARN,org.reflections=ERROR"
      CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "1"
      CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "1"
      CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "1"
      CONNECT_PRODUCER_CONFIG_ACKS: 1
      CONNECT_PLUGIN_PATH: '/usr/share/java,/usr/share/confluent-hub-components'
    command:
      - /bin/bash
      - -c
      - |
        echo "Installing Connector"
        confluent-hub install --no-prompt confluentinc/kafka-connect-jdbc:latest
        confluent-hub install --no-prompt debezium/debezium-connector-postgresql:1.9.3
        /etc/confluent/docker/run &
        sleep infinity
  ksqldb:
    image: confluentinc/ksqldb-server:latest
    container_name: ksqldb
    depends_on:
      - kafka
      - schema-registry
    ports:
      - 8088:8088
    environment:
      KSQL_LISTENERS: http://0.0.0.0:8088
      KSQL_BOOTSTRAP_SERVERS: kafka:29092
      KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
      KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
      KSQL_KSQL_SCHEMA_REGISTRY_URL: http://schema-registry:8081
      KSQL_STREAMS_PRODUCER_MAX_BLOCK_MS: 9223372036854775807
      KSQL_KSQL_CONNECT_URL: http://kafka-connect:8083
      KSQL_KSQL_SERVICE_ID: crypto-ksql
      KSQL_KSQL_HIDDEN_TOPICS: '^_.*'

16 Aralık 2022 Cuma

PG_STAT_USER_TABLES Sistem Tablosu - Performans Bilgilerini İçerir

n_dead_tup Sütunu

Örnek - dead tuple gösterir

Şöyle yaparız

SELECT 
  relname AS TableName ,
  n_live_tup AS LiveTuples,
  n_dead_tup AS DeadTuples ,
  last_autovacuum AS Autovacuum ,
  last_autoanalyze AS Autoanalyze
FROM pg_stat_user_tables;

n_live_tup Sütunu

Örnek

Şöyle yaparız

-- find least read tables
SELECT schemaname, relname, seq_tup_read, idx_tup_fetch
FROM   pg_catalog.pg_stat_user_tables
ORDER  BY seq_tup_read + idx_tup_fetch ASC;

-- find least write tables
SELECT schemaname, relname, n_tup_ins, n_tup_del, n_tup_upd
FROM   pg_catalog.pg_stat_user_tables
ORDER  BY n_tup_ins + n_tup_del + n_tup_upd ASC;

-- find empty tables
SELECT schemaname, relname,n_live_tup
FROM pg_catalog.pg_stat_user_tables
WHERE n_live_tup < 1;

6 Aralık 2022 Salı

PG_STAT_ACTIVITY Sistem Tablosu - For Monitoring Performance

Giriş

Açıklaması şöyle

There are several built-in tools and utilities available in PostgreSQL for monitoring performance metrics, such as the pg_stat_activity view and the pg_stat_database view.

query_start Sütunu

Örnek

Şöyle yaparız. Böylece en uzun süredir koşmakta olan sorguları görebiliriz

SELECT now() - pg_stat_activity.query_start AS duration,
  usesysid,
  usename,
  client_addr,
  pid,
  state,
  datname,
  query,
  backend_type
FROM pg_stat_activity
order by duration desc;

state Sütunu

active/idle gibi sonuçlar verir

Örnek

Bunları state sütununa göre gruplamak için şöyle yaparız. Bağlantıların durumunu gösterir

SELECT count(*),state FROM pg_stat_activity GROUP BY 2;
 count |            state
-------+------------------------------
     7 | active
    32 | idle
    22 | idle in transaction
     4 | idle in transaction(aborted)
(4 rows)

datname Sütunu

Database name in which process is running anlamına gelir.

Örnek - Kill Connection

Kendiminki hariç tüm connection'ları öldürmek için şöyle yaparız.

SELECT 
    pg_terminate_backend(procpid) 
FROM 
    pg_stat_activity 
WHERE 
    -- don't kill my own connection!
    procpid <> pg_backend_pid()
    -- don't kill the connections to other databases
    AND datname = 'database_name'
    ;

pid Sütunu

Process unique id anlamına gelir

query Sütunu

Actual truncated query anlamına gelir

Örnek

Şöyle yaparız

SELECT 
  datname as database_name, 
  pid as Process_ID,
  usename as username,
  application_name,
  client_addr as Host,
  client_port as Port,
  state,
  query 
FROM pg_stat_activity;

usename Sütunu

Database username anlamına gelir

Örnek

Şöyle yaparız

SELECT pid, datname, usename, query, state
FROM pg_stat_activity;

usesysid Sütunu

Database user id anlamına gelir

5 Aralık 2022 Pazartesi

Type Casts

Giriş

Açıklaması şöyle

A type cast specifies a conversion from one data type to another.

PostgreSQL accepts two equivalent syntaxes for type casts,
1. the PostgreSQL-specific value::type and
2. the SQL-standard CAST(value AS type).

1. Eski Yöntem

text

Örnek

Şöyle yaparız. Burada text[] oluşturuluyor

SELECT '{apple,cherry apple, avocado}'::text[];

Örnek

Şöyle yaparız. Burada varchar(255) olan 3 sütun birleştiriliyor.

SELECT 
(((instructors.title::text || ' '::text) ||
   instructors.first_name::text) || ' '::text) ||
   instructors.last_name::text 
AS name
FROM instructors;

COUNT(*) vs COUNT(1)

Giriş

Özet : Kısaca COUNT(*) ve COUNT(1) NULL değerler de dahil tüm satırları sayarlar. Çoğu veri tabanında bu ikisi arasında bir fark yoktur ama PostgreSQL açısından var.

Uzun Açıklama

Açıklaması şöyle. Yani aslında COUNT(*) ve COUNT(1) arasında bir performans farkı olmamalı

- COUNT(*) counts all the tuples in a group
- COUNT(<expr>) counts all the tuples in a group for which <expr> evaluates to something that IS NOT NULL
...
Now that we know the theory behind these COUNT expressions, what’s the difference between COUNT(*) and COUNT(1). There is none, effectively. The 1 expression in COUNT(1) evaluates a constant expression for each row in the group, and it can be proven that this constant expression will never evaluate to NULL, so effectively, we’re running COUNT(*), counting ALL the rows in the group again.

Ancak PostgreSQL performans farkı gösteriyor deniliyor. Açıklaması şöyle. Dolayısıyla COUNT(*) kullanmak daha iyi olabilir.

As it is now in 2019, given the database versions mentioned above, unfortunately, there is a significant difference between COUNT(*) and COUNT(1) in PostgreSQL. Luckily (and this is rare in SQL), all the other dialects don’t care and thus, consistently using COUNT(*), rather than COUNT(1) is a slightly better choice for ALL measured database products from this article.

COUNT(*) İçin Bazı Notlar

COUNT(*) tuple'dan kaç tane olduğunu gösterir. Açıklaması şöyle

Why is count(*) efficient?
When you use count(*), the database engine will use an index to count the rows. T

COUNT(*) ile LEFT JOIN yaparsak NULL sütun da olsa bir tuple olduğu için 1 sayılır

Örnek

Elimizde şöyle bir SQL olsun. Burada actor ve film veri tabanları LEFT JOIN ile birleştiriliyor. Oyuncular ve film sayılarını gösteriyor.

SELECT actor_id, a.first_name, a.last_name, COUNT(*) AS count
FROM actor AS a
LEFT JOIN film_actor AS fa USING (actor_id)
LEFT JOIN film AS f USING (film_id)
GROUP BY actor_id
ORDER BY c ASC, actor_id ASC;

Çıktı şöyle. SUSAN DAVIES hiç filmi olmadığı halde 1 çıktısı verdi. Çünkü COUNT(*) kaç satır olduğunu gösterir.

actor_id|first_name |last_name   |count |
--------|-----------|------------|----- |
     201|SUSAN      |DAVIS       | 1    |
     148|EMILY      |DEE         |14    |
      35|JUDY       |DEAN        |15    |
     199|JULIA      |FAWCETT     |15    |
     186|JULIA      |ZELLWEGER   |16    |
      31|SISSY      |SOBIESKI    |18    |
      71|ADAM       |GRANT       |18    |
       1|PENELOPE   |GUINESS     |19    |
      30|SANDRA     |PECK        |19    |

Bu sefer COUNT(*) yerine COUNT(film_id) yapıyoruz.

SELECT actor_id, a.first_name, a.last_name, COUNT(film_id) AS c
FROM actor AS a
LEFT JOIN film_actor AS fa USING (actor_id)
LEFT JOIN film AS f USING (film_id)
GROUP BY actor_id
ORDER BY c ASC, actor_id ASC;

Çıktı şöyle

actor_id|first_name |last_name   |c |
--------|-----------|------------|--|
     201|SUSAN      |DAVIS       | 0|
     148|EMILY      |DEE         |14|
      35|JUDY       |DEAN        |15|
     199|JULIA      |FAWCETT     |15|
     186|JULIA      |ZELLWEGER   |16|
      31|SISSY      |SOBIESKI    |18|
      71|ADAM       |GRANT       |18|
       1|PENELOPE   |GUINESS     |19|
      30|SANDRA     |PECK        |19|

2 Aralık 2022 Cuma

Debezium Connector

Giriş

PostgreSQL'deki CDC kayıtlarını okur ve Kafka'daki bir topic'e yazılır.

1. Connector'a bir isim verilir

2. connector.class her zaman io.debezium.connector.postgresql.PostgresConnector olarak belirtilir. Bu connector veri tabanından okumak içindir

3. Veri tabanı bağlantısı bilgisi tanımlanır. Bu alanlar şöyle

database.hostname

database.port

database.user

database.password

database.dbname

3. database.server.name ile Kafka'da yaratılacak topic ismi için ön ek tanımlanır.

4. table.include.list ile okunacak tablolar belirtilir veya schema.whitelist ile schema belirtilir.

5. snapshot.mode değeri always yapılabilir.

6. Eğer veri tabanı değişikliğini yani WAL çıktısını JSON yapmak istersek şöyle yaparız

"plugin.name": "wal2json",

"value.converter": "org.apache.kafka.connect.json.JsonConverter"

key.converter ve value.converter Alanları

key.converter ve value.converter Alanları yazısına taşıdım

Örnek

Elimizde şöyle bir JSON olsun

{
    "name": "warehouse-connector",
    "config": {
        "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
        "plugin.name": "pgoutput",
        "database.hostname": "postgres",
        "database.port": "5432",
        "database.user": "docker",
        "database.password": "docker",
        "database.dbname": "profile_service_dev",
        "database.server.name": "postgres",
        "snapshot.mode": "always",
        "table.include.list": "public.warehouse"
    }
}

Şöyle yaparız

curl -i -X POST -H "Accept:application/json" \
  -H "Content-Type:application/json" \
  127.0.0.1:8083/connectors/ \
  --data "@connector.json"

Veri tabanındaki değişiklikler "posgres.public.warehouse" isimli Kafka topic'te görülür. Şeklen şöyle

- connect_configs, connect_offsets topicleri debezium'un kendi topicleri. Bunlar alt çizgi kullanıyor çünkü config ve offset topic'leri debezium konfigürasyonunda belirtildi.

- connect-status yine debezium topic ancak kendisi yarattı

Eğer warehouse tablosunun mesajlarına bakarsak şeklen şöyle

Örnek

Şöyle yaparız. Debezium 8083'e gönderiyoruz.

curl -X POST  http://localhost:8083/connectors/ \
  -H 'content-type: application/json' \
  -d '{
   "name": "student-outbox-connector",
   "config": {
      "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
      "tasks.max": "1",
      "database.hostname": "postgres",
      "database.port": "5432",
      "database.user": "postgres",
      "database.password": "postgres",
      "database.dbname": "user_DB",
      "database.server.name": "pg-outbox-server",
      "tombstones.on.delete": "false",
      "table.whitelist": "public.outbox",
      "transforms": "outbox",
      "transforms.outbox.type": "com.eresh.outbox.OutboxTransformer"
   }
}'

Örnek

Şöyle yaparız. retail.orders_info tablosundaki CDC kayıtlarını myserver.retail.orders_info isimli topic'e yazar. Topic ismi Debezium tarafından serverName.schemaName.tableName şeklinde yaratılır

{
    "name": "pg-orders-source",
    "config": {
        "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
        "database.hostname": "localhost",
        "database.port": "5432",
        "database.user": "postgres",
        "database.password": "password",
        "database.dbname": "postgres",
        "database.server.name": "myserver",
        "plugin.name": "wal2json",
        "table.include.list": "retail.orders_info",
        "value.converter": "org.apache.kafka.connect.json.JsonConverter"
    }
}

Örnek

Şöyle yaparız. Veri tabanındaki değişiklikler "postgres.public.shipments" isimli Kafka topic'te görülür

curl -H 'Content-Type: application/json' debezium:8083/connectors --data '
{
  "name": "shipments-connector",  
  "config": {
    "connector.class": "io.debezium.connector.postgresql.PostgresConnector", 
    "plugin.name": "pgoutput",
    "database.hostname": "postgres", 
    "database.port": "5432", 
    "database.user": "postgresuser", 
    "database.password": "postgrespw", 
    "database.dbname" : "shipment_db", 
    "database.server.name": "postgres", 
    "table.include.list": "public.shipments" 
  }
}'

Örnek

Şöyle yaparız

{
  "name": "fulfillment-connector",  
  "config": {
    "connector.class": "io.debezium.connector.postgresql.PostgresConnector", 
    "database.hostname": "192.168.99.100", 
    "database.port": "5432", 
    "database.user": "postgres", 
    "database.password": "postgres", 
    "database.dbname" : "postgres", 
    "database.server.name": "fulfillment", 
    "table.include.list": "public.inventory" 
  }
}

column.include.list Alanı

Örnek

Şöyle yaparız. debezium.event_store.mt_events isimli topic'e yazar

{
    "name": "postgres-debezium-connector",
    "config": {
        "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
        "database.hostname": "localhost",
        "database.port": "5432",
        "database.user": "postgres",
        "database.password": "1234",
        "database.dbname": "MartenDB",
        "table.include.list": "event_store.mt_events",
        "column.include.list": "event_store.mt_events.data",
        "plugin.name": "pgoutput",
        "topic.prefix": "debezium",
        "slot.name": "debezium_replication_slot"
    }
}

transforms Alanı

Örnek

Şöyle yaparız

"table.include.list": "debezium_postgres_demo.outbox",


// It routes the events, to the topic based on the value written to the destination field 
// of the Outbox table:
// Debezium defaults the topic name to outbox.event followed by the value in 
// the route.by.field above, or the aggregatetype column if that is not set. 
// In this case as the item is written to the database with a destination value of item, 
// the topic name resolves to outbox.event.item.

"transforms": "outbox",
"transforms.outbox.type": "io.debezium.transforms.outbox.EventRouter",
"transforms.outbox.route.by.field": "destination",

// Events are transformed to use the value of the Outbox id field as the event key, 
// and to use the value of the payload field as the event payload
"transforms.outbox.table.field.event.key": "id",
"transforms.outbox.table.field.event.payload": "payload",

1 Aralık 2022 Perşembe

MERGE - Postgres 15 İle Geliyor

Giriş

Açıklaması şöyle

The MERGE command is one of the newest additions to the Postgres DB with the v15 release. It has been a standard feature on other SQL variants like SQL Server before. Take a look at the RFC on graphql-engine to see how the various similar operations are supported through the Hasura GraphQL API on SQL Server.

Merge statements are a combination of INSERT, UPDATE and DELETE.

Merge statement allows bringing in data from a source table into a target table.

Kaynak tabloyu, belirtilen tablo ile birleştirir. MERGE ve INSERT ... ON CONFLICT yani UPSERT farklı şeyler. Açıklaması şöyle

Note: MERGE is often (incorrectly) used interchangeably with the term UPSERT.

Açıklaması şöyle

What about MERGE?

SQL-standard MERGE actually has poorly defined concurrency semantics and is not suitable for upserting without locking a table first.

It's a really useful OLAP statement for data merging, but it's not actually a useful solution for concurrency-safe upsert. There's lots of advice to people using other DBMSes to use MERGE for upserts, but it's actually wrong.

Örnek

Şöyle yaparız

MERGE INTO customer_history c
  USING daily_orders d
  ON (c.customer_id = d.customer_id)

  WHEN MATCHED THEN
    UPDATE SET     -- Existing customer, update the order count and the timestamp of order.
      order_count = c.order_count + 1,
      last_order_id = d.order_id

  WHEN NOT MATCHED THEN       -- New entry, record it.
    INSERT (customer_id, last_order_id, order_center, order_count, last_order)
      VALUES (customer_id, d.order_id, d.order_center, 1, d.order_time);

Örnek

Şöyle yaparız

MERGE INNTO wines w
USING wine_stock_changes s
ON s.winename = w.winename
 WHEN NOT MATCHED AND s.stock_delta > 0 THEN
   INSERT VALUES(s.winename, s.stock_delta)
 WHEN MATCHED AND w.stock + s.stock_delta > 0 THEN
   UPDATE SET stock = w.stock + s.stock_delta
 WHEN MATCHED THEN
  DELETE;

20 Kasım 2022 Pazar

EVERY - Aggregate Metodu

Örnek

Şöyle yaparız

SELECT author_id, EVERY( title LIKE '%a')
FROM book
GROUP BY author_id


author_id	every
---		---
1		false
2		true

Analytic Functions / Window Functions - ROWS BETWEEN

Örnek

Soru şöyle

For example, if we want to sum up the revenue from all previous years till this year, we can use this window function!

Şöyle yaparız

SELECT year, SUM (revenue) 
OVER ( 
  ROWS BETWEEN 
    UNBOUNDED PRECEEDING 
    AND 
    CURRENT ROW
) AS running_sum
FROM revenue_table

8 Kasım 2022 Salı

Debezium Kullanımı İçin Hazırlık

1. User ve Role Yaratılır

Şöyle yaparız

CREATE USER cdcuser WITH PASSWORD 'cdcpassword' REPLICATION LOGIN;

CREATE ROLE replication_group WITH USER foouser, cdcuser;
GRANT CREATE ON DATABASE quant_core TO replication_group;
GRANT USAGE ON SCHEMA foo TO replication_group;
GRANT CREATE ON SCHEMA foo TO replication_group;

CREATE TABLE foo.cdc_heartbeat (
    heartbeat bit
);
ALTER TABLE foo.cdc_heartbeat OWNER TO cdcuser;

2. WAL Seviyesi

var/lib/postgresql/data/postgresql.conf dosyasındaki wal_level alanını değeri logical yapılır

20 Ekim 2022 Perşembe

SQL Komutları

Giriş

SQL komutları şöyle gruplanabilir

1. DDL – Data Definition Language

2. DQL – Data Query Language

3. DML – Data Manipulation Language

4. DCL – Data Control Language

Şeklen şöyle

INFORMATION_SCHEMA.COLUMNS Sistem Tablosu

Giriş

Açıklaması şöyle

It's a part of the SQL-92 standard, and it's implemented by most major database engines (with the notable exception of Oracle).

data_type Sütunu

Örnek

Eğer OID tipindeki sütunları görmek istersek şöyle yaparız

SELECT * FROM information_schema.columns WHERE data_type = 'oid';

table_name Sütunu

Örnek

Şöyle yaparız

SELECT column_name, column_external_name, ordinal_position, is_nullable, data_type
FROM information_schema.columns WHERE table_name='...'

29 Eylül 2022 Perşembe

CREATE INDEX - Partial Index

Giriş

Açıklaması şöyle

If you know that your query will always ignore columns with defined values, you can make use of a partial index to save space and make it faster.

Örnek - Equals

Şöyle yaparız

CREATE INDEX app_user_address_is_main ON app_user_address (user_id)
    WHERE is_main = TRUE;

Açıklaması şöyle

In this query, you’ll be creating an index that only indexes rows if the column is_main equals TRUE.

Örnek - Equals

Şöyle yaparız

CREATE INDEX balances_total_balance_nan_idx ON balances_snapshots ((true))
WHERE total_balance = 'NaN';

Örnek - B- Tree Index + Equals

Şöyle yaparız

CREATE INDEX id_btree ON sample_data USING BTREE(id) WHERE id = 200000;

Örnek - Not Equals

Şöyle yaparız

CREATE INDEX ON f (parent_sha256) WHERE parent_sha256 <> sha256;

CREATE UNIQUE INDEX

Burada CREATE INDEX yerine CREATE UNIQUE index kullanılıyor

Örnek

Şöyle yaparız

CREATE UNIQUE INDEX ON users (user_id)

Örnek

Şöyle yaparız.

CREATE UNIQUE INDEX event_unique_ower_uqx    -- index name
    ON mytable (event_id) 
    WHERE (status = 'owner') ;