30 Aralık 2022 Cuma

CREATE INDEX HASH_INDEX

Giriş
Şeklen şöyle. Aynı hash değerine sahip satırlar linked list gibi bağlanıyor


HASH CODE
Açıklaması şöyle
Despite the b-tree index, which can store many values without reducing the expected performance, the hash index has a limit of 2³²-1 of unique hash codes (different values may have the same hash codes). Therefore, increasing the number of duplicates (in terms of hash codes) negatively affects the index performance.
EQUALITY OPERATOR
Açıklaması şöyle
One of the reasons why the b-tree index is so standard is its flexibility because it supports all comparison operators. Hash index, on the other hand, supports only equality operators.
Bazı olumsuz yönleri şöyle
  • Limited search capabilities: Hash indexes are designed to handle only equality searches (i.e., “find all records where column A equals a specific value”). They are not well-suited for range queries or sorting.
  • Collisions: Hash indexes can have collisions, where multiple keys map to the same hash value. This can result in degraded performance, as the database must perform additional operations to resolve the collisions.
  • Unpredictable storage requirements: The size of a hash index cannot be predicted in advance, as it depends on the number of unique values in the indexed column. This can make it difficult to plan for storage requirements.
Örnek
Şöyle yaparız
CREATE TABLE IF NOT EXISTS hash_table(example varchar);

CREATE INDEX IF NOT EXISTS hash_index ON hash_table USING hash(example);

23 Aralık 2022 Cuma

VACUUM FULL - Production Veri Tabanında Kullanılmaz. Tabloyu Bloke Eder, Disk Alanını Geri Verir

Giriş
Söz dizimi şöyle
VACUUM FULL tablename;
Diğer Seçenekler

VACUUM FULL Neden Production Veri Tabanında Kullanılmaz
Açıklaması şöyle. Production Veri Tabanında Kullanılmaz çünkü disk alanını geri verir ama tabloyu kilitler. Sadece bakımdayken kullanılır. Eğer canlı ortamda tabloyu küçültmek yapmak gerekiyorsa REPACK Extension da kullanılabilir
By contrast, the full VACUUM does reclaim the free space to the operating system, but it blocks application workloads. You can think of it as Java’s “stop-the-world” garbage collection pause. It’s only in PostgreSQL that such a pause can last for hours (or days). Thus, database admins try their best to prevent the full VACUUM from happening at all.
Açıklaması şöyle
This will reclaim all the space used by the float and stale tuples but in turn creates a lot of problems like locking of a table(hanged) that ultimately makes it not a production-use solution. The only time we can use it can allow an application downtime we have taken for maintenance.
Açıklaması şöyle.
VACUUM FULL
The VACUUM FULL process returns the space to the operating system, ... It does the following tasks.

1. VACUUM FULL process obtains an exclusive lock on the table.
2. It creates a new empty storage table file.
3. Copy the live tuples to the new table storage.
4. Removes the old table file and frees the storage.
5. It rebuilds all associated table indexes, updates the system catalogs and statistics.
Örnek 
İsmi SampleTable olan tabloyu vakumlamak için şöyle yaparız
VACUUM FULL SampleTable;
Daha sonra bu komutun etkisini görmek için şöyle yaparız.
SELECT
  COUNT(*) AS npages,
  ROUND(100 * AVG(avail) / 8192, 2) AS avg_fs_ratio
FROM
    pg_freespace('SampleTable');

REPACK Extension - Storage Optimization İçindir

Kurmak için şöyle yaparız
CREATE EXTENSION pg_repack;
Açıklaması şöyle. Tabloyu yeni bir yere kopyalar ve indeksleri yeniden yapılandırır. Böylece saklama alanından tasarruf ederiz.
It creates a new table like we have seen in reindexing and copies all the data from the old table to a new table and after that rebuild the indexes. After full copying is done new table acts as the main one and the old one is dropped.
Şöyle yaparız
pg_repack -k --table mytable mydb;


PG_STAT_ALL_INDEXES Sistem Tablosu

seq_scan Alanı
Örnek
Açıklaması şöyle
Suppose you had a database with many indexes and wanted to monitor how well they were being used. You can use the number of times the index scan is against sequential scan pg_stat_statements
Şöyle yaparız
SELECT indexrelname, idx_scan, seq_scan
FROM pg_stat_all_indexes ORDER BY (idx_scan + seq_scan) DESC LIMIT 10;
idx_scan Alanı
Örnek
Kullanılmayan indeksleri bulmak için şöyle yaparız
SELECT
  relname,
  indexrelname,
  idx_scan,
  idx_tup_read,
  idx_tup_fetch,
  pg_size_pretty(pg_relation_size(indexrelname::regclass)) as size
FROM
pg_stat_all_indexes
WHERE
  schemaname = 'public'
  and indexrelname not like 'pg_toast_%'
  and idx_scan = 0
  and idx_tup_read = 0
  and idx_tup_fetch = 0
  and indexrelname != 'SequelizeData_pkey'
ORDER BY
  pg_relation_size(indexrelname::regclass) desc;

21 Aralık 2022 Çarşamba

CREATE DATABASE

Örnek
Şöyle yaparız
> CREATE DATABASE pg_dev OWNER postgres;


20 Aralık 2022 Salı

Docker Compose ve PostgreSQL Debezium

Giriş
Image'lar şöyle
debezium/postgres:13, zookeeper, confluentinc/cp-enterprise-kafka, debezium/connect mutlaka var

İlave olarak
confluentinc/cp-schema-registry,  confluentinc/ksqldb-server, provectuslabs/kafka-ui, dpage/pgadmin4 kullanılabilir

Örnek
Şöyle yaparız
version: "3.7"
services:
  postgres:
    image: debezium/postgres:13
    ports:
      - 5432:5432
    healthcheck:
      test: "pg_isready -U postgresuser -d shipment_db"
      interval: 2s
      timeout: 20s
      retries: 10
    environment:
      - POSTGRES_USER=postgresuser
      - POSTGRES_PASSWORD=postgrespw
      - POSTGRES_DB=shipment_db
      - PGPASSWORD=postgrespw
    volumes:
      - ./scripts:/docker-entrypoint-initdb.d
  zookeeper:
    image: confluentinc/cp-zookeeper:5.5.3
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
  kafka:
    image: confluentinc/cp-enterprise-kafka:5.5.3
    depends_on: [zookeeper]
    environment:
      KAFKA_ZOOKEEPER_CONNECT: "zookeeper:2181"
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
      KAFKA_BROKER_ID: 1
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_JMX_PORT: 9991
    ports:
    - 9092:9092
  debezium:
    image: debezium/connect:1.4
    environment:
      BOOTSTRAP_SERVERS: kafka:9092
      GROUP_ID: 1
      CONFIG_STORAGE_TOPIC: connect_configs
      OFFSET_STORAGE_TOPIC: connect_offsets
      KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
      VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
      CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
      CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
    depends_on: [kafka]
    ports:
      - 8083:8083
  schema-registry:
    image: confluentinc/cp-schema-registry:5.5.3
    environment:
     - SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=zookeeper:2181
     - SCHEMA_REGISTRY_HOST_NAME=schema-registry
     - SCHEMA_REGISTRY_LISTENERS=http://schema-registry:8081,http://localhost:8081
    depends_on: [zookeeper, kafka]

Örnek
Şöyle yaparız
version: "3.7"
services: postgres: image: debezium/postgres:13 ports: - 5432:5432 environment: - POSTGRES_USER=admin - POSTGRES_PASSWORD=root - POSTGRES_DB=food pgadmin: image: dpage/pgadmin4 environment: - PGADMIN_DEFAULT_EMAIL=admin@admin.com - PGADMIN_DEFAULT_PASSWORD=root ports: - '5050:80' restart: always zookeeper: image: confluentinc/cp-zookeeper:5.5.3 environment: ZOOKEEPER_CLIENT_PORT: 2181 kafka: image: confluentinc/cp-enterprise-kafka:5.5.3 depends_on: [zookeeper] environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 KAFKA_JMX_PORT: 9991 ports: - 29092:29092 debezium: image: debezium/connect:1.4 environment: BOOTSTRAP_SERVERS: kafka:9092 GROUP_ID: 1 CONFIG_STORAGE_TOPIC: connect_configs OFFSET_STORAGE_TOPIC: connect_offsets STATUS_STORAGE_TOPIC: my_status_topic KEY_CONVERTER: io.confluent.connect.avro.AvroConverter VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8085 CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8085 depends_on: [kafka] ports: - 8083:8083 schema-registry: image: confluentinc/cp-schema-registry:5.5.3 environment: - SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=zookeeper:2181 - SCHEMA_REGISTRY_HOST_NAME=schema-registry - SCHEMA_REGISTRY_LISTENERS=http://schema-registry:8085,http://localhost:8085 ports: - 8085:8085 depends_on: [zookeeper, kafka]

Örnek - postgres + debezium + kafka
Elimizde şöyle bir dosya olsun. Burada 4 container tanımlı. Bunlar postgre, zookeeper, kafka, connector. Postgre'ye bağlanmak için kullanıcı ismi "appuser", şifresi ise "qwerty" 
version: '3.1'
services:
    postgres:
        image: debezium/postgres
        environment:
          POSTGRES_PASSWORD: qwerty
          POSTGRES_USER: appuser
        volumes:
           - ./postgres:/data/postgres
        ports:
          - 6532:6532
    zookeeper:
        image: confluentinc/cp-zookeeper
        ports:
          - "2181:2181"
        environment:
          ZOOKEEPER_CLIENT_PORT: 2181
    kafka:
        image: confluentinc/cp-kafka
        depends_on:
          - zookeeper
          - postgres
        ports:
          - "9092:9092"
        environment:
          KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
          KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
          KAFKA_LOG_CLEANER_DELETE_RETENTION_MS: 5000
          KAFKA_BROKER_ID: 1
          KAFKA_MIN_INSYNC_REPLICAS: 1
    connector:
        image: debezium/connect:latest
        ports:
          - "8083:8083"
        environment:
          GROUP_ID: 1
          CONFIG_STORAGE_TOPIC: my_connect_configs
          OFFSET_STORAGE_TOPIC: my_connect_offsets
          BOOTSTRAP_SERVERS: kafka:9092
        depends_on:
          - zookeeper
          - postgres
          - kafka
Örnek - postgres + debezium + kafka
Şöyle yaparız
services:
  db:
    image: postgres:latest
    ports:
      - "5432:5432"
    environment:
      - POSTGRES_PASSWORD=arctype

  zookeeper:
    image: debezium/zookeeper
    ports:
      - "2181:2181"
      - "2888:2888"
      - "3888:3888"

  kafka:
    image: debezium/kafka
    ports:
      - "9092:9092"
      - "29092:29092"
    depends_on:
      - zookeeper
    environment:
      - ZOOKEEPER_CONNECT=zookeeper:2181
      - KAFKA_ADVERTISED_LISTENERS=LISTENER_EXT://localhost:29092,LISTENER_INT://kafka:9092
      - KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=LISTENER_INT:PLAINTEXT,LISTENER_EXT:PLAINTEXT
      - KAFKA_LISTENERS=LISTENER_INT://0.0.0.0:9092,LISTENER_EXT://0.0.0.0:29092
      - KAFKA_INTER_BROKER_LISTENER_NAME=LISTENER_INT

  connect:
    image: debezium/connect
    ports:
      - "8083:8083"
    environment:
      - BOOTSTRAP_SERVERS=kafka:9092
      - GROUP_ID=1
      - CONFIG_STORAGE_TOPIC=my_connect_configs
      - OFFSET_STORAGE_TOPIC=my_connect_offsets
      - STATUS_STORAGE_TOPIC=my_connect_statuses
    depends_on:
      - zookeeper
      - kafka
Örnek - postgres + debezium + kafka
Şöyle yaparız
version: “3.7”
services:
  postgres:
    image: debezium/postgres:13
    ports:
      — 5432:5432
    environment:
      — POSTGRES_USER=docker
      — POSTGRES_PASSWORD=docker
      — POSTGRES_DB=profile_service_dev
  zookeeper:
    image: confluentinc/cp-zookeeper:5.5.3
    environment:
    ZOOKEEPER_CLIENT_PORT: 2181
  kafka:
    image: confluentinc/cp-enterprise-kafka:5.5.3
    depends_on: [zookeeper]
    environment:
      KAFKA_ZOOKEEPER_CONNECT: “zookeeper:2181”
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
      KAFKA_BROKER_ID: 1
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_JMX_PORT: 9991
    ports:
      — 9092:9092
  kafka-ui:
    container_name: kafka-ui
    image: provectuslabs/kafka-ui:latest
    ports:
      — 8080:8080
    depends_on:
      — zookeeper
      — kafka
      — schema-registry
    environment:
      KAFKA_CLUSTERS_0_NAME: local
      KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: kafka:9092
      KAFKA_CLUSTERS_0_ZOOKEEPER: zookeeper:2181
      KAFKA_CLUSTERS_0_JMXPORT: 9997
      KAFKA_CLUSTERS_0_SCHEMAREGISTRY: http://schema-registry:8081
  debezium:
    image: debezium/connect:1.4
    environment:
    BOOTSTRAP_SERVERS: kafka:9092
    GROUP_ID: 1
    CONFIG_STORAGE_TOPIC: connect_configs
    OFFSET_STORAGE_TOPIC: connect_offsets
    KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
    VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
    CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
    CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
    depends_on: [kafka]
    ports:
      — 8083:8083
  schema-registry:
    image: confluentinc/cp-schema-registry:5.5.3
    environment:
      — SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=zookeeper:2181
      — SCHEMA_REGISTRY_HOST_NAME=schema-registry
      — SCHEMA_REGISTRY_LISTENERS=http://schema-registry:8081,http://localhost:8081
    ports:
      — 8081:8081
   depends_on: [zookeeper, kafka]
Örnek
Şöyle yaparız
version: '3.1'

services:

  db:
    image: postgres
    container_name: postgres
    restart: always
    environment:
      POSTGRES_PASSWORD: example
    ports:
      - 5432:5432
    volumes:
      - ./data:/var/lib/postgresql/data
    command:
      - postgres
      - -c
      - wal_level=logical

  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    container_name: zookeeper
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000

  kafka:
    image: confluentinc/cp-enterprise-kafka:latest
    container_name: kafka
    depends_on:
      - zookeeper
    ports:
      - 9092:9092
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 100
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1

  schema-registry:
    image: confluentinc/cp-schema-registry:latest
    container_name: schema-registry
    ports:
      - 8081:8081
    depends_on:
      - zookeeper
      - kafka
    environment:
      SCHEMA_REGISTRY_HOST_NAME: schema-registry
      SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: zookeeper:2181
      SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: PLAINTEXT://kafka:29092

  kafka-connect:
    image: confluentinc/cp-kafka-connect:latest
    container_name: kafka-connect
    depends_on:
      - zookeeper
      - kafka
      - schema-registry
    ports:
      - 8083:8083
    environment:
      CONNECT_BOOTSTRAP_SERVERS: "kafka:29092"
      CONNECT_REST_PORT: 8083
      CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect"
      CONNECT_GROUP_ID: compose-connect-group
      CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
      CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
      CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
      CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
      CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
      CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema-registry:8081'
      CONNECT_LOG4J_ROOT_LOGLEVEL: "INFO"
      CONNECT_LOG4J_LOGGERS: "org.apache.kafka.connect.runtime.rest=WARN,org.reflections=ERROR"
      CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "1"
      CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "1"
      CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "1"
      CONNECT_PRODUCER_CONFIG_ACKS: 1
      CONNECT_PLUGIN_PATH: '/usr/share/java,/usr/share/confluent-hub-components'
    command:
      - /bin/bash
      - -c
      - |
        echo "Installing Connector"
        confluent-hub install --no-prompt confluentinc/kafka-connect-jdbc:latest
        confluent-hub install --no-prompt debezium/debezium-connector-postgresql:1.9.3
        /etc/confluent/docker/run &
        sleep infinity
  ksqldb:
    image: confluentinc/ksqldb-server:latest
    container_name: ksqldb
    depends_on:
      - kafka
      - schema-registry
    ports:
      - 8088:8088
    environment:
      KSQL_LISTENERS: http://0.0.0.0:8088
      KSQL_BOOTSTRAP_SERVERS: kafka:29092
      KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
      KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
      KSQL_KSQL_SCHEMA_REGISTRY_URL: http://schema-registry:8081
      KSQL_STREAMS_PRODUCER_MAX_BLOCK_MS: 9223372036854775807
      KSQL_KSQL_CONNECT_URL: http://kafka-connect:8083
      KSQL_KSQL_SERVICE_ID: crypto-ksql
      KSQL_KSQL_HIDDEN_TOPICS: '^_.*'



16 Aralık 2022 Cuma

PG_STAT_USER_TABLES Sistem Tablosu - Performans Bilgilerini İçerir

n_dead_tup Sütunu

Örnek - dead tuple gösterir
Şöyle yaparız
SELECT 
  relname AS TableName ,
  n_live_tup AS LiveTuples,
  n_dead_tup AS DeadTuples ,
  last_autovacuum AS Autovacuum ,
  last_autoanalyze AS Autoanalyze
FROM pg_stat_user_tables;
n_live_tup Sütunu
Örnek
Şöyle yaparız
-- find least read tables
SELECT schemaname, relname, seq_tup_read, idx_tup_fetch
FROM   pg_catalog.pg_stat_user_tables
ORDER  BY seq_tup_read + idx_tup_fetch ASC;

-- find least write tables
SELECT schemaname, relname, n_tup_ins, n_tup_del, n_tup_upd
FROM   pg_catalog.pg_stat_user_tables
ORDER  BY n_tup_ins + n_tup_del + n_tup_upd ASC;

-- find empty tables
SELECT schemaname, relname,n_live_tup
FROM pg_catalog.pg_stat_user_tables
WHERE n_live_tup < 1; 


6 Aralık 2022 Salı

PG_STAT_ACTIVITY Sistem Tablosu - For Monitoring Performance

Giriş
Açıklaması şöyle
There are several built-in tools and utilities available in PostgreSQL for monitoring performance metrics, such as the pg_stat_activity view and the pg_stat_database view.
query_start Sütunu
Örnek
Şöyle yaparız. Böylece en uzun süredir koşmakta olan sorguları görebiliriz
SELECT now() - pg_stat_activity.query_start AS duration,
  usesysid,
  usename,
  client_addr,
  pid,
  state,
  datname,
  query,
  backend_type
FROM pg_stat_activity
order by duration desc;
state Sütunu
active/idle gibi sonuçlar verir

Örnek
Bunları state sütununa göre gruplamak için şöyle yaparız. Bağlantıların durumunu gösterir
SELECT count(*),state FROM pg_stat_activity GROUP BY 2;
 count |            state
-------+------------------------------
     7 | active
    32 | idle
    22 | idle in transaction
     4 | idle in transaction(aborted)
(4 rows)
datname Sütunu
Database name in which process is running anlamına gelir

Örnek - Kill Connection
Kendiminki hariç tüm connection'ları öldürmek için şöyle yaparız
SELECT 
    pg_terminate_backend(procpid) 
FROM 
    pg_stat_activity 
WHERE 
    -- don't kill my own connection!
    procpid <> pg_backend_pid()
    -- don't kill the connections to other databases
    AND datname = 'database_name'
    ;
pid Sütunu
Process unique id anlamına gelir

query Sütunu
Actual truncated query anlamına gelir
Örnek
Şöyle yaparız
SELECT 
  datname as database_name, 
  pid as Process_ID,
  usename as username,
  application_name,
  client_addr as Host,
  client_port as Port,
  state,
  query 
FROM pg_stat_activity;

usename Sütunu
Database username anlamına gelir
Örnek
Şöyle yaparız
SELECT pid, datname, usename, query, state
FROM pg_stat_activity;
usesysid Sütunu
Database user id anlamına gelir



5 Aralık 2022 Pazartesi

Type Casts

Giriş
Açıklaması şöyle
A type cast specifies a conversion from one data type to another.

PostgreSQL accepts two equivalent syntaxes for type casts, 
1. the PostgreSQL-specific value::type and
2. the SQL-standard CAST(value AS type).
1. Eski Yöntem
text
Örnek
Şöyle yaparız. Burada text[] oluşturuluyor
SELECT '{apple,cherry apple, avocado}'::text[];
Örnek
Şöyle yaparız. Burada varchar(255) olan 3 sütun birleştiriliyor.
SELECT 
(((instructors.title::text || ' '::text) || instructors.first_name::text) || ' '::text) || instructors.last_name::text 
AS name
FROM instructors;


COUNT(*) vs COUNT(1)

Giriş
Özet : Kısaca COUNT(*) ve COUNT(1) NULL değerler de dahil tüm satırları sayarlar. Çoğu veri tabanında bu ikisi arasında bir fark yoktur ama PostgreSQL açısından var.

Uzun Açıklama
Açıklaması şöyle. Yani aslında COUNT(*) ve COUNT(1) arasında bir performans farkı olmamalı
- COUNT(*) counts all the tuples in a group
- COUNT(<expr>) counts all the tuples in a group for which <expr> evaluates to something that IS NOT NULL
...
Now that we know the theory behind these COUNT expressions, what’s the difference between COUNT(*) and COUNT(1). There is none, effectively. The 1 expression in COUNT(1) evaluates a constant expression for each row in the group, and it can be proven that this constant expression will never evaluate to NULL, so effectively, we’re running COUNT(*), counting ALL the rows in the group again.
Ancak PostgreSQL performans farkı gösteriyor deniliyor. Açıklaması şöyle. Dolayısıyla COUNT(*) kullanmak daha iyi olabilir.
As it is now in 2019, given the database versions mentioned above, unfortunately, there is a significant difference between COUNT(*) and COUNT(1) in PostgreSQL. Luckily (and this is rare in SQL), all the other dialects don’t care and thus, consistently using COUNT(*), rather than COUNT(1) is a slightly better choice for ALL measured database products from this article.
COUNT(*) İçin Bazı Notlar
COUNT(*) tuple'dan kaç tane olduğunu gösterir. Açıklaması şöyle
Why is count(*) efficient?
When you use count(*), the database engine will use an index to count the rows. T
COUNT(*) ile LEFT JOIN yaparsak NULL sütun da olsa bir tuple olduğu için 1 sayılır

Örnek
Elimizde şöyle bir SQL olsun. Burada actor ve film veri tabanları LEFT JOIN ile birleştiriliyor. Oyuncular ve film sayılarını gösteriyor.
SELECT actor_id, a.first_name, a.last_name, COUNT(*) AS count
FROM actor AS a
LEFT JOIN film_actor AS fa USING (actor_id)
LEFT JOIN film AS f USING (film_id)
GROUP BY actor_id
ORDER BY c ASC, actor_id ASC;
Çıktı şöyle. SUSAN DAVIES hiç filmi olmadığı halde 1 çıktısı verdi. Çünkü COUNT(*) kaç satır olduğunu gösterir.
actor_id|first_name |last_name   |count |
--------|-----------|------------|----- |
     201|SUSAN      |DAVIS       | 1    |
     148|EMILY      |DEE         |14    |
      35|JUDY       |DEAN        |15    |
     199|JULIA      |FAWCETT     |15    |
     186|JULIA      |ZELLWEGER   |16    |
      31|SISSY      |SOBIESKI    |18    |
      71|ADAM       |GRANT       |18    |
       1|PENELOPE   |GUINESS     |19    |
      30|SANDRA     |PECK        |19    |
Bu sefer COUNT(*) yerine COUNT(film_id) yapıyoruz. 
SELECT actor_id, a.first_name, a.last_name, COUNT(film_id) AS c
FROM actor AS a
LEFT JOIN film_actor AS fa USING (actor_id)
LEFT JOIN film AS f USING (film_id)
GROUP BY actor_id
ORDER BY c ASC, actor_id ASC;
Çıktı şöyle
actor_id|first_name |last_name   |c |
--------|-----------|------------|--|
     201|SUSAN      |DAVIS       | 0|
     148|EMILY      |DEE         |14|
      35|JUDY       |DEAN        |15|
     199|JULIA      |FAWCETT     |15|
     186|JULIA      |ZELLWEGER   |16|
      31|SISSY      |SOBIESKI    |18|
      71|ADAM       |GRANT       |18|
       1|PENELOPE   |GUINESS     |19|
      30|SANDRA     |PECK        |19|



2 Aralık 2022 Cuma

Debezium Connector

Giriş
PostgreSQL'deki CDC kayıtlarını okur ve Kafka'daki bir topic'e yazılır.

1. Connector'a bir isim verilir
2. connector.class her zaman io.debezium.connector.postgresql.PostgresConnector olarak belirtilir. Bu connector veri tabanından okumak içindir

3. Veri tabanı bağlantısı bilgisi tanımlanır. Bu alanlar şöyle
database.hostname
database.port
database.user
database.password
database.dbname

3. database.server.name ile Kafka'da yaratılacak topic ismi için ön ek tanımlanır. 
4. table.include.list ile okunacak tablolar belirtilir veya schema.whitelist ile schema belirtilir.
5. snapshot.mode değeri always yapılabilir.

6. Eğer veri tabanı değişikliğini yani WAL çıktısını JSON yapmak istersek şöyle yaparız
"plugin.name": "wal2json",
"value.converter": "org.apache.kafka.connect.json.JsonConverter"

key.converter ve value.converter Alanları

Örnek
Elimizde şöyle bir JSON olsun
{
    "name": "warehouse-connector",
    "config": {
        "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
        "plugin.name": "pgoutput",
        "database.hostname": "postgres",
        "database.port": "5432",
        "database.user": "docker",
        "database.password": "docker",
        "database.dbname": "profile_service_dev",
        "database.server.name": "postgres",
        "snapshot.mode": "always",
        "table.include.list": "public.warehouse"
    }
}
Şöyle yaparız
curl -i -X POST -H "Accept:application/json" \
  -H "Content-Type:application/json" \
  127.0.0.1:8083/connectors/ \
  --data "@connector.json"
 Veri tabanındaki değişiklikler "posgres.public.warehouse" isimli Kafka topic'te görülür. Şeklen şöyle


- connect_configsconnect_offsets topicleri debezium'un kendi topicleri. Bunlar alt çizgi kullanıyor çünkü config ve offset topic'leri debezium konfigürasyonunda belirtildi. 
- connect-status yine debezium topic ancak kendisi yarattı
Eğer warehouse tablosunun mesajlarına bakarsak şeklen şöyle


Örnek
Şöyle yaparız. Debezium 8083'e gönderiyoruz.
curl -X POST  http://localhost:8083/connectors/ \
  -H 'content-type: application/json' \
  -d '{
   "name": "student-outbox-connector",
   "config": {
      "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
      "tasks.max": "1",
      "database.hostname": "postgres",
      "database.port": "5432",
      "database.user": "postgres",
      "database.password": "postgres",
      "database.dbname": "user_DB",
      "database.server.name": "pg-outbox-server",
      "tombstones.on.delete": "false",
      "table.whitelist": "public.outbox",
      "transforms": "outbox",
      "transforms.outbox.type": "com.eresh.outbox.OutboxTransformer"
   }
}'

Örnek
Şöyle yaparız. retail.orders_info tablosundaki CDC kayıtlarını myserver.retail.orders_info isimli topic'e yazar. Topic ismi Debezium tarafından serverName.schemaName.tableName şeklinde yaratılır
{
    "name": "pg-orders-source",
    "config": {
        "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
        "database.hostname": "localhost",
        "database.port": "5432",
        "database.user": "postgres",
        "database.password": "password",
        "database.dbname": "postgres",
        "database.server.name": "myserver",
        "plugin.name": "wal2json",
        "table.include.list": "retail.orders_info",
        "value.converter": "org.apache.kafka.connect.json.JsonConverter"
    }
}

Örnek
Şöyle yaparız. Veri tabanındaki değişiklikler "postgres.public.shipments" isimli Kafka topic'te görülür
curl -H 'Content-Type: application/json' debezium:8083/connectors --data '
{
  "name": "shipments-connector",  
  "config": {
    "connector.class": "io.debezium.connector.postgresql.PostgresConnector", 
    "plugin.name": "pgoutput",
    "database.hostname": "postgres", 
    "database.port": "5432", 
    "database.user": "postgresuser", 
    "database.password": "postgrespw", 
    "database.dbname" : "shipment_db", 
    "database.server.name": "postgres", 
    "table.include.list": "public.shipments" 
  }
}'
Örnek
Şöyle yaparız
{
  "name": "fulfillment-connector",  
  "config": {
    "connector.class": "io.debezium.connector.postgresql.PostgresConnector", 
    "database.hostname": "192.168.99.100", 
    "database.port": "5432", 
    "database.user": "postgres", 
    "database.password": "postgres", 
    "database.dbname" : "postgres", 
    "database.server.name": "fulfillment", 
    "table.include.list": "public.inventory" 
  }
}
column.include.list Alanı
Örnek
Şöyle yaparız. debezium.event_store.mt_events isimli topic'e yazar
{
    "name": "postgres-debezium-connector",
    "config": {
        "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
        "database.hostname": "localhost",
        "database.port": "5432",
        "database.user": "postgres",
        "database.password": "1234",
        "database.dbname": "MartenDB",
        "table.include.list": "event_store.mt_events",
        "column.include.list": "event_store.mt_events.data",
        "plugin.name": "pgoutput",
        "topic.prefix": "debezium",
        "slot.name": "debezium_replication_slot"
    }
}

transforms Alanı
Örnek
Şöyle yaparız
"table.include.list": "debezium_postgres_demo.outbox",


// It routes the events, to the topic based on the value written to the destination field 
// of the Outbox table:
// Debezium defaults the topic name to outbox.event followed by the value in 
// the route.by.field above, or the aggregatetype column if that is not set. 
// In this case as the item is written to the database with a destination value of item, 
// the topic name resolves to outbox.event.item.

"transforms": "outbox",
"transforms.outbox.type": "io.debezium.transforms.outbox.EventRouter",
"transforms.outbox.route.by.field": "destination",

// Events are transformed to use the value of the Outbox id field as the event key, 
// and to use the value of the payload field as the event payload
"transforms.outbox.table.field.event.key": "id",
"transforms.outbox.table.field.event.payload": "payload",



1 Aralık 2022 Perşembe

MERGE - Postgres 15 İle Geliyor

Giriş
Açıklaması şöyle
The MERGE command is one of the newest additions to the Postgres DB with the v15 release. It has been a standard feature on other SQL variants like SQL Server before. Take a look at the RFC on graphql-engine to see how the various similar operations are supported through the Hasura GraphQL API on SQL Server.

Merge statements are a combination of INSERT, UPDATE and DELETE.

Merge statement allows bringing in data from a source table into a target table.
Kaynak tabloyu, belirtilen tablo ile birleştirir. MERGE ve INSERT ... ON CONFLICT yani UPSERT farklı şeyler. Açıklaması şöyle
Note: MERGE is often (incorrectly) used interchangeably with the term UPSERT.
Açıklaması şöyle
What about MERGE?

SQL-standard MERGE actually has poorly defined concurrency semantics and is not suitable for upserting without locking a table first.

It's a really useful OLAP statement for data merging, but it's not actually a useful solution for concurrency-safe upsert. There's lots of advice to people using other DBMSes to use MERGE for upserts, but it's actually wrong.

Örnek
Şöyle yaparız
MERGE INTO customer_history c
  USING daily_orders d
  ON (c.customer_id = d.customer_id)

  WHEN MATCHED THEN
    UPDATE SET     -- Existing customer, update the order count and the timestamp of order.
      order_count = c.order_count + 1,
      last_order_id = d.order_id

  WHEN NOT MATCHED THEN       -- New entry, record it.
    INSERT (customer_id, last_order_id, order_center, order_count, last_order)
      VALUES (customer_id, d.order_id, d.order_center, 1, d.order_time);
Örnek
Şöyle yaparız
MERGE INNTO wines w
USING wine_stock_changes s
ON s.winename = w.winename
 WHEN NOT MATCHED AND s.stock_delta > 0 THEN
   INSERT VALUES(s.winename, s.stock_delta)
 WHEN MATCHED AND w.stock + s.stock_delta > 0 THEN
   UPDATE SET stock = w.stock + s.stock_delta
 WHEN MATCHED THEN
  DELETE;






20 Kasım 2022 Pazar

EVERY - Aggregate Metodu

Örnek
Şöyle yaparız
SELECT author_id, EVERY( title LIKE '%a')
FROM book
GROUP BY author_id


author_id every
--- ---
1 false
2 true


Analytic Functions / Window Functions - ROWS BETWEEN

Örnek
Soru şöyle
For example, if we want to sum up the revenue from all previous years till this year, we can use this window function!
Şöyle yaparız
SELECT year, SUM (revenue) 
OVER ( 
  ROWS BETWEEN 
    UNBOUNDED PRECEEDING 
    AND 
    CURRENT ROW
) AS running_sum
FROM revenue_table

8 Kasım 2022 Salı

Debezium Kullanımı İçin Hazırlık

Giriş

1. WAL Seviyesi
var/lib/postgresql/data/postgresql.conf dosyasındaki wal_level alanını değeri logical yapılır

2. Grup/Kullanıcı ve Publication + Slot Yaratma
Açıklaması şöyle
1. To stream changes your user needs the Replication attribute. Without it, you'll hit permission errors. 
2. After updating the parameters and granting the right permissions, the next steps are to create a publication, set up a replication slot, and start streaming changes.
Açıklaması şöyle
A publication defines which tables' changes you're publishing. 
Açıklaması şöyle
 A replication slot ensures changes stick around in the WAL until your consumer reads them.
Örnek
Şöyle yaparız. Burada önce Replication ö zelliğine sahip replication_role grubu yaratılıyor. Daha sonra bu rol bir kullanıcıya atanıyor 
CREATE ROLE replication_role WITH REPLICATION LOGIN;
CREATE USER replicator WITH PASSWORD 'your-secure-password';
GRANT replication_role TO replicator;
-- or: ALTER USER replicator REPLICATION;
Şöyle yaparızpg_recvlogical komutu hem publication hem de slot yaratabilir
# Let's create s publication for users table
CREATE PUBLICATION users_pub FOR TABLE users;

# Let's create a slot
pg_recvlogical -h $SERVER_NAME -U replicator -d postgres --slot users_slot 
--create-slot -P wal2json

# start streaming changes and see what they look like:
pg_recvlogical -h $SERVER_NAME -U replicator -d postgres --slot users_slot 
--start -o pretty-print=1 -f -
Örnek
Şöyle yaparız. Burada önce Replication özelliğine sahip cdcuser isimli bir kullanıcı yaratılıyor. Daha sonra replication_group isimli bir grup yaratılıyor. Daha sonra replication_group grubuna yetkiler grant ediliyor
CREATE USER cdcuser WITH PASSWORD 'cdcpassword' REPLICATION LOGIN;

CREATE ROLE replication_group WITH USER foouser, cdcuser;
GRANT CREATE ON DATABASE quant_core TO replication_group;
GRANT USAGE ON SCHEMA foo TO replication_group;
GRANT CREATE ON SCHEMA foo TO replication_group;

CREATE TABLE foo.cdc_heartbeat (
    heartbeat bit
);
ALTER TABLE foo.cdc_heartbeat OWNER TO cdcuser;



20 Ekim 2022 Perşembe

SQL Komutları

Giriş
SQL komutları şöyle gruplanabilir
1. DDL – Data Definition Language
2. DQL – Data Query Language
3. DML – Data Manipulation Language
4. DCL – Data Control Language

Şeklen şöyle



INFORMATION_SCHEMA.COLUMNS Sistem Tablosu

Giriş
Açıklaması şöyle
It's a part of the SQL-92 standard, and it's implemented by most major database engines (with the notable exception of Oracle).
data_type Sütunu
Örnek
Eğer OID tipindeki sütunları görmek istersek şöyle yaparız
SELECT * FROM information_schema.columns WHERE data_type = 'oid';
table_name Sütunu
Örnek
Şöyle yaparız
SELECT column_name, column_external_name, ordinal_position, is_nullable, data_type
FROM information_schema.columns WHERE table_name='...'