PostgreSQL Çorbası: Aralık 2020

24 Aralık 2020 Perşembe

ALTER TABLE

Giriş

Söz dizimi şöyle

ALTER TABLE table_name
ADD column_name data_type [constraint],
MODIFY column_name data_type [constraint],
DROP column_name,
ADD CONSTRAINT constraint_name constraint_definition,
DROP CONSTRAINT constraint_name;

DROP COLUMN

Örnek

Normalde şöyle yaparız

ALTER TABLE foo DROP COLUMN bar;

Ancak bazen "vsnprintf failed: Invalid argument" şeklinde bir hata geliyor. Bu durumda şöyle yaparız

SET lc_messsages = 'C';
ALTER TABLE foo DROP COLUMN bar;

REPLICA IDENTITY

Örnek

Şöyle yaparız

ALTER TABLE ingredients REPLICA IDENTITY FULL;

Açıklaması şöyle

The ALTER TABLE command with the REPLICA IDENTITY clause is used to set the replication identity for a table. When using Debezium's PostgreSQL connector, the connector requires a unique primary key or a unique identifier to keep track of changes. If a table does not have a primary key or a unique identifier, you will get an error, saying that the table does not have a replica identity, which is used for tracking changes. By running ALTER TABLE ... REPLICA IDENTITY FULL, you're setting the table's replication identity to "full", which means that the entire row is used as the identifier for change tracking purposes.

21 Aralık 2020 Pazartesi

Docker Compose ve PostgreSQL

Giriş

Docker compose ile kullanmak için bazı notlar

Image İsmi

Şunlar olabilir

- postgres:11.1

- postgres:13.3

- postgres:15.1

- postgres:15rc2

- debezium/postgres

- debezium/postgres:13

En Basit

Şöyle yaparız

version: '3'
services:

  authorization-db:
    image: postgres:11.1
    container_name: auth-db
    ports:
      - "5432:5432"

command Alanı

Şunlar olabilir

- max_connections

- max_prepared_transactions

Örnek - max_ connections

Şöyle yaparız

services:
  database:
    image: postgres:latest
    command: postgres -c 'max_connections=250'

environment Alanı

Ortam Değişkenleri Şunlar olabilir

- POSTGRES_PASSWORD

- POSTGRES_USER

- POSTGRES_DB

Örnek

Şöyle yaparız. Burada iki tane veri tabanı çalıştırılıyor.

version: '3'
services:
  course-catalog-operational-db:
    image: postgres:13.3
    container_name: course-catalog-operational-db
    command:
      - "postgres"
      - "-c"
      - "wal_level=logical"
    environment:
      POSTGRES_PASSWORD: 123456
      POSTGRES_DB: course-catalog-db
    ports:
      - "5433:5432"
  instructors-legacy-db:
    image: postgres:13.3
    container_name: instructors-legacy-db
    command:
      - "postgres"
      - "-c"
      - "wal_level=logical"
    environment:
      POSTGRES_PASSWORD: 123456
      POSTGRES_DB: instructors-db
    ports:
      - "5434:5432"
    volumes:
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql

Örnek - Hasura GraphQL + PostgreSQL 15

Şöyle yaparız

version: '3.6'
services:
  postgres:
    image: postgres:15rc2
    restart: always
    volumes:
    - db_data:/var/lib/postgresql/data
    environment:
      POSTGRES_PASSWORD: postgrespassword
    ports:
    - "5432:5432"
  graphql-engine:
    image: hasura/graphql-engine:v2.13.0
    ports:
    - "8080:8080"
    depends_on:
    - "postgres"
    restart: always
    environment:
      ## postgres database to store Hasura metadata
      HASURA_GRAPHQL_METADATA_DATABASE_URL: postgres://postgres:postgrespassword@postgres:5432/postgres
      ## this env var can be used to add the above postgres database to Hasura as a data source. this can be removed/updated based on your needs
      PG_DATABASE_URL: postgres://postgres:postgrespassword@postgres:5432/postgres
      ## enable the console served by server
      HASURA_GRAPHQL_ENABLE_CONSOLE: "true" # set to "false" to disable console
      ## enable debugging mode. It is recommended to disable this in production
      HASURA_GRAPHQL_DEV_MODE: "true"
      HASURA_GRAPHQL_ENABLED_LOG_TYPES: startup, http-log, webhook-log, websocket-log, query-log
      ## uncomment next line to set an admin secret
      HASURA_GRAPHQL_ADMIN_SECRET: myadminsecretkey
volumes:
  db_data:

docker-entrypoint-initdb

Veri tabanı başlarken çalıştırılacak SQL dosyalarını belirtiriz

Örnek

Şöyle yaparız

services:
  postgres:
    image: postgres
    ports:
      - "5432:5432"
    restart: always
    environment:
      POSTGRES_PASSWORD: password
      POSTGRES_DB: blogdb
      POSTGRES_USER: user
    volumes:
      - ./data/postgresql:/var/lib/postgresql
      - ./pg-initdb.d:/docker-entrypoint-initdb.d

healthcheck

Örnek

Şöyle yaparız.

-q ile quite belirtiliyor

-d ile ile veri tabanı ismi belirtiliyor

-U ile kullanıcı ismi belirtiliyor

version: '3'services:
  postgres:
    image: postgres:13.1
    healthcheck:
      test: [ "CMD", "pg_isready", "-q", "-d", "postgres", "-U", "root" ]
      timeout: 45s
      interval: 10s
      retries: 10
    restart: always
    environment:
      - POSTGRES_USER=root
      - POSTGRES_PASSWORD=password
      - APP_DB_USER=docker
      - APP_DB_PASS=docker
      - APP_DB_NAME=docker
    volumes:
      - ./db:/docker-entrypoint-initdb.d/
    ports:
      - 5432:5432

Örnek

Şöyle yaparız. Kullanıcı ismi -U ile belirtiliyor

postgres:
  container_name: scheduling-airflow-postgres
  image: postgres:13
  environment:
    POSTGRES_USER: airflow
    POSTGRES_PASSWORD: airflow
    POSTGRES_DB: airflow
  deploy:
    resources:
      limits:
        cpus: "0.40"
        memory: 1200M
  volumes:
    - postgres-db-volume:/var/lib/postgresql/data
  healthcheck:
    test: ["CMD", "pg_isready", "-U", "airflow"]
    interval: 5s
    retries: 5
  restart: always
  profiles:
    - scheduling
  networks:
    - datastack

restart Alanı

Genellikle always değeri verilir. Açıklaması şöyle

restart always : is used to restart the container if there is an error when creating the container.

volumes Alanı

Pod'un kullandığı /var/lib/postgresql/data dizini bir volume'a bağlanır

Örnek

Şöyle yaparız

version: '3.8'

services:
  ...
  db:
    image: postgres:15.2
    restart: always
    environment:
      POSTGRES_USER: book-user
      POSTGRES_PASSWORD: k9ZqLC
      POSTGRES_DB: bookdb
    volumes:
      - db-data:/var/lib/postgresql/data
    ports:
      - 6432:5432
volumes:
  db-data:
    driver: local

18 Aralık 2020 Cuma

pgbench komutu

Giriş

pgbench komutu veri tabanında tablolar oluşturur ve bunları 1 milyon satır ile doldurur. Daha sonra test yaparız

Örnek

Şöyle yaparız

$ pgbench -c 10 -j 2 -t 1000 my_benchmark_test_db -h 127.0.0.1 -p 5444 -U postgres
Password:
pgbench (15.1 (Ubuntu 15.1-1.pgdg22.04+1))
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 10
number of threads: 2
maximum number of tries: 1
number of transactions per client: 1000
number of transactions actually processed: 10000/10000
number of failed transactions: 0 (0.000%)
latency average = 75.438 ms
initial connection time = 160.700 ms
tps = 132.559344 (without initial connection time)
$

Sonra shared_buffer seçeneğini değiştirelim. Önce şöyle olsun

$ show shared_buffers;
 shared_buffers
----------------
 128MB
(1 row)

Şöyle yapalım

sudo vi /etc/postgresql/15/main/postgresql.conf

...
#------------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#------------------------------------------------------------------------------
# - Memory -
shared_buffers = 1GB                    # min 128kB
                                        # (change requires restart)

Veri tabanını tekrar başlatalım

sudo systemctl restart postgresql

Değeri kontrol edelim

$ show shared_buffers;
 shared_buffers
----------------
 1GB
(1 row)

Testi tekrar koşalım

$ pgbench -c 10 -j 2 -t 1000 my_benchmark_test_db -h 127.0.0.1 -p 5444 -U postgres
Password:
pgbench (15.1 (Ubuntu 15.1-1.pgdg22.04+1))
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 10
number of threads: 2
maximum number of tries: 1
number of transactions per client: 1000
number of transactions actually processed: 10000/10000
number of failed transactions: 0 (0.000%)
latency average = 47.632 ms
initial connection time = 148.379 ms
tps = 209.944478 (without initial connection time)

latency değeri azaldı. Ayrıca transaction per second değeri de arttı

-C seçeneği - Connection Overhead

Açıklaması şöyle

The -C option in the pgbench indicates that for every single transaction, pgbench will close the open connection and create a new one. This is useful for measuring the connection overhead.

Örnek

Şöyle yaparız

$ pgbench -c 20 -t 100 -S my_benchmark_test_db -h 127.0.0.1 -p 6432 -U my_db_user -C -f mysql.sql
Password:
pgbench (15.1 (Ubuntu 15.1-1.pgdg22.04+1))
starting vacuum...end.
transaction type: multiple scripts
scaling factor: 50
query mode: simple
number of clients: 20
number of threads: 1
maximum number of tries: 1
number of transactions per client: 100
number of transactions actually processed: 2000/2000
number of failed transactions: 0 (0.000%)
latency average = 178.276 ms
average connection time = 8.867 ms
tps = 112.185757 (including reconnection times)
SQL script 1: 
 - weight: 1 (targets 50.0% of total)
 - 1022 transactions (51.1% of total, tps = 57.326922)
 - number of failed transactions: 0 (0.000%)
 - latency average = 85.993 ms
 - latency stddev = 50.377 ms
SQL script 2: mysql.sql
 - weight: 1 (targets 50.0% of total)
 - 978 transactions (48.9% of total, tps = 54.858835)
 - number of failed transactions: 0 (0.000%)
 - latency average = 84.039 ms
 - latency stddev = 51.036 ms

-c seçeneği - the_number_of_clients_to_connect_with

Kaç tane connection açılacağını belirtir.

Örnek

Şöyle yaparız

$  pgbench -c 1000 -T 60 my_benchmark_test_db -h 127.0.0.1 -p 5432 -U my_db_user
Password:
pgbench (15.1 (Ubuntu 15.1-1.pgdg22.04+1))
starting vacuum...end.
pgbench: error: connection to server at "127.0.0.1", port 5432 failed: FATAL:  sorry, too many clients already
connection to server at "127.0.0.1", port 5432 failed: FATAL:  sorry, too many clients already
pgbench: error: could not create connection for client 44

Örnek

Şöyle yaparız

pgbench -c 1000 -T 60 my_benchmark_test_db -h 127.0.0.1 -p 6432 -U my_db_user
Password:
pgbench (15.1 (Ubuntu 15.1-1.pgdg22.04+1))
starting vacuum...end.
transaction type: 
scaling factor: 50
query mode: simple
number of clients: 1000
number of threads: 1
maximum number of tries: 1
duration: 60 s
number of transactions actually processed: 47370
number of failed transactions: 0 (0.000%)
latency average = 1106.280 ms
initial connection time = 8788.955 ms
tps = 903.930420 (without initial connection time)
dmi@dmi-VirtualBox:~$

-i seçeneği - initialize

Örnek

Şöyle yaparız

$ /usr/pgsql-10/bin/pgbench -i -s 5 testdb_1
...

$ psql testdb_1

testdb_1=# \dt+
List of relations
Schema |       Name       | Type    |  Owner   |    Size   | 
--------+------------------+-------+----------+---------+----
public |   pgbench_accounts | table | postgres | 64 MB     |
public |   pgbench_branches | table | postgres | 40 kB     |
public |   pgbench_history  | table | postgres | 0   bytes |
public |   pgbench_tellers  | table | postgres |   40 kB   |
(4 rows)

-s seçeneği - scale

Normal veri setinden ne kadar daha fazla kullanılacağını belirtir

Örnek

Şöyle yaparız

$ pgbench -i -s 50 my_benchmark_test_db -h 127.0.0.1 -p 5444 -U postgres
Password:
dropping old tables...
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
creating tables...
generating data (client-side)...
5000000 of 5000000 tuples (100%) done (elapsed 10.19 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 30.29 s (drop tables 0.05 s, create tables 0.04 s, client-side generate 10.64 s, vacuum 4.75 s, primary keys 14.81 s).
$

-t seçeneği - the_number_of_transactions_to_execute

Söz dizimi şöyle

pgbench -c <the_number_of_clients_to_connect_with> -j <the_number_of_workers_processes> 
  -t <the_number_of_transactions_to_execute> <sample_db_name>

-T seçeneği - duration of the test

Örnek

Şöyle yaparız

pgbench -c 10 -j 2 -t 1000 my_benchmark_test_db -h 127.0.0.1 -p 5444 -U postgres

Örnek

Şöyle yaparız

pgbench -c 50 -j 2 -T 180 benchmark_delay

Açıklaması şöyle

In this example, -c sets the number of client connections, -T defines the duration of the test in seconds, and -U specifies the user.

16 Aralık 2020 Çarşamba

PostGIS ST_DISTANCE

Giriş

İmzası şöyle

ST_Distance(geometry g1, geometry g2);

Örnek
İki nokta arasındaki mesafeyi şöyle buluruz.

SELECT ST_Distance(ST_GeomFromText('POINT(27.185425 88.124582)',4326),
 ST_GeomFromText('POINT(27.1854258 88.124500)', 4326));

Örnek

Tabloya yeni bir sütun ekleyelim ve index koyalım

ALTER TABLE clients_details_locations ADD COLUMN geom geometry(Point, 4326);

UPDATE clients_details_locations 
   SET geom = ST_SetSRID(ST_MakePoint(longitude , latitude), 4326);

CREATE INDEX clients_details_locations_geom_idx  ON clients_details_locations 
  USING GIST (geom);

Bir noktaya en yakın noktaları bulmak için şöyle yaparız

SELECT ... order by st_distance(geom,client_point)

PostGIS Extension

Giriş

Açıklaması şöyle

PostGIS is the spatial database extension for PostgreSQL. It has over 300 different built-ins and functions to make it easier to work with spatial data.

PostgreSQL bir sürü şeyi daha destekliyor. Açıklaması şöyle

- OLTP (Online Transaction Processing)
We can use PostgreSQL for CRUD (Create-Read-Update-Delete) operations.
- OLAP (Online Analytical Processing)
We can use PostgreSQL for analytical processing. PostgreSQL is based on 𝐇𝐓𝐀𝐏 (Hybrid transactional/analytical processing) architecture, so it can handle both OLTP and OLAP well.
- FDW (Foreign Data Wrapper)
A FDW is an extension available in PostgreSQL that allows us to access a table or schema in one database from another.
- Streaming
PipelineDB is a PostgreSQL extension for high-performance time-series aggregation, designed to power real-time reporting and analytics applications.
- Geospatial
PostGIS is a spatial database extender for PostgreSQL object-relational database. It adds support for geographic objects, allowing location queries to be run in SQL.
- Time Series
Timescale extends PostgreSQL for time series and analytics. For example, developers can combine relentless streams of financial and tick data with other business data to build new apps and uncover unique insights.
- Distributed Tables
CitusData scales Postgres by distributing data & queries.

Şeklen şöyle

Spatial Data Nedir?

Açıklaması şöyle. Konum bilgisi taşıyan veridir.

Spatial Data, often referred to as geospatial data, is any data that contains information about a specific location. In layman's terms, spatial data is data about location.

Spatial Data Types Nedir?

Açıklaması şöyle.

The two primary spatial data types are Geometric and Geographic data.

Geographic data is data that can be mapped to a sphere (the sphere in question is usually planet earth). Geographic data typically refers to longitude and latitude related to the location of an object on earth. GPS data is a good example of geographic data.

Geometric data is data that can be mapped to a two-dimensional flat surface. A good example of geometric data would be the floor plan of a building.

Açıklaması şöyle

PostgreSQL natively supports NoSQL as well as a rich set of data types, including Numeric Types, Boolean Type, Network Address, Bit String Types, Arrays, Composite Types, Object Identifier Types, Pseudo-Types, and even Geometric Types like Points, Line Segments, Boxes, Paths, Polygons, and Circles. It also supports JSON, hstore, and XML, and users can even add new types using the CREATE TYPE command. Postgres also supports a lot of SQL syntaxes, such as common table expressions, Windows functions, and table inheritance.

PostGIS Kurulumunu Kontrol Etmek

Şu iki komuttan birisini çalıştırırız

SELECT PostGIS_version();
SELECT PostGIS_full_version();

Eğer kurulu değilse çıktı olarak Türkçe şunu alırız

HATA: postgis_full_version() fonksiyonu mevcut değildir

İngilizcesi şöyle

ERROR:  function postgis_full_version() does not exist

Diğer

Bazı diğer yazılar şöyle

7 Aralık 2020 Pazartesi

LIKE

Giriş

Eğer Tam eşitlik kontrolü yapmak istiyorsak "=" kullanırız

Örnek

Şöyle yaparız

WHERE description = 'FPS'

LIKE

Örnek

Şöyle yaparız

WHERE description LIKE '%FPS'

3 Aralık 2020 Perşembe

CROSS JOIN

Giriş

Sol tablodaki her bir satır için sağ tablodaki tüm satırları çaprazlar

Örnek - interval

Elimizde şöyle iki tablo olsun

create table employee (
    id int,
    name char(20),
    division_id int
);

create table attendance (
    id int,
    employee_id int,
    activity_type int,
    created_at timestamp
);

attendance değerlerini günlük olarak görmek istersek şöyle yaparız

SELECT 
  days::date AS created_date,
  e.*
   
FROM (
  SELECT MIN(created_at), MAX(created_at) FROM attendance) AS r(startdate,enddate),
 generate_series(
    startdate::timestamp, 
    enddate::timestamp, 
    interval '1 day') g(days)
CROSS JOIN  employee e

Çıktı olarak şunu alırız. Burada her created_day için tüm çalışanları (toplam 5 kişi) teker teker yazdı

created_date	id	name	division_id
2020-11-18	1	John    1
2020-11-18	2	Amber   2
2020-11-18	3	Mike    1
2020-11-18	4	Jimmy   1
2020-11-18	5	Kathy   2
2020-11-19	1	John    1
2020-11-19	2	Amber   2
2020-11-19	3	Mike    1
2020-11-19	4	Jimmy   1
2020-11-19	5	Kathy   2

TIMESTAMP metodu

Giriş

Precision verilebilir. Açıklaması şöyle.

Postgres allows you to specify precision(0 to 6) while casting to TIMESTAMP

Örnek

select (now() at time zone 'utc') normalde şu çıktıyı verir

2020-12-03 09:39:28.992948

Şöyle yaparız

select (now() at time zone 'utc') :: timestamp(3)

Çıktı olarak 972 ile biten şu değeri alırız.

2019-01-29 08:54:28.972

Örnek - String Girdi

Şöyle yaparız.

select timestamp '2012-08-31 01:00:00';

Çıktı olarak şunu alırız

2012-08-31 01:00:00

Çıkartma İşlemi
Açıklaması şöyle.
Subtracting timestamps produces an INTERVAL data type. INTERVALs are a special data type for representing the difference between two TIMESTAMP types. When subtracting timestamps, Postgres will typically give an interval in terms of days, hours, minutes, seconds, without venturing into months. This generally makes life easier, since months are of variable lengths.