24 Aralık 2020 Perşembe

ALTER TABLE

Giriş
Söz dizimi şöyle
ALTER TABLE table_name
ADD column_name data_type [constraint],
MODIFY column_name data_type [constraint],
DROP column_name,
ADD CONSTRAINT constraint_name constraint_definition,
DROP CONSTRAINT constraint_name;

DROP COLUMN
Örnek
Normalde şöyle yaparız
ALTER TABLE foo DROP COLUMN bar;
Ancak bazen "vsnprintf failed: Invalid argument" şeklinde bir hata geliyor. Bu durumda şöyle yaparız
SET lc_messsages = 'C';
ALTER TABLE
foo DROP COLUMN bar;
REPLICA IDENTITY
Örnek
Şöyle yaparız
ALTER TABLE ingredients REPLICA IDENTITY FULL;
Açıklaması şöyle
The ALTER TABLE command with the REPLICA IDENTITY clause is used to set the replication identity for a table. When using Debezium's PostgreSQL connector, the connector requires a unique primary key or a unique identifier to keep track of changes. If a table does not have a primary key or a unique identifier, you will get an error, saying that the table does not have a replica identity, which is used for tracking changes. By running ALTER TABLE ... REPLICA IDENTITY FULL, you're setting the table's replication identity to "full", which means that the entire row is used as the identifier for change tracking purposes.

21 Aralık 2020 Pazartesi

Docker Compose ve PostgreSQL

Giriş
Docker compose ile kullanmak için bazı notlar

Image İsmi
Şunlar olabilir
- postgres:11.1
- postgres:13.3
- postgres:15.1
- postgres:15rc2
- debezium/postgres
- debezium/postgres:13

En Basit
Şöyle yaparız
version: '3'
services:

  authorization-db:
    image: postgres:11.1
    container_name: auth-db
    ports:
      - "5432:5432"
command Alanı
Şunlar olabilir
- max_connections
- max_prepared_transactions

Örnek - max_ connections
Şöyle yaparız
services:
  database:
    image: postgres:latest
    command: postgres -c 'max_connections=250'
environment Alanı
Ortam Değişkenleri Şunlar olabilir
- POSTGRES_PASSWORD
- POSTGRES_USER
- POSTGRES_DB

Örnek
Şöyle yaparız. Burada iki tane veri tabanı çalıştırılıyor.
version: '3'
services:
  course-catalog-operational-db:
    image: postgres:13.3
    container_name: course-catalog-operational-db
    command:
      - "postgres"
      - "-c"
      - "wal_level=logical"
    environment:
      POSTGRES_PASSWORD: 123456
      POSTGRES_DB: course-catalog-db
    ports:
      - "5433:5432"
  instructors-legacy-db:
    image: postgres:13.3
    container_name: instructors-legacy-db
    command:
      - "postgres"
      - "-c"
      - "wal_level=logical"
    environment:
      POSTGRES_PASSWORD: 123456
      POSTGRES_DB: instructors-db
    ports:
      - "5434:5432"
    volumes:
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql
Örnek - Hasura GraphQL + PostgreSQL 15
Şöyle yaparız
version: '3.6'
services:
  postgres:
    image: postgres:15rc2
    restart: always
    volumes:
    - db_data:/var/lib/postgresql/data
    environment:
      POSTGRES_PASSWORD: postgrespassword
    ports:
    - "5432:5432"
  graphql-engine:
    image: hasura/graphql-engine:v2.13.0
    ports:
    - "8080:8080"
    depends_on:
    - "postgres"
    restart: always
    environment:
      ## postgres database to store Hasura metadata
      HASURA_GRAPHQL_METADATA_DATABASE_URL: postgres://postgres:postgrespassword@postgres:5432/postgres
      ## this env var can be used to add the above postgres database to Hasura as a data source. this can be removed/updated based on your needs
      PG_DATABASE_URL: postgres://postgres:postgrespassword@postgres:5432/postgres
      ## enable the console served by server
      HASURA_GRAPHQL_ENABLE_CONSOLE: "true" # set to "false" to disable console
      ## enable debugging mode. It is recommended to disable this in production
      HASURA_GRAPHQL_DEV_MODE: "true"
      HASURA_GRAPHQL_ENABLED_LOG_TYPES: startup, http-log, webhook-log, websocket-log, query-log
      ## uncomment next line to set an admin secret
      HASURA_GRAPHQL_ADMIN_SECRET: myadminsecretkey
volumes:
  db_data:
docker-entrypoint-initdb
Veri tabanı başlarken çalıştırılacak SQL dosyalarını belirtiriz
Örnek
Şöyle yaparız
services:
  postgres:
    image: postgres
    ports:
      - "5432:5432"
    restart: always
    environment:
      POSTGRES_PASSWORD: password
      POSTGRES_DB: blogdb
      POSTGRES_USER: user
    volumes:
      - ./data/postgresql:/var/lib/postgresql
      - ./pg-initdb.d:/docker-entrypoint-initdb.d
healthcheck
Örnek
Şöyle yaparız
-q ile quite belirtiliyor
-d ile ile veri tabanı ismi belirtiliyor
-U ile kullanıcı ismi belirtiliyor
version: '3'services:
  postgres:
    image: postgres:13.1
    healthcheck:
      test: [ "CMD", "pg_isready", "-q", "-d", "postgres", "-U", "root" ]
      timeout: 45s
      interval: 10s
      retries: 10
    restart: always
    environment:
      - POSTGRES_USER=root
      - POSTGRES_PASSWORD=password
      - APP_DB_USER=docker
      - APP_DB_PASS=docker
      - APP_DB_NAME=docker
    volumes:
      - ./db:/docker-entrypoint-initdb.d/
    ports:
      - 5432:5432
Örnek
Şöyle yaparız. Kullanıcı ismi -U ile belirtiliyor
postgres:
  container_name: scheduling-airflow-postgres
  image: postgres:13
  environment:
    POSTGRES_USER: airflow
    POSTGRES_PASSWORD: airflow
    POSTGRES_DB: airflow
  deploy:
    resources:
      limits:
        cpus: "0.40"
        memory: 1200M
  volumes:
    - postgres-db-volume:/var/lib/postgresql/data
  healthcheck:
    test: ["CMD", "pg_isready", "-U", "airflow"]
    interval: 5s
    retries: 5
  restart: always
  profiles:
    - scheduling
  networks:
    - datastack  
restart Alanı
Genellikle always değeri verilir. Açıklaması şöyle
restart always : is used to restart the container if there is an error when creating the container.

volumes Alanı
Pod'un kullandığı /var/lib/postgresql/data dizini bir volume'a bağlanır
Örnek
Şöyle yaparız
version: '3.8'

services:
  ...
  db:
    image: postgres:15.2
    restart: always
    environment:
      POSTGRES_USER: book-user
      POSTGRES_PASSWORD: k9ZqLC
      POSTGRES_DB: bookdb
    volumes:
      - db-data:/var/lib/postgresql/data
    ports:
      - 6432:5432
volumes:
  db-data:
    driver: local



18 Aralık 2020 Cuma

pgbench komutu

Giriş
pgbench komutu veri tabanında tablolar oluşturur ve bunları 1 milyon satır ile doldurur.  Daha sonra test yaparız

Örnek
Şöyle yaparız
$ pgbench -c 10 -j 2 -t 1000 my_benchmark_test_db -h 127.0.0.1 -p 5444 -U postgres
Password:
pgbench (15.1 (Ubuntu 15.1-1.pgdg22.04+1))
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 10
number of threads: 2
maximum number of tries: 1
number of transactions per client: 1000
number of transactions actually processed: 10000/10000
number of failed transactions: 0 (0.000%)
latency average = 75.438 ms
initial connection time = 160.700 ms
tps = 132.559344 (without initial connection time)
$
Sonra shared_buffer seçeneğini değiştirelim. Önce şöyle olsun
$ show shared_buffers;
 shared_buffers
----------------
 128MB
(1 row)
Şöyle yapalım
sudo vi /etc/postgresql/15/main/postgresql.conf

...
#------------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#------------------------------------------------------------------------------
# - Memory -
shared_buffers = 1GB                    # min 128kB
                                        # (change requires restart)
Veri tabanını tekrar başlatalım
sudo systemctl restart postgresql
Değeri kontrol edelim
$ show shared_buffers;
 shared_buffers
----------------
 1GB
(1 row)
Testi tekrar koşalım
$ pgbench -c 10 -j 2 -t 1000 my_benchmark_test_db -h 127.0.0.1 -p 5444 -U postgres
Password:
pgbench (15.1 (Ubuntu 15.1-1.pgdg22.04+1))
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 10
number of threads: 2
maximum number of tries: 1
number of transactions per client: 1000
number of transactions actually processed: 10000/10000
number of failed transactions: 0 (0.000%)
latency average = 47.632 ms
initial connection time = 148.379 ms
tps = 209.944478 (without initial connection time)
latency değeri azaldı. Ayrıca transaction per second değeri de arttı


-C seçeneği - Connection Overhead
Açıklaması şöyle
The -C option in the pgbench indicates that for every single transaction, pgbench will close the open connection and create a new one. This is useful for measuring the connection overhead.
Örnek
Şöyle yaparız
$ pgbench -c 20 -t 100 -S my_benchmark_test_db -h 127.0.0.1 -p 6432 -U my_db_user -C -f mysql.sql
Password:
pgbench (15.1 (Ubuntu 15.1-1.pgdg22.04+1))
starting vacuum...end.
transaction type: multiple scripts
scaling factor: 50
query mode: simple
number of clients: 20
number of threads: 1
maximum number of tries: 1
number of transactions per client: 100
number of transactions actually processed: 2000/2000
number of failed transactions: 0 (0.000%)
latency average = 178.276 ms
average connection time = 8.867 ms
tps = 112.185757 (including reconnection times)
SQL script 1: 
 - weight: 1 (targets 50.0% of total)
 - 1022 transactions (51.1% of total, tps = 57.326922)
 - number of failed transactions: 0 (0.000%)
 - latency average = 85.993 ms
 - latency stddev = 50.377 ms
SQL script 2: mysql.sql
 - weight: 1 (targets 50.0% of total)
 - 978 transactions (48.9% of total, tps = 54.858835)
 - number of failed transactions: 0 (0.000%)
 - latency average = 84.039 ms
 - latency stddev = 51.036 ms
-c seçeneği - the_number_of_clients_to_connect_with
Kaç tane connection açılacağını belirtir. 
Örnek
Şöyle yaparız
$  pgbench -c 1000 -T 60 my_benchmark_test_db -h 127.0.0.1 -p 5432 -U my_db_user
Password:
pgbench (15.1 (Ubuntu 15.1-1.pgdg22.04+1))
starting vacuum...end.
pgbench: error: connection to server at "127.0.0.1", port 5432 failed: FATAL:  sorry, too many clients already
connection to server at "127.0.0.1", port 5432 failed: FATAL:  sorry, too many clients already
pgbench: error: could not create connection for client 44
Örnek
Şöyle yaparız
pgbench -c 1000 -T 60 my_benchmark_test_db -h 127.0.0.1 -p 6432 -U my_db_user
Password:
pgbench (15.1 (Ubuntu 15.1-1.pgdg22.04+1))
starting vacuum...end.
transaction type: 
scaling factor: 50
query mode: simple
number of clients: 1000
number of threads: 1
maximum number of tries: 1
duration: 60 s
number of transactions actually processed: 47370
number of failed transactions: 0 (0.000%)
latency average = 1106.280 ms
initial connection time = 8788.955 ms
tps = 903.930420 (without initial connection time)
dmi@dmi-VirtualBox:~$
-i seçeneği - initialize
Örnek
Şöyle yaparız
$ /usr/pgsql-10/bin/pgbench -i -s 5 testdb_1
...

$ psql testdb_1

testdb_1=# \dt+
List of relations
Schema |       Name       | Type    |  Owner   |    Size   | 
--------+------------------+-------+----------+---------+----
public |   pgbench_accounts | table | postgres | 64 MB     |
public |   pgbench_branches | table | postgres | 40 kB     |
public |   pgbench_history  | table | postgres | 0   bytes |
public |   pgbench_tellers  | table | postgres |   40 kB   |
(4 rows)
-s seçeneği - scale
Normal veri setinden ne kadar daha fazla kullanılacağını belirtir

Örnek
Şöyle yaparız
$ pgbench -i -s 50 my_benchmark_test_db -h 127.0.0.1 -p 5444 -U postgres
Password:
dropping old tables...
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
creating tables...
generating data (client-side)...
5000000 of 5000000 tuples (100%) done (elapsed 10.19 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 30.29 s (drop tables 0.05 s, create tables 0.04 s, client-side generate 10.64 s, vacuum 4.75 s, primary keys 14.81 s).
$
-t seçeneği - the_number_of_transactions_to_execute
Söz dizimi şöyle
pgbench -c <the_number_of_clients_to_connect_with> -j <the_number_of_workers_processes> 
  -t <the_number_of_transactions_to_execute> <sample_db_name>
-T seçeneği - duration of the test

Örnek
Şöyle yaparız
pgbench -c 10 -j 2 -t 1000 my_benchmark_test_db -h 127.0.0.1 -p 5444 -U postgres
Örnek
Şöyle yaparız
pgbench -c 50 -j 2 -T 180 benchmark_delay
Açıklaması şöyle
In this example, -c sets the number of client connections, -T defines the duration of the test in seconds, and -U specifies the user.

16 Aralık 2020 Çarşamba

PostGIS ST_DISTANCE

Giriş
İmzası şöyle
ST_Distance(geometry g1, geometry g2);
Örnek
İki nokta arasındaki mesafeyi şöyle buluruz.
SELECT ST_Distance(ST_GeomFromText('POINT(27.185425 88.124582)',4326),
 ST_GeomFromText('POINT(27.1854258 88.124500)', 4326));
Örnek
Tabloya yeni bir sütun ekleyelim ve index koyalım
ALTER TABLE clients_details_locations ADD COLUMN geom geometry(Point, 4326);

UPDATE clients_details_locations 
   SET geom = ST_SetSRID(ST_MakePoint(longitude , latitude), 4326);

CREATE INDEX clients_details_locations_geom_idx  ON clients_details_locations 
  USING GIST (geom);
Bir noktaya en yakın noktaları bulmak için şöyle yaparız
SELECT ... order by st_distance(geom,client_point)


PostGIS Extension

Giriş 
Açıklaması şöyle
PostGIS is the spatial database extension for PostgreSQL. It has over 300 different built-ins and functions to make it easier to work with spatial data.
PostgreSQL bir sürü şeyi daha destekliyor. Açıklaması şöyle
- OLTP (Online Transaction Processing)
We can use PostgreSQL for CRUD (Create-Read-Update-Delete) operations.
- OLAP (Online Analytical Processing)
We can use PostgreSQL for analytical processing. PostgreSQL is based on 𝐇𝐓𝐀𝐏 (Hybrid transactional/analytical processing) architecture, so it can handle both OLTP and OLAP well.
- FDW (Foreign Data Wrapper)
A FDW is an extension available in PostgreSQL that allows us to access a table or schema in one database from another.
- Streaming
PipelineDB is a PostgreSQL extension for high-performance time-series aggregation, designed to power real-time reporting and analytics applications.
- Geospatial
PostGIS is a spatial database extender for PostgreSQL object-relational database. It adds support for geographic objects, allowing location queries to be run in SQL.
- Time Series
Timescale extends PostgreSQL for time series and analytics. For example, developers can combine relentless streams of financial and tick data with other business data to build new apps and uncover unique insights.
- Distributed Tables
CitusData scales Postgres by distributing data & queries. 

Şeklen şöyle


Spatial Data Nedir?
Açıklaması şöyle. Konum bilgisi taşıyan veridir.
Spatial Data, often referred to as geospatial data, is any data that contains information about a specific location. In layman's terms, spatial data is data about location.
Spatial Data Types Nedir?
Açıklaması şöyle
The two primary spatial data types are Geometric and Geographic data.

Geographic data is data that can be mapped to a sphere (the sphere in question is usually planet earth). Geographic data typically refers to longitude and latitude related to the location of an object on earth. GPS data is a good example of geographic data.

Geometric data is data that can be mapped to a two-dimensional flat surface. A good example of geometric data would be the floor plan of a building.
Açıklaması şöyle
PostgreSQL natively supports NoSQL as well as a rich set of data types, including Numeric Types, Boolean Type, Network Address, Bit String Types, Arrays, Composite Types, Object Identifier Types, Pseudo-Types, and even Geometric Types like Points, Line Segments, Boxes, Paths, Polygons, and Circles. It also supports JSON, hstore, and XML, and users can even add new types using the CREATE TYPE command. Postgres also supports a lot of SQL syntaxes, such as common table expressions, Windows functions, and table inheritance.
PostGIS Kurulumunu Kontrol Etmek
Şu iki komuttan birisini çalıştırırız
SELECT PostGIS_version(); SELECT PostGIS_full_version();
Eğer kurulu değilse çıktı olarak Türkçe şunu alırız
HATA: postgis_full_version() fonksiyonu mevcut değildir
İngilizcesi şöyle
ERROR:  function postgis_full_version() does not exist
Diğer
Bazı diğer yazılar şöyle
ST_MakePointST_MakeValid
 

7 Aralık 2020 Pazartesi

LIKE

Giriş
Eğer Tam eşitlik kontrolü yapmak istiyorsak "=" kullanırız

Örnek
Şöyle yaparız
WHERE description = 'FPS'
LIKE
Örnek
Şöyle yaparız
WHERE description LIKE '%FPS'

3 Aralık 2020 Perşembe

CROSS JOIN

Giriş
Sol tablodaki her bir satır için sağ tablodaki tüm satırları çaprazlar

Örnek - interval
Elimizde şöyle iki tablo olsun
create table employee (
    id int,
    name char(20),
    division_id int
);

create table attendance (
    id int,
    employee_id int,
    activity_type int,
    created_at timestamp
);
attendance değerlerini günlük olarak görmek istersek şöyle yaparız
SELECT
days::date AS created_date, e.* FROM ( SELECT MIN(created_at), MAX(created_at) FROM attendance) AS r(startdate,enddate),
generate_series( startdate::timestamp, enddate::timestamp, interval '1 day') g(days) CROSS JOIN employee e
Çıktı olarak şunu alırız. Burada her created_day için tüm çalışanları (toplam 5 kişi) teker teker yazdı
created_date id name division_id
2020-11-18 1 John    1
2020-11-18 2 Amber   2
2020-11-18 3 Mike    1
2020-11-18 4 Jimmy   1
2020-11-18 5 Kathy   2
2020-11-19 1 John    1
2020-11-19 2 Amber   2
2020-11-19 3 Mike    1
2020-11-19 4 Jimmy   1
2020-11-19 5 Kathy   2

TIMESTAMP metodu

Giriş
Precision verilebilir. Açıklaması şöyle.
Postgres allows you to specify precision(0 to 6) while casting to TIMESTAMP
Örnek 
select (now() at time zone 'utc') normalde şu çıktıyı verir
2020-12-03 09:39:28.992948
Şöyle yaparız
select (now() at time zone 'utc') :: timestamp(3)
Çıktı olarak 972 ile biten şu değeri alırız.
2019-01-29 08:54:28.972
Örnek - String Girdi
Şöyle yaparız.
select timestamp '2012-08-31 01:00:00';
Çıktı olarak şunu alırız
2012-08-31 01:00:00
Çıkartma İşlemi
Açıklaması şöyle.
Subtracting timestamps produces an INTERVAL data type. INTERVALs are a special data type for representing the difference between two TIMESTAMP types. When subtracting timestamps, Postgres will typically give an interval in terms of days, hours, minutes, seconds, without venturing into months. This generally makes life easier, since months are of variable lengths.