PostgreSQL Çorbası: Temmuz 2019

25 Temmuz 2019 Perşembe

Date/Time Sütun Tipleri - DATE - Sadece Tarihi Saklar

Giriş
Java'daki LocalDate sınıfına denk gelir. Sadece tarih bilgisini saklar. Çıktı olarak şunu görürüz

2012-02-11

Cast To Date

Açıklaması şöyle

A type cast specifies a conversion from one data type to another.

PostgreSQL accepts two equivalent syntaxes for type casts, the PostgreSQL-specific value::type and the SQL-standard CAST(value AS type).

In this specific case, '{apple,cherry apple, avocado}'::text[]; takes the string literal {apple,cherry apple, avocado} and tells PostgreSQL to interpret it as an array of text.

Şöyle yaparız

SELECT '{apple,cherry apple, avocado}'::text[];

Örnek

timestamp tipindeki sütunun sadece date kısmını almak için şöyle yaparız

select a.created_at::date from attendance a;

Java 7

java.util.Date kullanılır

Java 8
Java'daki LocalDateTime sınıfına denk gelir. Açıklaması şöyle.

Örnek
Şöyle yaparız.

@Column(name = "local_date", columnDefinition = "DATE")
private LocalDate localDate;

Örnek - Date vs Char

Date tipi ile ORDER BY kullanılabilir. Char ile kullanmak doğru olmaz. Farkı görmek için şöyle yaparız

SELECT char_date, date_date FROM table1 ORDER BY char_date ; 

+-------------+-------------+
| char_date   | date_date   | 
+-------------+-------------+
| 01-Mar-2021 | 01-Mar-2021 |
| 12-Feb-2021 | 12-Feb-2021 |
| 23-Jan-2021 | 23-Jan-2021 | 
+-------------+-------------+

SELECT char_date, date_date FROM table1 ORDER BY date_date ; 

+-------------+-------------+
| char_date   | date_date   | 
+-------------+-------------+
| 23-Jan-2021 | 23-Jan-2021 | 
| 12-Feb-2021 | 12-Feb-2021 |
| 01-Mar-2021 | 01-Mar-2021 |
+-------------+-------------+

Date/Time Sütun Tipleri - TIMESTAMP WITHOUT TIME ZONE - Kullanmamak Lazım Çünkü Moment Değildir

Giriş
Kullanmayın diyorum çünkü Java'daki LocalDateTime gibidir.

Not : DATE sütun tipine de bakabilirsiniz.

Not : Verilen bir integer değerini time zone ile formatlamak için TO_TIMESTAMP metodu yazısına bakılabilir.

Açıklaması şöyle. Yani WITHOUT TIME ZONE kısmı yazılmayabilir.

The SQL standard requires that writing just timestamp be equivalent to timestamp without time zone, and PostgreSQL honors that behavior.

Bu alan zaman çizgisinden bağımsız bir tarih ve saati saklamak için kullanılır. Açıklaması şöyle.

This type does not represent a moment. It represents a vague idea of potential moments along a range of about 26-27 hours (the range of time zones around the globe).

Verilen bir örnek şöyle.

The WITHOUT only makes sense when you have the vague idea of a date-time rather than a fixed point on the timeline. For example, "Christmas this year starts at 2016-12-25T00:00:00" would be stored in the WITHOUT as it applies to any time zone, not yet having been applied to any one single time zone to get an actual moment on the timeline.

Verilen bir başka örnek şöyle.

All our factories around the world take lunch at 12:30 PM". That would be LocalTime, and for a particular date, LocalDateTime. But that has no real meaning, not an actual point on the timeline, until you apply a time zone to get a ZonedDateTime. That lunch break will be at different points on the timeline in the Delhi factory than the Düsseldorf factory and different again at the Detroit factory.

Örnek
Şöyle yaparız.

CREATE TABLE reference_values
(
  created_on timestamp without time zone,
  ...
)

Örnek

Şöyle yaparız

create table attendance (
    id int,
    employee_id int,
    activity_type int,
    created_at timestamp
);

INSERT INTO attendance VALUES
(1,	1,	1,'2020-11-18 7:10:25'),
(2,	2,	1,'2020-11-18 7:30:25'),
(3,	3,	1,'2020-11-18 7:50:25'),
(4,	2,	2,'2020-11-18 19:10:25'),
(5,	3,	2,'2020-11-18 19:22:38'),
(6,	1,	2,'2020-11-18 20:01:05'),
(7,	1,	1,'2020-11-19 7:11:23');

Java 8 Kullanımı

java.time.LocalDateTime ile bu sütun tipi kullanılabilir. Açıklaması şöyle.

The equivalent of Postgres’ TIMESTAMP WITHOUT TIME ZONE in java.time is java.time.LocalDateTime.

Örnek
Şöyle yaparız.

@Column(name = "local_date_time", columnDefinition = "TIMESTAMP")
private LocalDateTime localDateTime;

Örnek

Elimizde şöyle bir Java kodu olsun

@Entity
@Table(name = "date_tb")
public class Data implements Serializable {
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Integer id;

  @Column(name = "date_str")
  private String dateStr;

  @Column(name = "date")
  private Date date;

  @Column(name = "local_time")
  private LocalTime localTime;

  @Column(name = "local_Date")
  private LocalDate localDate;

  @JsonFormat(shape = JsonFormat.Shape.STRING, pattern = "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'")
  @Column(name = "local_datetime_dt")
  private LocalDateTime localDateTimeDt;

  @Column(name = "local_datetime_ts")
  private LocalDateTime localDateTimeTs;

  @Column(name = "offset_datetime")
  private OffsetDateTime offsetDateTime;

  @Column(name = "zoned_datetime")
  private ZonedDateTime zonedDateTime;
}

Şu SQL üretilir. LocalDateTime için TIMESTAMP WITHOUT TIME ZONE üretilmesi normal, ancak OffsetDateTime ve ZonedDateTime için niye TIMESTAMP WITH TIME ZONE üretilmiyor bilmiyorum

#Mapping to this PostgreSQL 
CREATE TABLE IF NOT EXISTS date_tb (
  "id" SERIAL,
  "date_str" VARCHAR(500) NULL,
  "date" TIMESTAMP NULL,
  "local_time" TIMESTAMP NULL,
  "local_date"  DATE NULL,
  "local_datetime_dt" TIMESTAMP NULL,
  "local_datetime_ts" TIMESTAMP NULL,
  "offset_datetime" TIMESTAMP NULL,
  "zoned_datetime" TIMESTAMP NULL,

   PRIMARY KEY ("id")
)

Java 7 Kullanımı
Hibernate/JPA veritabanını oluştururken java.sql.Timestamp tipindeki alanı Postgre'de TIMESTAMP tipine çevrilir. Ancak bence bu doğru değil. Açıklaması şöyle.

For example: Your local time zone is GMT+2. You store "2012-12-25 10:00:00 UTC". The actual value stored in the database is "2012-12-25 12:00:00".

24 Temmuz 2019 Çarşamba

GENERATE_SERIES - Belirtilen iki aralık arasında değer dizisi üretir

Giriş

Belirtilen iki aralık arasında değer dizisi üretir. Tabloyu doldurmak/populate için kullanılabilir

Örnek
Tabloyu doldurmak için şöyle yaparız.
-- Sample table with 1 million rows CREATE TABLE large_table ( id INT, name VARCHAR(50), address VARCHAR(100) ); -- Insert 1 million rows INSERT INTO large_table SELECT id, concat('Name ', id), concat('Address ', id) FROM generate_series(1, 1000000) AS id;
Örnek - interval

Elimizde şöyle iki tablo olsun

create table employee (
    id int,
    name char(20),
    division_id int
);

create table attendance (
    id int,
    employee_id int,
    activity_type int,
    created_at timestamp
);

attendance değerlerini günlük olarak görmek istersek şöyle yaparız. FROM içindeki SELECT cümlesinde attendance tablosundaki en küçük ve en büyük created_at değerleri alınıyor. Bu değerler generate_series()'e veriliyor. Bu çağrı sonucunda g g(days) ile days isimli bir tablo oluşturuluyor. Bu tablonun date isimli bir sütunu var. Dışarıdaki SELECT içinde bu sütuna erişiliyor.

SELECT 
  days::date AS created_date,
  e.*
   
FROM (SELECT MIN(created_at), MAX(created_at) FROM attendance) AS r(startdate,enddate),
 generate_series(startdate::timestamp,enddate::timestamp,interval '1 day') g(days)
CROSS JOIN  employee e

Çıktı olarak şunu alırız

created_date	id	name	division_id
2020-11-18	1	John    1
2020-11-18	2	Amber   2
2020-11-18	3	Mike    1
2020-11-18	4	Jimmy   1
2020-11-18	5	Kathy   2
2020-11-19	1	John    1
2020-11-19	2	Amber   2
2020-11-19	3	Mike    1
2020-11-19	4	Jimmy   1
2020-11-19	5	Kathy   2

Örnek

Şöyle yaparız. Bu sefer belirtilen iki tarih arasında 1 günlük aralıklar oluşturuluyor

SELECT 
     d::date AS created_date,  
     e.id,  
     e.company_id,
     e.division_id
     
FROM(SELECT MIN(date '2020-11-23'), MAX(date '2020-11-24') FROM employee_table)
AS r(startdate,enddate), 
  generate_series(startdate::timestamp,enddate::timestamp,interval '1 day') g(d)
CROSS JOIN  employee_table e

JSON_AGG

Giriş
Select ile gelen veriyi json dizini haline getirir.

Örnek
Şöyle yaparız.

SELECT json_agg(your_texts) FROM your_table

Çıktı olarak şunu alırız.

["687ccca","da075bd","4194d"]

String Functions - String Birleştirme

Giriş
Or (||) ile yapılır.

Örnek
Şöyle yaparız.

translate metodu - string + from + to
Açıklaması şöyle.

Any character in string that matches a character in the from set is replaced by the corresponding character in the to set. If from is longer than to, occurrences of the extra characters in from are removed.

Örnek
Y yerine Z, Z yerine Y koymak için şöyle yaparız.

select translate(SN, 'YZ', 'ZY') from my_table;

9 Temmuz 2019 Salı

pg_trgm Module - Pattern Matching Full Text Search İçindir

Giriş
Pattern matching veya full text search başlığı altına girer. Açıklaması şöyle. trgm kısaltması trigrams anlamına gelir.

The pg_trgm module provides functions and operators for determining the similarity of alphanumeric text based on trigram matching

trigram için açıklaması şöyle. Yani 3 karakter anlamına gelir.

Postgres provides a module called pg_trgm which ships different functions to work with trigrams matching. A trigram is a group of three consecutive characters taken from a string. Based on this we can measure the similarity of two strings by counting the number of trigrams they share and estimating how similar they are on a scale between 0 and 1. For example, the string “hello” would be represented by the following set of trigrams:

“ h”, “ he”, “hel”, “ell”, “llo”, “lo “

In order to make use of this module we need to enable it using the following statement:

CREATE EXTENSION pg_trgm;

pg_trm module için açıklama şöyle. Yani diğer araçlar kadar gelişmiş olmayabilir

PostgreSQL has a lot of advanced extensions and indexes to speed up the same. But one should keep in mind that these have their own limitations in terms of speed and might not work with languages other than English/multi-byte encoding, such as UTF-8. However, these fill the critical gap where the functionalities mentioned are necessary, but we cannot go for a full-fledged search solution such as Solr/Elasticsearch.

Kurulum

Şöyle yaparız

CREATE EXTENSION pg_trgm;

tsvector Nedir?

tsvector Veri Yapısı Nedir? yazısına bakabilirsiniz

to_tsvector Metodu - Text to tsvector Type

to_tsvector metodu yazısına bakabilirsiniz

tsquery Nedir? - Searching Against the Vector

Örnek

Şöyle yaparız ve çıktı olarak şunu alırız

SELECT to_tsvector('the cat got scared by a cucumber') 
@@to_tsquery('scares')

true

Açıklaması şöyle

By now, we know that the tsvector is not a standard text column, and there is a lot of pre-processing that happens for easier search. We need to use a particular function called tsquery to search against the tsvector column.

The output as we see is simply true or false. In this case, note that we searched for the word scares, but still, the result was true. That was because of the normalization we saw earlier. The @@ operator is used for direct matches. We have other operators that perform more sophisticated matches.

Örnek

Şöyle yaparız

SELECT episode_narrative FROM storm_events WHERE
  episode_narrative_tsv @@ to_tsquery('logan & international & airport')

GIST Index Yaratma - Generalized Search Tree

Örnek

Şöyle yaparız

CREATE TABLE IF NOT EXISTS PEOPLE (
  ID VARCHAR(36) PRIMARY KEY,
  NICKNAME VARCHAR(32) UNIQUE NOT NULL,
  NAME VARCHAR(100) NOT NULL,
  BIRTHDATE VARCHAR(10) NOT NULL,
  STACK TEXT[],
  SEARCH_TRGM TEXT
);

CREATE EXTENSION IF NOT EXISTS PG_TRGM;
CREATE INDEX CONCURRENTLY IF NOT EXISTS IDX_PEAPLE_SEARCH_TGRM 
ON PEOPLE USING GIST (SEARCH_TRGM GIST_TRGM_OPS);

GIN Index Yaratma - Generalized Inverted Index

gin biraz daha hızlı bir Full Text Search index galiba.

Örnek

Şöyle yaparız

CREATE INDEX trgm_idx on storm_events USING gin(event_type GIN_TGRM_OPS);

Örnek - tsvector alanına göre index

Şöyle yaparız

ALTER TABLE storm_events ADD COLUMN episode_narrative_tsv tsvector;
    
UPDATE storm_events SET episode_narrative_tsv = to_tsvector(episode_narrative);

CREATE INDEX tsv_idx ON storm_events USING gin(episode_narrative_tsv);

operator <->

Örnek
Şöyle yaparız.

SELECT t, 'super cool' <-> t AS dist
  FROM test_trgm
  ORDER BY dist DESC LIMIT 10;

similarity metodu
Açıklaması şöyle

The SIMILARITY function returns what is called a score. We can adjust this score according to the matches we need. The default score is 0.3 And if we are not worried about the score, then we can use the below shorthand.

SELECT * FROM storm_events WHERE event_type % 'Drot'

Örnek

Tablomuzda şöyle bir veri olsun

- Hello
? Hello
| Hello
$ Hello
! Hello
!? Hello

Şöyle yaparız.

SELECT my_column, similarity('$ Hello', my_column) AS sml
FROM my_table
WHERE my_column % '$ Hello'
ORDER BY sml DESC, my_column;

Çıktı olarak 1 alırız.

show_trgm metodu
Örnek
Şöyle yaparız.

SELECT show_trgm('$ Hello');

            show_trgm            
---------------------------------
 {"  h"," he",ell,hel,llo,"lo "}
(1 row)

SELECT show_trgm('- Hello');

            show_trgm            
---------------------------------
 {"  h"," he",ell,hel,llo,"lo "}
(1 row)

word_similarity metodu
Örnek ver

strict_word_similarity metodu
Örnek ver

show_limit metodu
Örnek ver

show_limit metodu
Örnek ver

STRING_TO_ARRAY metodu

Elimizde şöyle bir tablo olsun

CREATE SEQUENCE todo_id_seq
  START WITH 1
  INCREMENT BY 10
  NO MINVALUE
  NO MAXVALUE CACHE 1;

CREATE TABLE t_todo
(
  id         BIGINT                 NOT NULL DEFAULT nextval('todo_id_seq'),
  ext_ref    CHARACTER VARYING(255) NOT NULL,
  title      CHARACTER VARYING(255) NOT NULL,
  active     BOOLEAN                NOT NULL DEFAULT TRUE,
  created_at TIMESTAMP              NOT NULL DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP              NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (id),
  UNIQUE (ext_ref),
  UNIQUE (title)
);

Şöyle yaparız

@Repository
public interface ToDoRepository extends JpaRepository<ToDo, Long> {

  @Query(value = "SELECT * FROM t_todo todo WHERE ?1 %
                  ANY(STRING_TO_ARRAY(todo.title,' '))", 
         nativeQuery = true)
 List<ToDo> findAllMatching(String partialTitle);
}

Açıklaması şöyle

One more thing…the % operator uses the default similarity threshold having a value of 0.3 but if we need to adjust it (e.g 0.5) we can use the SIMILARITY function:

Bu durumda şöyle yaparız

SELECT * FROM t_todo todo WHERE SIMILARITY(todo.title, 'Clean') > 0.4

2. Root (Kök ve Lexeme) Kelime Arama

Örnek

Elimizde şöyle bir tablo olsun

CREATE TABLE log(
   name TEXT,
   description TEXT
);

INSERT INTO log VALUES('ERROR1','Failed to retreive credentials');
INSERT INTO log VALUES('ERROR2','Fatal error. No records present. Please try again with a
 different value');
INSERT INTO log VALUES('ERROR3','Unable to connect. Credentials missing');

Arama için şöyle yaparız

SELECT * FROM log WHERE to_tsvector(description) @@ to_tsquery('miss');

Açıklaması şöyle

The to_tsvector function converts the values to their lexeme and the to_tsquery function will try to match the words.

3. Levenshtein Matching

Kurulum

Şöyle yaparız

CREATE EXTENSION fuzzystrmatch;
Örnek

Şöyle yaparız

SELECT * FROM storm_events WHERE levenshtein(event_type, 'Drot') < 4

4. Phonetic Match

Açıklaması şöyle

The fuzzystrmatch extension has another cool feature where we can search for text which does not have similar spelling but sounds the same.

Örnek

Şöyle yaparız. Telaffuzu "he" kelimesine benzeyen Heat ve Hail gibi şeyler döndürür

SELECT DISTINCT(event_type) FROM storm_events WHERE DIFFERENCE(event_type, 'he') > 2