20 Ağustos 2019 Salı

High Cardinality - Öğe Sayısı

Giriş
Açıklaması şöyle.
In the world of databases, cardinality refers to the number of unique values contained in a particular column, or field, of a database.
Örnek
Açıklaması şöyle.
Say there are 10,000 pieces of equipment, each with 100 sensors, running 10 different firmware versions, spread across 100 sites:

The maximum cardinality of this dataset then becomes 1 billion [10,000 x 100 x 10 x 100].

Now imagine that the equipment can move as well, and we’d like to store the precise GPS location (lat, long) and use that as indexed metadata to query by. Because (lat, long) is a continuous field (as opposed to a discrete field like equipment_id), by indexing on location, the max cardinality of this dataset is now infinitely large (unbounded).
High Cardinality Varsa
Aslında bu kötü bir şey olduğu anlamına gelmez. Ancak cardinality değerini düşürmek için bazı çözümler şöyle.
1. The first question you can answer is: do you need every unique value that you’re storing? For example, you might be able to insert data every minute instead of every 5 seconds without losing the patterns in your data.
2. Another option is to expire data after a specified window of time to keep the dataset smaller.

Hiç yorum yok:

Yorum Gönder