2020년 2월 26일 수요일

Costs of Partitioning

Costs of Partitioning




Depending on the actual value distribution and partitioning criteria, the main memory consumption of a table might increase or decrease when it is changed from a non-partitioned to a partitioned table. While this does not initially appear very intuitive, the root cause for this lies in the dictionary compression that is applied.

  • Increased memory consumption due to partitioning

    A table has two attributes, MONTH and YEAR, and contains data for all 12 months and two distinct years (2013 and 2014). When the table is partitioned by YEAR, the dictionary for the MONTH attribute needs to be held in memory twice (both for 2013 and 2014), therefore increasing memory consumption.
  • Decreased memory consumption due to partitioning

    A table has two attributes, GENDER and FIRSTNAME, and stores data about German customers. When the table is partitioned by GENDER, it is divided into two groups (female and male). In Germany, there is a limited set of first names for both females and males. As a result, the FIRSTNAME dictionaries are implicitly partitioned as well into two almost distinct groups, both containing almost n/2 distinct values, compared to the unpartitioned table with n distinct values. Therefore, to represent those values in the index vector, only n-1 bits are required instead of n bits in the original table. As there is virtually no redundancy in the dictionaries, memory consumption can be reduced by partitioning.

댓글 없음:

댓글 쓰기