2020년 2월 24일 월요일

Costs Associated with Indexes


Costs Associated with Indexes

Indexes entail certain costs. Memory consumption and incremental maintenance are two major cost factors that need to be considered.

Memory Consumption

To store the mapping information of value IDs to records, each index uses an inverted list that needs to be kept in main memory. This list typically requires an amount of memory that is of the same order of magnitude as the index vector of the corresponding attribute. When creating a concatenated index (for more information, see above), there is even more overhead because an additional dictionary containing the concatenated values of all participating columns needs to be created as well. It is difficult to estimate the corresponding overhead, but it is usually notably higher than the summed-up size of the dictionaries of the participating columns. Therefore, concatenated indexes should be created with care.
The memory needed for inverted individual indexes includes the memory for inverted indexes on all indexed columns. This memory usage can be queried as a sum of M_CS_COLUMNS.MEMORY_SIZE_INDEX for all indexed columns.
The main advantage of the inverted individual index is its low memory footprint. The memory needed for inverted individual indexes is much smaller than the memory used for the internal index key column that is required for inverted value and inverted hash indexes. In addition, the data and log I/O overhead, table load time, CPU needed for the delta merge, and the overhead of DML update operations are also reduced because an internal column does not need to be maintained. Similarly, the DDL to create an inverted individual index is also much faster than for the other index types because the concatenated string does not need to be generated.

Incremental Maintenance

Whenever a DML operation is performed on the base table, the corresponding index structures need to be updated as well (for example, by inserting or deleting entries). These additional maintenance costs add to the costs on the base relation, and, depending on the number of indexes created, the number of attributes in the base table, and the number of attributes in the individual indexes, might even dominate the actual update time. Again, this requires that care is taken when creating additional indexes.

댓글 없음:

댓글 쓰기