Costs Associated with Indexes
Indexes
entail certain costs. Memory consumption and incremental maintenance are two
major cost factors that need to be considered.
Memory Consumption
To store
the mapping information of value IDs to records, each index uses an inverted
list that needs to be kept in main memory. This list typically requires an
amount of memory that is of the same order of magnitude as the index vector of
the corresponding attribute. When creating a
concatenated index (for more information, see above), there is even more
overhead because an additional dictionary containing the concatenated values of
all participating columns needs to be created as well. It is difficult
to estimate the corresponding overhead, but it is usually notably higher than
the summed-up size of the dictionaries of the participating columns. Therefore,
concatenated indexes should be created with care.
The
memory needed for inverted individual indexes includes the memory for inverted
indexes on all indexed columns. This memory usage can be queried as a sum of M_CS_COLUMNS.MEMORY_SIZE_INDEX for all indexed
columns.
The main advantage of the inverted individual index is its low memory
footprint. The
memory needed for inverted individual indexes is much smaller than the memory
used for the internal index key column that is required for inverted value and
inverted hash indexes. In addition, the data and log I/O overhead, table load
time, CPU needed for the delta merge, and the overhead of DML update operations
are also reduced because an internal column does not
need to be maintained. Similarly, the DDL to create an inverted
individual index is also much faster than for the other index types because the
concatenated string does not need to be generated.
Incremental
Maintenance
Whenever
a DML operation is performed on the base table, the corresponding index
structures need to be updated as well (for example, by inserting or deleting
entries). These additional maintenance costs add to the costs on the base
relation, and, depending on the number of indexes
created, the number of attributes in the base
table, and the number of attributes in the
individual indexes, might even dominate the actual
update time. Again, this requires that care is taken when creating
additional indexes.