Encoded Vector Index or EVI
- It’s an IBM’s version of Bitmap index.
- EVI is relatively new compared to Radix Index.
- Logical files do not support EVI.
- Use “CREATE ENCODED VECTOR INDEX ….” Statement to create EVI index on table
- It’s very efficient for:
- Small set or say defined set of distinct keys like States in USA
- Read-only table
Quick look into Bitmap Index
Let say this table represent the keys and corresponding RRNs.
Here is the Bitmap index for these keys(with RRN)
|TX||1||0||0||1||0||RRN 001 and 005 contains the key APPLE|
|AZ||0||1||0||0||0||RRN 002 contains the key MANGO|
|OH||0||0||1||0||1||RRN 003 and 005 contains the key PEACH|
- BitMap Index is good for small values.
- As the number of keys increase (new RRNs) numbers of bits needs to store the index also increased. (Issue 1)
- When a new key is added, System has to update the complete index. (Issue 2)
So, IBM create a modified version called Encoded Vector Index aka EVI
EVI: Encoded Vector Index
EVI has 2 parts
- Symbol table
- It contains a distinct list of keys
- A reference (called code) to an element in Vector list
- And it may contain the values for some SQL functions like sum(), count(),avg().
- Vector list
- It’s just a list
- Each position represents the RRN from base table
- This position is the only link between the Index and the base table.
- The value at a position is the link between Symbol table and this list.
Let say this is the data in the base table
The Symbol table might look like this
|Key||Code (ref to Vector list)||Count()||Some other functions output|
Vector List :
|Position in the list||Value|
So, let say system need to find a match for key value “TX”
- Look into the Symbol table for the code for this key value TX = 99
- Now check the vector list for all the values equals to 99
- Available at position 1 and 4
- RRN 1 and 4 in the base table contains the record for the key “TX”.
If you add a new Key in the symbol table, it will only impact a single entry in the vector list. So unlike Bitmap Index system don’t need to update the complete index. The system will just create a new entry in the vector list.
Memory allocation for the CODE (the reference between the Symbol table and the Vector list)
- By default, DB2 allocate 1 byte for the code.
- Means total 256 distinct keys (28) can be added in the symbol table.
- If the number of distinct keys increased beyond 256
- DB2 will allocate 2 bytes (then 4 bytes .. 8 bytes and so on) for the code
- And rebuild the index.
- So, every time size increase, System has to rebuild the complete Index.
- To limit the number of these rebuilds, you can give some idea to the system of how many distinct keys you are expecting in this index.
- So, based on this idea, System will allocate an appropriate number of bytes for the code.
CREATE ENCODED VECTOR INDEX XYZ on ABC(state)
WITH 3650 DISTINCT VALUES
- From this statement, System gets the idea that there might be 3650 distinct keys in this index.
- So, 1-byte won’t be enough for 3650 values
- But 2 bytes will be enough for now.
- This is not a hard limit. If Number of distinct keys in the index goes beyond the limit of 2 bytes, System will allocate extra space and rebuild the index.