Index

DB2 for I: Encoded Vector Index


Encoded Vector Index or EVI

  1. It’s an IBM’s version of Bitmap index.
  2. EVI is relatively new compared to Radix Index.
  3. Logical files do not support EVI.
  4. Use “CREATE ENCODED VECTOR INDEX ….” Statement to create EVI index on table
  5. It’s very efficient for:
    1. Small set or say defined set of distinct keys like States in USA
    2. Read-only table

Quick look into Bitmap Index

Let say this table represent the keys and corresponding RRNs.

KEY RRN
TX 004
AZ 002
OH 003
TX 001
OH 005

Here is the Bitmap index for these keys(with RRN)

001 002 003 004 005
TX 1 0 0 1 0 RRN 001 and 005 contains the key APPLE
AZ 0 1 0 0 0 RRN 002 contains the key MANGO
OH 0 0 1 0 1 RRN 003 and 005 contains the key PEACH

 

  1. BitMap Index is good for small values.
  2. As the number of keys increase (new RRNs) numbers of bits needs to store the index also increased. (Issue 1)
  3. When a new key is added, System has to update the complete index. (Issue 2)

So, IBM create a modified version called Encoded Vector Index aka EVI

EVI: Encoded Vector Index

EVI has 2 parts

  1. Symbol table
    1. It contains a distinct list of keys
    2. A reference (called code) to an element in Vector list
    3. And it may contain the values for some SQL functions like sum(), count(),avg().
  2. Vector list
    1. It’s just a list
    2. Each position represents the RRN from base table
    3. This position is the only link between the Index and the base table.
    4. The value at a position is the link between Symbol table and this list.

 

Let say this is the data in the base table

 

Key RRN
TX 004
AZ 002
OH 003
TX 001
OH 005

 

The Symbol table might look like this

Key Code (ref to Vector list) Count() Some other functions output
TX 99 2 ….
AZ 67 1 ….
OH 82 2 ….

 

Vector List :

Position in the list Value
1 99
2 67
3 82
4 99
5 82

So, let say system need to find a match for key value “TX”

  1. Look into the Symbol table for the code for this key value TX = 99
  2. Now check the vector list for all the values equals to 99
    1. Available at position 1 and 4
  3. RRN 1 and 4 in the base table contains the record for the key “TX”.

If you add a new Key in the symbol table, it will only impact a single entry in the vector list. So unlike Bitmap Index system don’t need to update the complete index. The system will just create a new entry in the vector list.

 

Memory allocation for the CODE (the reference between the Symbol table and the Vector list)

  1. By default, DB2 allocate 1 byte for the code.
    1. Means total 256 distinct keys (28) can be added in the symbol table.
    2. If the number of distinct keys increased beyond 256
      1. DB2 will allocate 2 bytes (then 4 bytes .. 8 bytes and so on) for the code
      2. And rebuild the index.
    3. So, every time size increase, System has to rebuild the complete Index.
  2. To limit the number of these rebuilds, you can give some idea to the system of how many distinct keys you are expecting in this index.
    1. So, based on this idea, System will allocate an appropriate number of bytes for the code.

CREATE ENCODED VECTOR INDEX XYZ on ABC(state)

WITH 3650 DISTINCT VALUES

 

  1. From this statement, System gets the idea that there might be 3650 distinct keys in this index.
    1. So, 1-byte won’t be enough for 3650 values
    2. But 2 bytes will be enough for now.

 

  1. This is not a hard limit. If Number of distinct keys in the index goes beyond the limit of 2 bytes, System will allocate extra space and rebuild the index.

More references

  1. https://www.ibm.com/developerworks/ibmi/library/i-evi-only-access-in-ibm-db2-for-i/index.html
  2. https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_73/rzajq/howeviworks.htm

IBM i developer.

View Comments
There are currently no comments.