Index

DB2 for I: Binary-Radix Index


Binary-Radix Index

  1. This is default index for CRTLF command and CREATE INDEX statement.
  2. It uses the Radix tree data structure.
    1. The radix tree data structure is a compressed version of the digital tree also called Trie (try) data structure.

There are the main terms in action here
1. Binary tree data structure
2. Trie data structure
3. Radix tree data structure

Binary tree data structure:(Google for more details)

1. It’s a tree data structure where each node can have maximum 2 child nodes.
a. Simple called left node and the right node
2. This structure is arranged in such a way that the value of left child node is less than the value parent node and value of right child node is greater than or equal to the value of parent node.
3. If use Binary search algorithm to look find a match.

Something like this

Binary tree

More details here.

Trie data structure

  1. It’s a tree-like data structure.
  2. Each node represents a step or a try towards the required match.
  3. Its root node is always empty.
  4. Each node can contain any number of child nodes.
  5. Right node always contains bigger value than its sibling left node.

 

For example, let say we have following string key values

  1. APPLE
  2. CAN
  3. CAT
  4. CORN
  5. RAT
  6. RED
  7. RUSH

In this case, Trie data structure uses individual characters from each string to make sure that a value can be retrieved from the data structure by reading a single branch starting from top node.

So, these values will be represented in this way

 

Here is how this data structure find a value

  1. Find “RED”

  1. Find “CORN”

 

There is one more term in play here that is “redundant nodes”. Any node with just a single child is called a redundant node. For example, from the branch for the key “APPLE” node A is a redundant node.

Why the redundant nodes matter?

We have only one key starting with letter A. And still, we created 5 nodes to represent a single value and data structure have to take 6 steps to find the match.

Here are other redundant  nodes

 

Radix Tree Data structure(Not the Binary-Radix index).

As I said earlier “Radix tree data structure is a compressed version of Trie (try) data structure.”

So, it compresses these redundant nodes it can save a lot of space and steps to find the final match.

For example, if we compress the five nodes for key “APPLE” into a single node for string “APPLE” it will save a lot of space and step to find the match and that’s what Radix tree does.

So, for the same set of keys, Radix tree will look like this:

  1. All the redundant nodes are compressed to make a single node.
  2. Like Trie data structure, one node can have any number of child nodes.

 

Back to the Binary-Radix Index.

  1. It’s a mix of Binary Tree and Radix tree
    1. Each node can have maximum 2 child nodes. Simply called the left child and right child.
  2. It uses the Binary search algorithm to find a match.

 

Let say data in the table looks something like this

Key field RRN
CAN 1
APPLE 2
CAT 3
RED 4
RAT 5
RUSH 6
CORN 7

 

Now if you search for key = ”RUSH”

  1. This data structure will go through following nodes
    1. ROOT
    2. “R”
    3. “USH”
    4. And will return the RRN = 6.
  2. Now RDBMS can quickly find a record in the table with RRN and return the all required columns.
  3. Nodes also contain some metadata.
  4. Access path maintenance all about rearranging this tree to reflect the changes in the base table.

 

More references:

  1. https://medium.com/basecs/compressing-radix-trees-without-too-many-tears-a2e658adb9a0

IBM i developer.

View Comments
There are currently no comments.