SQL OLAP

DB2 SQL OLAP : RANK and DENSE_RANK


In any exam how are participants are ranked ?

  1. Participant(s) with highest marks got the 1st rank.
  2. Participant(s) with 2nd highest marks got the 2nd rank and so on.


Here is a sample table with participant names with marks

Name Marks
Marco Aguilar 400
Jaime Collier 600
Ryan Sherman 900
Sara Obrien 300
Rebecca Collins 900
Kristopher Barnett 800
Sara Obrien 600
Judith Knight 700
Christine Lawrence 500
Veronica Stewart 600

  1. Rank function work in almost same manner as we define ranks in any exam.
  2. There is a minor difference in DENSE_RANK() and RANK() functions.
  3. Same as ROW_NUMBER()  OVER() function is required.
  4. But ORDER BY parameter is required in OVER() function.
    1. Black OVER() will not work
    2. If you “PARTITION BY” it must be followed by “ORDER BY”.
  5. “ORDER BY” defines how to want to give the ranks.

 

1. DENSE_RANK()

  1. This work exactly as the exam ranking login defined in the starting of this article.
    1. Based on “ORDER BY” top most row and all other matching rows get rank 1.
    2. 2nd and all matching rows get rank 2 and so on.
  2. There will be not gaps in rank sequence. Means no sequence number will be missing between 1 and highest rank sequence.

OVER(ORDER BY…)

select a.*, DENSE_RANK() OVER(ORDER BY MARKS DESC) as DENSE_RANK from RESULT a
  1. Both rows with 900 marks are ranked 1.
  2. Row with 800 marks is ranked 2.
  3. All 3 rows with 600 marks are ranked 4.

 

Name Marks DENSE_RANK
Ryan Sherman 900 1
Rebecca Collins 900 1
Kristopher Barnett 800 2
Judith Knight 700 3
Jaime Collier 600 4
Sara Obrien 600 4
Veronica Stewart 600 4
Christine Lawrence 500 5
Marco Aguilar 400 6
Sara Obrien 300 7

 

OVER(PARTITION BY… ORDER BY…)

  1. “PARTITION BY” works same as in ROW_NUMBER() function.
    1. Create a virtual partition in rows and each partition has its own set of sequence number starting with 1.
select a.*, DENSE_RANK() OVER(PARTITION BY MARKS  ORDER BY NAME) as DENSE_RANK 

from RESULT a
  1. Partitions are created by MARKS so each set of equal marks get its own set of rank sequence.
  2. After that ranks are assigned based on NAME (ORDER BY NAME).
Name Marks DENSE_RANK
Sara Obrien 300 1
Marco Aguilar 400 1
Christine Lawrence 500 1
Jaime Collier 600 1
Sara Obrien 600 2
Veronica Stewart 600 3
Judith Knight 700 1
Kristopher Barnett 800 1
Rebecca Collins 900 1
Ryan Sherman 900 2

 

2. RANK()

Here is how RANK() is different from DENSE_RANK().

  1. If you ask all participants to make a queue by their marks only.(Highest mark in front and lowest in back).
  2. Queue will look like some this
Position in queue 1 2 3 4 5 6 7 8 9 10
Marks 900 900 800 700 600 600 600 500 400 300

Here is how RANK() works

  1. Same marks always get the same rank. So
  2. Top 900 occupy position 1 and 2. But you can not tell which one will go for position 1 and which one for position 2.
    1. So both gets RANK 1
  3. Both 900 got RANK 1 but they occupied first 2 positions
  4. So 800 will get the RANK 3. This will create a gap as there is no one with RANK 2.
  5. 700 will get the RANK 4.
  6. Three 600 with occupy position 5,6 and 7 and again you can not tell which one gets which position.
    1. So assign all three same RANK 5.
  7. Now 500 will get RANK 8 creating gaps for RANK 6 and RANK 7.
  8. and SO ON…

This example with RANK() and DENSE_RANK() will help you understand how RANK() works in comparison of DENSE_RANK()

select a.*, 
RANK() OVER(ORDER BY MARKS DESC) as RANK, 
DENSE_RANK() OVER(ORDER BY MARKS DESC) as DENSE_RANK  
from RESULT a

 

Name Marks RANK DENSE_RANK
Ryan Sherman 900 1 1
Rebecca Collins 900 1 1
Kristopher Barnett 800 3 2
Judith Knight 700 4 3
Jaime Collier 600 5 4
Sara Obrien 600 5 4
Veronica Stewart 600 5 4
Christine Lawrence 500 8 5
Marco Aguilar 400 9 6
Sara Obrien 300 10 7
  1. There is not gaps in DENSE_RANK with all values between 1 to 7.
    1. Maximum rank can be less than total number of rows.
  2. There are gaps es in RANK with missing ranks 2,6 and 7.
    1. Maximum rank is always equal to number of rows(if PARITION BY is not used).

Its not like that gaps will occur in every case of RANK(). Based on values used for “ORDER BY” gaps may or may not occur.

select a.*, RANK() OVER(ORDER BY MARKS DESC, name) as RANK from RESULT a
  1. No gaps
Name Marks RANK
Rebecca Collins 900 1
Ryan Sherman 900 2
Kristopher Barnett 800 3
Judith Knight 700 4
Jaime Collier 600 5
Sara Obrien 600 6
Veronica Stewart 600 7
Christine Lawrence 500 8
Marco Aguilar 400 9
Sara Obrien 300 10

 

RANK() OVER(PARTITION BY… ORDER BY…) work in same concept as in DENSE_RANK() OVER(PARTITION BY… ORDER BY…) as above.


IBM i developer.

View Comments
There are currently no comments.