Forum Stats

  • 3,827,867 Users
  • 2,260,833 Discussions
  • 7,897,399 Comments

Discussions

Quick Tip: Oracle Database 12c Makes Top-N, Inner-N, and Bottom-N Querying Easier

top.gun
top.gun Member Posts: 3,666 Gold Crown
edited Jun 29, 2016 1:56PM in Database Software

Quick Tip: Oracle Database 12c Makes Top-N, Inner-N, and Bottom-N Querying Easier

by Paul Guerin, @top.gun

See how to construct queries to return rows or a percentage of rows—with or without ties—making top-n, inner-n, and bottom-n querying easier than ever before.

Using a top-n query, it's easy to exclude trailing rows in order to answer a question such as "Who are the highest n salary earners?" The legacy top-n query uses the ROWNUM pseudo column, or even the ROW_NUMBER and PERCENT_RANK analytic functions, to exclude trailing rows.

Now, in Oracle Database 12c, there is an easier method to exclude rows, and there is the option of including ties too. In addition, the syntax of the top-n query is much less ambiguous than before. This article examines these features further.

Oracle Database 12c also offers the ability to exclude rows according to a percentage, rather than according to a row count. Excluding rows as a percentage makes the top-n query even more versatile, and means it's possible to answer the more difficult question of "Who are the highest n percent of salary earners?"

Let's examine the Oracle Database 12c top-n query, and as a bonus examine the inner-n and bottom-n queries using the PERCENT_RANK analytic function.

Top-N Rows

Arguably the most common query to retrieve the top-n rows is one that utilizes the ROWNUM pseudo column.

For example, the following query attempts to determine the top five salary earners using the ROWNUM pseudo column to exclude all rows after the fifth.

-- common method to determine the top 5 rows
SELECT employee_id, first_name, last_name, salary
FROM employees
WHERE rownum<=5
ORDER BY salary DESC;

EMPLOYEE_ID FIRST_NAME           LAST_NAME                     SALARY
----------- -------------------- ------------------------- ----------
        100 Steven               King                           24000
        101 Neena                Kochhar                        17000
        102 Lex                  De Haan                        17000
        103 Alexander            Hunold                          9000
        104 Bruce                Ernst                           6000

While the structure of the above query is quite common, it doesn't quite do what was intended. In fact, rather than return the first five rows of a descending sort, the query above fetches five rows and then sorts them in descending order. Sorting any five rows isn't always the same as returning the first five rows of a sort, so unexpected results can be returned.

The top-n query that is guaranteed to sort first, and then return the first five rows is done using the ROWNUM pseudo column outside of an inline view.

Consequently, the top-n query to correctly determine the top five salary earners is as follows:

-- correct method to determine the top 5 rows
SELECT * 
FROM (
  SELECT employee_id, first_name, last_name, salary
  FROM employees 
  ORDER BY salary DESC
)
WHERE rownum<=5;

EMPLOYEE_ID FIRST_NAME           LAST_NAME                     SALARY
----------- -------------------- ------------------------- ----------
        100 Steven               King                           24000
        101 Neena                Kochhar                        17000
        102 Lex                  De Haan                        17000
        145 John                 Russell                        14000
        146 Karen                Partners                       13500

So the result set above is correct, and the SQL query is unambiguous, but Oracle Database 12c introduces a simpler syntax.

Consequently, it's cleaner to determine the top five salary earners using the Oracle Database 12c syntax with a FETCH FIRST n ROWS ONLY clause:

-- Oracle Database 12c top 5 rows
SELECT employee_id, first_name, last_name, salary
FROM employees
ORDER BY salary DESC
FETCH FIRST 5 ROWS ONLY;

EMPLOYEE_ID FIRST_NAME           LAST_NAME                     SALARY
----------- -------------------- ------------------------- ----------
        100 Steven               King                           24000
        101 Neena                Kochhar                        17000
        102 Lex                  De Haan                        17000
        145 John                 Russell                        14000
        146 Karen                Partners                       13500

Now, it needs to be remembered that tied rows are not necessarily included in the result set. If there were 10 employees tied with the highest salary, then still no more than five rows are returned. If there were 10 other employees tied with the fifth highest salary earner, then still only five rows are returned.

Fortunately Oracle Database 12c includes a clause to include ties.

Top-N Rows with Ties

Using the Oracle Database 12c FETCH ROWS WITH TIES clause, it's very easy to include the extra rows that are ties.

Examine the following SQL statement:

-- top 2 rows with ties
SELECT employee_id, first_name, last_name, salary
FROM employees
ORDER BY salary DESC
FETCH FIRST 2 ROWS WITH TIES;

EMPLOYEE_ID FIRST_NAME           LAST_NAME                     SALARY
----------- -------------------- ------------------------- ----------
        100 Steven               King                           24000
        101 Neena                Kochhar                        17000
        102 Lex                  De Haan                        17000

3 rows selected.

Only the first two rows are requested, but because the second row ties with another row, the result set includes three rows.

Moreover Oracle Database 12c also features the ability to limit the number of rows according to a percentage.

Top-N Percent of Rows with Ties

Say we wanted to ask a different question such as "Who are the highest n percent of salary earners?" We might want to include ties rather than return just a strict percentage, so this time the FETCH PERCENT ROWS WITH TIES clause is used.

-- top 10 percent with ties
SELECT employee_id, first_name, last_name, salary
FROM employees
ORDER BY salary DESC
FETCH FIRST 10 PERCENT ROWS WITH TIES;

EMPLOYEE_ID FIRST_NAME           LAST_NAME                     SALARY
----------- -------------------- ------------------------- ----------
        100 Steven               King                           24000
        101 Neena                Kochhar                        17000
        102 Lex                  De Haan                        17000
        145 John                 Russell                        14000
        146 Karen                Partners                       13500
        201 Michael              Hartstein                      13000
        108 Nancy                Greenberg                      12008
        205 Shelley              Higgins                        12008
        147 Alberto              Errazuriz                      12000
        168 Lisa                 Ozer                           11500
        114 Den                  Raphaely                       11000
        148 Gerald               Cambrault                      11000
        174 Ellen                Abel                           11000

13 rows selected.

In the employees table, there are three employees with a salary of 11,000. Because they are ties, the WITH TIES subclause ensures the result set includes them.

Inner-N Percent and Bottom-N Percent

Also as versatile are the inner-n percent and bottom-n percent queries. This time the easiest method is to use the PERCENT_RANK analytic.

The following two examples return

  • the second 10 percent of the top salary earners
  • the last 10 percent of the top salary earners

We do need to use an inline view, but the absence of the ROWNUM pseudo column means the query is still unambiguous.

-- between 10 and 20 percent
SELECT * FROM (
  SELECT employee_id, first_name, last_name, salary,
    percent_rank() OVER (ORDER BY salary desc)*100 pct
  FROM employees
)
WHERE pct BETWEEN 10 AND 20;

EMPLOYEE_ID FIRST_NAME           LAST_NAME                     SALARY        PCT
----------- -------------------- ------------------------- ---------- ----------
        162 Clara                Vishney                        10500 12.2641509
        149 Eleni                Zlotkey                        10500 12.2641509
        169 Harrison             Bloom                          10000 14.1509434
        156 Janette              King                           10000 14.1509434
        150 Peter                Tucker                         10000 14.1509434
        204 Hermann              Baer                           10000 14.1509434
        170 Tayler               Fox                             9600 17.9245283
        151 David                Bernstein                       9500 18.8679245
        157 Patrick              Sully                           9500 18.8679245
        163 Danielle             Greene                          9500 18.8679245

10 rows selected. 

-- last 10 percent
SELECT * FROM (
  SELECT employee_id, first_name, last_name, salary,
    percent_rank() OVER (ORDER BY salary desc)*100 pct
  FROM employees
)
WHERE pct >= 90;

EMPLOYEE_ID FIRST_NAME           LAST_NAME                     SALARY        PCT
----------- -------------------- ------------------------- ---------- ----------
        140 Joshua               Patel                           2500 90.5660377
        119 Karen                Colmenares                      2500 90.5660377
        182 Martha               Sullivan                        2500 90.5660377
        131 James                Marlow                          2500 90.5660377
        144 Peter                Vargas                          2500 90.5660377
        191 Randall              Perkins                         2500 90.5660377
        127 James                Landry                          2400 96.2264151
        135 Ki                   Gee                             2400 96.2264151
        128 Steven               Markle                          2200 98.1132075
        136 Hazel                Philtanker                      2200 98.1132075
        132 TJ                   Olson                           2100        100

11 rows selected. 

Conclusion

There you are—with Oracle Database 12c, limiting a result set by rows or the percentage of rows is unambiguous because the ROWNUM pseudo column is not used anymore.

The query can be constructed to return rows or the percentage of rows—with or without ties—making top-n, inner-n, and bottom-n querying easier than ever before.

See Also

About the Author

Paul Guerin is an internationally recognized thought leader on Oracle Database and a performance architect. He has been the principle consultant for many blue-chip companies from Australia, France, Hong Kong, and the United States. Moreover he has presented at some of the world's leading Oracle conferences, including Oracle OpenWorld 2013. Paul is based from a global delivery center in Southeast Asia and is a participant in the Oracle ACE program. Paul is a respected expert in the Oracle Community; follow him at @Top Gun

Comments

  • Virgil Ionescu
    Virgil Ionescu Member Posts: 1
    edited Dec 27, 2018 3:14PM

    One of the easiest way to write TOP N QUERIES in older versions of ORACLE (ver.8.0) is to make use of the power of UNION operator. UNION will order the records by the columns in the query.

    This functionality of the UNION operator is less well known, if not very poorly known.

    Example:

    SELECT * FROM
    (SELECT EMP_NO, EMP_NAME FROM EMP_TABLE
    UNION
    SELECT 99999999999,'' FROM DUAL)
    WHERE ROWNUM<=5

    where 99999999999 is bigger then all values in EMP_NO.

    Or, if you want to select TOP 5 salary employees with the highest 5 salaries:

    SELECT EMP_NO, EMP_NAME, 99999999999999-TMP_EMP_SAL
    FROM
    (SELECT 99999999999999-EMP_SAL TMP_EMP_SAL, EMP_NO, EMP_NAME 
     FROM EMP_TABLE
     UNION
     SELECT 99999999999999,0,'' FROM DUAL)
     WHERE ROWNUM<=5

    Obviously, this way of using the UNION operator can also be used in the newer versions of ORACLE and even in other databases.
    Enjoy the power of the UNION operator!

    Regards,

    Virgil Ionescu

    ***Moderator action (Timo) removed private contact data and signature.***

    ALOKCLOUD