This site is currently read-only as we are migrating to Oracle Forums for an improved community experience. You will not be able to initiate activity until January 30th, when you will be able to use this site as normal.

    Forum Stats

  • 3,889,958 Users
  • 2,269,775 Discussions
  • 7,916,823 Comments

Discussions

When we should use Scaler subquery and how it works

Bikram
Bikram Member Posts: 33 Green Ribbon

Hi All,

I wanted to know about when to use scaler subquery and how the scaler subquery works internally.

It would be great if I got answer with some examples or any documentation will also helpful.


Regards,

Bikram

Tagged:

Best Answer

  • BluShadow
    BluShadow Mr UKMember, Moderator Posts: 42,706 Red Diamond
    Answer ✓

    A scalar subquery is one that returns just a single value and row, so it can be used wherever a single value is required in a query e.g.

    SQL> ed
    Wrote file afiedt.buf
    
      1  select e.empno
      2        ,e.ename
      3        ,(select dname from dept d where d.deptno = e.deptno) as department
      4* from   emp e
    SQL> /
    
         EMPNO ENAME      DEPARTMENT
    ---------- ---------- --------------
          7839 KING       ACCOUNTING
          7698 BLAKE      SALES
          7782 CLARK      ACCOUNTING
          7566 JONES      RESEARCH
          7788 SCOTT      RESEARCH
          7902 FORD       RESEARCH
          7369 SMITH      RESEARCH
          7499 ALLEN      SALES
          7521 WARD       SALES
          7654 MARTIN     SALES
          7844 TURNER     SALES
          7876 ADAMS      RESEARCH
          7900 JAMES      SALES
          7934 MILLER     ACCOUNTING
    
    14 rows selected.
    

    Here, the department name is obtained for each employee record, as we know that we shall only retrieve a single department for a given deptno and that given deptno relates to the employee we're looking at.

    Without the d.deptno = e.deptno condition it's not scalar as the subquery would try and return multiple rows from the dept table, and you would get an exception.

    SQL> ed
    Wrote file afiedt.buf
    
      1  select e.empno
      2        ,e.ename
      3        ,(select dname from dept d) as department
      4* from   emp e
    SQL> /
          ,(select dname from dept d) as department
            *
    ERROR at line 3:
    ORA-01427: single-row subquery returns more than one row
    

    In reality though, it's often a good idea to avoid using scalar subqueries (they can make the query harder to read for starters), as usually you can do exactly what you need by using join conditions in the main query...

    SQL> ed
    Wrote file afiedt.buf
    
      1  select e.empno
      2        ,e.ename
      3        ,d.dname as department
      4  from   emp e
      5*        join dept d on (d.deptno = e.deptno)
    SQL> /
    
         EMPNO ENAME      DEPARTMENT
    ---------- ---------- --------------
          7839 KING       ACCOUNTING
          7698 BLAKE      SALES
          7782 CLARK      ACCOUNTING
          7566 JONES      RESEARCH
          7788 SCOTT      RESEARCH
          7902 FORD       RESEARCH
          7369 SMITH      RESEARCH
          7499 ALLEN      SALES
          7521 WARD       SALES
          7654 MARTIN     SALES
          7844 TURNER     SALES
          7876 ADAMS      RESEARCH
          7900 JAMES      SALES
          7934 MILLER     ACCOUNTING
    
    14 rows selected.
    


    The only 'advantage' to using a scalar subquery would be if there was no data returned by the subquery. In that case the value of the column would be null, but the main query would succeed in returning the other data. Using a regular (INNER) JOIN on the other hand would fail to return any rows that didn't have a matching record... though of course you can get around that using an OUTER JOIN...

    SQL> ed
    Wrote file afiedt.buf
    
    
      1  select e.empno
      2        ,e.ename
      3        ,d.dname as department
      4  from   emp e
      5*        left outer join dept d on (d.deptno = e.deptno)
    SQL> /
    
         EMPNO ENAME      DEPARTMENT
    ---------- ---------- --------------
          7839 KING       ACCOUNTING
          7782 CLARK      ACCOUNTING
          7934 MILLER     ACCOUNTING
          7566 JONES      RESEARCH
          7788 SCOTT      RESEARCH
          7902 FORD       RESEARCH
          7369 SMITH      RESEARCH
          7876 ADAMS      RESEARCH
          7698 BLAKE      SALES
          7499 ALLEN      SALES
          7521 WARD       SALES
          7654 MARTIN     SALES
          7844 TURNER     SALES
          7900 JAMES      SALES
    
    14 rows selected.
    

    In this case, every employee has a department that exists in department anyway, so that's not a problem.

Answers

  • Frank Kulash
    Frank Kulash Boston, USAMember, Moderator Posts: 43,002 Red Diamond
    edited Nov 25, 2022 2:19PM

    Hi, @Bikram

    See

    Because scalar subqueries can return values from other tables (or other rows from the same table) they can sometimes be used instead of joins, but that doesn't mean they are necessarily better than joins. For getting values from other rows of the same table, analytic functions or MATCH_RECOGNIZE may be better than scalar sub-queries.

  • BluShadow
    BluShadow Mr UKMember, Moderator Posts: 42,706 Red Diamond
    Answer ✓

    A scalar subquery is one that returns just a single value and row, so it can be used wherever a single value is required in a query e.g.

    SQL> ed
    Wrote file afiedt.buf
    
      1  select e.empno
      2        ,e.ename
      3        ,(select dname from dept d where d.deptno = e.deptno) as department
      4* from   emp e
    SQL> /
    
         EMPNO ENAME      DEPARTMENT
    ---------- ---------- --------------
          7839 KING       ACCOUNTING
          7698 BLAKE      SALES
          7782 CLARK      ACCOUNTING
          7566 JONES      RESEARCH
          7788 SCOTT      RESEARCH
          7902 FORD       RESEARCH
          7369 SMITH      RESEARCH
          7499 ALLEN      SALES
          7521 WARD       SALES
          7654 MARTIN     SALES
          7844 TURNER     SALES
          7876 ADAMS      RESEARCH
          7900 JAMES      SALES
          7934 MILLER     ACCOUNTING
    
    14 rows selected.
    

    Here, the department name is obtained for each employee record, as we know that we shall only retrieve a single department for a given deptno and that given deptno relates to the employee we're looking at.

    Without the d.deptno = e.deptno condition it's not scalar as the subquery would try and return multiple rows from the dept table, and you would get an exception.

    SQL> ed
    Wrote file afiedt.buf
    
      1  select e.empno
      2        ,e.ename
      3        ,(select dname from dept d) as department
      4* from   emp e
    SQL> /
          ,(select dname from dept d) as department
            *
    ERROR at line 3:
    ORA-01427: single-row subquery returns more than one row
    

    In reality though, it's often a good idea to avoid using scalar subqueries (they can make the query harder to read for starters), as usually you can do exactly what you need by using join conditions in the main query...

    SQL> ed
    Wrote file afiedt.buf
    
      1  select e.empno
      2        ,e.ename
      3        ,d.dname as department
      4  from   emp e
      5*        join dept d on (d.deptno = e.deptno)
    SQL> /
    
         EMPNO ENAME      DEPARTMENT
    ---------- ---------- --------------
          7839 KING       ACCOUNTING
          7698 BLAKE      SALES
          7782 CLARK      ACCOUNTING
          7566 JONES      RESEARCH
          7788 SCOTT      RESEARCH
          7902 FORD       RESEARCH
          7369 SMITH      RESEARCH
          7499 ALLEN      SALES
          7521 WARD       SALES
          7654 MARTIN     SALES
          7844 TURNER     SALES
          7876 ADAMS      RESEARCH
          7900 JAMES      SALES
          7934 MILLER     ACCOUNTING
    
    14 rows selected.
    


    The only 'advantage' to using a scalar subquery would be if there was no data returned by the subquery. In that case the value of the column would be null, but the main query would succeed in returning the other data. Using a regular (INNER) JOIN on the other hand would fail to return any rows that didn't have a matching record... though of course you can get around that using an OUTER JOIN...

    SQL> ed
    Wrote file afiedt.buf
    
    
      1  select e.empno
      2        ,e.ename
      3        ,d.dname as department
      4  from   emp e
      5*        left outer join dept d on (d.deptno = e.deptno)
    SQL> /
    
         EMPNO ENAME      DEPARTMENT
    ---------- ---------- --------------
          7839 KING       ACCOUNTING
          7782 CLARK      ACCOUNTING
          7934 MILLER     ACCOUNTING
          7566 JONES      RESEARCH
          7788 SCOTT      RESEARCH
          7902 FORD       RESEARCH
          7369 SMITH      RESEARCH
          7876 ADAMS      RESEARCH
          7698 BLAKE      SALES
          7499 ALLEN      SALES
          7521 WARD       SALES
          7654 MARTIN     SALES
          7844 TURNER     SALES
          7900 JAMES      SALES
    
    14 rows selected.
    

    In this case, every employee has a department that exists in department anyway, so that's not a problem.

  • mathguy
    mathguy Member Posts: 11,041 Black Diamond

    Here is a case where scalar subqueries are actually useful.

    Suppose you have a column in a table, the column is indexed, and you want to find the min and the max value in that column. Let's make up a test table:

    create table tbl (id primary key)
    nologging
    as
      with gen (x) as (select level from dual connect by level <= 1e3)
      select rownum from gen cross join gen
    ;
    


    Table TBL has a single column ID, with values from 1 to 1 million. We declared ID as primary key, so there will be a (unique) index on ID. (The fact that the index is a unique index, or that ID is not nullable, have nothing to do with the example; this is just an easy way to get several things done with as little work on my part as possible.)

    Now suppose we didn't know the values in the ID column, and we wanted to find the MIN and the MAX. We could write a query like this:

    select min(id) as min_id, max(id) as max_id from tbl;
    

    This works fine, on my system I get the correct result, in about 0.022 seconds.

    However, I learned - on this forum - some years ago that a different way to write the query, using scalar subqueries, is much more efficient for some reason. Indeed, on my system I get the same correct answer in 0.001 seconds (an order of magnitude faster) if I write the query like this:

    select (select min(id) from tbl) as min_id, (select max(id) from tbl) as max_id from dual;
    

    It would seem that this second query should take longer, since it reads the same data twice. But that isn't what happens. In the second query, we ask for the min value in a scalar subquery. That value can be read directly from the index - we can go straight to the beginning of the index and read the value; we don't need to read all the values, as is done in the first query. Similar thing for the max. You can see this by comparing the execution plans:

    First query:

    ---------------------------------------------------------------------------
    | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |      |     1 |     5 |   428   (2)| 00:00:01 |
    |   1 |  SORT AGGREGATE    |      |     1 |     5 |            |          |
    |   2 |   TABLE ACCESS FULL| TBL  |  1000K|  4882K|   428   (2)| 00:00:01 |
    ---------------------------------------------------------------------------
    

    Second query:

    --------------------------------------------------------------------------------------------
    | Id  | Operation                  | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT           |               |     1 |       |     8   (0)| 00:00:01 |
    |   1 |  SORT AGGREGATE            |               |     1 |     5 |            |          |
    |   2 |   INDEX FULL SCAN (MIN/MAX)| SYS_C00277489 |     1 |     5 |     3   (0)| 00:00:01 |
    |   3 |  SORT AGGREGATE            |               |     1 |     5 |            |          |
    |   4 |   INDEX FULL SCAN (MIN/MAX)| SYS_C00277489 |     1 |     5 |     3   (0)| 00:00:01 |
    |   5 |  FAST DUAL                 |               |     1 |       |     2   (0)| 00:00:01 |
    --------------------------------------------------------------------------------------------
    
  • Solomon Yakobson
    Solomon Yakobson Member Posts: 20,213 Red Diamond

    This was always a mystery to me - why optimizer opts for full scan when both min and max are used and not reading extreme leaf blocks on both sides of the index.

    SY.


  • Bikram
    Bikram Member Posts: 33 Green Ribbon

    Thanks @BluShadow explaining every expects of scaler sub-query with an example. Now for a given requirement, i can easily understand whether i should use scaler subquery or join.

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 20,213 Red Diamond

    Blue,

    Both examples with inner & outer joins aren't equivalent to original. Original query will raise an error if scalar subquery returns more than one row while inner & outer join replacements will not.

    SY.

    BluShadow