1 2 Previous Next 15 Replies Latest reply on Mar 21, 2017 3:56 AM by Barbara Boehmer

    Issues combining a Subquery

    Rundas2000

      Hello community,

      Im having a lot of trouble combining these two queries. Im close to graduating my college with a bachelors degree in database administration. However my strong point relies on MS SQL, now going back into SQL developer i have been struggling a bit. A classmate has provided me questions she receives in class and i appreciate it as a way to practice and improve my skills.

       

      The question is this: (Note this is base of the "cityjails" table provided with the Oracle 11g SQL book by Joan Casteel)

      List the names of probation officers who have had less than the average number of criminals assigned.

      I created this so far, but im getting multiple names even though i added the distinct clause.

      //

      SELECT Distinct (FIRST ||' '|| LAST) AS "Probation Officers", A.CRIMINAL_ID, A.PROB_ID

      FROM SENTENCES A

      JOIN PROB_OFFICERS B ON

      A.PROB_ID = B.PROB_ID

      WHERE A.PROB_ID < (SELECT AVG(CRIMINAL_ID) FROM SENTENCES)

      //

      These are the results.

       

      Any idea what im doing wrong?

       

      Thank you.

        • 1. Re: Issues combining a Subquery
          mathguy

          If you need to list the names of the probation officers who... (whatever), why do you also have CRIMINAL_ID and PROB_ID in the SELECT clause? That is what makes the DISTINCT irrelevant.

           

          This simply answers your question. Do you also want to know how to solve the problem in the most efficient way, or just to understand what was wrong in your attempt?

          • 2. Re: Issues combining a Subquery
            Frank Kulash

            Hi,

             

            Welcome to the Forum!

             

            Whenever you have a question, please post a little sample data (CREATE TABLE and INSERT statements, relevant columns only) for all the tables involved, and the exact results you want from that data, so that the people who want to help you can re-create the problem and test their ideas.  (If you're using commonly available Oracle supplied tables, like those in the scott schema, then you don't need to post the sample data, but the cityjails table you mentioned is not one of those, so you need to post some sample data.)

            Explain, using specific examples, how you get those results from that data.

            Always say what version of Oracle you're using (e.g. 11.2.0.2.0).

            See the forum FAQ: Re: 2. How do I ask a question on the forums?

             

            2ee2ed54-f31d-46d5-8df4-1eb5aae86da7 wrote:

            ...

            List the names of probation officers who have had less than the average number of criminals assigned.

            I created this so far, but im getting multiple names even though i added the distinct clause.

            ...

            SELECT DISTINCT means the results set won't contain 2 or more identical rows; it can contain 2 or more rows if the rows are not exactly the same.  For example, if the names are identical, but the criminal_id column is different, then the rows are not exactly the same.

             

             

            //

            SELECT Distinct (FIRST ||' '|| LAST) AS "Probation Officers", A.CRIMINAL_ID, A.PROB_ID

            FROM SENTENCES A

            JOIN PROB_OFFICERS B ON

            A.PROB_ID = B.PROB_ID

            WHERE A.PROB_ID < (SELECT AVG(CRIMINAL_ID) FROM SENTENCES)

            //

            ...

            Don't confuse the average number of criminals with the average criminal_id number.  If there are only 10 criminals, they might all have id numbers over 100 thousand.  Most "averages" (depending on exactly what you mean by that) of criminals would be 10 or less, but the average criminal_id would be over 100 thousand.

             

            When you do any kind of programming, take baby steps.  Write as little code as possible, test it, fix any problems, test again, and only when it's working perfectly, only the add another line or two of code, and test again.

             

            In this case, let's look at the sub-query.  You want a sub-query that will find the average number of criminals assigned to a probation officer, on average.  How can you figure that out?  You might start with finding how many criminals each officer has assigned, and take the average of those numbers.  Okay, so write a query that finds all the probation officers.  When you get that much working perfectly, modify the query to count how many criminals each one has assigned.  WHen you get that much working perfectly, then (and only then) modify it to compute the average of those counts.

             

            If you get stuck, post your query, and a specific question (along with the sample data and desired results, of course).  Show what you did on the last baby step, and describe the problem you're having on the current baby step.  If you get any error messages, post the complete messages.

            • 3. Re: Issues combining a Subquery
              Rundas2000

              My apologies, i'll rephrase my initial question.

               

              The question: Using a subquery, list the names of probation officers who have had less than the average number of criminals assigned.

              Version: Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit Production

               

              What i have created:

              //

              SELECT Distinct (FIRST ||' '|| LAST) AS "Probation Officers", A.CRIMINAL_ID, A.PROB_ID

              FROM SENTENCES A

              JOIN PROB_OFFICERS B ON

              A.PROB_ID = B.PROB_ID

              WHERE A.PROB_ID < (SELECT AVG(CRIMINAL_ID) FROM SENTENCES)

              //

               

              The SAMPLE-Table information:

              --------------------------------------------------------

              -- DDL for Table SENTENCES

              --------------------------------------------------------

              CREATE TABLE "SYSTEM"."SENTENCES"

                 (    "SENTENCE_ID" NUMBER(6,0),

                  "CRIMINAL_ID" NUMBER(9,0) NOT NULL ENABLE,

                  "TYPE" CHAR(1 BYTE),

                  "PROB_ID" NUMBER(5,0),

                  "START_DATE" DATE,

                  "END_DATE" DATE,

                  "VIOLATIONS" NUMBER(3,0),

                   CONSTRAINT "SENTENCES_ID_PK" PRIMARY KEY ("SENTENCE_ID"),

                   CONSTRAINT "SENTENCES_CRIMEID_FK" FOREIGN KEY ("CRIMINAL_ID")

                    REFERENCES "SYSTEM"."CRIMINALS" ("CRIMINAL_ID") ENABLE,

                   CONSTRAINT "SENTENCES_PROBID_FK" FOREIGN KEY ("PROB_ID")

                    REFERENCES "SYSTEM"."PROB_OFFICERS" ("PROB_ID") ENABLE

                 )

               

              --------------------------------------------------------

              --  DDL for Table PROB_OFFICERS

              --------------------------------------------------------

              CREATE TABLE "SYSTEM"."PROB_OFFICERS"

                 (    "PROB_ID" NUMBER(5,0),

                  "LAST" VARCHAR2(15 BYTE),

                  "FIRST" VARCHAR2(10 BYTE),

                  "STREET" VARCHAR2(30 BYTE),

                  "CITY" VARCHAR2(20 BYTE),

                  "STATE" CHAR(2 BYTE),

                  "ZIP" CHAR(5 BYTE),

                  "PHONE" CHAR(10 BYTE),

                  "EMAIL" VARCHAR2(30 BYTE),

                  "STATUS" CHAR(1 BYTE) DEFAULT 'A',

                  "MGR_ID" NUMBER(5,0),

                  "PAGER#" CHAR(10 BYTE),

                   CONSTRAINT "PROBOFFICERS_ID_PK" PRIMARY KEY ("PROB_ID")

              )

               

              --------------------------------------------------------

              --  DDL for Table CRIMINALS

              --------------------------------------------------------

               

              CREATE TABLE "SYSTEM"."CRIMINALS"

                 (    "CRIMINAL_ID" NUMBER(6,0),

                  "LAST" VARCHAR2(15 BYTE),

                  "FIRST" VARCHAR2(10 BYTE),

                  "STREET" VARCHAR2(30 BYTE),

                  "CITY" VARCHAR2(20 BYTE),

                  "STATE" CHAR(2 BYTE),

                  "ZIP" CHAR(5 BYTE),

                  "PHONE" CHAR(10 BYTE),

                  "V_STATUS" CHAR(1 BYTE) DEFAULT 'N',

                  "P_STATUS" CHAR(1 BYTE) DEFAULT 'N',

                   CONSTRAINT "CRIMINALS_ID_PK" PRIMARY KEY ("CRIMINAL_ID")

              )

               

              And lastly my Results:

               

              -------------------------

              The Problem

              ------------------------

               

              By using the "Distinct" Keyword i thought it would prevent George from appearing multiple times as the results show. So what i have are the queries separated into two queries that give me the results i want

               

              ------------------------

              The 2 Queries separated

              ------------------------

               

              SELECT DISTINCT (FIRST ||' '|| LAST) AS "Probation Officers", PROB_ID

              FROM PROB_OFFICERS;

               

               

              SELECT DISTINCT CRIMINAL_ID, PROB_ID

              FROM SENTENCES;

               

              A friend of mine is doing it as part of her homework, i myself however would like to learn it for my own experience. Since Oracle has a high demand in the DBA field and it would be great to learn as much as possible.

              • 4. Re: Issues combining a Subquery
                Rundas2000

                Both please,

                I think the answer is the following:

                //

                SELECT Distinct (FIRST ||' '|| LAST) AS "Probation Officers"

                FROM SENTENCES A

                JOIN PROB_OFFICERS B ON

                A.PROB_ID = B.PROB_ID

                WHERE A.PROB_ID < (SELECT AVG(CRIMINAL_ID) FROM SENTENCES)

                //

                 

                As what you was right haha. (im overthinking this question)

                 

                My Results where:

                 

                • 5. Re: Issues combining a Subquery
                  mathguy

                  There's something I can't quite tell from the information provided so far (sorry if it's there and I missed it - I didn't spend a lot of time reading everything). It seems in the SENTENCES table you have PO (probation officer) id's and criminal id's, but it's a sentences table, so I don't know if the same criminal may appear more than once (for different sentences), and if so, whether the same criminal may have been assigned to different PO's. Perhaps the last part doesn't make much sense, but the first is possible.

                   

                  Anyway. Without sample data, I can't test on your tables anyway. Instead I will test on the SCOTT schema. Among other things, it has a table EMP and a table DEPT. In EMP there are 14 rows (for 14 employees); in DEPT there are three departments, 10, 20 and 30. They have three, five, and six employees respectively. (Note that here the employees are unique in the EMP schema - this is not quite equivalent to your SENTENCES table. But, to allow for duplicate rows for the same employee, I will throw in a DISTINCT EMPNO even though I know in my case it is unnecessary.)

                   

                  It is also unclear if there are PO's who have no criminals assigned (so perhaps they would not be included in the SENTENCES table, even though they are in the PROB_OFFICERS table), and if there are, it is unclear if they should be included in the computation of the average. I will assume the answer to the second question is NO (it would also automatically be NO if the answer to the first question was NO).

                   

                  To get the average number of employees per department, I need something like

                   

                      select count (distinct empno) / count (distinct deptno) from emp

                   

                  Then I can use this as a subquery in the HAVING clause of an aggregate query on EMP. Finally I can join to the DEPT table just to get the department names.

                   

                  select s.deptno, d.dname
                  from   ( select   deptno
                           from     scott.emp
                           group by deptno
                           having   count(distinct empno) <
                                       (select count(distinct empno) / count(distinct deptno) from scott.emp)
                         ) s
                             join dept d on s.deptno = d.deptno
                  ;

                   

                  DEPTNO   DNAME
                  ------   ----------
                      10   ACCOUNTING

                  • 6. Re: Issues combining a Subquery
                    mathguy

                    Did you read Frank's reply?

                     

                    I strongly advise you to always read Frank's contributions, and to ask for an explanation when you don't understand what he is saying. Don't worry, you would be in good company - many things he says still fly over my head. (Well, OK, maybe I am not good company - but I am sure I'm not the only one.)

                     

                    Frank pointed out that in your original attempt - and still now, in your most recent one - that you take an average over CRIMINAL_ID. That makes no sense whatsoever. Criminal ID's may be something like 93320390 and 94568734   -   how is the average of such meaningless numbers relevant to "average criminals per PO"?  And, you are comparing a PO's ID number to an average (or whatever is on the other side)... the same question, how is the PO's ID related to how many criminals they were assigned?

                    • 7. Re: Issues combining a Subquery
                      Rundas2000

                      That confuses me, i used criminal ID because its the unique identifier to find the criminal. Should i use something else instead of criminal id? How can i provide you with the sample data by the way, i thought it was what i posted above?

                       

                      Thank you so much for the help too, im really glad i got involved in this forum!

                      • 8. Re: Issues combining a Subquery
                        mathguy

                        CRIMINAL_ID is, indeed, how you identify "the" criminal (or "a" criminal). But you don't need to identify them, do you? If I tell you Officer Mathguy has 12 criminals assigned to him, that is all you need - do you also need to identify those criminals? All you need is a COUNT(criminal_id). (Or COUNT DISTINCT if the same criminal may appear as assigned to the same PO more than once in the SENTENCES table.)

                         

                        Perhaps your confusion comes from ... why COUNT(criminal_id) is OK but AVG(criminal_id) is not OK.  Answer: because AVG actually takes every single value, as a number (assuming they are numbers to begin with - they could also be strings, in general) and takes their mathematical average. On the other hand, COUNT only counts how many non-null values there are; COUNT doesn't care what the actual values are, and it doesn't process them in any manner (other than, COUNT DISTINCT counts how many distinct, non-null values there are).

                         

                        To know how many criminals were assigned to a PO, you do a COUNT(criminal_id), or perhaps a COUNT(distinct criminal_ID) if there can be duplicates. Similarly, you use counts - of total (distinct) criminals and total (distinct) PO's in the entire table to find the average number of criminals per PO.  Just like I did in my example, to find the number of employees in each department, the total number of (distinct) employees in the company, and the total number of (distinct) departments in the company - the last two are used to get the average employees per department.

                        • 9. Re: Issues combining a Subquery
                          mathguy

                          Regarding sample data:  You provided statements to re-create your tables, and that is half of the job, and it's all good. The other half is to provide INSERT statements to create rows in those tables; otherwise they are empty. If your actual tables have 4000 rows, it's not necessary to provide INSERT statements for all of them; about 20 to 30 should suffice in almost all cases. Use your judgment on what to include - you want to cover enough "special situations" for reasonable testing.

                           

                          With databases from a book, you must also be careful with copyright - see what you are allowed to copy and post on a free website. Publishers vary in their policies.

                          • 10. Re: Issues combining a Subquery
                            Rundas2000

                            Should i script the table and paste it here?

                            Also your info helped a lot, this became my new subquery to get an avarage.

                            SELECT COUNT(DISTINCT PROB_ID) / COUNT(DISTINCT CRIMINAL_ID) FROM SENTENCES.

                            I get a result but when i run it with the query the results are blank.

                            • 11. Re: Issues combining a Subquery
                              mathguy

                              First - the math. (I am happy to say it - because it provides "job security" for people like me: it doesn't matter how skilled one is at programming; the math has to be right too, or else one may write beautiful code implementing the wrong computations.)

                               

                              To compute the average number of criminals per probation officer, you need to divide the count of criminals by the count of officers. 300 criminals, 100 officers, the average is 300/100. In your latest computation (where you got 0.3) you computed it the other way around: you divided the count of officers by that of criminals.

                               

                              The INSERT statements: How you do that depends on your front-end, like SQL Developer or Toad. I use SQL Developer most of the time. If I had a table with 4000 (or 40 million) rows and I wanted to create INSERT statements for 30 of them, I would write something like   SELECT * FROM <table> WHERE ROWNUM <= 30,  then I would select the results in the Query Result pane, and "Export" as INSERT statements. Whatever you use, it must have similar capabilities; the thing you should never do, and you should never have to do, is to write such statements by hand!

                              • 12. Re: Issues combining a Subquery
                                Frank Kulash

                                Hi,

                                Rundas2000 wrote:

                                 

                                Both please,

                                I think the answer is the following:

                                //

                                SELECT Distinct (FIRST ||' '|| LAST) AS "Probation Officers"

                                FROM SENTENCES A

                                JOIN PROB_OFFICERS B ON

                                A.PROB_ID = B.PROB_ID

                                WHERE A.PROB_ID < (SELECT AVG(CRIMINAL_ID) FROM SENTENCES)

                                //

                                ...

                                Without seeing your sample data, I can't say much for sure.

                                 

                                Even without seeing your data, I'll bet that if the query above is producing the right results from that sample data, it's purely by chance, and the same query will not produce what you want with other sample data.

                                • 13. Re: Issues combining a Subquery
                                  Barbara Boehmer

                                  The following is a simplification of your problem, using only the necessary tables and columns and some data from the Oracle emp and dept demo tables.

                                   

                                  Suppose you have data like this:

                                   

                                  SCOTT@orcl_12.1.0.2.0> SELECT b.prob_id, b.last, b.first FROM prob_officers b ORDER BY b.prob_id

                                    2  /

                                   

                                    PROB_ID LAST            FIRST

                                  ---------- --------------- ----------

                                          10 ACCOUNTING      NEW YORK

                                          20 RESEARCH        DALLAS

                                          30 SALES          CHICAGO

                                          40 OPERATIONS      BOSTON

                                   

                                  4 rows selected.

                                   

                                  SCOTT@orcl_12.1.0.2.0> SELECT a.prob_id, a.criminal_id FROM sentences a ORDER BY a.prob_id, a.criminal_id

                                    2  /

                                   

                                    PROB_ID CRIMINAL_ID

                                  ---------- -----------

                                          10        7782

                                          10        7839

                                          10        7934

                                          20        7369

                                          20        7566

                                          20        7788

                                          20        7876

                                          20        7902

                                          30        7499

                                          30        7521

                                          30        7654

                                          30        7698

                                          30        7844

                                          30        7900

                                   

                                  14 rows selected.

                                   

                                   

                                  1.  You can get the probation officers and how many criminals are assigned to each one like this:

                                   

                                  SCOTT@orcl_12.1.0.2.0> SELECT b.first || ' ' || b.last AS "Probation Officers",

                                    2          COUNT (DISTINCT a.criminal_id) criminals_assigned

                                    3  FROM  sentences a JOIN prob_officers b ON (a.prob_id = b.prob_id)

                                    4  GROUP  BY b.prob_id, b.last, b.first

                                    5  ORDER  BY b.prob_id

                                    6  /

                                   

                                  Probation Officers        CRIMINALS_ASSIGNED

                                  -------------------------- ------------------

                                  NEW YORK ACCOUNTING                        3

                                  DALLAS RESEARCH                            5

                                  CHICAGO SALES                              6

                                   

                                  3 rows selected.

                                   

                                   

                                  2.  You can get the average number of criminals assigned to probation officers like this:

                                   

                                  SCOTT@orcl_12.1.0.2.0> SELECT AVG (COUNT (DISTINCT criminal_id)) average_criminals_assigned

                                    2  FROM  sentences

                                    3  GROUP  BY prob_id

                                    4  /

                                   

                                  AVERAGE_CRIMINALS_ASSIGNED

                                  --------------------------

                                                  4.66666667

                                   

                                  1 row selected.

                                   

                                   

                                  3.  You can get the probation officers having less than average number of criminals assigned like this, using the first query from above, a HAVING clause for comparison of the count from the first query to the average from the second query, followed by the second query from above:

                                   

                                  SCOTT@orcl_12.1.0.2.0> SELECT b.first || ' ' || b.last AS "Probation Officers"

                                    2  FROM  sentences a JOIN prob_officers b ON (a.prob_id = b.prob_id)

                                    3  GROUP  BY b.prob_id, b.last, b.first

                                    4  HAVING COUNT (DISTINCT a.criminal_id) <

                                    5          (SELECT AVG (COUNT (DISTINCT criminal_id))

                                    6          FROM  sentences

                                    7          GROUP  BY prob_id)

                                    8  /

                                   

                                  Probation Officers

                                  --------------------------

                                  NEW YORK ACCOUNTING

                                   

                                  1 row selected.

                                  • 14. Re: Issues combining a Subquery
                                    mathguy

                                    This is the same as Reply 5, right?

                                    1 2 Previous Next