Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Complex select query help (Group by)

user12050217Oct 13 2009 — edited Dec 21 2009
Hi,
I am in need your help to create a single select query for below table data and output which requies. I am using oracle 10g on RHEL 5 version.

create table bustour(bussno number(4),start_date date,passengers number(2));
alter session set nls_date_format='dd-mm-yyyy';
insert into bustour values (4123,'04-10-2009',20);
insert into bustour values (4123,'05-10-2009',25);
insert into bustour values (4123,'05-10-2009',18);
insert into bustour values (4123,'06-10-2009',15);

insert into bustour values (6138,'03-10-2009',16);
insert into bustour values (6138,'03-10-2009',19);
insert into bustour values (6138,'04-10-2009',22);
insert into bustour values (6138,'04-10-2009',13);

insert into bustour values (4123,'07-10-2009',23);
insert into bustour values (4123,'08-10-2009',27);
insert into bustour values (4123,'11-10-2009',15);

insert into bustour values (6138,'05-10-2009',16);
insert into bustour values (6138,'05-10-2009',13);
insert into bustour values (6138,'05-10-2009',18);
insert into bustour values (6138,'05-10-2009',24);
insert into bustour values (6138,'07-10-2009',20);
insert into bustour values (6138,'08-10-2009',18);
SQL> select * from bustour;

    BUSSNO START_DATE PASSENGERS
---------- ---------- ----------
      4123 04-10-2009         20
      4123 05-10-2009         25
      4123 05-10-2009         18
      4123 06-10-2009         15
      6138 03-10-2009         16
      6138 03-10-2009         19
      6138 04-10-2009         22
      6138 04-10-2009         13
      4123 07-10-2009         23
      4123 08-10-2009         27
      4123 11-10-2009         15

    BUSSNO START_DATE PASSENGERS
---------- ---------- ----------
      6138 05-10-2009         16
      6138 05-10-2009         13
      6138 05-10-2009         18
      6138 05-10-2009         24
      6138 07-10-2009         20
      6138 08-10-2009         18

17 rows selected.

I want query output as below :

Bussno  start_date      end_Date   totalpassengers   maxpessenger  maxpassdate
4123    04-10-09        06-10-09          78              25         05-10-09
6138    03-10-09        04-10-09          70              22         04-10-09 
4123    07-10-09        11-10-09          65              27         08-10-09
6138    05-10-09        08-10-09         109              24         05-10-09
So, that we can know on which particular trip-schedule we got maximum passenger and date. (To calculate driver's % of bonus)

Please help me to write the query.

Best regards
Nisha
This post has been answered by Boneist on Oct 13 2009
Jump to Answer

Comments

Centinul
How are you determining the end date?
user12050217
Thanks. Upto the value of busno column remain same, because a complete bus trip data first entered and then another bus trip data.
Centinul
Are there any other columns in this table that you are not providing? The reason I ask is because it seems as if your solution is dependent on when a particular item is inserted into the table. However, there is no column that indicates which items were inserted first such as a monotonically increasing value (a.k.a. sequence value). The order of the result set cannot be guaranteed without using an ORDER BY clause.

Maybe I am missing something here but I cannot see an easy way to order the set initially to get the grouping you desire.

Thanks!
Boneist
user12050217 wrote:
Thanks. Upto the value of busno column remain same, because a complete bus trip data first entered and then another bus trip data.
Yes, but how are you determining the order in which the rows were inserted into the table?

Think of a table like a pile of clothes on the floor - how can you tell the order with which each item of clothing was added to the pile if there isn't something that's keeping a note of when it was added to the pile? Answer: you can't.

If your table doesn't have a column indicating the order the rows were inserted, then you're going to find it difficult to get the date you're after.
user12050217
No sir, there are any other column in the table, becasue we gets only bus number, no of passengers (tickets issued by bill book) and date of bus running by drivers; so we just feed the complete data of a particular bus. Now our manager has asked me to produce the above query result, so that they can further calculate the extra money for the drivers.

Thanking you,
Nisha
Boneist
You'll need to go back to your managers and tell them that you won't be able to get the data they're after.

However, it's a very strange requirement - "give me these random ranges determined by when they were inserted into the table" - if insert order is important, then I'd say the table needs a new column (timestamp) to hold the date + time the row was added plus a trigger to populate the column.

I'd go back to your manager and ask what range of dates they'd be most interested in - weekly or monthly?
Karthick2003
Nisha,

You must understand the simple fact that oracle does not have any predefined order for a table. So unless you use the ORDER BY clause the order in which the data are returned by oracle cannot be determined.

Hence you must find a column which will determine the order in which the BUSSNO must be retrieved. If you cant find one then you are out of luck.
Boneist
Eg. if they wanted the data weekly, then you'd do something like:
with my_tab as (select 4123 busno, to_date('04/10/2009', 'dd/mm/yyyy') start_date, 20 passengers from dual union all
                select 4123 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 25 passengers from dual union all
                select 4123 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 18 passengers from dual union all
                select 4123 busno, to_date('06/10/2009', 'dd/mm/yyyy') start_date, 15 passengers from dual union all
                select 6138 busno, to_date('03/10/2009', 'dd/mm/yyyy') start_date, 16 passengers from dual union all
                select 6138 busno, to_date('03/10/2009', 'dd/mm/yyyy') start_date, 19 passengers from dual union all
                select 6138 busno, to_date('04/10/2009', 'dd/mm/yyyy') start_date, 22 passengers from dual union all
                select 6138 busno, to_date('04/10/2009', 'dd/mm/yyyy') start_date, 13 passengers from dual union all
                select 4123 busno, to_date('07/10/2009', 'dd/mm/yyyy') start_date, 23 passengers from dual union all
                select 4123 busno, to_date('08/10/2009', 'dd/mm/yyyy') start_date, 27 passengers from dual union all
                select 4123 busno, to_date('11/10/2009', 'dd/mm/yyyy') start_date, 15 passengers from dual union all
                select 6138 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 16 passengers from dual union all
                select 6138 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 13 passengers from dual union all
                select 6138 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 18 passengers from dual union all
                select 6138 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 24 passengers from dual union all
                select 6138 busno, to_date('07/10/2009', 'dd/mm/yyyy') start_date, 20 passengers from dual union all
                select 6138 busno, to_date('08/10/2009', 'dd/mm/yyyy') start_date, 18 passengers from dual)
-- end of mimicking your data; use SQL below:
select busno,
       trunc(start_date, 'iw') start_date,
       trunc(start_date, 'iw') + 7 end_date,
       sum(passengers) total_passengers,
       max(passengers) max_passengers
from   my_tab
group by busno, trunc(start_date, 'iw')
order by trunc(start_date, 'iw'), busno;

     BUSNO START_DATE END_DATE  TOTAL_PASSENGERS MAX_PASSENGERS
---------- ---------- --------- ---------------- --------------
      4123 28/SEP/09  05/OCT/09               20             20
      6138 28/SEP/09  05/OCT/09               70             22
      4123 05/OCT/09  12/OCT/09              123             27
      6138 05/OCT/09  12/OCT/09              109             24
user12050217
I would like to thanks to all for your fast and valuable replies. Here issue is how to order by; so i am assuming that order by rownum can be work; because once entered data, it will never updated and as we insert the rows, i hope it will be inserted at the bottom (free space in the extent). But if it is still require the change the table structure i.e. add a new column, i am ready to add the new column but please guide me in more details.

Thanks to all you once again.
Nisha
Karthick2003
ROWNUM is a pseudocolumn which is generated when a row is outputted. So ordering by ROWNUM does not make any sense.

Your best possible option is to have a timestamp column that will the capture the inserted time and you can just order by that column.
Centinul
user12050217 wrote:
I would like to thanks to all for your fast and valuable replies. Here issue is how to order by; so i am assuming that order by rownum can be work; because once entered data, it will never updated and as we insert the rows, i hope it will be inserted at the bottom (free space in the extent).
As was mentioned previously there is no predefined order. There is no default that says data is returned in the order it was inserted. If there was this could make index access painfully slow.

I believe you are going to have to modify the table structure.
user12050217
SQL> create table newbus as select rownum id,bussno,start_date,passengers from bustour;

Table created.

SQL> select * from newbus;

        ID     BUSSNO START_DATE PASSENGERS
---------- ---------- ---------- ----------
         1       4123 04-10-2009         20
         2       4123 05-10-2009         25
         3       4123 05-10-2009         18
         4       4123 06-10-2009         15
         5       6138 03-10-2009         16
         6       6138 03-10-2009         19
         7       6138 04-10-2009         22
         8       6138 04-10-2009         13
         9       4123 07-10-2009         23
        10       4123 08-10-2009         27
        11       4123 11-10-2009         15

        ID     BUSSNO START_DATE PASSENGERS
---------- ---------- ---------- ----------
        12       6138 05-10-2009         16
        13       6138 05-10-2009         13
        14       6138 05-10-2009         18
        15       6138 05-10-2009         24
        16       6138 07-10-2009         20
        17       6138 08-10-2009         18

17 rows selected.

SQL> select * from newbus order by id;

        ID     BUSSNO START_DATE PASSENGERS
---------- ---------- ---------- ----------
         1       4123 04-10-2009         20
         2       4123 05-10-2009         25
         3       4123 05-10-2009         18
         4       4123 06-10-2009         15
         5       6138 03-10-2009         16
         6       6138 03-10-2009         19
         7       6138 04-10-2009         22
         8       6138 04-10-2009         13
         9       4123 07-10-2009         23
        10       4123 08-10-2009         27
        11       4123 11-10-2009         15

        ID     BUSSNO START_DATE PASSENGERS
---------- ---------- ---------- ----------
        12       6138 05-10-2009         16
        13       6138 05-10-2009         13
        14       6138 05-10-2009         18
        15       6138 05-10-2009         24
        16       6138 07-10-2009         20
        17       6138 08-10-2009         18

17 rows selected.

SQL>
Now i think i have id column for order by clause. We will update our application to insert the id value max(id)+1.
Regards
kendenny
Don't use max(id)+1, use a sequence. If you use max(id)+1 and two users enter data at the same time you'll get duplicate id's.
Boneist
user12050217 wrote:
Now i think i have id column for order by clause. We will update our application to insert the id value max(id)+1.
As well as not using a sequence to generate your id's, relying on id's for ordering can produce incorrect results, as rows can get an id, and then wait for the user to finish their lunch before they click on the "submit" button. If you're interested in the time at which the row was inserted, then, as I've suggested before, use a timestamp column, along with a trigger to update the time to systimestamp.

That way, you'll always get the time at which the row was added into the table.

I still fail to see why your managers require differing time ranges per bus number to work out bonuses; surely a consistent range (a week/4 days/a month/etc) would be better? Have you queried that with them?
user12050217
Our management has decided to give bonus to drivers, on running the bus on its full sitting capacity. So, when bus's trip starts and go to end; for those no. of days (which are not fixed and certain) a certain % of bonus will be calculated as per my query output;including value of totalpassengers. As far as application and sequence concern, only one data entry operator runs the application and enters the data in the table. If still it require to use sequence; i shall keep in mind when updation in application done.

Regards
Boneist
user12050217 wrote:
As far as application and sequence concern, only one data entry operator runs the application and enters the data in the table.
Maybe... for now. What happens if they get someone else in to help out?

And will the operator(s) know that the order in which they key the data is important? What determines the order of how they key in the data?
If still it require to use sequence; i shall keep in mind when updation in application done.
No. If you're going to go with ID's, USE A SEQUENCE to generate them!! Trying to work out the max id and then add 1 to it is only going to get slower the more rows you have in the table. It's a bad idea to re-invent the wheel!
user12050217
Sir,
Earlier it was issue that table do'nt have order by column, so i added it. Now, i am not able to understand the use of sequence / id column to create the required output please. Here id column is the record number and its another issue, when we update the application. Now, table is having order by column, so please guide me how to write the select query on the existing table. Rest will be sorted out in due course please.

I am not getting your mean by "No. If you're going to go with ID's, USE A SEQUENCE to generate them!! "
Kindly elaborate more.
Thank you sir.
Boneist
You create a sequence to auto generate the ids for you.

Then you insert them into the table like:
insert into table_name (id_column, ....) values (sequence_name.nextval, .....);
That way, you don't have to worry about working out what the next id you should pick; it's already done for you.

You could also have a trigger to do the insert if no id is specified.
user12050217
Ok, with this i will do the required changes in the application its sequence / id issue. But i am waiting for select query please.

Regards
Boneist
Answer
Ok, something like this:
with my_tab as (select 1 id, 4123 busno, to_date('04/10/2009', 'dd/mm/yyyy') start_date, 20 passengers from dual union all
                select 2 id, 4123 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 25 passengers from dual union all
                select 3 id, 4123 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 18 passengers from dual union all
                select 4 id, 4123 busno, to_date('06/10/2009', 'dd/mm/yyyy') start_date, 15 passengers from dual union all
                select 5 id, 6138 busno, to_date('03/10/2009', 'dd/mm/yyyy') start_date, 16 passengers from dual union all
                select 6 id, 6138 busno, to_date('03/10/2009', 'dd/mm/yyyy') start_date, 19 passengers from dual union all
                select 7 id, 6138 busno, to_date('04/10/2009', 'dd/mm/yyyy') start_date, 22 passengers from dual union all
                select 8 id, 6138 busno, to_date('04/10/2009', 'dd/mm/yyyy') start_date, 13 passengers from dual union all
                select 9 id, 4123 busno, to_date('07/10/2009', 'dd/mm/yyyy') start_date, 23 passengers from dual union all
                select 10 id, 4123 busno, to_date('08/10/2009', 'dd/mm/yyyy') start_date, 27 passengers from dual union all
                select 11 id, 4123 busno, to_date('11/10/2009', 'dd/mm/yyyy') start_date, 15 passengers from dual union all
                select 12 id, 6138 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 16 passengers from dual union all
                select 13 id, 6138 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 13 passengers from dual union all
                select 14 id, 6138 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 18 passengers from dual union all
                select 15 id, 6138 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 24 passengers from dual union all
                select 16 id, 6138 busno, to_date('07/10/2009', 'dd/mm/yyyy') start_date, 20 passengers from dual union all
                select 17 id, 6138 busno, to_date('08/10/2009', 'dd/mm/yyyy') start_date, 18 passengers from dual)
-- end of mimicking your data; use SQL below:
select busno,
       min(start_date) start_date,
       max(start_date) end_date,
       sum(passengers) total_passengers,
       max(passengers) max_passengers,
       max_pass_date
from   (select id,
               start_date,
               busno,
               passengers,
               distance,
               max(start_date) keep (dense_rank first order by passengers desc) over (partition by distance, busno) max_pass_date
        from   (select id,
                       start_date,
                       busno,
                       passengers,
                       -- using tabibitosan method, borrowed from Aketi Jyuuzou 
                       dense_rank() over (order by id) -
                          row_number() over (partition by busno order by id) distance
                from   my_tab))
group by busno, distance, max_pass_date
order by min(id)

     BUSNO START_DATE END_DATE   TOTAL_PASSENGERS MAX_PASSENGERS MAX_PASS_DATE
---------- ---------- ---------- ---------------- -------------- -------------
      4123 04/10/2009 06/10/2009               78             25 05/10/2009   
      6138 03/10/2009 04/10/2009               70             22 04/10/2009   
      4123 07/10/2009 11/10/2009               65             27 08/10/2009   
      6138 05/10/2009 08/10/2009              109             24 05/10/2009   
Marked as Answer by user12050217 · Sep 27 2020
Centinul
This is what I came up with, I'm not sure how full proof it is to be perfectly honest.

Removed for a couple of reasons:

1. The previously presented solution was better (nice solution Boneist) :)
2. My max passenger date logic was incorrect.

Edited by: Centinul on Oct 13, 2009 11:10 AM

Boneist is there a link on the `Net for this: tabibitosan method, borrowed from Aketi Jyuuzou
Boneist
Sure; you don't think I {noformat}*remembered*{noformat} how to do it, do you?! *{;-)

Here's where I first came across it: 3320506
And here's another one: 3677639

It was the 2nd link that I, um, "borrowed" from! ("copied" doesn't sound quite so nice... *{;-) )


ETA: In fact, I only really "got" what it does by printing out the rownum and the dense_rank values to work out why one minus the other gives a consistent result; I can has teh dumb *{;-)

Edited by: Boneist on 13-Oct-2009 16:40
Centinul
Boneist wrote:
ETA: In fact, I only really "got" what it does by printing out the rownum and the dense_rank values to work out why one minus the other gives a consistent result; I can has teh dumb *{;-)
That's the first thing I did when I was looking at your method :)

Thanks for the links!
user12050217
Thank you sir Boneist. Your query is giving me the exact output which i want.

Regards
Nisha Sharma
Aketi Jyuuzou
I am glad that "sense of tabibitosan" is used ;-)
with my_tab as (
select 1 id, 4123 busno, to_date('04/10/2009', 'dd/mm/yyyy') start_date, 20 passengers from dual union all
select 2 id, 4123 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 25 passengers from dual union all
select 3 id, 4123 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 18 passengers from dual union all
select 4 id, 4123 busno, to_date('06/10/2009', 'dd/mm/yyyy') start_date, 15 passengers from dual union all
select 5 id, 6138 busno, to_date('03/10/2009', 'dd/mm/yyyy') start_date, 16 passengers from dual union all
select 6 id, 6138 busno, to_date('03/10/2009', 'dd/mm/yyyy') start_date, 19 passengers from dual union all
select 7 id, 6138 busno, to_date('04/10/2009', 'dd/mm/yyyy') start_date, 22 passengers from dual union all
select 8 id, 6138 busno, to_date('04/10/2009', 'dd/mm/yyyy') start_date, 13 passengers from dual union all
select 9 id, 4123 busno, to_date('07/10/2009', 'dd/mm/yyyy') start_date, 23 passengers from dual union all
select 10 id, 4123 busno, to_date('08/10/2009', 'dd/mm/yyyy') start_date, 27 passengers from dual union all
select 11 id, 4123 busno, to_date('11/10/2009', 'dd/mm/yyyy') start_date, 15 passengers from dual union all
select 12 id, 6138 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 16 passengers from dual union all
select 13 id, 6138 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 13 passengers from dual union all
select 14 id, 6138 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 18 passengers from dual union all
select 15 id, 6138 busno, to_date('05/10/2009', 'dd/mm/yyyy') start_date, 24 passengers from dual union all
select 16 id, 6138 busno, to_date('07/10/2009', 'dd/mm/yyyy') start_date, 20 passengers from dual union all
select 17 id, 6138 busno, to_date('08/10/2009', 'dd/mm/yyyy') start_date, 18 passengers from dual)
select busno,min(start_date),max(start_date),sum(passengers),max(passengers),
max(start_date) Keep(Dense_Rank Last order by passengers) as max_pass_date
  from (select ID,busno,start_date,passengers,
         Row_Number() over(order by id)
        -Row_Number() over(partition by busno order by id) as distance
          from my_tab)
group by busno,distance
order by min(id);

BUSNO  MIN(STAR  MAX(STAR  SUM(PASSENGERS)  MAX(PASSENGERS)  MAX_PASS
-----  --------  --------  ---------------  ---------------  --------
 4123  09-10-04  09-10-06               78               25  09-10-05
 6138  09-10-03  09-10-04               70               22  09-10-04
 4123  09-10-07  09-10-11               65               27  09-10-08
 6138  09-10-05  09-10-08              109               24  09-10-05
1 - 25
Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Jan 18 2010
Added on Oct 13 2009
25 comments
2,404 views