8 Replies Latest reply: Nov 13, 2012 1:28 PM by Iordan Iotzov RSS

    How to update random rows?

    973995
      Hello All,

      I have got answers table, there are two columns which belong to that table. This table consists 1 million rows. The answer column can be numeric or text.
      create table answers(
      answer_id number primary key,
      answer varchar2(50)
      );
      My question is

      I want to update 50 random rows to 100 which is equal to '0' (zero).

      The below code works but I just wonder does it work correctly? I mean, I want to choose these rows randomly. However it is very hard to update random. Is this the best way? Do you have any suggestions?
      update answers set answer = '100' where answer_id in( 
      with
      get_ids as (select answer_id from answers where answer = '0' and rownum <= 50 order by dbms_random.value)
      select answer_id from get_ids);
      In addition, when I read threads relevant to random update, somebody recommends using rowid, do you think should I use rowid instead of answer_id? (answer_id is also unique value in my table) If I use rowid will it be faster?

      Thanks for your help.
        • 1. Re: How to update random rows?
          Frank Kulash
          Hi,
          970992 wrote:
          Hello All,

          I have got answers table, there are two columns which belong to that table. This table consists 1 million rows. The answer column can be numeric or text.
          create table answers(
          answer_id number primary key,
          answer varchar2(50)
          );
          My question is

          I want to update 50 random rows to 100 which is equal to '0' (zero).

          The below code works but I just wonder does it work correctly? I mean, I want to choose these rows randomly. However it is very hard to update random. Is this the best way?
          Almost. You want to pick the 50 rows after the ORDER BY has been applied, like this:
          update  answers 
          set      answer = '100' 
          where     answer_id in
          ( 
          with
          get_ids as (
                      select    answer_id 
                      from      answers
                      where         answer = '0'
          --            and      rownum <= 50          -- DON'T use ROWNUM here
                      order by  dbms_random.value
                  )
                  select  answer_id 
                  from    get_ids
                  where   rownum <= 50          -- Use ROWNUM here
          );
          Do you have any suggestions?
          update answers set answer = '100' where answer_id in( 
          with
          get_ids as (select answer_id from answers where answer = '0' and rownum <= 50 order by dbms_random.value)
          select answer_id from get_ids);
          The ORDER BY clause is the last part of a query to be run, after the WHERE clause is complete. As originally posted, you were picking 50 rows that had answer = '0', and then sorting them in random order. You want to pick all the rows that have answer = '0', then sort them in random order, and, last of all, pick the frist 50 from the list.
          In addition, when I read threads relevant to random update, somebody recommends using rowid, do you think should I use rowid instead of answer_id? (answer_id is also unique value in my table) If I use rowid will it be faster?
          Yes, using ROWID will be faster.
          • 2. Re: How to update random rows?
            AlbertoFaenza
            Hi,

            this one:
            select answer_id from answers where answer = '0' and rownum <= 50 order by dbms_random.value
            will always get the same 50 rows randomly order.

            You might be interested in this thread: {thread:id=1097955}

            It shows the use of SAMPLE clause that in your case might be efficient.
            Unfortunately it cannot be used in a subquery.

            Suppose that your table has 1Million records, 50 rows are 0.005%.
            In this case you can use something like:
            SELECT answer_id
              FROM answers SAMPLE (0.01)
             WHERE ROWNUM <= 50;
            Will select 50 random rows from 0.01% of the rows of the table randomly taken.
            I have used 0.01% as the sample is approximate and limited the output with ROWNUM. Your table will be scanned anyway only for 0.01% of total records.

            Unfortunately in your case you have the WHERE clause that select rows having answer='0' and this make this option useless for you but you may still keep in mind this option for some other use.

            Regards.
            Al

            Edited by: Alberto Faenza on Nov 13, 2012 5:20 PM
            SAMPLE clause cannot be specified in a subquery.

            Edited by: Alberto Faenza on Nov 13, 2012 5:37 PM
            Clarification added for answer='0'
            • 3. Re: How to update random rows?
              973995
              Thanks for your comment. I see that it is hard to use update and random together. I am just trying to find out efficient way in order to use both of them.


              By the way, you said

              >
              Hi,

              this one:
              select answer_id from answers where answer = '0' and rownum <= 50 order by dbms_random.value
              will always get the same 50 rows randomly order.
              >

              Because of the order by clause is considered after where clause, is this the reason?

              Edited by: 970992 on 13.Kas.2012 09:10
              • 4. Re: How to update random rows?
                973995
                Ok, I will use the rownum in the last query in with clause.

                Thanks for your remarkable comments.
                • 5. Re: How to update random rows?
                  973995
                  any ideas?
                  • 6. Re: How to update random rows?
                    EdStevens
                    970992 wrote:
                    Hello All,

                    I have got answers table, there are two columns which belong to that table. This table consists 1 million rows. The answer column can be numeric or text.
                    create table answers(
                    answer_id number primary key,
                    answer varchar2(50)
                    );
                    "The answer column can be numeric or text"


                    No, it cannot. It is text. It is always text. It is text by your own definition.

                    It may very well be true that some of the values contain only numeric characters, but that does not make the value numeric. It is still text. If you try to do any arithmetic against it, oracle will attempt to do a type cast conversion behind your back - which mar or may not succeed. But the values are still text in the table.
                    SQL> create table answers(
                     answer_id number primary key,
                     answer varchar2(50)
                     );
                    --
                    insert into answers values (1,100);
                    insert into answers values (2,200);
                    commit;
                    select sum(answer) from answers;
                    
                      2    3    4
                    Table created.
                    
                    SQL> SQL>
                    1 row created.
                    
                    SQL>
                    1 row created.
                    
                    SQL>
                    Commit complete.
                    
                    SQL>
                    SUM(ANSWER)
                    -----------
                            300
                    
                    SQL> SQL> insert into answers values (3,'heres your sign');
                    
                    1 row created.
                    
                    SQL> select sum(answer) from answers;
                    select sum(answer) from answers
                               *
                    ERROR at line 1:
                    ORA-01722: invalid number
                    SQL> truncate table answers;
                    
                    Table truncated.
                    
                    SQL> insert into answers values (1,'100');
                    
                    1 row created.
                    
                    SQL> insert into answers values (2,'200');
                    
                    1 row created.
                    
                    SQL> select sum(answer) from answers;
                    
                    SUM(ANSWER)
                    -----------
                            300
                    
                    SQL> insert into answers values (3,'heres your sign');
                    
                    1 row created.
                    
                    SQL> select sum(answer) from answers;
                    select sum(answer) from answers
                               *
                    ERROR at line 1:
                    ORA-01722: invalid number
                    My question is

                    I want to update 50 random rows to 100 which is equal to '0' (zero).

                    The below code works but I just wonder does it work correctly? I mean, I want to choose these rows randomly. However it is very hard to update random. Is this the best way? Do you have any suggestions?
                    update answers set answer = '100' where answer_id in( 
                    with
                    get_ids as (select answer_id from answers where answer = '0' and rownum <= 50 order by dbms_random.value)
                    select answer_id from get_ids);
                    In addition, when I read threads relevant to random update, somebody recommends using rowid, do you think should I use rowid instead of answer_id? (answer_id is also unique value in my table) If I use rowid will it be faster?

                    Thanks for your help.
                    • 7. Re: How to update random rows?
                      AlbertoFaenza
                      970992 wrote:
                      Thanks for your comment. I see that it is hard to use update and random together. I am just trying to find out efficient way in order to use both of them.


                      By the way, you said

                      >
                      Hi,

                      this one:
                      select answer_id from answers where answer = '0' and rownum <= 50 order by dbms_random.value
                      will always get the same 50 rows randomly order.
                      >

                      Because of the order by clause is considered after where clause, is this the reason?

                      Edited by: 970992 on 13.Kas.2012 09:10
                      The order by clause is sorting the result of the query. So you first get 50 rows and then you sort them in this case. Frank has shown the correct use in his post.

                      Regards.
                      Al

                      Edited by: Alberto Faenza on Nov 13, 2012 7:04 PM
                      • 8. Re: How to update random rows?
                        Iordan Iotzov
                        While SAMPLE returns a random set every time, the order does not appear to be random. That is, if you have a table that was populated with an increasing sequence then SAMPLE would return (almost) the records in the order they are entered /I did a small test in 11.2.0.3/. That means that we should keep the “order by dbms_random.value”.

                        Also, answer='0' does not preclude the use of SAMPLE, but requires us to do some additional planning. For instance, if one third of the records have answer='0', then we have to retrieve three times as much data from the SAMPLE query, so after elimination we can get what we want.

                        Finally, to get around the SAMPLE limitation (IN CLAUSE), we can switch to PL/SQL:
                        For l in ( select answer_id from 
                                  (   select    answer_id 
                                    from      answers sample (0.03)
                                    where         answer = '0'
                        --            and      rownum <= 50          -- DON'T use ROWNUM here
                                    order by  dbms_random.value
                                 )
                                where   rownum <= 50          -- Use ROWNUM here
                        ) loop
                             update answers set answer = '100' where answer_id  = l. answer_id   ;
                        End loop;
                        Iordan Iotzov
                        http://iiotzov.wordpress.com/