10 Replies Latest reply on Mar 1, 2012 4:30 PM by Frank Kulash

    insert Duplicate rows with difrent languages

    916212
      Hello everybody,
      I need your help in solving this query with the most performant way.
      I have a table:
      Create Table "POSITION" (
      id  number,
      TEXT varchar2(50),
      Language Varchar2(1)
      );
      insert statments
      INSERT INTO POSITION 
      VALUES (1, 'Secretary', 'E');
      INSERT INTO POSITION 
      VALUES (1, 'Secrétaire', 'F');
      INSERT INTO POSITION 
      VALUES (1, 'Segretario', 'I');
      INSERT INTO POSITION 
      VALUES (1, 'Secretario', 'S');
      INSERT INTO POSITION 
      VALUES (2, 'Assistance', 'E');
      INSERT INTO POSITION 
      VALUES (2, 'Ayuda', 'S');
      As you have seen the text changes in diffrent languages for the same id. for whatever reasons the data is in complete in all languages (some ids missing english some french and some italian). I need a query to insert the missing language(s) in my example above Id number 2 the language French and Italian is missing so the query should copy the record English instead but with Language = 'F' and one more time with language = 'I' and so on.
      in addition the query should look first to copy English if it doesn't exist then French if not then Italian when more than language exists.

      hope I make it clear. as i said this query should be performant also as the data size is big.



      Thanks for your help
        • 1. Re: insert Duplicate rows with difrent languages
          Frank Kulash
          Hi,

          Thanks for posting the CREATE TABLE and INSERT statements; tha's very helpful.

          Here's one way to do what you want:
          MERGE INTO     position     dst
          USING     (
                    WITH       all_language     AS
                    (
                         SELECT  'E' AS language, 1 AS ord_num     FROM dual     UNION ALL
                         SELECT     'F',           2          FROM dual     UNION ALL
                         SELECT     'I',           3          FROM dual     UNION ALL
                         SELECT     'S',           4          FROM dual
                    )
                    SELECT     p.id
                    ,     FIRST_VALUE (p.text IGNORE NULLS) 
                              OVER ( PARTITION BY  p.id
                                     ORDER BY      l.ord_num
                                   )     AS top_text
                    ,     l.language
                    FROM           all_language     l
                    LEFT OUTER JOIN  position     p     PARTITION BY (p.id)
                                             ON  l.language     = p.language
               )               src
          ON     (     src.id          = dst.id
               AND     src.language     = dst.language
               )
          WHEN NOT MATCHED THEN 
          INSERT     (dst.id, dst.text,     dst.language)
          VALUES     (src.id, src.top_text, src.language)
          ;
          It might be just as efficient to do this as an INSERT, rather than a MERGE.

          You may already have a table such as all_language that tells what the languages are, and in what order they should be used for getting missing values. If so, use that table where I used the sub-query all_language above. If you don't have such a table, think about creating one; it could be useful for a number of reasons, including validating language when you INSERT or UPDATE position.
          In this particular case, the alphabetic order of the codes 'E', 'F', 'I' and 'S' happens to be the order yoiu want, but it's bad programming practice to rely on that. If they change the picking order, or add another language, the code would require much larger changes.

          You might consider adding a new column, that tells whether the word in text really is a word in the given language, or was copied from some other language just to fill a void. I suggest another VARCHAR2 (1) column, called source_language, that will be equal to language when the text is actually a word in that language.
          • 2. Re: insert Duplicate rows with difrent languages
            916212
            Thanks for the Ideas
            I created a table
            Create Table "all_languages" (
            ord_num  number,
            language varchar2(1)
            );
            INSERT INTO ALL_LANGUAGES 
            VALUES (1, 'E');
            INSERT INTO ALL_LANGUAGES 
            VALUES (2, 'F');
            INSERT INTO ALL_LANGUAGES 
            VALUES (3, 'I');
            INSERT INTO ALL_LANGUAGES 
            VALUES (4, 'S');
            and the position table changed like this
            Create Table "POSITION" (id  number,TEXT varchar2(50),Language Varchar2(1), source_language Varchar2(1));
            I didn't exactly get what you meant by using insert rather than merge. can you show me what it is your idea with inserts.

            and yes you are right i created this table all_languages. in case they change the order then they change in this table.


            regarding the text of the Language for example 'E' this will never change I guess. not in the near future unless English language suddenly disapears then no one will understand this post :))))


            Thanks again
            • 3. Re: insert Duplicate rows with difrent languages
              Frank Kulash
              Hi,
              913209 wrote:
              Thanks for the Ideas
              I created a table
              Create Table "all_languages" (
              ord_num  number,
              language varchar2(1)
              );
              Remember, anything inside quotes (either single-quotes or double-quotes) is case-sensitive. Table names with lower-case letters are more trouble than they're worth. Avoid using double-quotes.
              Try something like this for all_languages:
              Create Table all_languages 
              (     ord_num       number
              ,     language      varchar2 (1)     PRIMARY KEY
              );
              It doesn't hurt to have a primary key, and having one enables you to use language in foreign key constraints:
              Create Table POSITION 
              (     id            number
              ,     TEXT           varchar2 (20)     NOT NULL
              ,     Language      Varchar2 (1)     CONSTRAINT  pos_lang_fk
                                       REFERENCES  all_languages (language)     
              ,     source_language     varchar2 (1)     CONSTRAINT  pos_SOURCE_lang_fk
                                       REFERENCES  all_languages (language)     
              );
              I put a NOT NULL constraint on text; it matters in the INSERT statement below.
              I didn't exactly get what you meant by using insert rather than merge. can you show me what it is your idea with inserts.
              Here's one way:
              INSERT INTO     position (id, text, language, source_language)
              WITH     got_analytics     AS
              (
                   SELECT     p.id
                   ,     p.text          AS orig_text
                   ,     FIRST_VALUE (p.text IGNORE NULLS) 
                             OVER ( PARTITION BY  p.id
                                    ORDER BY      l.ord_num
                                  )     AS top_text
                   ,     l.language
                   ,     FIRST_VALUE (p.source_language IGNORE NULLS)
                             OVER ( PARTITION BY  p.id
                                    ORDER BY      l.ord_num
                                  )     AS source_language
                   FROM           all_languageS     l
                   LEFT OUTER JOIN  position     p     PARTITION BY (p.id)
                                            ON  l.language     = p.language
              )
              SELECT     id, top_text, language, source_language
              FROM     got_analytics
              WHERE     orig_text     IS NULL
              ;
              This looks simpler than the MERGE I posted earlier, but that's partly because now we have a real all_languages table, and don't need to create one in a sub-query.
              and yes you are right i created this table all_languages. in case they change the order then they change in this table.
              Exactly! Any authorized user can make those changes, and your INSERT statements (or other code) do not have to change at all.
              regarding the text of the Language for example 'E' this will never change I guess. not in the near future unless English language suddenly disapears then no one will understand this post :))))
              English disappearing is not the only problem. (Actually, if you quit uisng English in this application, it would work anyway.)
              What I was concerned about was, for example, if they decide to add the Arabic language as 'A'. Would you want Arabic words to become the first choice, just because 'A' comes before 'E'?
              1 person found this helpful
              • 4. Re: insert Duplicate rows with difrent languages
                916212
                Hello,
                just an update.
                after many tests and combination I found out that the insert statement will not work fully. if the first language in the all_languages table doesn't exist in the position table.

                the merge statement also uses the same concept.

                can you give more ideas?


                Thanks

                Edited by: 913209 on Feb 29, 2012 7:42 AM
                • 5. Re: insert Duplicate rows with difrent languages
                  916212
                  after tests I will re open this question again
                  • 6. Re: insert Duplicate rows with difrent languages
                    Frank Kulash
                    Hi,
                    913209 wrote:
                    Hello,
                    just an update.
                    after many tests and combination I found out that the insert statement will not work fully. if the first language in the all_languages table doesn't exist in the position table.

                    the merge statement also uses the same concept.

                    can you give more ideas?
                    Sorry, I don't understand what you're saying.

                    Whenever you have a problem, please post a little sample data (CREATE TABLE and INSERT statements, relevant columns only) from all tables.
                    Also post the results you want from that data, and an explanation of how you get those results from that data, with specific examples.
                    If you're asking about a DML statement, such as UPDATE, the CREATE TABLE and INSERT statements should re-create the tables as they are before the DML, and the results will be the contents of the changed table(s) when everything is finished.
                    • 7. Re: insert Duplicate rows with difrent languages
                      916212
                      Sorry for misleading you. maybe i was not clear enough.

                      let me explain more my question.
                      The situation is we are in multinational company every country maintain position text with their own language.

                      Now they may be maintaining with more than one language for example French and English.

                      The query should insert the missing text of the Position ID by copying it from the original text but with another Language field.

                      after this query we should always have 4 records for each Position ID. Moreover not to overwrite the original texts only check the records missing and insert it. ANY language can be missing.

                      To add more difficulty to the query I thought of this Language Order or the ALL_Languages Table. which says if the text available in English and French then better copy the English text because people will understand it more than French for Example. if the English doesn't exist then copy the French and so on.

                      Coming back to the Example.

                      the query should do this
                      INSERT INTO POSITION
                      VALUES (2, 'Assistance', 'F','E');
                      INSERT INTO POSITION 
                      VALUES (2, 'Assistance', 'I','E');
                      IF English did not exist before for ID=2

                      then it should do this
                      INSERT INTO POSITION
                      VALUES (2, 'Ayuda', 'F','S');
                      INSERT INTO POSITION 
                      VALUES (2, 'Ayuda', 'I','S');
                      INSERT INTO POSITION 
                      VALUES (2, 'Ayuda', 'E','S');
                      if all Languages exists Like ID=1 the query should not add more rows to ID=1.

                      if all IDS have 4 records then it should prompt "no records inserted"


                      Hope i make my self clear.

                      Best regards
                      • 8. Re: insert Duplicate rows with difrent languages
                        Frank Kulash
                        Hi,

                        Okay, I see my mistake now.
                        You can use MERGE like this:
                        MERGE INTO     position     dst
                        USING     (
                                  SELECT     p.id
                                  ,     FIRST_VALUE (p.text IGNORE NULLS) 
                                            OVER ( PARTITION BY  p.id
                                                   ORDER BY      l.ord_num
                                                   ROWS BETWEEN  UNBOUNDED PRECEDING
                                                            AND          UNBOUNDED FOLLOWING
                                                 )     AS top_text
                                  ,     FIRST_VALUE (p.lsource_anguage IGNORE NULLS) 
                                            OVER ( PARTITION BY  p.id
                                                   ORDER BY      l.ord_num
                                                   ROWS BETWEEN  UNBOUNDED PRECEDING
                                                            AND          UNBOUNDED FOLLOWING
                                                 )     AS source_language
                                  ,     l.language
                                  FROM           all_languages     l
                                  LEFT OUTER JOIN  position     p     PARTITION BY (p.id)
                                                           ON  l.language     = p.language
                             )               src
                        ON     (     src.id          = dst.id
                             AND     src.language     = dst.language
                             )
                        WHEN NOT MATCHED THEN 
                        INSERT     (dst.id, dst.text,     dst.language, dst.source_language)
                        VALUES     (src.id, src.top_text, src.language, src.source_language)
                        ;
                        Or you can use INSERT, like this:
                        INSERT INTO     position (id, text, language, source_language)
                        WITH     got_analytics     AS
                        (
                             SELECT     p.id
                             ,     p.text          AS orig_text
                             ,     FIRST_VALUE (p.text IGNORE NULLS) 
                                       OVER ( PARTITION BY  p.id
                                              ORDER BY      l.ord_num
                                              ROWS BETWEEN  UNBOUNDED PRECEDING
                                                       AND          UNBOUNDED FOLLOWING
                                            )     AS top_text
                             ,     l.language
                             ,     FIRST_VALUE (p.source_language IGNORE NULLS)
                                       OVER ( PARTITION BY  p.id
                                              ORDER BY      l.ord_num
                                              ROWS BETWEEN  UNBOUNDED PRECEDING
                                                       AND          UNBOUNDED FOLLOWING
                                            )     AS source_language
                             FROM           all_languageS     l
                             LEFT OUTER JOIN  position     p     PARTITION BY (p.id)
                                                      ON  l.language     = p.language
                        )
                        SELECT     id, top_text, language, source_language
                        FROM     got_analytics
                        WHERE     orig_text     IS NULL
                        ;
                        This is basically what I posted before, except that I added
                        ...                      ROWS BETWEEN  UNBOUNDED PRECEDING
                                                       AND          UNBOUNDED FOLLOWING
                        to all the analytic clauses.
                        The problem was that when you say "ORDER BY ord_num", that creates a default window that includes only the current ord_num and lower values. We need to look at all values, lower and higher, so we have to override the default windowing.

                        Edited by: Frank Kulash on Feb 29, 2012 8:58 PM

                        Edited by: Frank Kulash on Mar 1, 2012 11:31 AM
                        Added correct MERGE statement.
                        • 9. Re: insert Duplicate rows with difrent languages
                          916212
                          Hello,
                          Source Language comes as null. I guess you have to substitute in the query P.source_language with P.Language

                          Best Regards

                          Edited by: 913209 on Mar 1, 2012 3:10 AM
                          • 10. Re: insert Duplicate rows with difrent languages
                            Frank Kulash
                            Hi,
                            913209 wrote:
                            Hello,
                            Source Language comes as null.
                            Not when I run it.
                            `       ID TEXT                 L S
                            ---------- -------------------- - -
                                     2 Ayuda                E S
                                     2 Ayuda                F S
                                     2 Ayuda                I S
                                     2 Ayuda                S S
                            Post a complete test case that people can run to re-create the problem.