This discussion is archived
10 Replies Latest reply: Mar 1, 2012 8:30 AM by Frank Kulash RSS

insert Duplicate rows with difrent languages

916212 Newbie
Currently Being Moderated
Hello everybody,
I need your help in solving this query with the most performant way.
I have a table:
Create Table "POSITION" (
id  number,
TEXT varchar2(50),
Language Varchar2(1)
);
insert statments
INSERT INTO POSITION 
VALUES (1, 'Secretary', 'E');
INSERT INTO POSITION 
VALUES (1, 'Secrétaire', 'F');
INSERT INTO POSITION 
VALUES (1, 'Segretario', 'I');
INSERT INTO POSITION 
VALUES (1, 'Secretario', 'S');
INSERT INTO POSITION 
VALUES (2, 'Assistance', 'E');
INSERT INTO POSITION 
VALUES (2, 'Ayuda', 'S');
As you have seen the text changes in diffrent languages for the same id. for whatever reasons the data is in complete in all languages (some ids missing english some french and some italian). I need a query to insert the missing language(s) in my example above Id number 2 the language French and Italian is missing so the query should copy the record English instead but with Language = 'F' and one more time with language = 'I' and so on.
in addition the query should look first to copy English if it doesn't exist then French if not then Italian when more than language exists.

hope I make it clear. as i said this query should be performant also as the data size is big.



Thanks for your help
  • 1. Re: insert Duplicate rows with difrent languages
    Frank Kulash Guru
    Currently Being Moderated
    Hi,

    Thanks for posting the CREATE TABLE and INSERT statements; tha's very helpful.

    Here's one way to do what you want:
    MERGE INTO     position     dst
    USING     (
              WITH       all_language     AS
              (
                   SELECT  'E' AS language, 1 AS ord_num     FROM dual     UNION ALL
                   SELECT     'F',           2          FROM dual     UNION ALL
                   SELECT     'I',           3          FROM dual     UNION ALL
                   SELECT     'S',           4          FROM dual
              )
              SELECT     p.id
              ,     FIRST_VALUE (p.text IGNORE NULLS) 
                        OVER ( PARTITION BY  p.id
                               ORDER BY      l.ord_num
                             )     AS top_text
              ,     l.language
              FROM           all_language     l
              LEFT OUTER JOIN  position     p     PARTITION BY (p.id)
                                       ON  l.language     = p.language
         )               src
    ON     (     src.id          = dst.id
         AND     src.language     = dst.language
         )
    WHEN NOT MATCHED THEN 
    INSERT     (dst.id, dst.text,     dst.language)
    VALUES     (src.id, src.top_text, src.language)
    ;
    It might be just as efficient to do this as an INSERT, rather than a MERGE.

    You may already have a table such as all_language that tells what the languages are, and in what order they should be used for getting missing values. If so, use that table where I used the sub-query all_language above. If you don't have such a table, think about creating one; it could be useful for a number of reasons, including validating language when you INSERT or UPDATE position.
    In this particular case, the alphabetic order of the codes 'E', 'F', 'I' and 'S' happens to be the order yoiu want, but it's bad programming practice to rely on that. If they change the picking order, or add another language, the code would require much larger changes.

    You might consider adding a new column, that tells whether the word in text really is a word in the given language, or was copied from some other language just to fill a void. I suggest another VARCHAR2 (1) column, called source_language, that will be equal to language when the text is actually a word in that language.
  • 2. Re: insert Duplicate rows with difrent languages
    916212 Newbie
    Currently Being Moderated
    Thanks for the Ideas
    I created a table
    Create Table "all_languages" (
    ord_num  number,
    language varchar2(1)
    );
    INSERT INTO ALL_LANGUAGES 
    VALUES (1, 'E');
    INSERT INTO ALL_LANGUAGES 
    VALUES (2, 'F');
    INSERT INTO ALL_LANGUAGES 
    VALUES (3, 'I');
    INSERT INTO ALL_LANGUAGES 
    VALUES (4, 'S');
    and the position table changed like this
    Create Table "POSITION" (id  number,TEXT varchar2(50),Language Varchar2(1), source_language Varchar2(1));
    I didn't exactly get what you meant by using insert rather than merge. can you show me what it is your idea with inserts.

    and yes you are right i created this table all_languages. in case they change the order then they change in this table.


    regarding the text of the Language for example 'E' this will never change I guess. not in the near future unless English language suddenly disapears then no one will understand this post :))))


    Thanks again
  • 3. Re: insert Duplicate rows with difrent languages
    Frank Kulash Guru
    Currently Being Moderated
    Hi,
    913209 wrote:
    Thanks for the Ideas
    I created a table
    Create Table "all_languages" (
    ord_num  number,
    language varchar2(1)
    );
    Remember, anything inside quotes (either single-quotes or double-quotes) is case-sensitive. Table names with lower-case letters are more trouble than they're worth. Avoid using double-quotes.
    Try something like this for all_languages:
    Create Table all_languages 
    (     ord_num       number
    ,     language      varchar2 (1)     PRIMARY KEY
    );
    It doesn't hurt to have a primary key, and having one enables you to use language in foreign key constraints:
    Create Table POSITION 
    (     id            number
    ,     TEXT           varchar2 (20)     NOT NULL
    ,     Language      Varchar2 (1)     CONSTRAINT  pos_lang_fk
                             REFERENCES  all_languages (language)     
    ,     source_language     varchar2 (1)     CONSTRAINT  pos_SOURCE_lang_fk
                             REFERENCES  all_languages (language)     
    );
    I put a NOT NULL constraint on text; it matters in the INSERT statement below.
    I didn't exactly get what you meant by using insert rather than merge. can you show me what it is your idea with inserts.
    Here's one way:
    INSERT INTO     position (id, text, language, source_language)
    WITH     got_analytics     AS
    (
         SELECT     p.id
         ,     p.text          AS orig_text
         ,     FIRST_VALUE (p.text IGNORE NULLS) 
                   OVER ( PARTITION BY  p.id
                          ORDER BY      l.ord_num
                        )     AS top_text
         ,     l.language
         ,     FIRST_VALUE (p.source_language IGNORE NULLS)
                   OVER ( PARTITION BY  p.id
                          ORDER BY      l.ord_num
                        )     AS source_language
         FROM           all_languageS     l
         LEFT OUTER JOIN  position     p     PARTITION BY (p.id)
                                  ON  l.language     = p.language
    )
    SELECT     id, top_text, language, source_language
    FROM     got_analytics
    WHERE     orig_text     IS NULL
    ;
    This looks simpler than the MERGE I posted earlier, but that's partly because now we have a real all_languages table, and don't need to create one in a sub-query.
    and yes you are right i created this table all_languages. in case they change the order then they change in this table.
    Exactly! Any authorized user can make those changes, and your INSERT statements (or other code) do not have to change at all.
    regarding the text of the Language for example 'E' this will never change I guess. not in the near future unless English language suddenly disapears then no one will understand this post :))))
    English disappearing is not the only problem. (Actually, if you quit uisng English in this application, it would work anyway.)
    What I was concerned about was, for example, if they decide to add the Arabic language as 'A'. Would you want Arabic words to become the first choice, just because 'A' comes before 'E'?
  • 4. Re: insert Duplicate rows with difrent languages
    916212 Newbie
    Currently Being Moderated
    Hello,
    just an update.
    after many tests and combination I found out that the insert statement will not work fully. if the first language in the all_languages table doesn't exist in the position table.

    the merge statement also uses the same concept.

    can you give more ideas?


    Thanks

    Edited by: 913209 on Feb 29, 2012 7:42 AM
  • 5. Re: insert Duplicate rows with difrent languages
    916212 Newbie
    Currently Being Moderated
    after tests I will re open this question again
  • 6. Re: insert Duplicate rows with difrent languages
    Frank Kulash Guru
    Currently Being Moderated
    Hi,
    913209 wrote:
    Hello,
    just an update.
    after many tests and combination I found out that the insert statement will not work fully. if the first language in the all_languages table doesn't exist in the position table.

    the merge statement also uses the same concept.

    can you give more ideas?
    Sorry, I don't understand what you're saying.

    Whenever you have a problem, please post a little sample data (CREATE TABLE and INSERT statements, relevant columns only) from all tables.
    Also post the results you want from that data, and an explanation of how you get those results from that data, with specific examples.
    If you're asking about a DML statement, such as UPDATE, the CREATE TABLE and INSERT statements should re-create the tables as they are before the DML, and the results will be the contents of the changed table(s) when everything is finished.
  • 7. Re: insert Duplicate rows with difrent languages
    916212 Newbie
    Currently Being Moderated
    Sorry for misleading you. maybe i was not clear enough.

    let me explain more my question.
    The situation is we are in multinational company every country maintain position text with their own language.

    Now they may be maintaining with more than one language for example French and English.

    The query should insert the missing text of the Position ID by copying it from the original text but with another Language field.

    after this query we should always have 4 records for each Position ID. Moreover not to overwrite the original texts only check the records missing and insert it. ANY language can be missing.

    To add more difficulty to the query I thought of this Language Order or the ALL_Languages Table. which says if the text available in English and French then better copy the English text because people will understand it more than French for Example. if the English doesn't exist then copy the French and so on.

    Coming back to the Example.

    the query should do this
    INSERT INTO POSITION
    VALUES (2, 'Assistance', 'F','E');
    INSERT INTO POSITION 
    VALUES (2, 'Assistance', 'I','E');
    IF English did not exist before for ID=2

    then it should do this
    INSERT INTO POSITION
    VALUES (2, 'Ayuda', 'F','S');
    INSERT INTO POSITION 
    VALUES (2, 'Ayuda', 'I','S');
    INSERT INTO POSITION 
    VALUES (2, 'Ayuda', 'E','S');
    if all Languages exists Like ID=1 the query should not add more rows to ID=1.

    if all IDS have 4 records then it should prompt "no records inserted"


    Hope i make my self clear.

    Best regards
  • 8. Re: insert Duplicate rows with difrent languages
    Frank Kulash Guru
    Currently Being Moderated
    Hi,

    Okay, I see my mistake now.
    You can use MERGE like this:
    MERGE INTO     position     dst
    USING     (
              SELECT     p.id
              ,     FIRST_VALUE (p.text IGNORE NULLS) 
                        OVER ( PARTITION BY  p.id
                               ORDER BY      l.ord_num
                               ROWS BETWEEN  UNBOUNDED PRECEDING
                                        AND          UNBOUNDED FOLLOWING
                             )     AS top_text
              ,     FIRST_VALUE (p.lsource_anguage IGNORE NULLS) 
                        OVER ( PARTITION BY  p.id
                               ORDER BY      l.ord_num
                               ROWS BETWEEN  UNBOUNDED PRECEDING
                                        AND          UNBOUNDED FOLLOWING
                             )     AS source_language
              ,     l.language
              FROM           all_languages     l
              LEFT OUTER JOIN  position     p     PARTITION BY (p.id)
                                       ON  l.language     = p.language
         )               src
    ON     (     src.id          = dst.id
         AND     src.language     = dst.language
         )
    WHEN NOT MATCHED THEN 
    INSERT     (dst.id, dst.text,     dst.language, dst.source_language)
    VALUES     (src.id, src.top_text, src.language, src.source_language)
    ;
    Or you can use INSERT, like this:
    INSERT INTO     position (id, text, language, source_language)
    WITH     got_analytics     AS
    (
         SELECT     p.id
         ,     p.text          AS orig_text
         ,     FIRST_VALUE (p.text IGNORE NULLS) 
                   OVER ( PARTITION BY  p.id
                          ORDER BY      l.ord_num
                          ROWS BETWEEN  UNBOUNDED PRECEDING
                                   AND          UNBOUNDED FOLLOWING
                        )     AS top_text
         ,     l.language
         ,     FIRST_VALUE (p.source_language IGNORE NULLS)
                   OVER ( PARTITION BY  p.id
                          ORDER BY      l.ord_num
                          ROWS BETWEEN  UNBOUNDED PRECEDING
                                   AND          UNBOUNDED FOLLOWING
                        )     AS source_language
         FROM           all_languageS     l
         LEFT OUTER JOIN  position     p     PARTITION BY (p.id)
                                  ON  l.language     = p.language
    )
    SELECT     id, top_text, language, source_language
    FROM     got_analytics
    WHERE     orig_text     IS NULL
    ;
    This is basically what I posted before, except that I added
    ...                      ROWS BETWEEN  UNBOUNDED PRECEDING
                                   AND          UNBOUNDED FOLLOWING
    to all the analytic clauses.
    The problem was that when you say "ORDER BY ord_num", that creates a default window that includes only the current ord_num and lower values. We need to look at all values, lower and higher, so we have to override the default windowing.

    Edited by: Frank Kulash on Feb 29, 2012 8:58 PM

    Edited by: Frank Kulash on Mar 1, 2012 11:31 AM
    Added correct MERGE statement.
  • 9. Re: insert Duplicate rows with difrent languages
    916212 Newbie
    Currently Being Moderated
    Hello,
    Source Language comes as null. I guess you have to substitute in the query P.source_language with P.Language

    Best Regards

    Edited by: 913209 on Mar 1, 2012 3:10 AM
  • 10. Re: insert Duplicate rows with difrent languages
    Frank Kulash Guru
    Currently Being Moderated
    Hi,
    913209 wrote:
    Hello,
    Source Language comes as null.
    Not when I run it.
    `       ID TEXT                 L S
    ---------- -------------------- - -
             2 Ayuda                E S
             2 Ayuda                F S
             2 Ayuda                I S
             2 Ayuda                S S
    Post a complete test case that people can run to re-create the problem.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points