Forum Stats

  • 3,838,681 Users
  • 2,262,393 Discussions
  • 7,900,730 Comments

Discussions

ODI cannot load CSV with fields enclosed by double-quotes, delimited by comma

3673816
3673816 Member Posts: 79
edited Jul 2, 2018 3:09PM in Data Integrator

Hi.

I have a CSV file with fields delimited by comma.

Some fields, however are also enclosed by double-quotes, specifically the ones whose text itself has commas, as a way of differentiating commas in the text vs commas that are field delimiters.

So, something like this:

--------------

123,abcd,"abc,d"

456,defg,"def,g"

Three fields, delimited by comma. However, the third field is enclosed by double-quotes and has commas in its text. This comma is not a delimited, but a part of the text itself.

I have having a hard time using ODI to load this file into Oracle.

In Oracle External table it is easy, I would just do:

         FIELDS TERMINATED BY ','

         OPTIONALLY ENCLOSED BY '"'

In ODI, in the datastore "Files" tab I don't see anything that specifies "optionally enclosed by".

Also, in the mapping, I am using the "LKM SQL to Oracle (Built-In).Global" module, which does not have the "optionally enclosed by" attribute. Neither does the "LKM File to Oracle (EXTERNAL TABLE)" module.

So, how can I load double-quoted fields without the actual double-quotes in the resulting Oracle column and, more importantly, escape the text-related commas and not treat them as field separators?

Thank you

Boris

Tagged:
3173245

Best Answer

  • JimmyOTNC
    JimmyOTNC Member Posts: 162 Bronze Badge
    edited Jul 2, 2018 1:12PM Answer ✓

    when you define the file (on the designer, models), on the files tab:

         In the field separator section:

              put other ,

          In the Text Delimiter: put " (the double quote).

    Try this and let me know if this work for you.

    3173245

Answers

  • 3673816
    3673816 Member Posts: 79
    edited Jul 2, 2018 12:18PM

    The way I understand it, the Load Knowledge Module can be tweaked to treat fields optionally enclosed by '"'.

    The LKM File to Oracle (EXTERNAL TABLE) module has the "Create External Table" task. In it, I do see the following code:

      FIELDS  TERMINATED BY  x'<%=odiRef.getSrcTablesList("", "[XFILE_SEP_FIELD]", "", "")%>'
      <% if(odiRef.getSrcTablesList("", "[FILE_ENC_FIELD]", "", "").equals("")){%>
      <%} else {%>OPTIONALLY ENCLOSED BY '<%=odiRef.getSrcTablesList("", "[FILE_ENC_FIELD]", "", "").substring(0,1)%>' AND '<%=odiRef.getSrcTablesList("", "[FILE_ENC_FIELD]", "", "").substring(1,2)%>' <%}%>
      <%=odiRef.getOption("EXT_MISSING_FIELD")%>

    I am not sure how this code can be tweaked to get this type of behavior:       
    OPTIONALLY ENCLOSED BY '"'

    Also, the LKM SQL to Oracle module (the built-in version of which I am using), has a bunch of tasks, but I am not even sure which task would be responsible for controlling this behavior and what the code might look like.
    We are mostly using the "LKM SQL to Oracle" built-in module to load CSV files into our Oracle tables, so finding a way to do it with this module would be actually more preferable than with the "LKM File to Oracle (EXTERNAL TABLE)" module.

    Thanks

    Boris

  • JimmyOTNC
    JimmyOTNC Member Posts: 162 Bronze Badge
    edited Jul 2, 2018 1:12PM Answer ✓

    when you define the file (on the designer, models), on the files tab:

         In the field separator section:

              put other ,

          In the Text Delimiter: put " (the double quote).

    Try this and let me know if this work for you.

    3173245
  • 3673816
    3673816 Member Posts: 79
    edited Jul 2, 2018 2:10PM

    Thanks JimmyOTNC.

    This is working now.

    I already had the comma (,) in the Field Separator field.

    I just added the double quote (") in the Text Delimiter field.

    In essence ODI's Text Delimiter is synonymous with "OPTIONALLY ENCLOSED BY" attribute. It wasn't intuitive enough for me.

    At some point, I would love to learn how to manage/control things like that with Knowledge Modules, as I think that would have been another approach, if more complicated. Looks like that FILE_ENC_FIELD could have been defined as a flexfield, but that's where my digging stopped before I used your recommendation and it worked.

    Thanks!

  • JimmyOTNC
    JimmyOTNC Member Posts: 162 Bronze Badge
    edited Jul 2, 2018 3:09PM

    Glad it worked for you. Yes, the doc is not intuitive at all.

This discussion has been closed.