Oracle Fusion Data Intelligence Idea Lab

Welcome to the Oracle Analytics Community: Please complete your User Profile and upload your Profile Picture

More sophisticated Text data types in Data Flows

Needs Votes
31
Views
4
Comments

We use data flows a lot, and have seen an improvement in them over the years.

There's an annoying issue to do with how text data types are handled, that can cause very slow performance on the data flow if you save the data flow Database Connection (red data set) rather than csv (green data set). This is because of text lengths.

So;

  1. Converting to Text using the UI (right click > Convert to Text) will convert using VARCHAR(4000) - regardless of the actual length in the column
  2. Using CONCAT or || to add columns or transform columns will often create a CLOB object

We have seen that runtime of a data flow can go from 2-3 hours to 2-3 minutes just by using CAST statements to convert the relevant columns.

The issue is that there's nothing within the UI of the data flow to tell you that this is the case (it tells you text only, not Varchar(xx) etc; all that can be done is to run the data flow (create a "Red" data set) and look at the detail in SQL developer.

Can something be done on this? I could think of the following solutions;

  1. Surface detailed data types on the save screen; Column1 is Text AND a CLOB
  2. Some sort of automatic conversion of text length that can be run and convert them automatically (something like Column A is a CLOB/VARCHAR(4000) but the maximum character length is 20 so we'll convert to VARCHAR(25) to be on the safe side so we don't crop too much.
4
4 votes

Needs Votes · Last Updated

Comments

  • Chris Allies
    Chris Allies Rank 3 - Community Apprentice

    In case it helps someone else; we have a data set that helps us to identify this issue on a temporary basis by creating a data set using the below SQL. This is helpful but still needs a data set to be created first which can waste time - a proper solution is still needed on the front end.

    select c.TABLE_NAME, c.COLUMN_NAME, c.DATA_TYPE, c.DATA_LENGTH, c.DATA_PRECISION, c.DATA_SCALE, c.NULLABLE
    from all_tab_columns c
    inner join
    all_objects o
    on c.TABLE_NAME = o.OBJECT_NAME
    where c.OWNER = 'OAX_USER'
    and o.object_type='TABLE'
    order by 1

  • GabriellePrichard-Oracle
    edited Jun 17, 2025 2:05PM

    Thanks for the detailed feedback @Chris Allies— I’ll review this with our team internally. Improving type handling could be a great enhancement. Out of curiosity, which database are you using with your data flow? If you know the column can be safely trimmed, your approach of using 'cast' is definitely the right workaround. Since we can’t always predict future data size, it's tricky for us to determine an ideal column length.

  • Giorgio Ticinelli
    Giorgio Ticinelli Rank 3 - Community Apprentice

    Hi Gabrielle, Chris is using the FDI instance, so the database is the ADW built-in database. What Chris says it's been verified by me and the only way for him to avoid this problem is using CAST, because there is no option to choose the data type. Data flows are very popular because they are extremely simple to use, but adding the possibility to choose a data type could be useful (without complicating too much). In the worst case scenario, we had to support Chris checking the database tables in the back end, identifying the CLOB column that was causing issues. Thanks, Giorgio

  • Chris Allies
    Chris Allies Rank 3 - Community Apprentice
    edited Jun 18, 2025 12:48PM

    Thanks Gabrielle - I think honestly the easiest thing to implement would be to just highlight the VARCHAR(xxx) value in the Save Data part of the data flow, option 2 I was just spitballing.

    I'd also question the design choice of Right Click > Convert to Text in a data flow defaulting to Cast(column as varchar(4000)) as that's also likely to have a performance implication. Varchar(100) would probably be enough in most use cases? This would reduce runtime and thus make for a better UX.