Converting to Date in Oracle DV Data Editor (Missing Documentation?)
Hi all,
I'm looking for help converting a column to date from text, I tried looking at posts and the following documentation that appears to be missing a step behind converting to date.
Context:
- Data is a Csv file loaded into OAC DV
- The following transformations are being done in the data editor that prompts when loading a dataset
Problem:
The column "LAST_UPDATE_DATE" is reading date data in a text format. I am performing the following steps to attempt to convert this to a date form:
- Click the three dots at the column header
- Click "Convert to Date"
- The following Wizard appears:
I can't find any Oracle documentation on this wizard or how to use it. The output I'm trying to achieve is the following format:
- 6/7/2022 5:34:01 AM
Inputting the following leads to errors
At this point I'll settle for just having a short date (e.g. "6/7/22") but I can't seem to figure this out either.
Can anyone help?
Best Answer
-
Ok, I had to give it a try to find out what was going on.
I call it a bug: the function exists, the formula is valid, but still it doesn't work.
The workaround is to use a TIMESTAMP instead of a date:
TIMESTAMPADD(SQL_TSI_DAY, "LAST_UPDATE_DATE", TIMESTAMP '1900-01-01 00:00:00')
If you are motivated to raise an SR to report the bug when entering the version with
DATE '1900-01-01'
feel free to do it :D3
Answers
-
If your source is a CSV file, there is not an issue, it works as it should.
CSV doesn't have data types, it's text or number. You can tell Excel to convert a number to a date, but it was (and still is) a number and it's Excel doing the conversion. Being a number, you can't turn it into a date just by trying a cast or a format, you first need to calculate the date from that number.
In Excel dates are actually numbers, just displayed as dates "visually": it's a number representing the number of days from 01/01/1900 (if not wrong).
In this case you could get real dates by using TIMESTAMPADD, and adding that number of days to 01/01/1900.
Something like this could work:
TIMESTAMPADD(sql_tsi_day, "LAST_UPDATE_DATE", DATE '1900-01-01')
And if your LAST_UPDATE_DATE is not an integer, the fractional part is the fraction of day representing the time.
1 -
@Gianni Ceresa Thank you for your response!
Great point about how data is stored in a Csv.
I think ultimately what I'm looking for is the OAC equivalent of that visual change to display the stored number as a date. This data will be used to provide visuals aggregated by date, so I am concerned about these visuals showing the numerical value of the date, as opposed to the date itself.
I tried the formula you suggested and retrieved this error:
In an attempt to troubleshoot this error, I played around with the data type and changed it from text to number, and vice versa to see if the formula would work, however it gave the same error every time.
0 -
You should write
DATE '1900-01-01'
, the word "DATE" is important because it's a casting of the text following as a date. As an alternative you can do an explicit casting with the CAST function or use the TO_DATETIME function to generate a date.And technically there can't be the equivalent to Excel with dates as numbers, mostly because Excel does its own cooking. The idea that a date is the number of days from 01-01-1900 is a very Excel thing. The closest "standard" would be the epoch time: the number of seconds from 01-01-1970. But even then it is still not a date, it's a number.
And that's why OAC has a number of functions to convert text or numbers into real dates (because data types do matter!).
This forum has a very poor style, one could expect a different background for "code", but I can't control the style of the forum to make the code pasted more visible.
TIMESTAMPADD(sql_tsi_day, "LAST_UPDATE_DATE", DATE '1900-01-01')
0 -
I see, so it appears there is no visual conversion I can do for the date, at least here in the data editor, and specifically for .csv file types.
I rewrote that formula and this time received no error! However, it then made every value in my date column null:
Going back to my point about the dates showing as numerical values in Visuals, I went and created a very simple line chart to see if there was a way I could change the date's display to no avail (take a look at the x-axis)
Any ideas on what else I could try?
0 -
I would say the issue could be a circular reference: you are redefining the formula for "LAST_UPDATE_DATE" to be a formula referencing "LAST_UPDATE_DATE".
Add a new column with the formula, you can keep your "LAST_UPDATE_DATE" as a number if you want, but it's pointless (so you can hide it), and then you have a new column (name it what you want, but not "LAST_UPDATE_DATE") with the real dates.And again: stop wasting your time trying to display dates from your numbers directly. It's impossible, because that number isn't a standard way to represent a date anywhere else than Excel. It can't be done just with a click as a display transformation.
0 -
@Gianni Ceresa Thank you for your consistent support.
I tried re-writing your above formula as a new column (as opposed to a transformation) and received the same null results:
Let me know if there is anything else you think I can try. I feel if we can find the answer to this it will be very helpful to future analysts using OAC.
0 -
@Gianni Ceresa You are the man! Thank you for helping me figure this out, I posted your response as the best answer. I will definitely be looking into a bug report.
Thanks again!
2 -
Glad it works @Allen Braun, and thanks for accepting the answer! That really helps out the community and we appreciate it :).
0 -
Hi @Allen Braun Looks like @Gianni Ceresa provided a way to solve your issue - in the future, these two blogs may also be helpful with some complex text columns that need to be converted to dates.
If you have a sample file that you can share with me here or privately - I can take a look and see why our semantic profiler did not recognize it as a date. My contact info is in my profile. thanks.
3 -
In this case, there is nothing that can be done on "text to date" conversion, because the data is not a date at all.
This is an example of the data: the first column is the "date".
In Excel, those numbers can be turned into dates visually, just telling Excel that the column is supposed to be formatted as a date.
But nowhere else it will be a date, because it isn't a standard like the Unix time or things like that. It's a pure Excel thing. And when you load an Excel file you could see it as a date because Excel has formats, but in a CSV it's just a number.
1 -
Thanks @Gianni Ceresa , the details provided may help us automate the conversion in the future.
0 -
@Luis E. Rivas -Oracle Your blogs are fantastic! Part two of your blog is more applicable to this specific dataset, but I couldn't get a date/time format to work using the included Date Converter. I do have a sample dataset I can send you - message me to let me know what's the best way to give it to you.
1 -
Just one note of caution (I don't have access to validate this right now) wrt the proposed solution:
TIMESTAMPADD(SQL_TSI_DAY, "LAST_UPDATE_DATE", TIMESTAMP '1900-01-01 00:00:00')
Excel converts '1900-01-01 00:00:00' to a value of 1. In fact, Excel formats the value of 0 to the imaginary 1/0/1900, not 12/31/1899.
You may want to validate whether your converted dates are off by a day. If so, you will have to change your anchor day:
TIMESTAMPADD(SQL_TSI_DAY, "LAST_UPDATE_DATE", TIMESTAMP '1899-12-31 00:00:00')
0 -
@Ambient , possible!
It's one of those Excel things that would deserve a test (or to search to confirm it), I just remember it is from 1900, just to annoy everybody having to handle it :D
But OP should easily know if his dates are off by 1 day or not.
0 -
Yes, Excel is "special" when it comes to these things and can't do things normally. Luckily DV correct for this where basic dates are concerned:
Problems re-appear when you do time arithmetic as you can imagine.
Never trust Excel, kids! ;)
1 -
@Christian Berg and @Ambient both great points! I went ahead and referenced the source data and did some spot checking to make sure the dates lined up. As it turns out the dates I converted using the following formula was off by two days!
TIMESTAMPADD(SQL_TSI_DAY, "LAST_UPDATE_DATE", TIMESTAMP '1900-01-01 00:00:00')
To accomdate this I added a - 2 to the end of the formula which corrected the dates.
TIMESTAMPADD(SQL_TSI_DAY, "LAST_UPDATE_DATE", TIMESTAMP '1900-01-01 00:00:00') - 2
1 -
The fact that the difference was 2 didn't feel right to me at first, but then I remembered a project that I was on in the mid-aughts. At the time we found out that Excel had purposely perpetuated a bug from the Lotus 123 days to preserve data integrity, by including Feb. 29, 1900, even though 1900 was not a leap year.
one of those little bits of trivia…
0 -
And that's why everybody love Excel : it comes with its own logic, but because of the number of users, it is always right 😁
Not like databases or any other type of files where data types exists and are a real thing 🙄
1