Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Support ANSI SQL behaviour for null/empty Character string data types

Sebastien A.Nov 6 2017 — edited Nov 7 2017

Oracle considers the empty string '' to be the same as null. This differs from what most SQL dialect do and it prevents compliance with ANSI SQL.

Proposed Idea

Have a database configuration that would make oracle compliant by:

  • Considering null/empty string as different
    • in comparison
    • while stored (inserting then selecting an empty string should return an empty string)
  • Modify concatenation operator behave as specified in the spec ('a' || null = null, and 'a' || '' = 'a')
  • Modify functions having a string as input to comply with the spec. For instance LENGTH(null) should yield null but LENGTH('') should yield 0.

This configuration would need to be opt-in to avoid breaking changes for existing oracle application.

Note:

It seems that VARCHAR2 was created to keep VARCHAR  datatype available to support the standard behaviour in the future.

However I a database level configuration would be needed to configure how empty string literals should be considered.

Motivation

The goal of such idea is to be able to make SQL vendor independant application easier to write.

Some application developped with standard SQL in mind and actually making a difference between '' and null  have a hard time being made compatible with Oracle.

ANSI SQL: empty character string vs null behaviour

Even though we can find lot of places on the web people saying that in standard SQL empty character string should not be considered null I couldn't find in the SQL specification a place that explicitely states that.

However what the spec says the following

Concatenating character string with null should yield null

From ISO/IEC 9075-2:2003 (E) 6.28 <string value expression>:

    <concatenation> ::= <character value expression> <concatenation operator> <character factor>

    ...

    b) Let S1 and S2 be the result of the <character value expression> and <character factor>, respectively. Case:

        i) If either S1 or S2 is the null value, then the result of the <concatenation> is the null value.

So the spec says 'foo' || null = null but in oracle 'foo' || null = 'foo'.

Even if not explicited in the spec 'foo' || '' should yield 'foo'. (I can't think of an argument to have anything else).

Hence one cannot comply with the spec if considering '' and null because it requires need different results for input '' and null.

Length(null) should yield null value

From ISO/IEC 9075-2:2003 (E) 6.27 <numeric value function>

<length expression> ::=

<char length expression>

| <octet length expression>

<char length expression> ::=

{ CHAR_LENGTH | CHARACTER_LENGTH } <left paren> <string value expression>

[ USING <char length units> ] <right paren>

...

General Rules

1) If the value of one or more <string value expression>s, <datetime value expression>s, <interval value

expression>s, and <collection value expression>s that are simply contained in a <numeric value function>

is the null value, then the result of the <numeric value function> is the null value.

So LENGTH(null) should yield null but LENGTH('') should yield 0.

Hence one cannot comply with the spec if considering '' and null because it requires need different results for input '' and null.

<literal> are used to specify non-null values

From ISO/IEC 9075-2:2003 (E) 5.3 <literal>.

Function: Specify a non-null value.

This implies that '' should not be null or not be a valid string literal.

But in oracle '' is considered to be null.

Update:

Rephrasing to:

* clarify what the proposed idea.

* summarize evidences from SQL specification.

Comments

BluShadow

Unfortunately, making a change to the way the standard datatypes behave would typically break thousands of databases around the world.  I can't see that happening any time soon.

Sven W.

Can you reference the ansi document where it shows how NULLs should be handled for varchar2 strings?

Sebastien A.

I know it would be a breaking change. It should be opt-in configuration like SQL server did it with ANSI_NULLS

Sebastien A.

Can you reference the ansi document where it shows how NULLs should be handled for varchar2 strings?

I'm no SQL standard expert the only thing I could personnaly found in the spec is in ISO/IEC 9075-2:2003 (E) 5.3 <literal>.

5.3 <literal>

Function: Specify a non-null value.

This implies that '' should not be null or not be a valid string literal.

Additionally there are plenty of places on the web where people claim this behaviour is not ANSI like those

* https://stackoverflow.com/a/1171200/1069454

* https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:5984520277372

Sebastien A.

I'm no SQL standard expert the only thing I could personnaly found in the spec is in ISO/IEC 9075-2:2003 (E) 5.3 <literal>.

5.3 <literal>

Function: Specify a non-null value.

This implies that '' should not be null or not be a valid string literal.

Additionally there are plenty of places on the web where people claim this behaviour is not ANSI like those

* https://stackoverflow.com/a/1171200/1069454

* https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:5984520277372

I found another incompatibility with the SQL Spec induced by the fact that oracle considers '' as null.

From ISO/IEC 9075-2:2003 (E) 6.28 <string value expression>:

<concatenation> ::= <character value expression> <concatenation operator> <character factor>
...
b) Let S1 and S2 be the result of the <character value expression> and <character factor>, respectively. Case:
    i) If either S1 or S2 is the null value, then the result of the <concatenation> is the null value.

But in oracle concatenating null behaves as if not concatenating 'foo' || null yields 'foo' instead of null.

Sven W.

I'm no SQL standard expert the only thing I could personnaly found in the spec is in ISO/IEC 9075-2:2003 (E) 5.3 <literal>.

5.3 <literal>

Function: Specify a non-null value.

This implies that '' should not be null or not be a valid string literal.

Additionally there are plenty of places on the web where people claim this behaviour is not ANSI like those

* https://stackoverflow.com/a/1171200/1069454

* https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:5984520277372

A literal is something completlly different. You can't use this part of the spec. Before this discussion escalates further, it would be good to really check the standard and see what is written there. First need to get the facts right.

Also what would the consequence of your suggestion be?

If you declare a column as NOT NULL, you would allow to insert an empty string '' into the column? This would break most applications.

If the only issue is how to compare NULL values, then maybe a different kind of comparison operator, like === (compare value, type and NULL) would be better suited to solve or avoid the problems that you see with the empty=NULL logic.

BluShadow

A literal is something completlly different. You can't use this part of the spec. Before this discussion escalates further, it would be good to really check the standard and see what is written there. First need to get the facts right.

Also what would the consequence of your suggestion be?

If you declare a column as NOT NULL, you would allow to insert an empty string '' into the column? This would break most applications.

If the only issue is how to compare NULL values, then maybe a different kind of comparison operator, like === (compare value, type and NULL) would be better suited to solve or avoid the problems that you see with the empty=NULL logic.

Having worked on Ingres databases, the number of problems caused by having empty strings not the same as null, was more than I can count.  When viewing the data, you cannot tell if a column is containing nulls, empty strings, or a mixture of both, so you always had to try and remember to check for both things in your queries.  Most people usually forgot, and most people ended up with queries producing the wrong results.  It's actually great to work on Oracle where an empty string is null... as it just means we have one check to make, rather than two.  I can hardly think of any reason for storing an empty string in any of the different applications I've needed to write.

Sebastien A.

A literal is something completlly different. You can't use this part of the spec. Before this discussion escalates further, it would be good to really check the standard and see what is written there. First need to get the facts right.

Also what would the consequence of your suggestion be?

If you declare a column as NOT NULL, you would allow to insert an empty string '' into the column? This would break most applications.

If the only issue is how to compare NULL values, then maybe a different kind of comparison operator, like === (compare value, type and NULL) would be better suited to solve or avoid the problems that you see with the empty=NULL logic.

Sven W. wrote:

A literal is something completlly different. You can't use this part of the spec. Before this discussion escalates further, it would be good to really check the standard and see what is written there. First need to get the facts right.

I updated the idea title and description to be more specific on the problem and use the terminology used in the SQL standard spec.

The fact is that it seems that oracle is non standard compliant when it comes to handling null/empty character strings. I found 2 evidences listed above after scanning through the SQL standard spec.

Also what would the consequence of your suggestion be?

If you declare a column as NOT NULL, you would allow to insert an empty string '' into the column? This would break most applications.

The goal of the idea is to provide configuration to make oracle behave the standard way.

This would  be opt-in to prevent breaking existing oracle applications.

Having a configuration to make oracle behave the standard way would allow applications to be more RDBMS independent. That's why standard exist.

If the only issue is how to compare NULL values, then maybe a different kind of comparison operator, like === (compare value, type and NULL) would be better suited to solve or avoid the problems that you see with the empty=NULL logic.

I don't think the only issue is how to compare strings.

For other RDBMS null and empty strings are 2 different things but on oracle there is no way to make such differences. People resort to all sort of hack when porting to oracle to deal with this lack of difference (e.g. Oracle RDBMS empty string and null ).

Lukas Eder

Having worked on Ingres databases, the number of problems caused by having empty strings not the same as null, was more than I can count.  When viewing the data, you cannot tell if a column is containing nulls, empty strings, or a mixture of both, so you always had to try and remember to check for both things in your queries.  Most people usually forgot, and most people ended up with queries producing the wrong results.  It's actually great to work on Oracle where an empty string is null... as it just means we have one check to make, rather than two.  I can hardly think of any reason for storing an empty string in any of the different applications I've needed to write.

I don't buy this argument. I cannot recall any system / language where empty strings and null are the same thing, including Java, JavaScript, Scala, Kotlin, and pretty much every other SQL dialect. I don't even see the advantage of this "equality". Consider the inconsistent behaviour between 1 + NULL IS NULL and 'a' || NULL = 'a'. Why isn't zero (the "empty integer") and NULL the same thing to at least be consistent?

Lukas Eder

I know it would be a breaking change. It should be opt-in configuration like SQL server did it with ANSI_NULLS

I definitely think that this would need to be introduced as an opt-in change at first, given its tremendous potential impact on everything.

BluShadow

I don't buy this argument. I cannot recall any system / language where empty strings and null are the same thing, including Java, JavaScript, Scala, Kotlin, and pretty much every other SQL dialect. I don't even see the advantage of this "equality". Consider the inconsistent behaviour between 1 + NULL IS NULL and 'a' || NULL = 'a'. Why isn't zero (the "empty integer") and NULL the same thing to at least be consistent?

Lukas Eder wrote:

I don't buy this argument. I cannot recall any system / language where empty strings and null are the same thing, including Java, JavaScript, Scala, Kotlin, and pretty much every other SQL dialect. I don't even see the advantage of this "equality". Consider the inconsistent behaviour between 1 + NULL IS NULL and 'a' || NULL = 'a'. Why isn't zero (the "empty integer") and NULL the same thing to at least be consistent?

zero is a known value

an empty string is... well.... it has no value.

Can't compare Oranges with Apples.

Sven W.

I don't buy this argument. I cannot recall any system / language where empty strings and null are the same thing, including Java, JavaScript, Scala, Kotlin, and pretty much every other SQL dialect. I don't even see the advantage of this "equality". Consider the inconsistent behaviour between 1 + NULL IS NULL and 'a' || NULL = 'a'. Why isn't zero (the "empty integer") and NULL the same thing to at least be consistent?

Lukas Eder wrote:

...Consider the inconsistent behaviour between 1 + NULL IS NULL and 'a' || NULL = 'a'. Why isn't zero (the "empty integer") and NULL the same thing to at least be consistent?

To be fair, this is a different behaviour for different operators. I assume, this is also documented somewhere as ANSI behaviour.

But it is not about the value itself.

However we see that in this specific case, the NULL values for strings already behaves like sarod wants it to be.

How I understand the "idea" is that

a) the IS NULL comparison would return FALSE it the value is an empty string and TRUE if it is null.

b) the equality comparison A=B returns TRUE if both values are empty strings, instead of NULL (=FALSE in the where clause) as it is currently.

I think having a new type of equality comparison operator. would solve most problems that might be the reason for the suggestion.

As a side note: Standard compliance usually is good.

However to assume that if two databases are standard compliant, that you can copy code from one to another, is a fallacy.

Sebastien A.

Lukas Eder wrote:

...Consider the inconsistent behaviour between 1 + NULL IS NULL and 'a' || NULL = 'a'. Why isn't zero (the "empty integer") and NULL the same thing to at least be consistent?

To be fair, this is a different behaviour for different operators. I assume, this is also documented somewhere as ANSI behaviour.

But it is not about the value itself.

However we see that in this specific case, the NULL values for strings already behaves like sarod wants it to be.

How I understand the "idea" is that

a) the IS NULL comparison would return FALSE it the value is an empty string and TRUE if it is null.

b) the equality comparison A=B returns TRUE if both values are empty strings, instead of NULL (=FALSE in the where clause) as it is currently.

I think having a new type of equality comparison operator. would solve most problems that might be the reason for the suggestion.

As a side note: Standard compliance usually is good.

However to assume that if two databases are standard compliant, that you can copy code from one to another, is a fallacy.

Sven W. wrote:

Lukas Eder wrote:

...Consider the inconsistent behaviour between 1 + NULL IS NULL and 'a' || NULL = 'a'. Why isn't zero (the "empty integer") and NULL the same thing to at least be consistent?

To be fair, this is a different behaviour for different operators. I assume, this is also documented somewhere as ANSI behaviour.

But it is not about the value itself.

However we see that in this specific case, the NULL values for strings already behaves like sarod wants it to be.

I think there is a misunderstanding here. What I would expect is to have the string concatenation behave as specified by the standard. That is  'a' || NULL should yield NULL and 'a' || '' shoudl yield ''

As a side note: Standard compliance usually is good.

However to assume that if two databases are standard compliant, that you can copy code from one to another, is a fallacy.

I agree that a standard will never be able to standardize 100% of behaviour of something as complex as SQL and there will always be corner cases where RDBMS behaves differently.

However string concatenation is clearly specified in AINSI SQL and current Oracle behaviour when concatenating string with null is not compliant with the spec.

Additionally the spec basically says:

*   'a' || NULL should yield NULL

*   'a' || '' should yiled 'a'
And I don't see how oracle could comply to the spec without differentiating empty string from null string.

GregV

I agree with the fact that it would probably break many things in place. Though I don't see the need for the distinction, I do think there are inconsistencies in some built-in functions' results.

'a' || NULL should return NULL, as should Length('a' || NULL)

William Robertson

Personally I think having two kinds of 'empty' is madness, the ANSI committee were crazy to come up with it, and Oracle has done well to stand up against it all these years. However, the standard is out there and there is a portability argument for at least providing tools to support it.

Perhaps a new 'a' prefix could be added for text literals so that a'' would give an ANSI-style empty string.

Lukas Eder

Sven W. wrote:

Lukas Eder wrote:

...Consider the inconsistent behaviour between 1 + NULL IS NULL and 'a' || NULL = 'a'. Why isn't zero (the "empty integer") and NULL the same thing to at least be consistent?

To be fair, this is a different behaviour for different operators. I assume, this is also documented somewhere as ANSI behaviour.

But it is not about the value itself.

However we see that in this specific case, the NULL values for strings already behaves like sarod wants it to be.

I think there is a misunderstanding here. What I would expect is to have the string concatenation behave as specified by the standard. That is  'a' || NULL should yield NULL and 'a' || '' shoudl yield ''

As a side note: Standard compliance usually is good.

However to assume that if two databases are standard compliant, that you can copy code from one to another, is a fallacy.

I agree that a standard will never be able to standardize 100% of behaviour of something as complex as SQL and there will always be corner cases where RDBMS behaves differently.

However string concatenation is clearly specified in AINSI SQL and current Oracle behaviour when concatenating string with null is not compliant with the spec.

Additionally the spec basically says:

*   'a' || NULL should yield NULL

*   'a' || '' should yiled 'a'
And I don't see how oracle could comply to the spec without differentiating empty string from null string.

'a' || '' shoudl yield ''

That's a typo, I'm assuming? It should yield 'a'

Lukas Eder

Lukas Eder wrote:

I don't buy this argument. I cannot recall any system / language where empty strings and null are the same thing, including Java, JavaScript, Scala, Kotlin, and pretty much every other SQL dialect. I don't even see the advantage of this "equality". Consider the inconsistent behaviour between 1 + NULL IS NULL and 'a' || NULL = 'a'. Why isn't zero (the "empty integer") and NULL the same thing to at least be consistent?

zero is a known value

an empty string is... well.... it has no value.

Can't compare Oranges with Apples.

It is a well defined value and concept: https://en.wikipedia.org/wiki/Empty_string. It is the identity element of the concatenation operation, just like zero is the identity element of the addition operation. So, really. This isn't apples and oranges. This is just Oracle being super quirky.

I understand that you've never seen the point of it from your experience, but that doesn't mean that the concept of an empty string is unnecessary / foreign / surprising / etc. Again, it's a well-defined and useful concept in pretty much every other language / platform.

BluShadow

It is a well defined value and concept: https://en.wikipedia.org/wiki/Empty_string. It is the identity element of the concatenation operation, just like zero is the identity element of the addition operation. So, really. This isn't apples and oranges. This is just Oracle being super quirky.

I understand that you've never seen the point of it from your experience, but that doesn't mean that the concept of an empty string is unnecessary / foreign / surprising / etc. Again, it's a well-defined and useful concept in pretty much every other language / platform.

You don't have to tell me.  I've worked on more than just Oracle databases in my time.  Ingres is a good example as it has always treated empty strings as being different from null.  I have good experience of the numerous bugs and issues that occurred because people hadn't accounted for both empty strings and nulls in their queries, and also the developers in my team who had to enlist my help to find the issues because there's no visual difference when they are looking at the data as both things look blank on the screen.

It was a blessing to work with Oracle where that annoyance (and after years of it, it really had become an annoyance) was removed and then we only had to worry about the difference between nulls and strings with only spaces in them.

William Robertson

It is a well defined value and concept: https://en.wikipedia.org/wiki/Empty_string. It is the identity element of the concatenation operation, just like zero is the identity element of the addition operation. So, really. This isn't apples and oranges. This is just Oracle being super quirky.

I understand that you've never seen the point of it from your experience, but that doesn't mean that the concept of an empty string is unnecessary / foreign / surprising / etc. Again, it's a well-defined and useful concept in pretty much every other language / platform.

I am sure it is useful in a 3GL to be able to distinguish between initialised and uninitialised variables or have the flexibility to set strings to some non-empty dummy value, especially when a string is implemented as an array (and of course we have "is empty" for arrays), but really in SQL I think it is a huge mistake. We had a situation on my project just recently where a "rating not in ('C','D')" condition against a complex view was excluding unrated data, and my manager, from a SQL Server background, thought we needed to go and check whether the values were being stored as null or as empty strings. I was able to tell him there is no such distinction in Oracle and we all saved some time we might have wasted in comparing different kinds of emptiness. I expect there is a third or fourth kind of nothingness we could come up with if we put our minds to it, but those would be mistakes too.

BluShadow

It is a well defined value and concept: https://en.wikipedia.org/wiki/Empty_string. It is the identity element of the concatenation operation, just like zero is the identity element of the addition operation. So, really. This isn't apples and oranges. This is just Oracle being super quirky.

I understand that you've never seen the point of it from your experience, but that doesn't mean that the concept of an empty string is unnecessary / foreign / surprising / etc. Again, it's a well-defined and useful concept in pretty much every other language / platform.

Lukas Eder wrote:

Again, it's a well-defined and useful concept in pretty much every other language / platform.

Are you sure?

Looking at MS ODBC documentation:

https://docs.microsoft.com/en-us/sql/odbc/microsoft/data-type-limitations

quote:

Zero-Length Strings When a dBASE, Microsoft Excel, Paradox, or Textdriver is used, inserting a zero-length string into a column actually inserts a NULL instead.

Interestingly, when googling a little, it's surprising to see that people are still raising issues about their queries not doing what they expect and finding it's because they are not catering for both null and empty string in those other database.

As someone who's spent a long time on the Oracle community, especially in the area of SQL and PL/SQL, I have to say it's rare to see questions that relate to people being confused over empty strings being null.  When we do get such questions, it's clearly people who've come from other databases and are extremely new to Oracle, but that is not very often.

Lukas Eder

Lukas Eder wrote:

Again, it's a well-defined and useful concept in pretty much every other language / platform.

Are you sure?

Looking at MS ODBC documentation:

https://docs.microsoft.com/en-us/sql/odbc/microsoft/data-type-limitations

quote:

Zero-Length Strings When a dBASE, Microsoft Excel, Paradox, or Textdriver is used, inserting a zero-length string into a column actually inserts a NULL instead.

Interestingly, when googling a little, it's surprising to see that people are still raising issues about their queries not doing what they expect and finding it's because they are not catering for both null and empty string in those other database.

As someone who's spent a long time on the Oracle community, especially in the area of SQL and PL/SQL, I have to say it's rare to see questions that relate to people being confused over empty strings being null.  When we do get such questions, it's clearly people who've come from other databases and are extremely new to Oracle, but that is not very often.

Looking at MS ODBC documentation

We can discuss funky implementation details of old tech, or well-defined concepts in maths and computer science I see, you're biased by the first, I strive to achieve the latter.

When we do get such questions, it's clearly people who've come from other databases and are extremely new to Oracle, but that is not very often.

In other words: Oracle is an environment where people migrate away from, not to? ;-)

GregV

Having worked with the Oracle Database for quite some years now, I tend to agree with BluShadow. Making the distinction between empty string and null string would probably make things worse. The developers of the crappy third-party software we're implementing had a better idea: let's store a "null" value as 1 blank character ' '. Wonderful, now we have to use NVL(colname, ' ') = ' ' practically in all our queries...

Sven W.

Lukas Eder wrote:

Again, it's a well-defined and useful concept in pretty much every other language / platform.

Are you sure?

Looking at MS ODBC documentation:

https://docs.microsoft.com/en-us/sql/odbc/microsoft/data-type-limitations

quote:

Zero-Length Strings When a dBASE, Microsoft Excel, Paradox, or Textdriver is used, inserting a zero-length string into a column actually inserts a NULL instead.

Interestingly, when googling a little, it's surprising to see that people are still raising issues about their queries not doing what they expect and finding it's because they are not catering for both null and empty string in those other database.

As someone who's spent a long time on the Oracle community, especially in the area of SQL and PL/SQL, I have to say it's rare to see questions that relate to people being confused over empty strings being null.  When we do get such questions, it's clearly people who've come from other databases and are extremely new to Oracle, but that is not very often.

BluShadow wrote:

...

As someone who's spent a long time on the Oracle community, especially in the area of SQL and PL/SQL, I have to say it's rare to see questions that relate to people being confused over empty strings being null. When we do get such questions, it's clearly people who've come from other databases and are extremely new to Oracle, but that is not very often.

I always find it strange how lengthy the discussions are in the Javascript community when the need to consider null, unkown, empty and zero length strings.

Here is such a fine example: https://stackoverflow.com/questions/154059/how-do-you-check-for-an-empty-string-in-javascript

I especially like this part of the discussion (which is totally confusing for sql/plsql developers):

Huh? If you are expecting a string, empty(0) and empty(7) should return the same value.Bennett McElwee Apr 1 '14 at 22:44

      

In my particular case - empty("0") must return false (because that is a not empty string), but empty(0) must return true because it is empty Andron Apr 2 '14 at 11:24

      

But 0 isn't empty! It's a number, and numbers can't be full or empty. Of course, it's your function and so must satisfy your requirements, but empty is a misleading name in this case.Bennett McElwee Apr 2 '14 at 21:15

The question for me is: Is there really an advantage to go this route? I firmly believe "no there is not". I know I'm pretty biased by my experience, but in this case, I do not see why we ever want to add more confusion. Especially since the only argument we had so far is to be more "standard compliant". If that is the only reason, then better change the standard.

user-15sr5

@lukas-eder great suggestion. I'm currently struggling with this same issue. An empty string is different from null for the same reason that an empty array is different from null, i.e. having an empty container is not the same as having no container. I will have to add tons of Oracle specific logic and will suffer degraded performance and it makes me sad.

Adding such support as opt-in would not break anyone's database. It's only a question of whether the Oracle people see enough need for this, which it doesn't seem they do, which is fine. But please don't use backwards compatibility as a reason not to do it.

On having to filter out for empty strings on top of null: no ! That's exactly the point. An empty string is intentional. It doesn't mean that there is no data, it means that the data is an empty string. You do not have to remove empty strings on top of nulls because you filter out nulls for one reason, and you would filter empty strings for another reason.

1 - 24

Post Details

Added on Nov 6 2017
24 comments
5,289 views