Forum Stats

  • 3,824,927 Users
  • 2,260,440 Discussions
  • 7,896,351 Comments

Discussions

Allow to set negative substr length: SUBSTR(text,position,-length)

Sven W.
Sven W. Member Posts: 10,535 Gold Crown
edited Jul 22, 2019 10:35AM in Database Ideas - Ideas

syntax

SUBSTR(text,position,-length,length_mode)

example

text="abcdefgh"

SUBSTR(text,1,-2) => "abcdef"

result

This would fetch a part from a string that is a little less than the original string.

The suggestion would be a shortcut (and hopefully one with a higher performance) to

SUBSTR(text, 1, LENGTH(text)-2))

reason

The advantage is that this eliminates the need to write "text" - which could be a complex expression - twice.

We already can select a few characters from the end of the string using a negative position.

SUBSTR(text,-2) => "gh"

returns the last two chars from a text. However currently there is no quick way to fetch the first part.

The suggestion would complete this.

-- Edit --

added 4th parameter length_mode: needed so that existing code still produces NULL for negative lengths. See further comments in this thread.

Sven W.KayKThorsten KettnerFatMartinRApexBinefloo_barErik van RoonTimur AkhmadeevJorge RimblasJeffrey KemppnoskoGregVctriebsdstuber
14 votes

Active · Last Updated

Comments

  • FatMartinR
    FatMartinR Member Posts: 190 Blue Ribbon

    Whilst I appreciate your idea is based on the <span class="codeinlineitalic" style="font-style: italic;">substring_length</span>, what would it mean if <span class="codeinlineitalic" style="font-style: italic;">position</span> was also negative?

    https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions162.htm

    SUBSTR

    Syntax

    substr::=

    Description of substr.gif follows

    Description of the illustration substr.gif

    Purpose

    The SUBSTR functions return a portion of <span class="codeinlineitalic" style="font-style: italic;">char</span>, beginning at character <span class="codeinlineitalic" style="font-style: italic;">position</span>, <span class="codeinlineitalic" style="font-style: italic;">substring_length</span> characters long. SUBSTR calculates lengths using characters as defined by the input character set. SUBSTRB uses bytes instead of characters. SUBSTRC uses Unicode complete characters. SUBSTR2 uses UCS2 code points. SUBSTR4 uses UCS4 code points.

    • If <span class="codeinlineitalic" style="font-style: italic;">position</span> is 0, then it is treated as 1.
    • If <span class="codeinlineitalic" style="font-style: italic;">position</span> is positive, then Oracle Database counts from the beginning of <span class="codeinlineitalic" style="font-style: italic;">char</span> to find the first character.
    • If <span class="codeinlineitalic" style="font-style: italic;">position</span> is negative, then Oracle counts backward from the end of <span class="codeinlineitalic" style="font-style: italic;">char</span>.
    • If <span class="codeinlineitalic" style="font-style: italic;">substring_length</span> is omitted, then Oracle returns all characters to the end of <span class="codeinlineitalic" style="font-style: italic;">char</span>. If <span class="codeinlineitalic" style="font-style: italic;">substring_length</span> is less than 1, then Oracle returns null.

    <span class="codeinlineitalic" style="font-style: italic;">char</span> can be any of the datatypes CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB. Both <span class="codeinlineitalic" style="font-style: italic;">position</span> and <span class="codeinlineitalic" style="font-style: italic;">substring_length</span> must be of datatype NUMBER, or any datatype that can be implicitly converted to NUMBER, and must resolve to an integer. The return value is the same datatype as <span class="codeinlineitalic" style="font-style: italic;">char</span>. Floating-point numbers passed as arguments to SUBSTR are automatically converted to integers.

  • Sven W.
    Sven W. Member Posts: 10,535 Gold Crown
    edited Jul 17, 2019 3:36AM

    Whilst I appreciate your idea is based on the <span class="codeinlineitalic" style="font-style: italic;">substring_length</span>, what would it mean if <span class="codeinlineitalic" style="font-style: italic;">position</span> was also negative?

    https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions162.htm

    SUBSTR

    Syntax

    substr::=

    Description of substr.gif follows

    Description of the illustration substr.gif

    Purpose

    The SUBSTR functions return a portion of <span class="codeinlineitalic" style="font-style: italic;">char</span>, beginning at character <span class="codeinlineitalic" style="font-style: italic;">position</span>, <span class="codeinlineitalic" style="font-style: italic;">substring_length</span> characters long. SUBSTR calculates lengths using characters as defined by the input character set. SUBSTRB uses bytes instead of characters. SUBSTRC uses Unicode complete characters. SUBSTR2 uses UCS2 code points. SUBSTR4 uses UCS4 code points.

    • If <span class="codeinlineitalic" style="font-style: italic;">position</span> is 0, then it is treated as 1.
    • If <span class="codeinlineitalic" style="font-style: italic;">position</span> is positive, then Oracle Database counts from the beginning of <span class="codeinlineitalic" style="font-style: italic;">char</span> to find the first character.
    • If <span class="codeinlineitalic" style="font-style: italic;">position</span> is negative, then Oracle counts backward from the end of <span class="codeinlineitalic" style="font-style: italic;">char</span>.
    • If <span class="codeinlineitalic" style="font-style: italic;">substring_length</span> is omitted, then Oracle returns all characters to the end of <span class="codeinlineitalic" style="font-style: italic;">char</span>. If <span class="codeinlineitalic" style="font-style: italic;">substring_length</span> is less than 1, then Oracle returns null.

    <span class="codeinlineitalic" style="font-style: italic;">char</span> can be any of the datatypes CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB. Both <span class="codeinlineitalic" style="font-style: italic;">position</span> and <span class="codeinlineitalic" style="font-style: italic;">substring_length</span> must be of datatype NUMBER, or any datatype that can be implicitly converted to NUMBER, and must resolve to an integer. The return value is the same datatype as <span class="codeinlineitalic" style="font-style: italic;">char</span>. Floating-point numbers passed as arguments to SUBSTR are automatically converted to integers.

    Excellent point.

    I think there are two possible options.

    a) raise an error if both parameters are negative => not my favorite solution

    b) allow both parameters to be negative

    b1) start the counting always from the back of the string

    b2) start the counting from the resulting position

    My first idea was more in line with b2). However this results in some effects that might be more surprising than b1).

    There also could be some ambiguity which needs resolving.

    Some examples:

    negative position, negative length

    -- b1) count from the end of the stringsubstr('abcdefgh',-2,-1) => 'g'substr('abcdefgh',-2,-4) => null

    there could be an alternative (b2)

    -- b2) count to the left of the positionsubstr('abcdefgh',-2,-1) => 'f'
    substr('abcdefgh',-2,-4) => 'cdef'

    I think this needs careful consideration.

    positive position, negative length

    substr('abcdefgh',4,-2) => 'ef'substr('abcdefgh',7,-2) => null

    One major issue I see is that this suggestion could break existing (slightly strange) code.

    Currently

    substr('abcdefgh', 1, -1) => null

    While it doesn't make much sense to have a negative length, there is a defined result. The suggestion would change this result.

    So maybe yet an additional parameter is needed that influences the length behaviour.

    substr(text, position, length, length_mode)

    Only when length_mode is set to 'new behaviour' then negative lengths work as I suggested.

    -- edit --

    result of examples changed.

  • FatMartinR
    FatMartinR Member Posts: 190 Blue Ribbon

    Excellent point.

    I think there are two possible options.

    a) raise an error if both parameters are negative => not my favorite solution

    b) allow both parameters to be negative

    b1) start the counting always from the back of the string

    b2) start the counting from the resulting position

    My first idea was more in line with b2). However this results in some effects that might be more surprising than b1).

    There also could be some ambiguity which needs resolving.

    Some examples:

    negative position, negative length

    -- b1) count from the end of the stringsubstr('abcdefgh',-2,-1) => 'g'substr('abcdefgh',-2,-4) => null

    there could be an alternative (b2)

    -- b2) count to the left of the positionsubstr('abcdefgh',-2,-1) => 'f'
    substr('abcdefgh',-2,-4) => 'cdef'

    I think this needs careful consideration.

    positive position, negative length

    substr('abcdefgh',4,-2) => 'ef'substr('abcdefgh',7,-2) => null

    One major issue I see is that this suggestion could break existing (slightly strange) code.

    Currently

    substr('abcdefgh', 1, -1) => null

    While it doesn't make much sense to have a negative length, there is a defined result. The suggestion would change this result.

    So maybe yet an additional parameter is needed that influences the length behaviour.

    substr(text, position, length, length_mode)

    Only when length_mode is set to 'new behaviour' then negative lengths work as I suggested.

    -- edit --

    result of examples changed.

    My point was actually this.    We already have your desired functionality.

    • If <span class="codeinlineitalic" style="font-weight: inherit; font-style: italic; font-family: inherit;">position</span> is negative, then Oracle counts backward from the end of <span class="codeinlineitalic" style="font-weight: inherit; font-style: italic; font-family: inherit;">char</span>.
  • Thorsten Kettner
    Thorsten Kettner Member Posts: 42 Red Ribbon

    My point was actually this.    We already have your desired functionality.

    • If <span class="codeinlineitalic" style="font-weight: inherit; font-style: italic; font-family: inherit;">position</span> is negative, then Oracle counts backward from the end of <span class="codeinlineitalic" style="font-weight: inherit; font-style: italic; font-family: inherit;">char</span>.

    No, we don't. At least not in a convienient way with a single call to substr and mere number literals as position and length parameters.

    Sven wants select substr('abcdefgh', 1, -2) from dual; to get 'abcdef', while select substr('abcdefgh', -2, -2) from dual; gets 'gh' which is the opposite.

    Sven W.
  • FatMartinR
    FatMartinR Member Posts: 190 Blue Ribbon

    No, we don't. At least not in a convienient way with a single call to substr and mere number literals as position and length parameters.

    Sven wants select substr('abcdefgh', 1, -2) from dual; to get 'abcdef', while select substr('abcdefgh', -2, -2) from dual; gets 'gh' which is the opposite.

    Oh yeah.   I see what you mean now.

    Yeah, why not.   It's short-hand I suppose, so goes to make for more easily read code.

    Sven W.