Forum Stats

  • 3,741,299 Users
  • 2,248,407 Discussions
  • 7,861,732 Comments

Discussions

Equivalents To SQL Server's STUFF() Function

Ray Winkelman
Ray Winkelman Member Posts: 75 Green Ribbon
edited Jul 28, 2016 5:16PM in SQL & PL/SQL

Hey Guys!

Let's have some fun with strings today.

So, I've got some T-SQL code I'm translating into PL-SQL, and I've been encountering some uses of SQL Server's STUFF() function. Microsoft's documentation on the STUFF() function states:

The STUFF function inserts a string into another string. It deletes a specified length of characters in the first string at the start position and then inserts the second string into the first string at the start position.

The header of the STUFF() function accepts the following arguments:

STUFF ( character_expression , start , length , replaceWith_expression )


So, considering the following code as our example, how can we manipulate a string in PL-SQL with as few calls to system functions as possible? Let's aim for the most performant solution - as always.

select stuff( 
( 
    select '123456789' 
    from   table_name for clob path('')
), 1, 4, ''
) as example 
from dual;

And, GO!

John StegemanFrank KulashRay WinkelmanEtbinAndrewSayerJonathan HultUser_D1JQ2

Best Answer

  • Paulzip
    Paulzip Member Posts: 8,370 Blue Diamond
    edited Jul 28, 2016 1:05PM Accepted Answer

    create or replace function Stuff(pExpr varchar2,

                                     pStart integer,

                                     pLength integer,

                                     pReplace varchar2) return varchar2 is

    begin

      return case

               when pStart is not null and pLength is not null and

                    pStart > 0 and pStart < length(pExpr) then

                 substr(pExpr, 1, pStart - 1)||pReplace||substr(pExpr, pStart + pLength)

               else null

             end; 

    end;

    Ray WinkelmanJonathan Hult

Answers

  • John Stegeman
    John Stegeman Member Posts: 24,269 Blue Diamond
    edited Jul 28, 2016 1:01PM

    INSTR() and SUBSTR() would work just fine edit: DOH, what was I thinking?!?! Saw Frank's reply and realized that I typed without thinking. @Frank Kulash, we also need ||

    How it would work is left as an exercise for the reader.

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,727 Red Diamond
    edited Jul 28, 2016 1:03PM

    Hi,

    To get the same results in Oracle, SUBSTR is the only function you need.  In general, you'll have to call that function twice: once to get the part of the original string before start, and once to get the part after (start + length).

    If you really, really wanted to minimize the number of function calls, you can do it with a single call to REGEXP_REPLACE, but it would require more || operations.  Since the || operrator does the same thing as the CONCAT function, I assume you'd want to minimize them, too.

    John Stegeman
  • Paulzip
    Paulzip Member Posts: 8,370 Blue Diamond
    edited Jul 28, 2016 1:05PM Accepted Answer

    create or replace function Stuff(pExpr varchar2,

                                     pStart integer,

                                     pLength integer,

                                     pReplace varchar2) return varchar2 is

    begin

      return case

               when pStart is not null and pLength is not null and

                    pStart > 0 and pStart < length(pExpr) then

                 substr(pExpr, 1, pStart - 1)||pReplace||substr(pExpr, pStart + pLength)

               else null

             end; 

    end;

    Ray WinkelmanJonathan Hult
  • BluShadow
    BluShadow Member, Moderator Posts: 41,088 Red Diamond
    edited Jul 28, 2016 1:04PM

    Or it could be done using a single regexp_replace function, but again, if this isn't a question about an issue, I'll leave it up to those who want to have the exercise.

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,727 Red Diamond
    edited Jul 28, 2016 1:17PM

    Hi,

    John Stegeman wrote:
    
    Frank Kulash, we also need || 
    ...
    
    

    || is an operator, not a function.

    As OP phrased it, this sounds like a question to test how clever you are and how well you know Oracle SQL, regardless of any practical uses.

    Of course, OP may have phrased the question poorly, and may actually want a simple and/or efficient way, regardless of what tools and techniques are used.

    John Stegeman
  • Ray Winkelman
    Ray Winkelman Member Posts: 75 Green Ribbon
    edited Jul 28, 2016 1:30PM

    Hey guys.

    To clear up some ambiguity here: I'm looking for the least processing intensive drop-in replacement for the STUFF() method. Paulzip provided exactly what I was looking for, bravo!

    Is anyone willing to contribute the regex solution? I'd like to compare the explain plan of both routines. Please wrap the regex solution in a function called STUFF() like Paulzip did.

    This may seem like a trivial question to some of you veterans, but it's important to me that this function is optimal, as I'll be nesting calls to this in queries expected to return thousands of rows.

    Here's Paulzip's plan, explained:

    -----------------------------------------------------------------

    | Id  | Operation        | Name | Rows  | Cost (%CPU)| Time     |

    -----------------------------------------------------------------

    |   0 | SELECT STATEMENT |      |     1 |     2   (0)| 00:00:01 |

    |   1 |  FAST DUAL       |      |     1 |     2   (0)| 00:00:01 |

    -----------------------------------------------------------------

  • John Stegeman
    John Stegeman Member Posts: 24,269 Blue Diamond
    edited Jul 28, 2016 1:45PM

    regexp will NOT be the one you want. Guaranteed.

    Oh, and by the way - your execution plan has nothing to do with the function. I guess you did "select stuf(....) from dual" You'll get the same execution plan with a stuff() function written to use regexp.

    Frank KulashRay WinkelmanAndrewSayerRay Winkelman
  • Ray Winkelman
    Ray Winkelman Member Posts: 75 Green Ribbon
    edited Jul 28, 2016 1:58PM

    Yeah?

    Although it wasn't a verbosely explained plan, I thought that the statistics were inclusive of the function call. Am I wrong?

    Should I execute plan the statement instead?

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,727 Red Diamond
    edited Jul 28, 2016 1:59PM

    Hi,

    John Stegeman wrote:
    
    ... your execution plan has nothing to do with the function. I guess you did "select stuf(....) from dual" You'll get the same execution plan with a stuff() function written to use regexp.
    

    Good point!

    OP: the execution plan describes how rows are fetched.  In most cases, how the rows are fetched has much more to do with efficiency than what you do with the rows after you get them.  However, what you do with the rows after you get them can be significant.

    Imagine you want to travel to another country, and have lunch after you get there.  If you're concerned about cost, you'll focus on how you get to the other country: a first-class plane ticket might cost a lot more than a second-class train ticket.  Where you have lunch and what you eat will make a difference, too, but probably not nearly as much.

    Ray WinkelmanRay Winkelman
  • John Stegeman
    John Stegeman Member Posts: 24,269 Blue Diamond
    edited Jul 28, 2016 2:02PM

    Explain plan is not going to be useful in evaluating the performance of a function (as long as the function isn't itself doing a bunch of SQL).

    If you wanted to measure performance, I'd be doing something like using v$sesstat snapshots before and after executing the function a few (dozens, hundreds, thousands, depending on the function) times and seeing how much CPU is used and also measuring the elapsed times.

    Ray WinkelmanRay Winkelman
  • CarlosDLG
    CarlosDLG Member Posts: 1,361 Silver Trophy
    edited Jul 28, 2016 2:03PM
    Is anyone willing to contribute the regex solution?
    

    For your own learning, you should try to build the regexp version.  You can then post it here to get feedback.

    John StegemanRay Winkelman
  • BluShadow
    BluShadow Member, Moderator Posts: 41,088 Red Diamond
    edited Jul 28, 2016 2:19PM
    John Stegeman wrote:
    
    regexp will NOT be the one you want. Guaranteed.
    
    

    Maybe... maybe not.

    We know regexp functions are more CPU intensive in general, but the payoff between a single fairly simple regexp compared to several regular string functions may be ok... it depends on what you're doing.  Certainly if it's just for a small set of data, the chances are you won't notice any real difference.

  • Ray Winkelman
    Ray Winkelman Member Posts: 75 Green Ribbon
    edited Jul 28, 2016 2:22PM

    Provided I thought that the solution will be a long one, I would do that. 

    It's because I know that the solution will be a short one, I know that critiquing my efforts will almost certainly require a greater collective effort than someone who's better versed in the regex library posting a proficient solution.

    Trust me. I'm just a monkey with a keyboard.

    On another note, I trust John entirely (And my own intuitions) when he says:

    regexp will NOT be the one you want. Guaranteed.
    

    So it's time to mark this sucker as answered!

    I only asked for the regex example because it was entertained earlier in the thread, and if provided, we would have exhausted all (the discussed) possible solutions to my problem.

  • Etbin
    Etbin Member Posts: 8,968 Gold Crown
    edited Jul 28, 2016 2:29PM

    More fuction calls might provide more not null results.

    Replacing nvl(p_start,0) with case when p_start is null then 0 else p_start end and accordingly might be faster.


    with

    data as

    (select '123456789' p_expr,3 p_start,5 p_length,'***' p_stuff from dual union all

    select null,null,null,null from dual union all

    select null,null,null,'123456789' from dual union all

    select null,0,0,'123456789' from dual union all

    select null,0,3,'123456789' from dual union all

    select null,3,0,'123456789' from dual union all

    select '+~+~+~+~',0,0,'123456789' from dual union all

    select '+~+~+~+~',3,0,'123456789' from dual union all

    select '+~+~+~+~',0,3,'123456789' from dual union all

    select 'ABCDEFGH',3,11,'abcdefgh' from dual union all

    select 'ABCDEFGH',3,3,'abcdefgh' from dual union all

    select 'ABCDEFGH',3,3,null from dual union all

    select 'ABCDEFGHIJK',-5,3,'!!!' from dual union all

    select 'ABCDEFGHIJK',-5,-3,'!!!' from dual union all

    select 'ABCDEFGHIJK',5,-3,'!!!' from dual union all

    select '987654321',null,null,'123456789' from dual union all

    select '987654321',null,null,null from dual

    )

    select p_expr,p_start,p_length,p_stuff,

           case when p_start is not null

                 and p_length is not null

                 and p_start > 0

                 and p_start < length(p_expr)

                then substr(p_expr,1,p_start - 1)||p_stuff||substr(p_expr,p_start + p_length)

               else null

             end result,

             substr(p_expr,1,nvl(p_start,0) - 1)||p_stuff||substr(p_expr,nvl(p_start,0) + nvl(p_length,0)) variant

      from data

    P_EXPRP_STARTP_LENGTHP_STUFFRESULTVARIANT
    12345678935***12***8912***89
    ------
    ---123456789-123456789
    -00123456789-123456789
    -03123456789-123456789
    -30123456789-123456789
    +~+~+~+~00123456789-123456789+~+~+~+~
    +~+~+~+~30123456789+~123456789+~+~+~+~123456789+~+~+~
    +~+~+~+~03123456789-123456789+~+~+~
    ABCDEFGH311abcdefghABabcdefghABabcdefgh
    ABCDEFGH33abcdefghABabcdefghFGHABabcdefghFGH
    ABCDEFGH33-ABFGHABFGH
    ABCDEFGHIJK-53!!!-!!!JK
    ABCDEFGHIJK-5-3!!!-!!!DEFGHIJK
    ABCDEFGHIJK5-3!!!ABCD!!!BCDEFGHIJKABCD!!!BCDEFGHIJK
    987654321--123456789-123456789987654321
    987654321----987654321

    Regards

    Etbin

    Ray WinkelmanRay WinkelmanJonathan Hult
  • John Stegeman
    John Stegeman Member Posts: 24,269 Blue Diamond
    edited Jul 28, 2016 2:25PM

    Never trust, always verify

    At any rate, substr and concatenation are pretty simple operations, so I still believe it will be faster than regexp, Blu's point notwithstanding

    @BluShadow what about the overhead of having to parse/compile the regular expression itself? My "gut feel" is that just that part of the work will take more resource than the simple substr/||

    Ray WinkelmanAndrewSayer
  • Paulzip
    Paulzip Member Posts: 8,370 Blue Diamond
    edited Jul 28, 2016 2:28PM

    A regular expression solution will be slower, I can pretty much guarantee that. I forgot to add deterministic clause to my function which would allow result caching, so would be faster if the same inputs occurred.

    I'd be very surprised if you could create something faster, as any solution would still need to use substr.

    EtbinRay Winkelman
  • John Stegeman
    John Stegeman Member Posts: 24,269 Blue Diamond
    edited Jul 28, 2016 2:28PM

    and FYI: Google found the same function

  • Ray Winkelman
    Ray Winkelman Member Posts: 75 Green Ribbon
    edited Jul 28, 2016 2:45PM

    I love when an actually smart person contributes useful information in a place you would expect smart people to contribute useful information.

    You rock, Etbin.

  • BluShadow
    BluShadow Member, Moderator Posts: 41,088 Red Diamond
    edited Jul 28, 2016 3:30PM

    regexp is certainly likely to be slower in most cases, but for the purpose of illustrating (I don't have a database to hand to test at the minute)...

    It would be something like:

    regexp_replace(source_string, '^(.{4}).{3}(.*)$', '\1STUFFSTRING\2')


    So, in this example, the "4" is the number of characters to keep on the left

    The "3" is the number of characters to discard

    and the "STUFFSTRING" is the string to put in place of the 3 characters removed.

    Whilst regexp are generally considered CPU intensive, this pattern is relatively straightforward.

    The first 4 character are easy to parse,  The next 3 characters are easy to parse and the remainder of the string are easy to parse.

    The replacement is just including the first 4 character and the remainder of the string around the string you want to include in place of the 3 that are removed.

    The use of "^" to indicate the start of the string and the "$" to indicate the end of the string will help with parsing as it provides definite anchor points for the parsing to consider.

    Certainly, other regular expression patterns can be more complex and more CPU intensive, especially if they include "|" (or), and/or variable string length patterns etc.

    The best way to determine what's more performant between regexp and regular string functions would be to test it on a good volume of data.

    Ray WinkelmanRay Winkelman
  • BluShadow
    BluShadow Member, Moderator Posts: 41,088 Red Diamond
    edited Jul 28, 2016 3:35PM
    John Stegeman wrote:
    
     BluShadow what about the overhead of having to parse/compile the regular expression itself? My "gut feel" is that just that part of the work will take more resource than the simple substr/||
    

    Overhead of parsing, whether it's code or string parsing, always depends on the complexity of what you're parsing.  Some things are simple and parsed quickly, other things are complex and slow.

    In this 'challenge' the obtaining of a fixed set of characters from the left, followed by a fixed set of characters to discard, followed by the remaining characters is not complex.  Sure, SUBSTR can use relative byte positioning rather than having to do much in the way of parsing, but the difference I suspect will be negligible unless we're processing large volumes of data.

  • Paulzip
    Paulzip Member Posts: 8,370 Blue Diamond
    edited Jul 28, 2016 3:43PM
    Etbin wrote:
    
    More fuction calls might provide more not null results. 
    Replacing nvl(p_start,0) with case when p_start is null then 0 else p_start end and accordingly might be faster.
    Regards Etbin

    Mine might be more nulls, but it matches the SQL Server notes on the operation and parameters of Stuff function!

    Ray WinkelmanRay Winkelman
  • Etbin
    Etbin Member Posts: 8,968 Gold Crown
    edited Jul 28, 2016 5:16PM

    Until now I was only aware SQL Server exists and most probably such Stuff function parameters specifications (actually limitations) and notes on operation are just another reason to let the before mentioned awareness remain the same.

    Might not be irrelevant to the OP though, but at least in principle, migrating from SQL Server to Oracle requiring Oracle to work according to SQL Server specifications sounds weird (at least to me).

    A kind of modify-everything-without-changing-anything.

    Regards

    Etbin

This discussion has been closed.