This discussion is archived
8 Replies Latest reply: Jan 22, 2012 8:40 AM by 912516 RSS

strcpy corrupting data

908674 Newbie
Currently Being Moderated
the following code produces strange behaviour on a specific context.
It's used to shift the an array of char to the left

#include <iostream>
int main( int argc, char** argv)
{
char* p = &argv[1][0];
std::cout << p << std::endl;
strcpy( p, p+1);
std::cout << p << std::endl;
return -1;
}

produces the following output when it's compile with CC -m64
on /etc/release
Oracle Solaris 10 9/10 s10x_u9wos_14a X86
Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.
Assembled 11 August 2010

a.out "%%ABCDEF%%123"
%%ABCDEF%%123
%ABCDF%%%123

it works with CC -m32
it works both 32-bit and 64-bit on
Solaris 10 10/09 s10x_u8wos_08a X86
Copyright 2009 Sun Microsystems, Inc. All Rights Reserved.
Use is subject to license terms.
Assembled 16 September 2009

I exhibit a case when it fails
it depends.

for instance
a.out "123456789012"
123456789012
23456789012

works fine, which suggests there is nothing wrong using strcpy the way above. it's been like this for years.
I suppose something wrong in the 64-bit libc of Solaris 10 u9 on x86

has anyone come across like this before and
does anyone know whether it's been corrected?

thanks
do
  • 1. Re: strcpy corrupting data
    Steve_Clamage Pro
    Currently Being Moderated
    Function strcpy requires disjoint source and target arrays. The effect of trying to copy overlapping ranges, as in your case, is undefined. Reference: C standard, section 7.21.2.3. (C++ inherits string functions and their semantics from C.)
    The memcpy function has the same restriction, but the memmove function allows overlapping ranges.

    The strcpy and memcpy functions have the non-overlapping restriction to allow for the most efficient copying possible. Copying can start at either end of the arrays, and multiple bytes can be copied at one time.

    When you need to copy overlapping objects, use memmove. It works correctly whether ranges overlap or not.
  • 2. Re: strcpy corrupting data
    rajp Newbie
    Currently Being Moderated
    The dynamic memory access checking feature of Code Analyzer in the latest Studio tools would report these kinds of errors. For example in this case, it would have warned you about the overlapping memory areas.

    Raj
  • 3. Re: strcpy corrupting data
    908674 Newbie
    Currently Being Moderated
    I realise your point perfectly. I'm very aware of ovelapping ranges and that memmove should be considered. nevertheless, I consider it very strange the difference of behaviour with different values (same length). I would believe there is some regression somewhere.
  • 4. Re: strcpy corrupting data
    Steve_Clamage Pro
    Currently Being Moderated
    "Undefined behavior" means you cannot expect any particular result, ever. Results can vary with any change to anything -- the compile-time platform, the run-time platform, the compiler version, compiler options, the OS, the OS update level, the run-time inputs to the application, etc.

    In your example, the strings that behaved differently were not the same length, at least as shown in your first post.
    %%ABCDEF%%123
    123456789012
    The algorithm used in the Solaris version of strcpy is sensitive to the length of the string.

    A change in behavior when behavior is undefined does not constitute a regression -- because no particular result could reasonably be expected in the first place.
  • 5. Re: strcpy corrupting data
    908674 Newbie
    Currently Being Moderated
    my mistake about the length. I would have noticed similar behaviour, if I had paid more attention
    Unfortunately, it does not help much. the bit of code is part of legacy code, which means it's being used for years on various OS (total of 7) and working on site. Though it shouldn't be coded like this in the first place, it has worked on solaris 10 x86 up to upgrade 8 and not in upgrade 9 (not tested with upgrade 10), Solaris has altered the implementation. Even though I can not assume an "undefined behaviour", I can not help considering it as a regression. Now, I have to advise against upgrading the OS inconsiderately.
    Note, as mentioned before, that it works when compiled in 32-bit. Is the algorithm different between 32-bit and 64-bit? surprisingly or not, yes
    It appears that a change of behaviour takes place with a length of 13, which I consider strange as well.

    Edited by: 905671 on Jan 4, 2012 1:48 AM

    Edited by: 905671 on Jan 4, 2012 1:59 AM
  • 6. Re: strcpy corrupting data
    Steve_Clamage Pro
    Currently Being Moderated
    As you know, strcpy is in libc.so.1, distributed as part of Solaris. Although the interface to standard functions is stable, the implementation can vary to fix bugs (unlikely for strcmp) or to improve performance (likely for strcmp). The implementation varies not only for Sparc vs x86, but for 32-bit vs 64-bit. Critical small functions like strcpy are written in assembler.

    Sometimes at high optimization levels, small system functions will be generated inline by the compiler code generator instead of by calling the library function. The compiler won't do that unless it can know that the generated code will be faster. Example: copying a literal string -- the size and alignment are known at compile time.

    On modern CPUs, you get a performance gain if you can copy a word or double word at a time instead of a byte at a time. The strcpy function does in fact check for alignment and whether the end of the string is at the end of a double-word, word, or half-word, and copies according to what it finds. The effect of strcpy(p, p+1) can therefore be different when the length is a multiple of 4 (or 8) bytes than when it is not, and can be different depending on the alignment of the start of the string.

    If you have to use legacy code that you cannot modify, and if the code depends on undefined behavior and works only by accident of implementation, you cannot update any system software (compiler, linker, libraries) the code depends on without first doing thorough testing. In such cases, it would be worthwhile IMHO to figure out how to modify the code, or to replace it with code that you can maintain.
  • 7. Re: strcpy corrupting data
    Darryl Gove Newbie
    Currently Being Moderated
    Hi,

    I'd be very concerned that the application relies on a particular variety of "undefined behaviour". The problem in these situations is not that the application fails on a particular Solaris update, it is that the failure may not be detected on other OS versions or patch levels. You should be able to LD_PRELOAD[_32|_64] a library that provides the desired strcpy behaviour, this will enable you to avoid both detected and undetected issues with the legacy application.

    You can see some of the implementations of strcpy at opensolaris.org: http://src.opensolaris.org/source/search?q=&project=onnv&defs=strcpy&refs=&path=&hist=

    Regards,

    Darryl.
  • 8. Re: strcpy corrupting data
    912516 Newbie
    Currently Being Moderated
    905671 wrote:
    the following code produces strange behaviour on a specific context.
    It's used to shift the an array of char to the left

    #include <iostream>
    int main( int argc, char** argv)
    {
    char* p = &argv[1][0];
    std::cout << p << std::endl;
    strcpy( p, p+1);
    std::cout << p << std::endl;
    return -1;
    }

    produces the following output when it's compile with CC -m64
    on /etc/release
    Oracle Solaris 10 9/10 s10x_u9wos_14a X86
    Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.
    Assembled 11 August 2010

    a.out "%%ABCDEF%%123"
    %%ABCDEF%%123
    %ABCDF%%%123
    You are trying to write to a memory area (argv) provided and reserved by the implementation (any C implementation), that is meant only to be read.

    If we want to modify and use its data, we have to create one or more copies of its data that we want, to char arrays defined by us.

    Also if we want to copy data inside a char array, memmove is preffered, in the style:

    /* memmove example */
    #include <stdio.h>
    #include <string.h>

    int main(void)
    {
    char str[] = "memmove can be very useful......";
    memmove (str+20,str+15,11);
    puts (str);
    return 0;
    }

    I hope these help.

    Edited by: 909513 on Jan 22, 2012 6:40 PM

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points