This content has been marked as final. Show 8 replies
Function strcpy requires disjoint source and target arrays. The effect of trying to copy overlapping ranges, as in your case, is undefined. Reference: C standard, section 188.8.131.52. (C++ inherits string functions and their semantics from C.)
The memcpy function has the same restriction, but the memmove function allows overlapping ranges.
The strcpy and memcpy functions have the non-overlapping restriction to allow for the most efficient copying possible. Copying can start at either end of the arrays, and multiple bytes can be copied at one time.
When you need to copy overlapping objects, use memmove. It works correctly whether ranges overlap or not.
"Undefined behavior" means you cannot expect any particular result, ever. Results can vary with any change to anything -- the compile-time platform, the run-time platform, the compiler version, compiler options, the OS, the OS update level, the run-time inputs to the application, etc.
In your example, the strings that behaved differently were not the same length, at least as shown in your first post.
The algorithm used in the Solaris version of strcpy is sensitive to the length of the string.
A change in behavior when behavior is undefined does not constitute a regression -- because no particular result could reasonably be expected in the first place.
my mistake about the length. I would have noticed similar behaviour, if I had paid more attention
Unfortunately, it does not help much. the bit of code is part of legacy code, which means it's being used for years on various OS (total of 7) and working on site. Though it shouldn't be coded like this in the first place, it has worked on solaris 10 x86 up to upgrade 8 and not in upgrade 9 (not tested with upgrade 10), Solaris has altered the implementation. Even though I can not assume an "undefined behaviour", I can not help considering it as a regression. Now, I have to advise against upgrading the OS inconsiderately.
Note, as mentioned before, that it works when compiled in 32-bit. Is the algorithm different between 32-bit and 64-bit? surprisingly or not, yes
It appears that a change of behaviour takes place with a length of 13, which I consider strange as well.
Edited by: 905671 on Jan 4, 2012 1:48 AM
Edited by: 905671 on Jan 4, 2012 1:59 AM
As you know, strcpy is in libc.so.1, distributed as part of Solaris. Although the interface to standard functions is stable, the implementation can vary to fix bugs (unlikely for strcmp) or to improve performance (likely for strcmp). The implementation varies not only for Sparc vs x86, but for 32-bit vs 64-bit. Critical small functions like strcpy are written in assembler.
Sometimes at high optimization levels, small system functions will be generated inline by the compiler code generator instead of by calling the library function. The compiler won't do that unless it can know that the generated code will be faster. Example: copying a literal string -- the size and alignment are known at compile time.
On modern CPUs, you get a performance gain if you can copy a word or double word at a time instead of a byte at a time. The strcpy function does in fact check for alignment and whether the end of the string is at the end of a double-word, word, or half-word, and copies according to what it finds. The effect of strcpy(p, p+1) can therefore be different when the length is a multiple of 4 (or 8) bytes than when it is not, and can be different depending on the alignment of the start of the string.
If you have to use legacy code that you cannot modify, and if the code depends on undefined behavior and works only by accident of implementation, you cannot update any system software (compiler, linker, libraries) the code depends on without first doing thorough testing. In such cases, it would be worthwhile IMHO to figure out how to modify the code, or to replace it with code that you can maintain.
I'd be very concerned that the application relies on a particular variety of "undefined behaviour". The problem in these situations is not that the application fails on a particular Solaris update, it is that the failure may not be detected on other OS versions or patch levels. You should be able to LD_PRELOAD[_32|_64] a library that provides the desired strcpy behaviour, this will enable you to avoid both detected and undetected issues with the legacy application.
You can see some of the implementations of strcpy at opensolaris.org: http://src.opensolaris.org/source/search?q=&project=onnv&defs=strcpy&refs=&path=&hist=
905671 wrote:You are trying to write to a memory area (argv) provided and reserved by the implementation (any C implementation), that is meant only to be read.
the following code produces strange behaviour on a specific context.
It's used to shift the an array of char to the left
int main( int argc, char** argv)
char* p = &argv;
std::cout << p << std::endl;
strcpy( p, p+1);
std::cout << p << std::endl;
produces the following output when it's compile with CC -m64
Oracle Solaris 10 9/10 s10x_u9wos_14a X86
Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.
Assembled 11 August 2010
If we want to modify and use its data, we have to create one or more copies of its data that we want, to char arrays defined by us.
Also if we want to copy data inside a char array, memmove is preffered, in the style:
/* memmove example */
char str = "memmove can be very useful......";
I hope these help.
Edited by: 909513 on Jan 22, 2012 6:40 PM