Skip to Main Content

Hardware

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

vdbench: Repeatable false negative validation errors

4207177Mar 9 2020 — edited Mar 9 2020

Failing to find an apparent bug in our system, I decided to go back to basics and test the

tester, vdbench, in this case.

It looks like I've hit a bug where data validation errors are spuriously generated. Collegues

have verified this is other systems my sanity, also seeing similar same results. The following

reproducing test uses a local file for the device/lun for each run (the underlying filesytem

seems to make no difference to the results).

Vdbench is vdbench50407.

For this test, only iorate is changed to trigger the issue. When iorate is < ~176 test runs

generally succeed, and generally fail above this, the results vary. Each failure is an apparent

data validation error as reported by vdbench.

Unless we've missed something, as far as we understand, all these tests should pass.

To reproduce (with all files produced under sub directory $top/), with the assumption of free

disk space, free ports, and available memory for jvms:

{

    typeset top="$(pwd)/test.vdbench"

    mkdir -p "$top"

    find "$top/" -depth -mindepth 1 -delete

    cat <<. >"$top/parmfile"

data_errors=1

journal=noflush

validate=read

sd=sd1,openflags=o_direct,lun=\$lun,journal=\$journal

wd=wd1,sd=sd1,seekpct=eof,rdpct=50

rd=rd1,warmup=0,elapsed=10,interval=2,maxdata=0,iorate=\$iorate,threads=1,wd=wd1,xfersize=2.5k

.

    typeset -a tests=()

    for iorate in 1 4 16 64 $(seq 128 2 256) 257 1024 2048

    do

        (

            typeset test="$top/test.iorate-$iorate"

            mkdir -p "$test"

            truncate -s 1G "$test/dev"

            vdbench -p "$((5070 + $iorate))" -o "$test" -f "$top/parmfile" "lun=$test/dev" "journal=$test" "iorate=$iorate" -vq

            typeset rc="${?#0}"

            printf "%s %s\\n" "$iorate" "${rc:+fail:rc=}${rc:-success}" >>"$test.rc"

        ) &

        tests+=("$!")

        ((${#tests[@]} < 4)) || { wait "${tests[0]}"; tests=("${tests[@]:1}"); }

    done

    wait

    printf "%8s %s\\n" "iorate" "test-result" $(cat "$top/"*".rc" | sort -n) | tee "$top/results"

}

Example results are:

  iorate test-result

       1 success

       4 success

      16 success

      64 success

     128 success

     130 success

     132 success

     134 success

     136 success

     138 success

     140 success

     142 success

     144 success

     146 success

     148 success

     150 success

     152 success

     154 fail:rc=157

     156 success

     158 success

     160 success

     162 success

     164 success

     166 success

     168 fail:rc=157

     170 fail:rc=157

     172 success

     174 fail:rc=157

     176 fail:rc=157

     178 fail:rc=157

     180 fail:rc=157

     182 fail:rc=157

     184 fail:rc=157

     186 fail:rc=157

     188 fail:rc=157

     190 fail:rc=157

     192 fail:rc=157

     194 fail:rc=157

     196 fail:rc=157

     198 fail:rc=157

     200 fail:rc=157

     202 fail:rc=157

     204 fail:rc=157

     206 fail:rc=157

     208 fail:rc=157

     210 fail:rc=157

     212 fail:rc=157

     214 fail:rc=157

     216 fail:rc=157

     218 fail:rc=157

     220 fail:rc=157

     222 fail:rc=157

     224 fail:rc=157

     226 fail:rc=157

     228 fail:rc=157

     230 fail:rc=157

     232 fail:rc=157

     234 fail:rc=157

     236 fail:rc=157

     238 fail:rc=157

     240 fail:rc=157

     242 fail:rc=157

     244 fail:rc=157

     246 fail:rc=157

     248 fail:rc=157

     250 fail:rc=157

     252 fail:rc=157

     254 fail:rc=157

     256 fail:rc=157

     257 fail:rc=157

    1024 fail:rc=157

    2048 fail:rc=157

A failed test run log example (local paths removed) is:

Vdbench error log, created 09:01:35 Mar 09 2020 UTC

Error log. If there are no error messages beyond this line then there were no Data Validation or I/O errors.

09:01:36.416 localhost-0: Created new mmap file: .../test.vdbench/test.iorate-154/vdbench.sd1.mmap.0 mmap size: 409.602k

09:01:36.418 localhost-0: Allocating Data Validation map: 419,432 one-byte entries for each 2,560-byte block.

09:01:36.419 localhost-0: Created new DV map: sd1 size: 1,073,741,824 bytes; key block size: 2560; entries: 419,430

09:01:36.420 localhost-0: Creating journal file .../test.vdbench/test.iorate-154/sd1.jnl

09:01:36.426 localhost-0: Writing Data Validation map to .../test.vdbench/test.iorate-154/sd1.jnl

09:01:36.530 localhost-0: Writing Data Validation map to .../test.vdbench/test.iorate-154/sd1.map

09:01:42.885 localhost-0: 09:01:42.876

09:01:42.885 localhost-0: 09:01:42.877 Time of first corruption: Mon Mar 09 2020 09:01:42.876 UTC

09:01:42.885 localhost-0: 09:01:42.877

09:01:42.885 localhost-0: 09:01:42.879 At least one Data Validation error has been detected.                              

09:01:42.885 localhost-0: 09:01:42.879

09:01:42.885 localhost-0: 09:01:42.879 Terminology:                                                                       

09:01:42.885 localhost-0: 09:01:42.879 - Data block: a block of xfersize= bytes.                                          

09:01:42.885 localhost-0: 09:01:42.879 - Key block:  the smallest xfersize specified by the user which is the unit of     

09:01:42.885 localhost-0: 09:01:42.879               data that Data Validation keeps track of.                            

09:01:42.885 localhost-0: 09:01:42.879 - Sector:     512 bytes of disk storage, regardless of actual storage sector size. 

09:01:42.886 localhost-0: 09:01:42.879 - Lba:        Logical Byte Address, not to be confused with Logical Block Address. 

09:01:42.886 localhost-0: 09:01:42.880

09:01:42.886 localhost-0: 09:01:42.880

09:01:42.886 localhost-0: 09:01:42.880 The output starts with a summary of a data block, followed by a summary of each    

09:01:42.886 localhost-0: 09:01:42.880 key block. If all sectors in a key block show a similar type of data corruption    

09:01:42.886 localhost-0: 09:01:42.880 only the FIRST sector of the key block will be reported.                           

09:01:42.886 localhost-0: 09:01:42.880 For all other cases, ALL sectors will be reported.                                 

09:01:42.886 localhost-0: 09:01:42.880

09:01:42.886 localhost-0: 09:01:42.880 Contents of the first 32 bytes of each sector:                                     

09:01:42.886 localhost-0: 09:01:42.880

09:01:42.886 localhost-0: 09:01:42.880 Byte 0x00 -  0x07: Byte offset of this disk block

09:01:42.886 localhost-0: 09:01:42.880 Byte 0x08 -  0x0f: Timestamp: number of milliseconds since 1/1/1970

09:01:42.887 localhost-0: 09:01:42.881 Byte 0x10        : Data Validation key from 1 - 126

09:01:42.887 localhost-0: 09:01:42.881 Byte 0x11        : Checksum of timestamp

09:01:42.887 localhost-0: 09:01:42.881 Byte 0x12 -  0x13: Reserved

09:01:42.887 localhost-0: 09:01:42.881 Byte 0x14 -  0x1b: SD or FSD name in ASCII hexadecimal

09:01:42.887 localhost-0: 09:01:42.881 Byte 0x1c -  0x1f: Owner ID when written

09:01:42.887 localhost-0: 09:01:42.881 Byte 0x20 - 0x1ff: 480 bytes of compression data pattern

09:01:42.887 localhost-0: 09:01:42.881

09:01:42.887 localhost-0: 09:01:42.881 On the left:  the data that was expected ('.' marks unknown value).

09:01:42.887 localhost-0: 09:01:42.881 On the right: the data that was found.

09:01:42.887 localhost-0: 09:01:42.881

09:01:42.895 localhost-0: 09:01:42.884

09:01:42.895 localhost-0: 09:01:42.884

09:01:42.895 localhost-0: 09:01:42.884

09:01:42.895 localhost-0: 09:01:42.885 Corrupted data block for sd=sd1,lun=.../test.vdbench/test.iorate-154/dev; lba: 2,096,640 (0x001ffe00) xfersize=2560

09:01:42.895 localhost-0: 09:01:42.885

09:01:42.895 localhost-0: 09:01:42.885 Data block has 1 key block(s) of 2560 bytes each.

09:01:42.895 localhost-0: 09:01:42.885 All key blocks are corrupted.

09:01:42.895 localhost-0: 09:01:42.885 Key block lba: 0x001ffe00

09:01:42.895 localhost-0: 09:01:42.887    Key block of 2,560 bytes has 5 512-byte sectors.

09:01:42.896 localhost-0: 09:01:42.887    Timeline:

09:01:42.896 localhost-0: 09:01:42.890    Mon Mar 09 2020 09:01:42.865 UTC Sector last written. (As found in the first corrupted sector, timestamp is taken just BEFORE the actual write).

09:01:42.896 localhost-0: 09:01:42.890    Mon Mar 09 2020 09:01:42.870 UTC Key block first found to be corrupted during a read-after-write.

09:01:42.896 localhost-0: 09:01:42.891

09:01:42.896 localhost-0: 09:01:42.891    4 of 5 sectors are corrupted.

09:01:42.896 localhost-0: 09:01:42.891    All 4 corrupted sectors will be reported.

09:01:42.896 localhost-0: 09:01:42.891

09:01:42.896 localhost-0: 09:01:42.891         Data Validation error for sd=sd1,lun=.../test.vdbench/test.iorate-154/dev

09:01:42.896 localhost-0: 09:01:42.891         Block lba: 0x001ffe00; sector lba: 0x00200000; Key block size: 2560; relative sector in data block: 0x01 ( 1)

09:01:42.896 localhost-0: 09:01:42.891         ===> Compression pattern miscompare.

09:01:42.896 localhost-0: 09:01:42.891         Timestamp found in sector: Mon Mar 09 2020 09:01:42.865 UTC

09:01:42.896 localhost-0: 09:01:42.891 0x000   00000000 00200000 ........ ........   00000000 00200000 00000170 be874c51

09:01:42.897 localhost-0: 09:01:42.893 0x010   52..0000 20316473 20202020 00000000   52530000 20316473 20202020 00000e48

09:01:42.897 localhost-0: 09:01:42.893         There are no mismatches in bytes 32-511

09:01:42.898 localhost-0: 09:01:42.895

09:01:42.898 localhost-0: 09:01:42.895         Data Validation error for sd=sd1,lun=.../test.vdbench/test.iorate-154/dev

09:01:42.898 localhost-0: 09:01:42.895         Block lba: 0x001ffe00; sector lba: 0x00200200; Key block size: 2560; relative sector in data block: 0x02 ( 2)

09:01:42.898 localhost-0: 09:01:42.896         ===> Compression pattern miscompare.

09:01:42.898 localhost-0: 09:01:42.896         Timestamp found in sector: Mon Mar 09 2020 09:01:42.865 UTC

09:01:42.899 localhost-0: 09:01:42.896 0x000   00000000 00200200 ........ ........   00000000 00200200 00000170 be874c51

09:01:42.899 localhost-0: 09:01:42.896 0x010   52..0000 20316473 20202020 00000000   52530000 20316473 20202020 00000e48

09:01:42.899 localhost-0: 09:01:42.896         There are no mismatches in bytes 32-511

09:01:42.900 localhost-0: 09:01:42.898

09:01:42.901 localhost-0: 09:01:42.898         Data Validation error for sd=sd1,lun=.../test.vdbench/test.iorate-154/dev

09:01:42.901 localhost-0: 09:01:42.898         Block lba: 0x001ffe00; sector lba: 0x00200400; Key block size: 2560; relative sector in data block: 0x03 ( 3)

09:01:42.901 localhost-0: 09:01:42.898         ===> Compression pattern miscompare.

09:01:42.901 localhost-0: 09:01:42.898         Timestamp found in sector: Mon Mar 09 2020 09:01:42.865 UTC

09:01:42.901 localhost-0: 09:01:42.898 0x000   00000000 00200400 ........ ........   00000000 00200400 00000170 be874c51

09:01:42.901 localhost-0: 09:01:42.899 0x010   52..0000 20316473 20202020 00000000   52530000 20316473 20202020 00000e48

09:01:42.901 localhost-0: 09:01:42.899         There are no mismatches in bytes 32-511

09:01:42.902 localhost-0: 09:01:42.900

09:01:42.902 localhost-0: 09:01:42.900         Data Validation error for sd=sd1,lun=.../test.vdbench/test.iorate-154/dev

09:01:42.902 localhost-0: 09:01:42.900         Block lba: 0x001ffe00; sector lba: 0x00200600; Key block size: 2560; relative sector in data block: 0x04 ( 4)

09:01:42.902 localhost-0: 09:01:42.900         ===> Compression pattern miscompare.

09:01:42.903 localhost-0: 09:01:42.900         Timestamp found in sector: Mon Mar 09 2020 09:01:42.865 UTC

09:01:42.903 localhost-0: 09:01:42.900 0x000   00000000 00200600 ........ ........   00000000 00200600 00000170 be874c51

09:01:42.903 localhost-0: 09:01:42.900 0x010   52..0000 20316473 20202020 00000000   52530000 20316473 20202020 00000e48

09:01:42.903 localhost-0: 09:01:42.901         There are no mismatches in bytes 32-511

09:01:42.906 localhost-0: 09:01:42.905 op: read   lun: .../test.vdbench/test.iorate-154/dev lba:      2096640 0x001FFE00 xfer:     2560 errno: 60003: '60003            A Data Validation error was discovered'

Comments

kdario

Hmm, it seems that this forum has some issues(or 'features')

Post above is reply to this thread: Image field cannot refresh when navigate records in jsf page so I don't know how this became totally new thread

Dario

BAO.SZ

Dario,

As you said, this topic is off the original thread (Image field cannot refresh when navigate records in jsf page), but is valuable to me. so I branch it as a new one.

I have two furthur question2:

1.Your suggestion to refresh page data seems a bit complicated, does it the best practice for the job?

--It seems, we also need to use Active Data technology?

2.Can have more evidence or reference docs on this sayings:

"normal way" to refresh page is usually very bad for applications based on JSF/ADF

Thanks.

BAO.SZ

I use the Actions-->branch feature.

But I cannot edit the first branched post, and you become the author of this thread.

Timo Hahn

BAZ, now things get messi

Your intention was good, however branching away the answer isn't something you, we or anybody should do. Now @"kdario" is in the drivers seat for this thread, not you. It's a flaw in the software that you can branch away part of the thread, not your fault.

Anyway I suggest that you open a new thread with your follow up question referring to the original thread. Abandon this one, please. We are happy to answer to the new thread.

Timo

BAO.SZ

Timo and kdario,

I'd glad to open a new thread for the topic. and thank you all very much for your patiences and kindness.

I think the 'Action-->.branch' is an useful feature when some improvement been made, that is:

When make the branching, the operator (which should be the author of the original thread) should be still the 'driver of the new thread',

and can edit the title and post content for the new thread.(but now cannot).

The benefit of above feature is :

1. the new thread usaually will have some relation with the original one, so keep the relation between them will make the contents/threads in this forum more relevent.

A relevant connected network of information will be more useful than a vast volume of unrelated segements of posts.

2.little inputing will be needed by branch if the author can edit the original branch out post.


Timo Hahn

I agree with you, however this is not the right place to discuss this. There is already an enhancement request (idea as it called in the new forum wording) for this at where you can add your comment and vote for the request to get implemented.

Timo

1 - 6

Post Details

Added on Mar 9 2020
0 comments
900 views