Forum Stats

  • 3,826,282 Users
  • 2,260,619 Discussions
  • 7,896,867 Comments

Discussions

File System Definition: aborting: open failed for

MarkDaniels
MarkDaniels Member Posts: 5
edited Apr 13, 2018 3:05PM in Vdbench

Hello,

File:

compratio=3

*Host Definition

hd=default,user=root,jvms=10,shell=ssh

hd=host1,system=server1

hd=host2,system=server2

hd=host3,system=server3

hd=host4,system=server4

hd=host5,system=server5

hd=host6,system=server6

hd=host7,system=server7

hd=host8,system=server8

#File System Definition

#width=3, depth=8 and files=10 : In this example is Width^Depth*Files | 3^8*10 = 65610 Files

fsd=default,depth=8,width=3,files=10,sizes=(16k,30,32k,20,64k,20,128k,10,512k,15,1024k,5),openflags=directio,shared=yes

fsd=fsd1,anchor=/nfs1

fsd=fsd2,anchor=/nfs2

fsd=fsd3,anchor=/nfs3

fsd=fsd4,anchor=/nfs4

fsd=fsd5,anchor=/nfs5

fsd=fsd6,anchor=/nfs6

fsd=fsd7,anchor=/nfs7

fsd=fsd8,anchor=/nfs8

*File System Workload Definition

fwd=fwd1,fsd=fsd1,host=host1,xfersize=8k,operation=read,fileio=random,fileselect=sequential,threads=32

fwd=fwd2,fsd=fsd2,host=host2,xfersize=8k,operation=read,fileio=sequential,fileselect=sequential,threads=32

fwd=fwd3,fsd=fsd3,host=host3,xfersize=16k,operation=read,fileio=random,fileselect=sequential,threads=32

fwd=fwd4,fsd=fsd4,host=host4,xfersize=16k,operation=read,fileio=sequential,fileselect=sequential,threads=32

fwd=fwd5,fsd=fsd5,host=host5,xfersize=8k,operation=read,fileio=random,fileselect=sequential,threads=32

fwd=fwd6_delete,fsd=fsd6,host=host6,xfersize=8k,operation=delete,fileio=sequential,fileselect=random,threads=16

fwd=fwd6_create,fsd=fsd6,host=host6,xfersize=8k,operation=create,fileio=sequential,fileselect=random,threads=16

fwd=fwd7,fsd=fsd7,host=host7,xfersize=4k,operation=read,fileio=random,fileselect=sequential,threads=32

fwd=fwd8_delete,fsd=fsd8,host=host8,xfersize=4k,operation=delete,fileio=sequential,fileselect=sequential,threads=16

fwd=fwd8_create,fsd=fsd8,host=host8,xfersize=4k,operation=create,fileio=sequential,fileselect=sequential,threads=16

*Run Definition

#rd=create,fwd=*,format=yes,fwdrate=max,threads=128,interval=1

rd=pure_nfs5a,fwd=(*),fwdrate=(10000-60000,2000),format=no,elapsed=120,interval=5,pause=15,warmup=15

Then from time to time I got the following error:

Slave host7-0 aborting: open failed for /nfs7/vdb.1_1.dir/vdb.2_2.dir/vdb.3_3.dir/vdb.4_1.dir/vdb.5_2.dir/vdb.6_2.dir/vdb.7_1.dir/vdb.8_3.dir/vdb_f0001.file

file_open(), open /nfs3/vdb.1_2.dir/vdb.2_3.dir/vdb.3_3.dir/vdb.4_2.dir/vdb.5_3.dir/vdb.6_3.dir/vdb.7_3.dir/vdb.8_2.dir/vdb_f0001.file failed

(From different hosts and different files)

I have changed multiple configurations such as:

  • fileio=(random,shared)
  • fileselect=(xxx,once) (catastrophic lol)
  • Changed the number of JVMs, changed the number of threads to a minimum, but no luck
  • I ran the "create" first, so I can get all the files created

Error states:

<a name="_739498517"></a><i><b>12:54:44.002 Starting RD=pure_nfs5a; elapsed=120 warmup=15; fwdrate=14000; For loops: iorate=14000</b></i>

12:54:44.005 09:54:44.007 task_run_all(): 33 tasks

12:56:26.534 09:56:26.535 file_open(), open /nfs3/vdb.1_2.dir/vdb.2_3.dir/vdb.3_3.dir/vdb.4_2.dir/vdb.5_3.dir/vdb.6_3.dir/vdb.7_3.dir/vdb.8_2.dir/vdb_f0001.file failed

12:56:26.534 09:56:26.536 error: 13          <-- DO you know what it means?

12:56:26.534 09:56:26.536 Memory total Java heap:  419.000 MB; Free:  240.212 MB; Used:  178.788 MB;

12:56:26.534 09:56:26.536 Maximum native memory allocation:      524,288; Current allocation:      524,288          <-- Is this a problem ?

12:56:26.534 09:56:26.536

12:56:26.535 09:56:26.536 open failed for /nfs3/vdb.1_2.dir/vdb.2_3.dir/vdb.3_3.dir/vdb.4_2.dir/vdb.5_3.dir/vdb.6_3.dir/vdb.7_3.dir/vdb.8_2.dir/vdb_f0001.file

12:56:26.535 09:56:26.537

12:56:26.536 java.lang.RuntimeException: open failed for /nfs3/vdb.1_2.dir/vdb.2_3.dir/vdb.3_3.dir/vdb.4_2.dir/vdb.5_3.dir/vdb.6_3.dir/vdb.7_3.dir/vdb.8_2.dir/vdb_f0001.file

12:56:26.536    at Vdb.common.failure(common.java:308)

12:56:26.536    at Vdb.ActiveFile.openFile(ActiveFile.java:158)

12:56:26.536    at Vdb.FwgThread.openForRead(FwgThread.java:351)

12:56:26.536    at Vdb.FwgThread.openFile(FwgThread.java:343)

12:56:26.536    at Vdb.OpRead.doRandomRead(OpRead.java:90)

12:56:26.536    at Vdb.OpRead.doOperation(OpRead.java:44)

12:56:26.536    at Vdb.FwgThread.run(FwgThread.java:157)

Any thoughts?

Tagged:

Answers

  • Henk Vandenbergh-Oracle
    Henk Vandenbergh-Oracle Member Posts: 813
    edited Apr 13, 2018 2:35PM

    Assuming this is Linux: errno 13: EACCES Permission Denied.

    Somehow the OS, together with your file server, thinks that it can not access some files.

    Why, that's not clear, alas, there is not much diagnostics info available.

    - After a failed test, have you gone to the file in question to see what its current state is, does it exist etc?

    - Are you sure that your file systems mounted across your eight hosts are not stepping on each other's toes, as in "host3 and host4 are both pointing to the same file system"?

    - Are you sure your clients have enough resources? I have seen it happen that for instance an open() request can not get the memory it needs, and therefore returns an error code that ultimately results in 'some errno'.

    In this new 'virtual OS' world virtual systems are some times too small running into all kinds of problems.

    - I notice that the way you are testing you have no need for the 'shared=yes' FSD parameter. I can't think of it making a difference, but try it without.

    - Maybe indeed your file server may have some issues?

    - What version of vdbench?

    This is the best I can think of for now.

    Henk.

  • MarkDaniels
    MarkDaniels Member Posts: 5
    edited Apr 13, 2018 3:05PM

    After a failed test, have you gone to the file in question to see what its current state is, does it exist etc?

    Yes it exists, I tested it.

    - Are you sure that your file systems mounted across your eight hosts are not stepping on each other's toes, as in "host3 and host4 are both pointing to the same file system"?

    Yes, they are pointing to the proper places, I triple checked.

    - Are you sure your clients have enough resources? I have seen it happen that for instance an open() request can not get the memory it needs, and therefore returns an error code that ultimately results in 'some errno'.

    In this new 'virtual OS' world virtual systems are some times too small running into all kinds of problems.

    Yes, this is Virtual OS, master had only 8GB memory, I'm increasing 128, just like the slaves. - That's was my initial concern.

    - I notice that the way you are testing you have no need for the 'shared=yes' FSD parameter. I can't think of it making a difference, but try it without.

    I will remove it.

    - Maybe indeed your file server may have some issues?

    I'm monitoring it, it's not even 10%of its full resources.

    - What version of vdbench?

    vdbench50401 - I actually didn't check that. I'm upgrading it right now.

    I will be replying in a few.

This discussion has been closed.