Hello,
According to http://www.oracle.com/technetwork/server-storage/sun-sparc-enterprise/documentation/sparc-t7-m7-server-architecture-2702… "These engines can process 32 independent data streams, offloading the processor cores to do other work."
Let's try to pbind "yes > /dev/null" to all threads of single core and then launch vector_in_range() on one of the threads? One execution of vector_in_range() takes about 80ms (zone memory still not increased, case ID: 497386-1217697831), so we will run it in cycle. What we got is CPU time sharing:
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWP
5334 dglushe* 99 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0 60 68K 0 yes/1
5327 dglushe* 99 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0 60 68K 0 yes/1
5336 dglushe* 99 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0 60 68K 0 yes/1
5330 dglushe* 99 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0 60 68K 0 yes/1
5332 dglushe* 99 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0 60 68K 0 yes/1
5340 dglushe* 99 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0 60 68K 0 yes/1
5338 dglushe* 99 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0 60 68K 0 yes/1
5345 dglushe* 46 14 0.1 0.0 0.0 0.0 0.0 39 0 48 .1M 0 dax-in-range/1
5342 dglushe* 39 0.4 0.0 0.0 0.0 0.0 0.0 61 0 49 24K 0 yes/1
Most of the user time dax-in-range spends in following stack:
libdax.so.1`dax_read_results+0x1c4
libdax_query.so.1`dax_query_execute+0x160
libdax_query.so.1`dax_scan+0xdb4
vector.so`dax_scan+0x12c
vector.so`vectorScanStream+0x1d8
vector.so`vectorFilter+0x418
vector.so`vectorInRange+0x2c
vector.so`vector_in_range+0x24
dax-in-range`main+0x1d0
dax-in-range`_start+0x108
Is it poor vector.so that uses DAX ineffectively or the offloading statement is incorrect? vector_in_range() steals time from "yes > /dev/null".
Thank you.