3 Replies Latest reply: Jul 14, 2009 9:26 PM by 807567 RSS

    Thread migration

    807567
      Apologies if this was answered in another post - I could not find it anywhere...

      We just upgraded from Solaris 8 on an Enterprise 6500 to Solaris 10 on a T5240 with 2 cpus, 8 cores, configured to run 128 virtual cpus.

      I am an Informix DBA who is seeing some odd perfromance on the new server. As this is not an Informix forum I will not bore you with those details. My question is this - Can a process running on one of the 128 virtual cpu's migrate to another virtual cpu? With the Informix database server, we start, for example 16 unix processes, they are called oninit. One of the threads started by Informix can begin running on one of the oninit processes but when necessary will migrate to another oninit process. Since these oninit processes are bound by processor affinity to a specified cpu (virtual), I wonder if some of the odd performance we are seeing is related to the fact that the oninits can "swap" threads between each other but maybe the OS's virtual processors cannot do the same thing with the oninit processes?

      Thanks,

      Mike

      Edited by: Shoeless_MIke on Jul 10, 2009 1:27 PM
        • 1. Re: Thread migration
          807567
          Mike,

          I'll do my best to respond, but like you say - I'm not exactly an Informix guy. Let me get this straight, the system boot sequence launches 16 Informix processes during the init phase. Unless the Informix code is specifically binding those processes to a particular CPU via processor_bind(2), then the thread execution context for all of those 16 processes can block and be rescheduled on a different physical CPU (context switching). Although threads have an affinity toward a CPU they can be forced over to the second CPU if necessary. Ideally, you would want to fix this using the shell command: pbind(1M). Note that context switching across cores within the same physical CPU has minimal performance impact (even across cores).

          Do you know how many threads these Informix processes spawn? If they execute via a single thread each then you're starving yourself - to resolve this you would need to initialize 128 Informix processes. And to minimize context switching, bind the first 64 to processor #1 and the remaining 64 to processor #2. You could also split the current 16 processes between the two CPUs since I'd bet all 16 are running on CPU #1. You could also (but I don't recommend it) carve up resources and assign each Informix process an even fraction of the system memory (say 64GB / 128 processes).

          The next item you may want to look into are the process page sizes. Solaris defaults to 8 Kbytes (8192 bytes). This sucks for a database system. Change this to a reasonably large number. For example:
          (tbrown@bullshark) [tbrown]: pagesize -a
          8192
          65536
          4194304 <-- (I usually use this one)
          268435456
          (tbrown@bullshark) [tbrown]: ppgsz -o heap=4M,stack=512K -p <Informix process pid>
          Back to your question, (paraphrased) "can processes swap threads". The answer is no. Threads share their parent process execution context. The only way for threads across multiple processes to communicate is with a named pipe, shared memory, doors, memory mapped global memory, sockets, etc. What you describe sounds like process-level (main thread) context switching either across cores or processors, or both. It also could be serialized execution where threads execute in serial FIFO (yikes).

          I'm battling database performance myself on our T5240 (same config, max memory, different db). Hope I answered at least some aspect of your question.

          Good luck,
          Tracy S. Brown
          • 2. Re: Thread migration
            807567
            Tracy -

            Thanks for the reply...

            We do bind our oninit processes to the Sun virtual CPU's when the database server starts up. I am not sure what system call Informix makes to perform the bind but from looking at 'top' output I never see an oninit process run on a CPU other than the one it was "bound" to at startup.

            Informix will spawn any number of threads that execute on the oninit processes. The primary thread that runs on these oninit processes are "sqlexec" threads - which like their name implies performs sql work. The other main thread that runs on the oninits are kaio threads, which perform all I/O requests. The # of kaio threads is equal to the # of oninit processes that we configure to start when the engine starts, while the # of sqlexec threads is determined entirely by user activity. It is not unusual to see, on a busy Informix server, several hundred sqlexec threads. While Informix binds their oninit processes to physical or virtual Sun CPU's they do not bind their sqlexec or kaio threads to the oninit processes. An sqlexec thread who "comes to life" on an oninit process can migrate and run on another available oninit process if it is switched off the oninit it is on by the internal Informix scheduler. The context for all of the Informix threads is stored in shared memory that all of the oninit processes can access, so it is a seemless migration when an sqlexec or kaio thread migrates between oninits.

            I wonder how the kernal/cpu's handle this scenario:

            An sqlexec thread is running - executing its code on Oninit#1. The Oninit #1 process is bound to virtual/logical cpu #1. The sqlexec thread reaches a point where it needs to yield to another waiting thread so it is switched off Oninit #1 and goes to sleep. When the thread comes back to life it begins running on a different oninit process- Oninit#7, which is bound to cpu #7. How would that cpu know anything about the context relative to the sqlexec thread that was running on a different cpu?

            I know this scenario occurs in situations where there are, for example, 8 physical cpus - not logical but real cpus. When the Informix threads migrate between oninit processes there is a cost involved:

            sqlexec1 ----> oninit#1 -----> CPU1 (This cpu has some context information relative to the sqlexec thread)

            sqlexec1 yields. wakes up...

            sqlexec1 ----> oninit#7 -------> CPU7 (This cpu has information relative to whatever was running in this process prior to the yield-switch)

            I hope that makes some sense...

            Mike
            • 3. Re: Thread migration
              807567
              Mike,

              I suspect that your Oninit LWPs are being starved of resources because of thread priorities. As demand increases for these LWPs to accept a socket connection and spawn an sqlexec thread, the Oninit LWPs will be forced to yield processor time. It's worse than that because the default thread priority (Timeshare: TS) utilizes a sliding scale quantum factor that maximizes each thread execution interval (as in maximum amount of time the thread can execute before blocking). I'd bet willing to be this is my problem. I've seen upwards of 640 threads per second on my database at peak load with only 128 virtual processors from which the scheduler can choose. And performs like a drunk dog skydiving - not pretty. The following example shows how to view thread priority levels and their associated quantum interval:
              [tbrown]: dispadmin -g -c TS
              # Time Sharing Dispatcher Configuration
              RES=1000
              
              # ts_quantum  ts_tqexp  ts_slpret  ts_maxwait ts_lwait  PRIORITY LEVEL
                     200         0        50           0        50        #     0
                     <... snip ...>
                     160         0        51           0        51        #    10
                     <... snip ...>
                     120        10        52           0        52        #    20
                     <... snip ...>
                      80        20        53           0        53        #    30
                      <... snip ...>
                      40        30        55           0        55        #    40
                      <... snip ...>
                      40        48        58           0        59        #    58
                      20        49        59       32000        59        #    59
              The TS thread priorities range between -60 to +60 with 0 being middle of the road. I'm guessing that the default priority is 0. Each time a thread is forced off the processor it's priority gets bumped up 1. This is all fine, but your Oninit processes shouldn't be attempting to fare against however many hundreds or thousands of threads eager for execution time. I'd try setting the 16 Oninit LWPs to priority 60 if your Oninit LWPs are actually running as TS threads. For example:
              [tbrown]: priocntl -s -c TS -i pid -p 60 <Oninit pid number>
              However, I'm running my db on a zone that is not scheduling my database threads as TS threads. Instead these threads are being scheduled as FSS threads (fair share: FSS) as in round-robin execution intervals of up to the quantum interval before blocking (it's more complicated due to projects and tasks and other resource allocation admin stuff). I need to investigate the commands for changing thread priorities on zones. You can find out what class of threads are being scheduled using ps. For example:
              [uid=0(root)@sevengill] [tbrown]: ps -Lce | grep ksh
               15808     1  FSS  59 pts/1       0:00 ksh
               16269     1  FSS  59 pts/1       0:00 ksh
               16142     1  FSS  59 pts/1       0:00 ksh
              As you can see, my shell is running on a fair share thread at priority 59 (equiv. to 60 since 0 is counted as number 1). If for some reason you cannot change the Oninit LWP priorities, then you can at least give them more chances at handling an inbound request by changing the thread quantum level. The quantum interval is based on something called the resolution time (reported as RES). The command to find this out is:
              [uid=0(root)@sevengill] [tbrown]: dispadmin -g -c FSS
              #
              # Fair Share Scheduler Configuration
              #
              RES=1000
              #
              # Time Quantum
              #
              QUANTUM=110
              Convoluted explanation skipped - the quantum for FSS threads is 110 milliseconds. This means that for my FSS threads, they can have a maximum execution interval of 110 milliseconds which is a fairly large value when cycles are computed in nanoseconds. You might try reducing this interval so your Oninit LWPs get more activity on the processor. For example:
              (tbrown@nickle) [tbrown]: dispadmin -g -c FSS -r 100
              #
              # Fair Share Scheduler Configuration
              #
              RES=100
              #
              # Time Quantum
              #
              QUANTUM=11
              Due to more convoluted stuff, don't use 100 because that's a special number for Solaris - here's why: the new value of 11 equals 11 clock ticks (not helpful since a tick happens every 10 milliseconds, 11*10 = 110 which is the value when set to 1000). You might try using 500 to start, at least that's where I'm going to start my testing.

              I haven't experimented with any of this yet on the T5240 - just my workstation. I'm just now getting up to speed on thread classes and class priorities (not to mention the convoluted quantum interval).

              (not sure how much this actually helps)

              Good luck,
              Tracy S Brown

              Edited by: pthread_mutex_impl on Jul 14, 2009 7:15 PM