Add Book to My BookshelfPurchase This Book Online

Chapter 4 - Managing Pthreads

Pthreads Programming
Bradford Nichols, Dick Buttlar and Jacqueline Proulx Farrell
 Copyright © 1996 O'Reilly & Associates, Inc.

Scheduling Pthreads
The operating system continuously selects a single thread to run from a systemwide collection of all threads that are not waiting for the completion of an I/O request or are not blocked by some other activity. Many threaded programs have no reason to interfere with the default behavior of the system's scheduler. Nevertheless, the Pthreads standard defines a thread-scheduling interface that allows programs with real-time tasks to get involved in the process.
Using the Pthreads scheduling feature, you can designate how threads share the available processing power. You may decide that all threads should have equal access to all available CPUs, or you can give some threads preferential treatment. In some applications, it's beneficial to give those threads that perform important tasks an advantage over those that perform background work. For instance, in a process-control application, a thread that responds to input for special devices could be given priority over a thread that simply maintains the log. Used in conjunction with POSIX real-time extensions, such as memory locking and real-time clocks, the Pthreads scheduling feature lets you create real-time applications in which the threads with important tasks can be guaranteed to complete their tasks in a predictable, finite amount of time.*
 *See the book POSIX.4: Programming for the Real World by Bill O. Gallmeister, from O'Reilly & Associates, for in-depth discussion of the POSIX real-time extensions.
Note that, even though the Pthreads standard specifies a scheduling interface, it allows vendors to support or not support its programming interface at their option. If your system supports the scheduling programming interface, the compile-time constant _POSIX_THREAD_PRIORITY_SCHEDULING will be TRUE.*
 *If your implementation supports the POSIX real-time extensions, you can use the sched_yield call to force some broad form of scheduling. A sched_yield call places the calling thread at the end of its scheduling priority queue and lets another thread of the same priority take its place. 
Scheduling Priority and Policy
The eligibility of any given thread for special scheduling treatment is determined by the settings of two thread-specific attributes:
 Scheduling priority
A thread's scheduling priority, in relation to that of other threads, determines which thread gets preferential access to the available CPUs at any given time.
 Scheduling policy
A thread's scheduling policy is a way of expressing how threads of the same priority run and share the available CPUs.
We'll be using these terms throughout the discussions that follow. Once we've set the stage with some background information about scheduling scope, we'll consider the scheduling priority and policy thread attributes in much greater detail.
Scheduling Scope and Allocation Domains
The concept of scheduling scope refers to the inclusiveness of the scheduling activity in which a thread participates. In other words, scope determines how many threads—and which threads—a given thread must compete against when it's time for the scheduler to select one of them to run on a free CPU.
Because some operating system kernels know little about threads, the scope of thread scheduling depends upon the abilities of an implementation.* A given implementation may allow you to schedule threads either in process scope or in system scope. When scheduling occurs in process scope, threads are scheduled against only other threads in the same program. When scheduling occurs in system scope, threads are scheduled again stall other active threads systemwide. Implementations may also provide a thread attribute that allows you to set the scheduling scope on a per-thread basis. Here, too, you can choose that a thread participate in scheduling in either process or system scope.
 *As we'll discuss in Chapter 6, Practical Considerations, some systems provide the abstraction of a thread within the container of the process without any help from the kernel. On these systems the lower-level operating system kernel schedules processes to run, not threads.
The discussion of scheduling scope is complicated when multiprocessing systems are involved. Many operating systems allow collections of CPUs to be treated as separate units for scheduling purposes. In Digital UNIX, for example, such a grouping is called a processor set and can be created by system calls or administrative commands. The Pthreads standard does recognize that such groupings may exist and refers to them as scheduling allocation domains. However, to avoid forcing all vendors to implement specific allocation domain sizes, the standard leaves all policies and interfaces relating to them undefined. As a result, there's a wide range of standard-compliant implementations out there. Some vendors, such as Digital, provide rich functionality, and others provide very little, even placing all CPUs in a single allocation domain.
Figure 4-5: Scheduling with system scope and one allocation domain
Figure 4-5 shows a system using only system scheduling scope and a single allocation domain. On one side of the scheduler we have processes containing one or more threads that need to be scheduled. On the other side the scheduler has the available CPU processing power of the system combined into the one allocation domain. The scheduler compares the priorities of all runnable threads of all processes systemwide when selecting a thread to run on an available CPU. It gives the thread with the highest priority first preference, regardless of which process it belongs to.
Figure 4-6 shows a system with only process scope and a single allocation domain.
Figure 4-6: Scheduling with process scope and one allocation domain
The standard requires a scheduler that supports process scope to compare the scheduling priority of a thread only to the priorities of other threads of the same process. How the scheduler makes the comparison is also undefined. As a result, the priorities set by the Pthreads library on a system that provides this type of scheduling may not necessarily have any systemwide meaning.
For instance, consider such a scheduler on a multiprocessing system on which the threads of a given process (Process A) are competing for CPUs. Process A has three threads, one with very high priority and two with medium priority. The scheduler can place the high priority thread on one of the CPUs and thus meet the standard's requirements for process-scope scheduling. It need do no more—even if other CPUs in the allocation domain have lower priority threads from other processes running on them. The scheduler can leave Process A's remaining runnable medium priority threads waiting for its high priority thread to finish running. Thus, this type of scheduling can deny a multithreaded application the benefit of multiple CPUs within the allocation domain.
An implementation that uses system-scope scheduling with a single allocation domain, such as the one we showed in Figure 4-5, behaves quite differently. If the threads of a process in system scope have high enough priorities, they will be scheduled on multiple CPUs at the same time. System-scope scheduling is thereby much more useful than process-scope scheduling for real-time or parallel processing applications when only a single allocation domain is available.
Figure 4-7 shows a system with multiple allocation domains supporting both process and system scope. The threads of Process A all have process scheduling scope and exclusive access to an allocation domain. Process B's threads have system scope and their own allocation domain as well. The threads of all other processes have system scope and are assigned to the remaining allocation domain.
Figure 4-7: Scheduling with process and system scope and multiple allocation domains
Because the threads of Process A and Process B don't share an allocation domain with those of other processes, they will execute more predictably. Their threads will never wait for a higher priority thread of another process to finish or preempt another process's lower priority thread. Because Process B's threads use system scope, they will always be able to simultaneously access the multiple CPUs within its domain. However, because Process A's threads use process scope, they may not always be able to do so. It depends on the implementation on which they run.
You should take into account one potential pitfall of using multiple scheduler allocation domains if your implementation allows you to define them. When none of the threads in Process A or B are running on the CPUs in their allocation domains, the CPUs are idle, regardless of the load on other CPUs in other domains. You may in fact obtain higher overall CPU utilization by limiting the number of allocation domains. Be certain that you understand the characteristics of your application and its threads before you set scheduling policies that affect its performance and behavior.
If an implementation allows you to select the scheduling scope of a thread using a per-thread attribute, you'll probably set up the thread's attribute object, as shown in Example 4-21.
Example 4-21: Setting Scheduling Scope in an Attribute Object (sched.c)
pthread_attr_t custom_sched_attr;
        pthread_attr_setscope(&custom_sched_attr, PTHREAD_SCOPE_SYSTEM);
        pthread_create(&thread, &custom_sched_attr, ...);
The pthread_attr_setscope function sets the scheduling-scope attribute in a thread attribute object to either system-scope scheduling (PTHREAD_SCOPE_SYSTEM), as in Example 4-21, or process-scope scheduling (PTHREAD_SCOPE_PROCESS). Conversely, you'd use pthread_attr_getscope to obtain the current scope setting of an attribute object.
For the remainder of our discussion, we'll try to ignore scope. We can't avoid using terms that have different meanings depending upon what type of scheduling scope is active. As a cheat sheet for those occasions when these terms appear, refer to the following:
 When we say pool of threads, we mean:
In process scope: all other threads in the same process
In system scope: all threads of all processes in the same allocation domain
 When we say scheduler, we mean:
In process scope: the Pthreads library and/or the scheduler in the operating system's kernel
In system scope: the scheduler in the operating system's kernel
 When we say processing slot, we mean:
In process scope: the portion of CPU time allocated to the process as a whole within its allocation domain
In system scope: the portion of CPU time allocated to a specific thread within its allocation domain
Runnable and Blocked Threads
In selecting a thread for a processing slot, the scheduler first considers whether it is runnable or blocked. A blocked thread must wait for some particular event, such as I/O completion, a mutex, or a signal on a condition variable, before it can continue its execution. By contrast, a runnable thread can resume execution as soon as it's given a processing slot.
After it has weeded out the blocked threads, the scheduler must select one of the remaining runnable threads to which it will give the processing slot. If there are enough slots for all runnable threads (for instance, there are four CPUs and four threads), the scheduler doesn't need to apply its scheduling algorithm at all, and all runnable threads will get a chance to run simultaneously.
Scheduling Priority
The selection algorithm that the scheduler uses is affected by each runnable thread's scheduling priority and scheduling policy. As we mentioned before, these are per-thread attributes; we'll show you how to set them in a few pages.
The scheduler begins by looking at an array of priority queues, as shown in Figure 4-8. There is a queue for each scheduling priority and, at any given priority level, the threads that are assigned that priority reside. When looking for a thread to run in a processing slot, the scheduler starts with the highest priority queue and works its way down to the lower priority queues until it finds the first thread.
Figure 4-8: Priority queues
In this illustration only three of the priority queues hold runnable threads. When running threads either involuntarily give up their processing slot(more on this later) or go from blocked to runnable, they are placed at the end of the queue for their priority. Over time, the population of the priority queues will grow and decline.
Whenever a thread with a higher priority than the current running thread becomes runnable, it interrupts the running thread and replaces it in the processing slot. From the standpoint of the thread that's been replaced, this is known as an involuntary context switch.
Scheduling Policy
A thread's scheduling policy determines how long it runs when it moves from the head of its priority queue to a processing slot. The two main scheduling policies are SCHED_FIFO and SCHED-RR:
This policy (first-in first-out) lets a thread run until it either exits or blocks. As soon as it becomes unblocked, a blocked thread that has given up its processing slot is placed at the end of its priority queue.
This policy (round robin) allows a thread to run for only a fixed amount of time before it must yield its processing slot to another thread of the same priority. This fixed amount of time is usually referred to as a quantum. When a thread is interrupted, it is placed at the end of its priority queue.
The Pthreads standard defines an additional policy, SCHED_OTHER, and leaves its behavior up to the implementors. On most systems, selecting SCHED_OTHER will give a thread a policy that uses some sort of time sharing with priority adjustment. By default, all threads start life with the SCHED_OTHER policy. After all, time sharing with priority adjustment is the typical UNIX scheduling algorithm for processes. It works like SCHED_RR, giving threads a quantum of time in which to run. Unlike SCHED_FIFO and SCHED_RR, however, it causes the scheduler to occasionally adjust a thread's priority without any input from the programmer. This priority adjustment favors threads that don't use all their quantum before blocking, increasing their priority. The idea behind this policy is that it gives interactive I/O-bound threads preferential treatment over CPU-bound threads that consume all their quantum.
The definitions of SCHED_FIFO, SCHED_RR, and SCHED_OTHER actually come from the POSIX real-time extensions (POSIX.1b). Any Pthreads implementation that uses the compile-time constant _POSIX_THREAD_PRIORITY_SCHEDULING will also recognize them. As we'll continue our discussion, we'll find other POSIX.1b features that are useful in manipulating priorities.
Using Priorities and Policies
Although you can set different scheduling priorities and policies for each thread in an application, and even dynamically change them in a running thread, most applications don't need this complexity.
A real-time application designer would typically first make a broad division between those tasks that must be completed in a finite amount of time and those that are less time critical. Those threads with real-time tasks would be given a SCHED_FIFO policy and high priority. The remaining threads would be given a SCHED_RR policy and a lower priority. The scheduling priority of all of these threads would be set to be higher than those of any other threads on the system. Ideally the host would be capable of system-scope scheduling.
As shown in Figure 4-9, the real-time threads of the real-time application will always get access to the CPU when they are runnable, because they have higher priority than any other thread on the system. When a real-time thread gets the CPU it will complete its task without interruption (unless, of course, it blocks—but that would be a result of poor design). No other thread can preempt it; no quantum stands in its way. These threads behave like event (or interrupt) handlers; they wait for something to happen and then process it to completion within the shortest time possible.
Figure 4-9: Using policies and priorities in an application
Because of their high priority, the non-real-time threads in the application also get preferential treatment, but they must share the CPU with each other as their quantums expire. These threads usually perform the background processing for the application.
An example of this kind of real-time application would be a program that runs chemical processing equipment. The threads that deploy hardware control algorithms—periodically reading sensors, computing new control values, and sending signals to actuators—would run with the SCHED_FIFO policy and a high priority. Other threads that performed the less critical tasks—updating accounting records for chemicals used and recording the hours for employees running the equipment—would run with the SCHED_RR policy and at a lower priority.
Setting Scheduling Policy and Priority
You can set a thread's scheduling policy and priority in the thread attribute object you specify in the pthread_create call that creates the thread. Assume that we have a thread attribute object named custom_sched_attr. We've initialized it with a call to pthread_attr_init. We specify it in calls to pthread_attr_setschedpolicy to set the scheduling policy and pthread_attr_setschedparam to set the scheduling priority, as shown in Example 4-22.
Example 4-22: Setting a Thread's Scheduling Attributes (sched.c)
pthread_attr_t custom_sched_attr;
int fifo_max_prio, fifo_min_prio;
struct sched_param fifo_param;
  pthread_attr_setinheritsched(&custom_sched_attr, PTHREAD_EXPLICIT_SCHED);
  pthread_attr_setschedpolicy(&custom_sched_attr, SCHED_FIFO);
  fifo_max_prio = sched_get_priority_max(SCHED_FIFO);
  fifo_min_prio = sched_get_priority_min(SCHED_FIFO);
  fifo_mid_prio = (fifo_min_prio + fifo_max_prio)/2;
  fifo_param.sched_priority = fifo_mid_prio;
  pthread_attr_setschedparam(&custom_sched_attr, &fifo_param);
  pthread_create(&(threads[i]), &custom_sched_attr, ....);
The way in which pthread_attr_setschedparam is used demands a little more explanation.
When you use pthread_attr_setschedpolicy to set a thread's policy to SCHED_FIFO or SCHED_RR, you can also call pthread_attr_setschedparam to set its parameters. The pthread_attr_setschedparam function takes two arguments: the first is a thread attribute object, the second is a curious thing defined in the POSIX.1b standard and known as a struct sched_param. It looks like this:
struct sched_param {;
          int sched_priority;
That's it. The struct sched_param has only a single required member and specifies a single attribute—a scheduling priority. (Some Pthreads implementations may store other information in this structure.) Let's see how we stick a priority into this thing.
The POSIX.1b standard specifies that there must be at least 32 unique priority values apiece for the SCHED_RR and SCHED_FIFO priorities. (The standard does not require that there be defined priorities for SCHED_OTHER.) The absolute values and actual range of the priorities depend upon the implementation, but one thing's for certain—you can use sched_get_priority_max and sched_get_priority_min to get a handle on them.
In our example, we call sched_get_priority_max and sched_get_priority_min to obtain the maximum and minimum priority values for the SCHED_FIFO policy. We add the two together and divide by two, coming up with a priority level that's happily in the middle of the SCHED_FIFO priority range. It's this priority value that we insert in the priority member of our struct sched_param. A call to pthread_attr_setschedparam and, voila!—our thread has a nice middling priority with which to work.
Before we leave our discussion of setting a thread's scheduling attributes statically when the thread is created, we'll make one final point. If you must retrieve the scheduling attribute settings from a thread attribute object, you can use the functions pthread_attr_getschedpolicy and pthread_attr_getschedparam. They work in the same way as the corresponding functions for other thread attributes.
Now we'll look at a way to set the scheduling policy and priority of a selected thread while it's running. In Example 4-23, we set a target thread's policy to SCHED_FIFO and its priority to the priority level stored in the variable fifo_min_prio.
Example 4-23: Setting Policy and Priority Dynamically (sched.c)
fifo_sched_param.sched_priority = fifo_min_prio;
pthread_setschedparam(threads[i], SCHED_FIFO, &fifo_min_prio);
As you can see, the pthread_setschedparam call sets both policy and priority at the same time. Conversely, the pthread_getschedparam function returns the current policy and priority for a specified thread. Be careful when you use the pthread_setschedparam function to dynamically adjust another thread's priority. If you raise a thread's priority higher than your own and it is runnable, it will preempt you when you make the call.
If you decide to use scheduling, you don't need to individually set the scheduling attributes of each thread you create. Instead, you can specify that each thread should inherit its scheduling characteristics from the thread that created it. Like other per-thread scheduling attributes, the inheritance attribute is specified in the attribute object used at thread creation, as shown in Example 4-24.
Example 4-24: Setting Scheduling Inheritance in an Attribute Object (sched.c)
pthread_attr_t custom_sched_attr;
        pthread_attr_setinheritsched(&custom_sched_attr, PTHREAD_INHERIT_SCHED)
        pthread_create(&thread, &custom_sched_attr, ...);
The pthread_attr_setinheritsched function takes a thread attribute object as its first argument and as its second argument either the PTHREAD_INHERIT_SCHED flag or the PTHREAD_EXPLICIT_SCHED flag. You can obtain the current inheritance attribute from an attribute object by calling pthread_attr_getinheritsched.
Scheduling in the ATM Server
We're now ready to assign different scheduling priorities to the worker threads in our ATM server, based on the type of transaction they are processing. To illustrate how our server might use scheduling attributes, we'll give highest priority to the threads that service deposit requests. After all, time is money and the sooner the bank has your money the sooner they can start making money with it. Specifically, we'll add code to our server so that deposit threads run at a high priority with a SCHED_FIFO scheduling policy and the other threads run at a lower priority using a SCHED_RR scheduling policy.
We don't need to change worker thread code; only the boss thread concerns itself with setting scheduling attributes. We'll globally declare some additional thread attribute objects (pthread_attr_t) in our atm_server_init routine in Example 4-25 and prepare them to be used by the boss thread when it creates worker threads.
Example 4-25: Creating Attribute Objects for Worker Threads (sched.c)
/* global variables */
pthread_attr_t custom_attr_fifo, custom_attr_rr;
int fifo_max_prio, rr_min_prio;
struct sched_param fifo_param, rr_param;
  pthread_attr_setschedpolicy(&custom_attr_fifo, SCHED_FIFO);
  fifo_param.sched_priority = sched_get_priority_max(SCHED_FIFO);
  pthread_attr_setschedparam(&custom_attr_fifo, &fifo_param);
  pthread_attr_setschedpolicy(&custom_attr_rr, SCHED_RR);
  rr_param.sched_priority = sched_get_priority_min(SCHED_RR);
  pthread_attr_setschedparam(&custom_attr_rr, &rr_param);
The boss thread will use the custom_attr_fifo attribute object when creating deposit threads. The atm_server_init routine sets this attribute object to use the SCHED_FIFO scheduling policy and the maximum priority defined for the policy. The boss thread will use the custom_attr_rr attribute object for all other worker threads. It is set with the SCHED_RR scheduling policy and the minimum priority defined for the policy. The boss thread uses these attribute objects in the server's main routine:
Example 4-26: Creating threads with custom scheduling attributes (sched.c)
extern int
  atm_server_init(argc, argv);
  for(;;) {;
    /*** Wait for a request ***/
    workorderp = (workorder_t *)malloc(sizeof(workorder_t));
    server_comm_get_request(&workorderp->conn, workorderp->req_buf);
    sscanf(workorderp->req_buf, "%d", &trans_id);
    switch(trans_id) {;
         pthread_create(worker_threadp, &custom_attr_fifo, process_request,
                       (void *)workorderp);
         pthread_create(worker_threadp, &custom_attr_rr, process_request,
                       (void *)workorderp);
  return 0;
In our server's main routine, the boss thread checks the request type before creating a thread to process it. If the request is a deposit, the boss specifies the custom_attr_fifo attribute object in the pthread_create call. Otherwise, it uses the custom_attr_rr attribute object.

Previous SectionNext Section, Inc © 2000 –  Feedback