!standard D.16 09-10-22 AI05-0167-1/01 !class Amendment 09-10-22 !status work item 09-10-22 !status received 09-10-22 !priority Medium !difficulty Medium !subject Managing affinities for programs executing on multiprocessor platforms !summary Facilities are provided to allow a multiprocessor platform to be partitioned into a number of non-overlapping allocation domains (ADs). Every task is scheduled within an AD. A task may also be assigned to execute on just one CPU from within its AD. !problem An increasing number of embedded applications are now executed on multiprocessor and multicore platforms. For non-real-time programs it is usually acceptable for the mapping of tasks to CPUs to be implementation defined and hidden from the program. For real-time programs it may not be acceptable for the mapping of tasks to CPUs to be hidden from the program. This mapping is often known as the "affinity" of the task. The control of affinities is as important as the control of priorities. The ability to control the affinity of a task is needed in Ada. !proposal The following collection of additions to the language are concerned with supporting the execution of multi-tasking Ada programs on SMPs - identical multiprocessors. The following issues are addressed - representing CPUs - controlling task affinities - identifying interrupt affinities - supporting different scheduling schemes - Implementation advice and documentation requirements A simple integer type is used to represent the range of CPUs (CPUs). The range starts from one to more naturally cater for the single CPU case. There is a default that is necessary in other definitions (see group budgets' AI). The CPUs are potentially split into sets and hence an array of Boolean is defined (although perhaps a container should be used). These definitions are give here in a child package of System, although they could just be added to System: package Ada.System.MultiProcessors is Number_of_CPUs : constant Positive := ; type CPU is range 1 .. Number_of_CPUs; Default_CPU : constant CPU := ; type CPU_Set is array (CPU) of Boolean; end Ada.System.MultiProcessors; The following package (again defined here as an extension to System) allows the group of CPUs to be partitioned into a finite set of non-overlapping 'Allocation_Domains' (AD). One AD is defined to be the 'System' AD; the environmental task and any derived from that task are allocated to the 'System' AD. Tasks can be allocated to an AD and be globally scheduled within that AD. Alternatively they can be allocated to an AD and assigned to a specific CPU within that AD. Task cannot be allocated to more than one AD, or assigned to more than one CPU. with Ada.Task_Identification; use Ada.Task_Identification; with Ada.Real_Time; use Ada.Real_Time; with Ada.System.MultiProcessors; use Ada.System.MultiProcesors; package Ada.System.Allocation_Domains is type Allocation_Domain is limited private; System_Allocation_Domain : constant Allocation_Domain; NON_EMPTY_SYSTEM_Allocation_DOMAIN : exception; CPU_NOT_IN_SYSTEM_DOMAIN : exception; CPU_ASSIGNED_IN_SYSTEM_DOMAIN : exception; CPU_NOT_IN_Allocation_DOMAIN: exception; function Create(PS : CPU_Set) return Allocation_Domain; -- raise CPU_NOT_IN_SYSTEM_DOMAIN if CPUs not in -- System_Allocation_Domain -- raise CPU_ASSIGNED_IN_SYSTEM_DOMAIN if in System_Allocation_Domain -- but has a assigned task -- raise NON_EMPTY_SYSTEM_Allocation_DOMAIN if the allocation would leave the -- system_Scheduling domain empty function Get_CPU_Set(AD : Allocation_Domain) return CPU_Set; function Get_Allocation_Domain(Tid : Task_Id := Current_Task) return Allocation_Domain; procedure Allocate_Task(AD : in out Allocation_Domain; Tid : Task_Id := Current_Task); procedure Allocate_Task(AD : in out Allocation_Domain; P : CPU; Tid : Task_Id := Current_Task); -- raises CPU_NOT_IN_Allocation_DOMAIN if P not in AD procedure Set_CPU(P : CPU; Tid : Task_Id := Current_Task); -- raises CPU_NOT_IN_Allocation_DOMAIN if P not in current AD for Tid procedure Free_CPU(Tid : Task_Id := Current_Task); function Get_CPU(Tid : Task_Id := Current_Task) return CPU; procedure Delay_Until_And_Set_CPU(T : Ada.Real_Time.Time; P : CPU); -- raises CPU_NOT_IN_Allocation_DOMAIN if P not in current AD for Tid private type Allocation_Domain is new CPU_Set; System_Allocation_Domain : constant Allocation_Domain := (others => True); end Ada.System.Scheduling.Domains; The required behaviour of each subprogram is as follows; Create: creates a new AD and moves CPUs from 'System' AD to this new AD. A CPU cannot be moved if it has a task assigned to it. The 'System' AD must not be emptied of CPUs as it always contains the environment task. Note, that there is still only one environment task. Get_CPU_Set and Get_Allocation_Domain are straightforward. There are two Allocate_Task procedures. One allocates the task just to an AD (for global scheduling) the other allocates it to an AD and assigns a specific CPU within that AD (for partitioned scheduling). The language could allow tasks to migrate between ADs, or raise an exception if the task is already allocated. Set_CPU assigns the task to the CPU. The task can now only execute on that CPU. Free_CPU removes the CPU specific assignment. The task can now execute on any CPU within its AD. Get_CPU returns the CPU on which the designated task is executing. Note that if the task is not assigned to a specific CPU then the value returned may change asynchronously. An alternative behaviour would be to raise an exception if the task is not assigned. Delay_Until_And_Set_CPU delays a task and then assigns the task to the specified CPU when the delay expires. This is needed for some scheduling schemes. In addition to these two packages there are two new pragmas required to control the affinity of tasks during activation: pragma Allocation_Domain (AD : Allocation_Domain); pragma CPU (P : CPU); If no affinities are declared then a task will inherit the AD and CPU (if assigned) of its parent task. For protected objects there is no need for affinities, it is the tasks that have ADs and possibly an assigned CPU. PO code will run on the task's CPU. There is however a need to know on what CPU interrupt code will execute, this will allow an associated task to be assigned the same CPU. Hence the following (which could be added to Ada.Interrupts): function Get_CPUs(I: Interrupt_Id) return CPU_Set; For most platforms only one CPU will be identified in the returned set. But it is possible for more than one CPU to be capable of handling a particular interrupt (although an occurrence of the interrupt will only be delivered to one of these CPUs). Each AD is, in effect, scheduled independently; and hence could be subject to different dispatching policies. This is supported as follows. All ADs have the same range of priorities (System.Any_Priority). The 'System' AD, System_Allocation_Domain, is subject to the policies defined using the configuration pragmas: Task_Dispatching_ Policy and Priority_Specific_Dispatching. All other ADs have as default those of System_Allocation_Domain but can re-define their policies by using the following library routines: procedure Task_Dispatching_Policy(AD : Allocation_Domain; ...); procedure Priority_Specific_Dispatching(AD : Allocation_Domain; ...); Extra definitions will be needed to make these equivalent to the current pragmas. These procedure could be defined in the package above. Ideally a program should only be able to call Create and these dispatching policy routines at the library level. There are a number of implementation characteristics that must be documented, and there will be certain implementation advice useful to include in the ARM. For example the CPU(s) on which the clock interrupt is handled and hence delay queue and ready queue manipulations (and user code - Timing Events) executed must be documented. As there is no scheduling between ADs an implementation is recommended to have distinct queues per AD. An implementation must also document how it implements Ceiling_Locking for protected actions, in particular what happens to a task that calls an occupied PO. The implementation advice will be for the task to spin at its current active priority level - and then run at the (higher) ceiling level once it has gained access. !wording ** TBD ** !discussion An increasing number of embedded applications are now executed on multiprocessor and multicore platforms. For non-real-time programs it is usually acceptable for the mapping of tasks to CPUs to be implementation defined and hidden from the program. For real-time programs this is not the case. The control of affinities is as important as the control of priorities. In the literature on multiprocessor scheduling there are two main approaches to affinity: global scheduling where the tasks can run on any CPU (and potentially migrate at runtime); and partitioned scheduling where tasks are anchored to a single CPU. From these schemes, two further variants are commonly discussed: for global scheduling tasks are restricted to a subset of the available CPUs, and for partitioned scheduling the program can explicitly change a task's affinity and hence cause it to be moved at run-time. Restricting the set of CPUs on which a task can be globally scheduled supports scalability - as platforms move to contain hundreds of CPUs, the overheads of allowing full task migration become excessive and outweighs any advantage that might accrue from global scheduling. Controlled changing of a task's affinity has been shown to lead to improved schedulability for certain types of application. These four schemes can be used with any form of dispatching, for example fixed priority or EDF. For multiprocessors, EDF is no longer optimal and is not always better than fixed priority. Also global scheduling is usually better than partitioning, but not always. New dispatching algorithms will be defined in the future (it is an active research area), the provisions defined above will allow many of these algorithms to be programmed using the controls provided. However, a fully flexible model with, for example, overlapping allocation domains, is not supported by the workshop (IRTAW). It was felt better to remove a constraint later rather than attempt to add one. Protected objects require a real lock on a multiprocessor platform unless all user tasks are assigned the same CPU. Spin locking (where tasks spin at their current active priority) is an adequate scheme. Programmer control over ceilings allows protocols such as non-preemptive execution of 'shared' POs to be programmed. No further language provision is required. The provisions outlined in this AI formed the main focus of the 14th IRTAW. They are the result of considerable discussion and evaluation. The starting point were a number of papers at the workshop, including: Supporting Execution on Multiprocessor Platforms, by Burns and Wellings; Providing Additional Real-Time Capability and Flexibility for Ada 2005, by Rod White; Towards a Ravenscar Extension for Multiprocessor Systems, by Ruiz; and Realtime Paradigms Needed Post Ada 2005, by Michell et al. There were also relevant papers and discussions at the previous workshop. The notion of an allocation domain has some similarities to the current Ada notion of a partition. However the workshop felt that the two notions were not identical and to use partitions for allocation domains would not be effective. Partitions are more likely to have a role with non SMP (i.e. CC-NUMA) architectures. The definition of ADs is such that a simple system with just one AD will not need to consider these domains. Moreover, if global dispatching using fixed priorities is adequate then the program can be silent on all affinity issues. !example ** TBD ** --!corrigendum D.16(1) !ACATS test Add an ACATS C-Test of this package. !appendix ****************************************************************