Version 1.3 of ai12s/ai12-0323-1.txt

Unformatted version of ai12s/ai12-0323-1.txt version 1.3
Other versions for file ai12s/ai12-0323-1.txt

!standard D.16(16/5)          19-03-11 AI12-0323-1/02
!class Amendment 19-03-07
!status Amendment 1-2012 19-03-11
!status ARG Approved 8-0-2 19-03-11
!status work item 19-03-07
!status received 19-02-26
!priority Low
!difficulty Easy
!subject Implementation Advice for the CPU aspect for protected types
!summary
If CPU is statically specified for a protected type, then the implementation should not use busy-waiting.
!problem
The Implementation Advice in AI12-0281-1 seems to assume that the specification of CPU is static. But that is not required by the language. In the case where CPU is not specified statically, the implementation would have to dynamically determine whether to use busy-waiting or if just an active priority change is sufficient. This could make protected actions far more expensive than necessary, especially if it is necessary to make a system call to implement busy-waiting.
!proposal
(See Summary.)
!wording
Modify D.16(16/5):
Starting a protected action on a protected object {statically} assigned to a processor should be implemented without busy-waiting.
!discussion
As proposed, this wording applies even when the tasks aren't statically assigned a CPU. In that case, the task would have to check if it is allowed to use the protected object before starting the action, but that check would only require the values that would be stored in the task's TCB and in the protected object -- so there still is no need to use out-of-line code to start such an action.
Note that a task that is starting a protected action is running, so it isn't present on any queues; thus changing the active priority (which likely is a number in the task's TCB), along with any necessary check that the task is allowed to use the PO (that is, it's priority and CPU) are the only things necessary to start a protected action on a PO that is using ceiling locking on a single CPU. In particular, no interaction with the task runtime is needed. (For a protected subprogram call, an interaction only would be needed when lowering the priority, as that might cause the task to be preempted.)
---
We had three different suggestions for fixing this problem:
(1) The proposed solution; (2) Just completely delete the advice. The author rejects this solution as users should be able to depend on setting CPU to ensure that protected actions do not use locks. We can't require this normatively, since the concept of busy-waiting and of a lock aren't well-defined; but Implementation Advice (which requires documentation as to whether it is followed) allows users to determine whether an implementation follows the advice. (3) Leave the advice as it is. It is argued that task runtime interactions will be more expensive than any test, and that the code to start a protected action will be in the runtime anyway. However, the model given above shows that the runtime need not be involved for starting a protected action for a protected subprogram call, so this argument appears to be false. Leaving the advice unmodified requires a more expensive implementation than necessary on some targets.
!corrigendum D.16(14/3)
Replace the paragraph:
The CPU value determines the processor on which the task will activate and execute; the task is said to be assigned to that processor. If the CPU value is Not_A_Specific_CPU, then the task is not assigned to a processor. A task without a CPU aspect specified will activate and execute on the same processor as its activating task if the activating task is assigned a processor. If the CPU value is not in the range of System.Multiprocessors.CPU_Range or is greater than Number_Of_CPUs the task is defined to have failed, and it becomes a completed task (see 9.2).
by:
For a task, the CPU value determines the processor on which the task will activate and execute; the task is said to be assigned to that processor. If the CPU value is Not_A_Specific_CPU, then the task is not assigned to a processor. A task without a CPU aspect specified will activate and execute on the same processor as its activating task if the activating task is assigned a processor. If the CPU value is not in the range of System.Multiprocessors.CPU_Range or is greater than Number_Of_CPUs the task is defined to have failed, and it becomes a completed task (see 9.2).
For a protected type, the CPU value determines the processor on which calling tasks will execute; the protected object is said to be assigned to that processor. If the CPU value is Not_A_Specific_CPU, then the protected object is not assigned to a processor. A call to a protected object that is assigned to a processor from a task that is not assigned a processor or is assigned a different processor raises Program_Error.
Implementation Advice
Starting a protected action on a protected object statically assigned to a processor should be implemented without busy-waiting.
!ASIS
[Not sure. It seems like some new capabilities might be needed, but I didn't check - Editor.]
!ACATS test
ACATS B- and C-Tests are needed to check that the new capabilities are supported.
!appendix

From: Randy Brukardt
Sent: Tuesday, February 26, 2018  11:46 PM

When we discussed this AI today, we mentioned that (in general) aspect CPU is 
not static, so the compiler cannot always know whether or not busy-waiting is 
needed. We didn't think this is a problem, since aspect CPU is usually static,
and is always static for profiles Ravenscar/Yorvik.

The Implementation Advice in this AI reads:

   Starting a protected action on a protected object assigned to a processor 
   should be implemented without busy-waiting.

This is not possible in general, as noted above. Should we fix this wording so 
it applies only to cases where the implementation could reasonably do the 
right thing? A strategic insertion of "statically" should do the trick:

   Starting a protected action on a protected object statically assigned to 
   a processor should be implemented without busy-waiting.

Generally, we try to avoid asking the impossible in implementation advice, and 
while I suppose it could be accomplished with heroic efforts (generating all 
of the operations both ways, and picking one at runtime), that would seem to 
defeat the goal of analysis (especially at the generated code level) and 
surely efficiency.

That seems especially true in this case, where the advice is as much a 
statement to users of what they ought to expect as it is to implementers.

Thoughts?

****************************************************************

From: Tucker Taft
Sent: Wednesday, February 27, 2018  6:35 AM

Your update seems fine.  But I also don't see why you couldn't check at
run-time to decide whether to use busy waiting or just rely on the raising 
of priority, though of course doing that test does involve additional 
overhead.

****************************************************************

From: Tullio Vardanega
Sent: Wednesday, February 27, 2018  9:33 AM

As noted in yesterday's discussion, the goal here is for the application 
designer to be able to assert that the program is deadlock-free even when
running on a multicore processor. Making this claim soundly and simply 
requires the PO to run on the same CPU as the tasks call its protected 
actions. And this involves a static guarantee.
Applying this reasoning to Randy's proposed fix to the Implementation Advice, 
we should be able to say that the addition of "statically" refers to the called 
PO and its callers. In other words it is not sufficient that the PO is 
statically assigned to a given CPU, if the tasks that call it may move about.
If they did, their priority (or deadline) assignment would lose meaning and 
therefore all bets of predictable execution behaviour would be off.

****************************************************************

From: Bob Duff
Sent: Wednesday, February 27, 2018  9:51 AM

> The Implementation Advice in this AI reads:

I think it's a waste of time discussing this Advice.
I suggest you delete it.

Steve argued that it tells Ada programmers what to expect.
I would agree, except that programmers don't read the RM.
If they want to know what to expect, they can consult gnat docs.

****************************************************************

From: Tucker Taft
Sent: Wednesday, February 27, 2018  10:00 AM

> I think it's a waste of time discussing this Advice.
> I suggest you delete it.

I think starting to talk about "static" is probably a mistake here.  I really
think the run-time system can be smart enough, presuming the CPU 
specifications are recorded somewhere in the run-time data structures (which I 
believe is necessary), to use this information to avoid a spin lock.  I don't 
see any further help from the compiler is necessary.

 Steve argued that it tells Ada programmers what to expect.
> I would agree, except that programmers don't read the RM.
> If they want to know what to expect, they can consult gnat docs.

That seems a bit parochial.  In fact, I think Ada programmers do read the 
manual more than non-Ada programmers read their language manual.  In part that 
is because Ada is one of the few languages that *has* an official manual! ;-)

****************************************************************

From: Richard Wai
Sent: Wednesday, February 27, 2018  10:35 AM

> I would agree, except that programmers don't read the RM.
> If they want to know what to expect, they can consult gnat docs.

Honestly as an Ada programmer, I refer to the RM heavily and avoid GNAT docs 
where at all possible (as well as extensions), since that is not really 
portable ??

****************************************************************

From: Randy Brukardt
Sent: Wednesday, February 27, 2018  3:26 PM

I agree. If no one ever reads the RM or cares about portable code, why have it 
at all? Besides, I hear a lot more people that don't know how to do something 
because they didn't read the GNAT docs, than those who don't know something 
because they didn't read the RM.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, February 27, 2018  3:39 PM

> I think starting to talk about "static" is probably a mistake here.  I 
> really think the run-time system can be smart enough, presuming the 
> CPU specifications are recorded somewhere in the run-time data 
> structures (which I believe is necessary), to use this information to 
> avoid a spin lock.  I don't see any further help from the compiler is 
> necessary.

The entire point, as I understand it of the single-processor ceiling priority 
model is that it eliminates many places where you have to interact with the 
runtime system (which can be expensive). In particular, starting a protected
action does not require any use of the runtime system. (If you're running, 
there cannot be any higher priority task running, so you can enter the PO 
without any interaction with the runtime. You do have to raise your active
priority, but since you're running, that is just a number in the TCB
-- you can't be on any queues.) You do need an interaction when the priority 
is lowered when you *leave* the PO (that might cause you to be preempted) and
possibly for an entry if the barrier is closed.

Similarly, if there is a spin-lock, I'd expect that to be generated directly 
in-line, because it changes nothing about the running state of the task.

So, it's possible to support both, but at the cost of greatly increasing the 
code size for the interaction (not only do both methods have to be supported, 
but you would also need code to check the current CPU assignments of both 
sides - which would take multiple tests in order to deal with Not_A_CPU). The 
added overhead would at least partially defeat the purpose. I see no reason to 
have advice to do something stupid - to get this guarantee, you should use 
static CPU assignment.

****************************************************************

From: Tucker Taft
Sent: Wednesday, February 27, 2018  4:31 PM

I think you are overstating it.  I suspect that these things won't be in 
directly generated code, but might be parts of the run-time that are inlined, 
if your compiler supports that.  In that case, you can also generate the 
appropriate "if" statements around the calls on the different locking 
approaches, and perhaps try to inline everything, presumably eliminating 
unneeded code if various conditions are static.  And when I say "run time"
I don't mean going to a separate kernel, I just mean calling some run-time
library, at least in a "bare board" system.

Ultimately, if we can't agree, we should just drop the Impl. Advice.

***************************************************************

From: Bob Duff
Sent: Wednesday, February 27, 2018  6:15 PM

No further comment.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, February 27, 2018  6:34 PM

Be that as it may, Tullio pointed out that the intended use is when all of
the CPUs (both tasks and protected types) are static -- otherwise, the 
analysis needed to verify the deadlock avoidance gets complicated.

I don't see any point in putting a potentially expensive requirement on 
implementations in cases outside of the intended usage. All that does is make 
Ada programs run slower (at a minimum, by requiring two extra memory reads to 
retrieve and compare the CPU values for each protected action).

> Ultimately, if we can't agree, we should just drop the Impl. Advice.

I think that the *proper* advice is valuable to programmers, as they know that 
they can expect this property (and lean on their implementer is it isn't 
there). But I'm the wrong person to make that determination. (After all, I 
wouldn't care if the entirety of Annex D was dropped!)

****************************************************************

Questions? Ask the ACAA Technical Agent