!standard C.6(13.2/3) 16-10-02 AI12-0128-1/09 !standard C.6(19) !standard C.6(20) !standard C.6(22/2) !standard C.6(25/4) !class Amendment 14-10-03 !status Amendment 1-2012 16-08-04 !status WG9 Approved 16-10-08 !status ARG Approved 9-0-3 16-06-12 !status work item 14-10-03 !status received 14-07-14 !priority Medium !difficulty Hard !subject Exact size access to parts of composite atomic objects !summary Memory accesses to subcomponents of an atomic composite object must read or write the entire object. !problem It is not unusual for hardware to require access in a particular size. Indeed, Ada has C.6(22/2) to make it possible for users of Atomic and Volatile objects to be able to match such hardware requirements. Ada also has record representation clauses to make it possible to represent bit fields in terms of a record rather than having to use obscure masking operations. Unfortunately, when these two capabilities are put together, problems occur. In particular, access to components of a composite atomic object might have the wrong size. C.6(15) only applies to the object as a whole, so compilers are free to use the wrong size to access the components. The language should be defined so this doesn't happen. !proposal (See Wording.) !wording Add after C.6(13.2/3) [the end of the Legality Rules section] If a nonatomic subcomponent of an atomic object is passed as the actual parameter in a call then the formal parameter shall allow pass by copy (and, at run time, the parameter shall be passed by copy). A nonatomic subcomponent of an atomic object shall not be used as an actual for a generic formal of mode in out. A nonatomic subcomponent of an atomic type shall not be aliased. A nonatomic subcomponent of an atomic type or object shall not have components that are specified to be independently addressable. Add after C.6(19) [the end of the Dynamic Semantics section] All reads of or writes to any nonatomic subcomponent of an atomic object shall be implemented by reading and/or writing all of the nearest enclosing atomic object. Implementation Note: For example, if a 32-bit record object has four nonatomic components, each occupying one byte, then an assignment to one of those components might normally be implemented on some target machines via some sort of store_byte instruction; if the record object is atomic then instead a 32-bit read-modify-write must be performed. That read-modify-write need not be atomic, although the read and the write must each separately be atomic. Note that it doesn't matter whether the store_byte instruction would have executed atomically. This rule is needed in some cases for memory-mapped device registers. Modify C.6(20): [the Implementation Requirements section] The external effect of a program (see 1.1.3) is defined to include each read and update of a volatile or atomic object. The implementation shall not generate any memory reads or updates of atomic or volatile objects other than those specified by the program. {However, there may be target-dependent cases where reading or writing a volatile but nonatomic object (typically a component) necessarily involves reading and/or writing neighboring storage, and that neighboring storage might overlap a volatile object.} Modify C.6(22/2): [part of the Implementation Advice section] A load or store of a volatile object whose size is a multiple of System.Storage_Unit and whose alignment is nonzero, should be implemented by accessing exactly the bits of the object and no others{, except in the case of a volatile but nonatomic subcomponent of an atomic object}. Add after C.6(25/4): [an additional user Note] 11 When mapping an Ada object to a memory-mapped hardware register, the Ada object should be declared atomic to ensure that the compiler will read and write exactly the bits of the register as specified in the source code and no others. AARM Discussion: This is especially important for a write-only hardware register, as in such a case a read-modify-write cycle will not work. The only time the language guarantees such a cycle will not happen is when writing an entire atomic object. If one wants to write individual components of a write-only hardware register (assuming the hardware supports that), those also need to be declared atomic. !discussion Consider the case of a memory mapped device register which is represented as an atomic object of a record or array type. Suppose further that there is a requirement that all memory references to this object must read or write the entire object/register. For example, if it is a 32-bit register and you want to modify one byte of the register, then you must perform a 32-bit load into a 32-bit temporary, modify the appropriate byte of the temporary, and then copy the value of the temporary into the register via a 32-bit assignment. This AI is intended to improve Ada's support for this situation, which is viewed as being common enough to warrant such support. The solution chosen in this AI is to unconditionally assume that any composite atomic object is subject to this requirement. An alternative which was considered would be to introduce a new Boolean aspect, Whole_Object_Reference, which could be used to indicate that a given atomic composite object is subject to this requirement. This finer-grained specification was deemed to introduce too much complexity (e.g., should the aspect be specifiable for a formal parameter, with corresponding changes to the conformance rules?). C.6 already imposes rules to ensure that any code which manipulates a reference to a volatile object (e.g., via 'Access or via passing an actual parameter by reference) knows that the referenced object is volatile - problems result if code can access a volatile object without knowing that the object is volatile. Similar rules are added for preventing references to subcomponents of an atomic object from "leaking out" to contexts which are unaware of the existence of the enclosing atomic object. This "whole object reference" requirement implies that two components of an atomic object are never independently addressable. This could theoretically introduce an incompatibility in the case of a program which has two tasks which asynchronously manipulate different components of an atomic object (in the case where the two components were, until this AI came along, independently addressable). Another such case involves the case of an atomic record with two components, C1 and C2. Reading or writing C2 will now introduce reads and/or writes of C1, violating the usual rule about implicit reads of volatile objects (note that C1 is necessarily volatile because atomicity implies volatility and volatility of an object implies volatility of a component thereof); but the point of this AI is that this change is considered to be desirable. There is also a permission mentioned explicitly in the new wording which was previously only implicit (and even that is debatable): However, there may be target-dependent cases where reading or writing a volatile but nonatomic object (typically a component) necessarily involves reading and/or writing neighboring storage, and that neighboring storage might overlap a volatile object. Consider, for example, a volatile array of 32 Booleans with a component size of 1. An assignment to an element of this array will typically require read-modify-writing at least the enclosing byte. Note that there are no atomic objects in this scenario, so this permission is only tangentially related to the main topic of this AI. !example type Status is Ready : Boolean; Length : Integer range 0 .. 15; end record; for Status use record Ready 0 range 0 .. 0; Length 0 range 1 .. 5; end record; Status_Register : Status with Address => ..., Size => 32, Atomic => True; Temporary : Status; Status_Register := (Ready => True, Length => 0); Temporary := Status_Register; if Temporary.Ready then null; end if; if Status_Register.Ready then -- Reads entire register null; end if; Status_Register.Length := 10; -- Prereads entire register, -- then writes entire register. !corrigendum C.6(13.2/3) @dinsa It is illegal to specify a representation aspect for a component, object or type for which the aspect Independent or Independent_Components is True, in a way that prevents the implementation from providing the independent addressability required by the aspect. @dinst If a nonatomic subcomponent of an atomic object is passed as the actual parameter in a call then the formal parameter shall allow pass by copy (and, at run time, the parameter shall be passed by copy). A nonatomic subcomponent of an atomic object shall not be used as an actual for a generic formal of mode @b. A nonatomic subcomponent of an atomic type shall not be aliased. A nonatomic subcomponent of an atomic type or object shall not have components that are specified to be independently addressable. !corrigendum C.6(19) @dinsa If an actual parameter is atomic or volatile, and the corresponding formal parameter is not, then the parameter is passed by copy. @dinst All reads of or writes to any nonatomic subcomponent of an atomic object shall be implemented by reading and/or writing all of the nearest enclosing atomic object. !corrigendum C.6(20) @drepl The external effect of a program (see 1.1.3) is defined to include each read and update of a volatile or atomic object. The implementation shall not generate any memory reads or updates of atomic or volatile objects other than those specified by the program. @dby The external effect of a program (see 1.1.3) is defined to include each read and update of a volatile or atomic object. The implementation shall not generate any memory reads or updates of atomic or volatile objects other than those specified by the program. However, there may be target-dependent cases where reading or writing a volatile but nonatomic object (typically a component) necessarily involves reading and/or writing neighboring storage, and that neighboring storage might overlap a volatile object. !corrigendum C.6(22/2) @drepl A load or store of a volatile object whose size is a multiple of System.Storage_Unit and whose alignment is nonzero, should be implemented by accessing exactly the bits of the object and no others. @dby A load or store of a volatile object whose size is a multiple of System.Storage_Unit and whose alignment is nonzero, should be implemented by accessing exactly the bits of the object and no others, except in the case of a volatile but nonatomic subcomponent of an atomic object. !corrigendum C.6(25/4) @dinsa @xindent<@s9<10 Specifying the Pack aspect cannot override the effect of specifying an Atomic or Atomic_Components aspect.>> @dinst @xindent<@s9<11 When mapping an Ada object to a memory-mapped hardware register, the Ada object should be declared atomic to ensure that the compiler will read and write exactly the bits of the register as specified in the source code and no others.>> !ASIS No ASIS effect. !ACATS test An ACATS B-Test should be created to check the new Legality Rules. The actual read/write size cannot be tested by the ACATS, unfortunately, so we don't need or want a C-Test. !appendix !topic Does Atomic on a record apply to it's components ? !reference Ada 2012 RMC.6(15) !from Simon Clubley 2014-07-13 !keywords Atomic update record components This submission was prompted by an issue identified recently in comp.lang.ada. On an ARM target, a 32-bit record, with multiple bitfields, was defined as Atomic. However, the generated code showed that when a bitfield, instead of the record as a whole, was referenced, GNAT sometimes used ldrb/strb (byte level access instructions) instead of ldr/str (32 bit access instructions). This broke the hardware requirement that the register this record was been used to access must be accessed in units of 32 bits. Scenario: Consider a 32 bit record marked as Atomic. This record consists of multiple bitfields and is used to model the bitfields in a 32 bit memory mapped device register. The hardware requires the device register to be read from and written to in units of 32 bits. Now consider the following statement: Device_Register.bitfield := 1; When this specific bitfield in the record, say 4 bits wide, is written to, does the Atomic attribute on the record require the record to be accessed in units of 32 bits or is the compiler permitted to generate code to access, say, 8 bits of the record only ? C.6(15) states: For an atomic object (including an atomic component) all reads and updates of the object as a whole are indivisible. There are conflicting opinions about the above rule in comp.lang.ada. The words "as a whole" in C.6(15) were used to justify the position that access to this single bitfield is not required to be in units of the record size which on the surface seemed reasonable. However, other opinions are that C.6(15) does apply when accessing this single bitfield. Which opinion is correct ? I would like C.6(15) to apply, but I am more interested in obtaining a firm decision so other options can be considered if it doesn't. As a related question, when C.6(15) does apply, does it also apply to all the bits included by 'Size when 'Size has been used to increase the size of the atomic object (say from 5 bits to 32 bits) ? Possible solutions: If C.6(15) applies when accessing a single bitfield in an Atomic record, then this is clearly a GNAT bug and needs handling as such. If C.6(15) is deemed not to apply to a single bitfield, then we need a mechanism to indicate, in the program source code, that the compiler is not allowed to reference a bitfield in an Atomic record by just accessing one segment of that record's memory. If the partial aggregate proposal is approved, then this could be the mechanism. If it is not approved, then a pragma/aspect along the lines of "No_Segmented_Access" could be the mechanism to enforce this. I do believe a mechanism is required in Ada itself however. The special requirements of the register should be declared in the source code, for the compiler (and maintainers) to see, so that it can be guaranteed to be honoured when the code is compiled. **************************************************************** From: Jeff Cousins Sent: Tuesday, July 15, 2014 4:15 AM C.6 8/3's "Finally, if an object is volatile, then so are all of its subcomponents [(the same does not apply to atomic)]." seems to support the view that the words "as a whole" in C.6(15) win. **************************************************************** From: Tucker Taft Sent: Tuesday, July 15, 2014 12:01 PM > ... C.6(15) states: > > For an atomic object (including an atomic component) all reads and > updates of the object as a whole are indivisible. > > There are conflicting opinions about the above rule in comp.lang.ada. > The words "as a whole" in C.6(15) were used to justify the position > that access to this single bitfield is not required to be in units of > the record size which on the surface seemed reasonable. > > However, other opinions are that C.6(15) does apply when accessing > this single bitfield. > > Which opinion is correct ? C.6(15) is only referring to reads and updates of the object as a whole. There are no special "atomicness" rules when referring to individual components, though all such components are considered volatile. > I would like C.6(15) to apply, but I am more interested in obtaining a > firm decision so other options can be considered if it doesn't. > > As a related question, when C.6(15) does apply, does it also apply to > all the bits included by 'Size when 'Size has been used to increase > the size of the atomic object (say from 5 bits to 32 bits) ? The definition of 'Size on an elementary object is that it refers to the number of bits normally read/written, as specified in 13.1(7/2): "The representation of an object consists of a certain number of bits (the size of the object). For an object of an elementary type, these are the bits that are normally read or updated by the machine code when loading, storing, or operating-on the value of the object...." For a non-atomic composite object, extra bits might not be loaded/stored, again quoting from that same paragraph: "... For a composite object, padding bits might not be read or updated in any given composite operation, depending on the implementation." For an atomic composite object, although it is not stated explicitly, the implication is that the 'Size of an atomic object determines the number of bits read/written indivisibly (e.g. see C.6(11/4)). > Possible solutions: > > If C.6(15) applies when accessing a single bitfield in an Atomic > record, then this is clearly a GNAT bug and needs handling as such. > > If C.6(15) is deemed not to apply to a single bitfield, then we need a > mechanism to indicate, in the program source code, that the compiler > is not allowed to reference a bitfield in an Atomic record by just > accessing one segment of that record's memory. > > If the partial aggregate proposal is approved, then this could be the > mechanism. If it is not approved, then a pragma/aspect along the lines > of "No_Segmented_Access" could be the mechanism to enforce this. > > I do believe a mechanism is required in Ada itself however. The > special requirements of the register should be declared in the source > code, for the compiler (and maintainers) to see, so that it can be > guaranteed to be honoured when the code is compiled. For special-purpose hardware requirements such as this, it seems better to fetch the word as a whole into a temp, set the bit of interest, and then store the word back as a whole. Expecting the compiler to give you this level of control is probably overkill. Do you have examples of other languages that promise this level of control? If so, how is it specified? **************************************************************** From: Randy Brukardt Sent: Tuesday, July 15, 2014 5:36 PM > > However, other opinions are that C.6(15) does apply when accessing > > this single bitfield. > > > > Which opinion is correct ? > > C.6(15) is only referring to reads and updates of the object as a > whole. There are no special "atomicness" rules when referring to > individual components, though all such components are considered > volatile. This seems to defeat the purpose of C.6(15), which was specifically to support the sort of hardware register access noted by the questioner. OTOH, there's plenty of evidence that this (misguided) interpretation was intended (which I missed during the initial comp.lang.ada discussion, so I put fuel on this fire mistakenly). That means that fixing it would be incompatible, and that's not likely to fly. ... > > I do believe a mechanism is required in Ada itself however. The > > special requirements of the register should be declared in the > > source code, for the compiler (and maintainers) to see, so that it > > can be guaranteed to be honoured when the code is compiled. > > For special-purpose hardware requirements such as this, it seems > better to fetch the word as a whole into a temp, set the bit of > interest, and then store the word back as a whole. Sure, this makes sense. But if someone mistakenly does the wrong thing, the compiler just generates the wrong code without any complaint. That's not the Ada way (I hope). > Expecting the compiler to give you this level of control is probably > overkill. Do you have examples of other languages that promise this > level of control? If so, how is it specified? So far as I can tell, other languages don't *need* this sort of control, because a hardware register would never be modeled as a composite type. In C, for instance, one would generally use bit-masking operations to get at the parts of the register, so it would only be read as a whole. This problem happens because us Ada people say "hey, we have a better way to handle bit fields! Just declare an appropriate record and record representation clause and let the compiler handle all of the mess!". But if someone does that on an atomic hardware register, and writes natural access code, the wrong code is generated. The reason that we added Implementation Advice C.6(22/2) and C.6(23/2) was specifically to support the case of accessing hardware registers. It seems obnoxous that we say, "and, oh by the way, a record type will work here, but don't use it like one or we'll generate the wrong code.". There ought to be a way for the compiler to tell the user that the wrong code will be generated. I agree with you that an explicit temporary is needed. We don't want to get into implicit volatile operations (everything must appear in the source). But that means that any individual component access to a hardware register is a mistake. This, I think that an aspect is needed here that causes the compiler to reject any case where anything other than the exact bits of the entire atomic object would be accessed. That can't be the default for compatibility reasons, but it certainly should be possible. (I can't quite imagine any case where accessing part of an atomic object ever makes sense, for any expected use of atomic, but someone almost certainly has code that expects that to work.) I would suggest an aspect No_Partial_Access, which could only be given on atomic objects. That would have no effect if the atomic object's type was elementary, but for a composite object, it would prevent any direct access to components of the object. That would require that a temporary be used to read or update just part of the object (as it should -- it's important that what happens to the other bits be specified in such a case). Access to memory-mapped hardware has always been considered important in Ada, and using bit-mapped records rather than bit masks has always been considered an Ada advantage -- and it's silly that they don't work well together. At the least, we should provide a means so that bad uses are detected, not just silently executed (incorrectly because of the hardware limitations). **************************************************************** From: Simon Clubley Sent: Wednesday, July 15, 2014 3:55 PM >> > Which opinion is correct ? >> >> C.6(15) is only referring to reads and updates of the object as a >> whole. There are no special "atomicness" rules when referring to >> individual components, though all such components are considered >> volatile. > > This seems to defeat the purpose of C.6(15), which was specifically to > support the sort of hardware register access noted by the questioner. > OTOH, there's plenty of evidence that this (misguided) interpretation > was intended (which I missed during the initial comp.lang.ada > discussion, so I put fuel on this fire mistakenly). That means that > fixing it would be incompatible, and that's not likely to fly. > [First, my apologies for any delayed responses. The way Ada-Comment is described in the LRM makes it sound like a dropbox for public comments which are then discussed internally. I didn't realise two way communication was possible for the submitter after something was submitted.] Thanks, Randy. So at least we now know GNAT is behaving in a way compatible with how C.6(15) was intended to be interpreted when it was written. > ... >> > I do believe a mechanism is required in Ada itself however. The >> > special requirements of the register should be declared in the >> > source code, for the compiler (and maintainers) to see, so that it >> > can be guaranteed to be honoured when the code is compiled. >> >> For special-purpose hardware requirements such as this, it seems >> better to fetch the word as a whole into a temp, set the bit of >> interest, and then store the word back as a whole. One issue is that while excessive terseness is bad (C and friends), excessive verbosity can also hinder readability especially when it's boilerplate type code such as the above. One concern is such code could cause a routine to grow in size very quickly if it has to set up a number of registers for a device. > Sure, this makes sense. But if someone mistakenly does the wrong > thing, the compiler just generates the wrong code without any > complaint. That's not the Ada way (I hope). > >> Expecting the compiler to give you this level of control is >> probably overkill. Do you have examples of other languages that >> promise this level of control? If so, how is it specified? No I don't, and in a way that's kind of the point. Ada is good at allowing people to model the actual problem in ways that some other languages are not. I was just looking to enhance Ada's abilities further. > So far as I can tell, other languages don't *need* this sort of > control, because a hardware register would never be modeled as a > composite type. In C, for instance, one would generally use > bit-masking operations to get at the parts of the register, so it would only be read as a whole. Exactly. In C, the register is treated as an opaque integer and any meaning to those bitfields is placed in header files as opaque constants with no internal structure as far as the compiler is concerned. As well as not looking as readable in the code as bitfields would be, this severely reduces the ability of the compiler to do error checking at compile time. > I agree with you that an explicit temporary is needed. We don't want > to get into implicit volatile operations (everything must appear in the source). > But that means that any individual component access to a hardware > register is a mistake. Given the arguments above, I think C.6(15) is probably going to stand as-is so I accept a temporary is needed in current Ada compilers as they exist today. I'll discuss the partial aggregate option for future Ada versions on the other thread. > This, I think that an aspect is needed here that causes the compiler > to reject any case where anything other than the exact bits of the > entire atomic object would be accesses. That can't be the default for > compatibility reasons, but it certainly should be possible. (I can't > quite imagine any case where accessing part of an atomic object ever > makes sense, for any expected use of atomic, but someone almost > certainly has code that expects that to work.) > > I would suggest an aspect No_Partial_Access, which could only be given > on atomic objects. That would have no effect if the atomic object's > type was elementary, but for a composite object, it would prevent any > direct access to components of the object. That would require that a > temporary be used to read or update just part of the object (as it > should -- it's important that what happens to the other bits be specified in such a case). Another way of looking at No_Partial_Access is that, rather than denying component access, it (or something like it) could command the compiler to generate code which accesses components of the object by accessing the full object itself (ie: in units of the object size). If this was not possible, such code would be rejected by the compiler. A specific example for ARM: for a 32-bit record, the compiler would only be able to generate code which used ldr (a 32-bit load operation) and would be forbidden from generating code which used ldrb (an 8-bit load operation). Is that a viable option ? **************************************************************** From: Randy Brukardt Sent: Tuesday, July 22, 2014 1:26 PM Only in the sense that "anything is possible". It would take substantial additional work to allow it. First, it's important to remember that while Atomic doesn't apply to components, Volatile does (C.6(8/3)), and Atomic implies Volatile. So the components of a composite atomic object are volatile. That means that C.6(20) and C.6(22/2) apply to such components. That means no implicit reads or writes of such components, nor any access to extra bits. This last part means that not only CAN a compiler use a byte load to read a component of an atomic object, it MUST use such a load if it wants to follow C.6(22/2) exactly. Or in short, the language says that a compiler SHOULD do the wrong thing for such a component. We could of course add an exception to the Implementation Advice for components of atomic objects that have No_Partial_Access applied. That would complicate an otherwise simple rule, but otherwise seems harmless. In that case, it would be possible for component reads to be implemented by a "read whole object atomically/extract component from value" sequence. But it's not possible for component writes to work that way, because the other components have to come from somewhere (except of course in the degenerate case of having only one component in the object). If one pre-reads them from the atomic object, we get a violation of C.6(20) [remember, these are volatile, so implicit reads aren't allowed]. Just using default-initialized components (or random junk) doesn't make any sense implicitly - it could change components not explicitly accessed in the source. So it seems in any case we have to disallow component writes. As such, I went with the simplest approach (make it possible to detect bugs without any semantic change to fix them). Disallowing both reads and writes is easy and consistent. That seems more likely to get through the ARG. And in any case, we could in the future allow reads compatibly. Since the real solution to component writes is partial aggregates, component writes are not needed anyway. Perhaps component reads should be allowed as that would make writing the rest of the code easier. But without partial aggregates, reads don't seem to buy anything; you'll need to get in the habit of using temporaries for the entire object. **************************************************************** From: Simon Clubley Sent: Wednesday, July 23, 2014 2:31 PM Thank you for the detailed write-up and analysis Randy. After reading the various responses, I can see why temporaries are going to have to continue to be required for current Ada compilers and why the current intrepretation of C.6(15) needs to stand as-is. On the plus side, at least there is now a firm write-up on the issue. :-) Thanks to everyone for their comments, **************************************************************** From: Matthias Richter Sent: Friday, August 15, 2014 7:46 AM > > I do believe a mechanism is required in Ada itself however. The > > special requirements of the register should be declared in the > > source code, for the compiler (and maintainers) to see, so that it > > can be guaranteed to be honoured when the code is compiled. > > For special-purpose hardware requirements such as this, it seems > better to fetch the word as a whole into a temp, set the bit of > interest, and then store the word back as a whole.> > Expecting the compiler to give you this level of control is probably > overkill. Do you> > have examples of other languages that promise this level of control? > If so, how is it specified? Yes, there is one (at least for ARM, see below). It is called 'C'. Volatile bitfields in 'C' generally are known as not well-defined regarding their layout and therefore they are avoided by many (most?) C programmers, but they offer control over the access width the compiler has to use for reads or writes of the individual fields. There are detailed requirements regarding the access width of bitfields in the AAPCS (ARM ABI specification). I am not sure if it is defined in the C standard. The relevant paragraph in the AAPCS (http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapcs.pdf) is: - 7.1.7.5 Volatile bit-fields - preserving number and width of container - accesses - - When a volatile bit-field is read, its container must be read exactly once - using the access width appropriate to the type of the container. - - When a volatile bit-field is written, its container must be read exactly - once and written exactly once using the access width appropriate to the type - of the container. The two accesses are not atomic. - - Multiple accesses to the same volatile bit-field, or to additional volatile - bit-fields within the same container may not be merged. For example, an - increment of a volatile bit-field must always be implemented as two reads - and a write. - - Note - Note the volatile access rules apply even when the width and alignment of - the bit-field imply that the access could be achieved more efficiently using - a narrower type. For a write operation the read must always occur even if - the entire contents of the container will be replaced. - - If the containers of two volatile bit-fields overlap then access to one - bit-field will cause an access to the other. For example, in - struct S {volatile int a:8; volatile char b:2}; - an access to a will also cause an access to b, but not vice-versa. - If the container of a non-volatile bit-field overlaps a volatile bit-field - then it is undefined whether access to the non-volatile field will cause the - volatile field to be accessed. The type of the container of an individual bitfield dictates the access width to be used for reads and writes of this field. In other paragraphs rules for the layout of structs are defined. If a compiler is AAPCS-compliant, the layout is well-defined and predictable, although there is no explicit control as with representation clauses like in Ada. (I don't know if ABI specifications for other architectures contain similar definitions like the AAPCS) I would expect _at least_ the same level of control with Ada. Actually, it should be better (whatever better means in this context...) because, as pointed out by Randy: > This problem happens because us Ada people say "hey, we have a better > way to handle bit fields! Just declare an appropriate record and > record representation clause and let the compiler handle all of the mess!" **************************************************************** From: Matthias Richter Sent: Sunday, September 14, 2014 9:45 AM !topic Adding pragma/aspect to force specific read/write width !reference Ada 2012 RM{clause unsure} !from Matthias Richter 2014-09-14 !keywords read write device register bitfields access width Background ---------- Different hardware architectures allow different access widths for reads/writes to/from memory or memory-mapped I/O: 1. Only word (e.g. 32 bit) accesses are allowed - the instruction set contains only word read and write instructions 2. Word, halfword and byte (32, 16 and 8 bit) accesses are allowed over the entire address space 3. Word, halfword and byte (32, 16 and 8 bit) accesses are generally allowed, but there are some addresses or address ranges where only word accesses are allowed In case 1 or 2 there is usually no problem. Compilers know which instructions may be used for a particular architecture. In contrast, a compiler usually doesn't know when there are restrictions in portions of the address space (case 3) - and there is currently no way to tell an Ada compiler about such restrictions. This problem became visible with the recently introduced Ada compilers for ARM microcontrollers. These microcontrollers generally allow word, halfword and byte accesses, but some addresses/address ranges (memory-mapped I/O registers) must be written and read only as 32 bit words. In this case it is possible that a compiler generates illegal instructions. It is very likely that other microcontrollers have similar restrictions. There was a lively discussion on this topic in comp.lang.ada some weeks ago. This discussions were focused mainly on atomicity and leaded to the proposal of a partial aggregate syntax. As a result, two comments/proposals were sent to this list by Simon Clubley. The partial aggregate syntax is an interesting feature (and would be an elegant solution for the simultaneous update of multiple fields in I/O registers), but i don't think it would solve the problem with illegal instructions on architectures with restrictions in portions of their address space. I will elaborate on this below. When does the problem occur? ---------------------------- The problem occurs when the follwing conditions are met: - The hardware has a non-uniform address space with respect to the allowed access width (case 3 above) - The compiler (strictly speaking its optimiser) knows that a particular write changes only a fraction of a word - or that only a fraction of a read word is actually used. It may then generate a byte (or halfword) write or read which covers only the affected fraction of the word The generated byte (or halfword) write or read may be illegal for the address in question. On architectures as described in case 2 above, this optimisation is legitimate, because in 'normal memory' it leads to exactly the same result as the write/read of the entire word. There might be pathologic cases, but usually it leads to same result with I/O registers, too. Architectures as described in case 1 above are obviously the trivial case, this kind of optimisation is impossible there. Typically, partial writes or reads of records with bitfields may be optimised this way. The problem was first observed with writes or reads to fields of a record which was mapped to a I/O register via representation and address clauses. (This observation leads to the already mentioned discussion in comp.lang.ada) There may be other situations where this optimisation might occur. Proposal -------- Adding a pragma and/or aspect 'access width' would give the possibility to specify the allowed read/write width for a variable. - The 'access width' may be specified for a subtype or an object - The 'access width' may be specified for a record component: Writes or reads of this component must be done with the specified width - The 'access width' may be specified for a record: All writes or reads of the record as a whole or of any of its components must be done with the specified width. - Writes to components smaller than the given 'access width' must be realised as a read-modify-write cycle - Allowed values are the number of bits of the read and write instructions available on the particular architecture (e.g. 8, 16, 32) If other values are given, the compiler shall reject the program - Naming: 'access width' is a first proposal. Other (better?) proposals are appreciated Existing workarounds -------------------- - Don't use records with representation clauses for I/O registers, use bitmasks (like in C) and word read/writes instead. This is - mildly speaking - extremely ugly. Records with representation clauses were always advertised as the better solution in Ada. And indeed, it seems to be a more Ada-like level of abstraction to do it that way. - Use temporary variables (in fact, make the read-modify-write cycle explicite): Read the entire record, write the field which should be modified and write back the whole record. From my point of view, not an adequate level of abstraction, too. Given the fact that in a typical microcontroller programs *) there are a lot of these register writes and reads, this approach is ugly, too. *) I examined a program for motor control (not written in Ada, but that doesn't matter for this question). In this program, nearly 30% of non-empty code lines are containing a read or a write of a I/O register. This is a more or less typical value for programs on small or middle-range microcontrollers. Note 1: There is no guarantee that these workarounds are not affected by the access width optimisation and therefore it is theoretically possible that illegal instructions are generated. Though it is very probable that a sensible compiler will do the 'right' thing here. I would expect a higher level of definedness from Ada... Note 2: On other architectures (case 1 and 2 above), records with representation clauses work as expected. It's a pity that they don't work on modern, very popular microcontroller families. Other solutions --------------- In theory, there is a possibility to solve the problem outside the language. One could give the compiler a table where the appropiate access widths for particular address ranges are specified. This table could look similar to a linker script. I don't like this approach: You would have to maintain informations about the I/O registers at two places - in the Ada-spec where the records with representation clauses are declared and in the said table. I would prefer to specify these things at a common place. I would see the allowed access width as a complement to the address. (Note: It is enough of annoyance that the user has to write these register specifications himself. In the 'C' world this is usually done by the processor manufacturer or the compiler vendor) Atomic or volatile? ------------------- - 'atomic' doesn't inhibit the optimisation described above. A byte write has exactly the same result as a word write which leaves the other three bytes unchanged - if a byte write is allowed at the particular address. And since it is not split in multiple writes, it is 'atomic' in its literal sense. - For a write to parts of an atomic object I would expect that the full read- modify-write cycle is atomic. On most architectures this is very expensive (it requires things like interrupt locks unless there are special instructions for atomic read-modify-write cycles). I don't think that this kind of things should be done implicitely, so it propably would be best to forbid partial writes to atomic objects. I'm afraid that it would break compatibility with existing compilers to completely forbid it, so at least it should be required for a compiler to emit a warning about possibly unexpected behaviour in this case. - The semantics of 'volatile' are usually sufficient for register accesses: - An object is read exactly once - An object is written exactly once It should be clarified that in case of a partial write of an object, this is done by a read-modify-write cycle consisting of exactly one read followed by exactly on write. I am not sure if [C.6] allows these read-modify-write cycles. If not, I propose to change it - existing compilers do it already this way (at least GNAT, I don't know about others) - this is the expected behaviour of a programmer who writes this type of low- level routines Other Languages --------------- In 'C', there is a notion of a 'container type' of a bitfield. That is not the type of the struct (record) as one could think, it is a type specifier given with each field of a bitfield struct. The AAPCS (ARM ABI specification) requires explicitly that accesses to bitfields must be done with the access width corresponding to its 'container type'. (See also my comment from 2014-08-15) So, at least on ARM (with a AAPCS-compliant compiler), 'C' offers the possibility to use bitfields with full control over the access width and with predictable representation (this is defined also in the AAPCS). It's a pity that this is not possible with Ada... Conclusion ---------- To make records with representation clauses useable for the access to IO registers (as advertised since the very beginnings of Ada) on today's microcontrollers, a possibility to specifiy an allowed access width (complementing the address clause) is highly desirable and should be added to the language. Existing workarounds are 1) ugly and 2) their correct working is not really guaranteed. Even in 'C' it is possible to use a very similar construct (at least and especially on the microcontroller family suffering from the problem in Ada). **************************************************************** From: Randy Brukardt Sent: Tuesday, September 30, 2014 5:04 PM ... > - 'atomic' doesn't inhibit the optimisation described above. Yes it does. C.6(22/2) says that accessing only part of the bits is wrong. (Note that this is Implementation Advice only because what it means to access bits of the object is undefined and undefinable in Ada terms. We believe that IA is stronger than an requirement in this case because an implementation is required to document any IA that is not followed.) Remember that all atomic objects are also volatile. > A byte write has exactly the same result as a word write which leaves > the other three bytes unchanged - if a byte write is allowed at the > particular address. And since it is not split in multiple writes, it > is 'atomic' in its literal sense. No, such a write clearly violates C.6(22/2). The problem discussed on comp.lang.ada came about because the object in question was composite. That means that all of the components are also volatile. Thus, accessing a single component HAD to be done as a byte read or write -- as the component itself is volatile -- such an operation is not even atomic (the whole object is not considered). A compiler that wrote the entire object in this case would be violating C.6(22/2) for the component. That's the bug: a composite volatile variable that must be accessed in a fixed size can only be accessed as a whole, never as an individual component. The requirement C.6(20) and the advice C.6(22/2) together mean that the program source must not attempt such an operation (if the read/write size actually matters). The Ada compiler ought to give some support for this (but it can't be the default for compatibility reasons and because in many situations there is no problem). I am proposing an aspect for this purpose (as a response to Simon Clubley's question). > - For a write to parts of an atomic object I would expect that the > full read- modify-write cycle is atomic. See above. Writing to a part of an atomic object is going to result in nonsense because of the other rules of the language. Making some sort of atomic read/write cycle is going to break the semantics of the necessarily volatile components. The only sensible thing is to prevent writing part of an atomic object altogether, which ought to be done with help from the compiler (not the current situation of telling people not to do that). That's the point of the proposed aspect. [I'm being a little more vague than usual because I haven't actually written up the AI yet, I'll probably do that later this week.] **************************************************************** From: Matthias Richter Sent: Sunday, October 5, 2014 6:55 AM > > - 'atomic' doesn't inhibit the optimisation described above. > > Yes it does. C.6(22/2) says that accessing only part of the bits is wrong. > (Note that this is Implementation Advice only because what it means to > access bits of the object is undefined and undefinable in Ada terms. > We believe that IA is stronger than an requirement in this case > because an implementation is required to document any IA that is not > followed.) Remember that all atomic objects are also volatile. > > > A byte write has exactly the same result as a word write which > > leaves the other three bytes unchanged - if a byte write is allowed > > at the particular address. And since it is not split in multiple > > writes, it is 'atomic' in its literal sense. > > No, such a write clearly violates C.6(22/2). Are these rules meant to be followed literally? Or is it sufficient when the behaviour of the program is 'as if' the rules are followed? Apparently, GNAT (respective its makers) has chosen the latter interpretation: | 7.1.10 Atomic Variables and Optimization | | There are two considerations with regard to performance when atomic | variables are used. | | First, the RM only guarantees that access to atomic variables be | atomic, it has nothing to say about how this is achieved, though there | is a strong implication that this should not be achieved by explicit | locking code. Indeed GNAT will never generate any locking code for | atomic variable access (it will simply reject any attempt to make a | variable or type atomic if the atomic access cannot be achieved | without such locking code). | | That being said, it is important to understand that you cannot assume | that the entire variable will always be accessed. Consider this example: | | | | type R is record | | A,B,C,D : Character; | | end record; | for R'Size use 32; | for R'Alignment use 4; | | RV : R; | pragma Atomic (RV); | X : Character; | ... | X := RV.B; | | You cannot assume that the reference to RV.B will read the entire | 32-bit variable with a single load instruction. It is perfectly | legitimate if the hardware allows it to do a byte read of just the B | field. This read is still atomic, which is all the RM requires. GNAT | can and does take advantage of this, depending on the architecture and | optimization level. | Any assumption to the contrary is non-portable and risky. Even if you | examine the assembly language and see a full 32-bit load, this might | change in a future version of the compiler. (source: https://docs.adacore.com/gnat-unw-docs/html/gnat_ugn_8.html#SEC97 ) I think, the clause 'if the hardware allows it' in the last paragraph is part of the problem: Compilers assume that the smallest addressable unit is the same over the whole address space. There is no way to tell a compiler if there are exceptions for specific addresses or address ranges. Same problem in C6(22/2): System.Storage_Unit is a global value. There is no way to tell if there are exceptions for specific addresses or address ranges. > The problem discussed on comp.lang.ada came about because the object in > question was composite. That means that all of the components are also > volatile. Thus, accessing a single component HAD to be done as a byte read > or write -- as the component itself is volatile -- such an operation is not > even atomic (the whole object is not considered). A compiler that wrote the > entire object in this case would be violating C.6(22/2) for the component. > > That's the bug: a composite volatile variable that must be accessed in a > fixed size can only be accessed as a whole, never as an individual > component. The requirement C.6(20) and the advice C.6(22/2) together mean > that the program source must not attempt such an operation (if the > read/write size actually matters). The Ada compiler ought to give some > support for this (but it can't be the default for compatibility reasons and > because in many situations there is no problem). I am proposing an aspect > for this purpose (as a response to Simon Clubley's question). I still think it would be very useful if it would be possible to directly access single fields of a record which is e.g. mapped to an IO register. It is an adequate level of abstraction, because it reflects the intent of the programmer: He don't want to do a read-modify-write cycle (which would be explicit if one has to use temporary variables), he simply wants to access a single field of the record. The read-modify-write cycle is only a means to an end. Of course, the programmer should know that a read-modify-write cycle is generated - but someone who does this kind of low-level programming knows anyways that there is no other way of doing these accesses. The semantic of 'C'-bitfields described in the 'Procedure Call Standard for the ARM Architecture' http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapcs.pdf may be appropriate: | 7.1.7.5 | Volatile bit-fields?preserving number and width of container accesses | When a volatile bit-field is read, its container must be read exactly once | using the access width appropriate to the type of the container. | When a volatile bit-field is written, its container must be read exactly | once and written exactly once using the access width appropriate to the | type of the container. The two accesses are not atomic. Multiple accesses | to the same volatile bit-field, or to additional volatile bit-fields | within the same container may not be merged. For example, an increment of | a volatile bit-field must always be implemented as two reads and a write. | Note Note the volatile access rules apply even when the width and | alignment of the bit-field imply that the access could be achieved more | efficiently using a narrower type. For a write operation the read must | always occur even if the entire contents of the container will be | replaced. If the containers of two volatile bit-fields overlap then | access to one bit-field will cause an access to the other. For example, | in struct S {volatile int a:8; volatile char b:2}; an access to a will | also cause an access to b, but not vice-versa. | If the container of a non-volatile bit-field overlaps a volatile bit-field | then it is undefined whether access to the non- volatile field will cause | the volatile field to be accessed. Would this be doable if not the whole record but only its components are marked as volatile? A possibility to specify the access width (if it's different from System.Storage_Unit) still would be necessary (that would be the equivalent of the 'container type' in 'C'). > > - For a write to parts of an atomic object I would expect > > that the full read- modify-write cycle is atomic. > > See above. Writing to a part of an atomic object is going to result in > nonsense because of the other rules of the language. Making some sort of > atomic read/write cycle is going to break the semantics of the necessarily > volatile components. The only sensible thing is to prevent writing part of > an atomic object altogether, which ought to be done with help from the > compiler (not the current situation of telling people not to do that). > That's the point of the proposed aspect. I wanted to say: 'If partial writes are allowed, I would expect that the full read-modify-write cycle is atomic'. Because of the problems you mentioned and the fact that atomic read-modify- write cycles are quite heavyweight on most architectures, i think as well that it would be best to forbid partial writes to atomic objects. Fortunatetly, in the use case 'IO registers', atomic accesses are rarely needed. In a sensible program, there are usually no concurrent accesses to the same IO register from multiple tasks. In the rare cases where concurrent accesses can occur, the access control can (and should) be programmed explicitely. **************************************************************** From: Randy Brukardt Sent: Monday, October 6, 2014 8:45 PM > > No, such a write clearly violates C.6(22/2). > > Are these rules meant to be followed literally? Yes. > Or is it sufficient when the behaviour of the program is 'as if' the > rules are followed? Well, this is not testable in terms of the "effect" of an Ada program - they make no change to that effect. That's why they can only be "advice". There is no way that such rules could ever be anything more, since the only way to tell if they are followed or not is to examine the generated code -- which is out of bounds for the Standard and for the ACATS. OTOH, we have the advice so that the intent of the language is clear (even if it doesn't formally mean anything). Thus, if they are violated, you can go to your implementer and complain (not that they have to do anything if they don't want, but then at least you'll know something about your implementer). So to answer your question, it's sufficient in the sense that advice can be ignored (but that fact ought to be documented), but it clearly violates the intent of the advice -- ignoring the advice without documentation of some sort is wrong. > Apparently, GNAT (respective its makers) has chosen the latter > interpretation: > > | 7.1.10 Atomic Variables and Optimization > | > | There are two considerations with regard to performance when atomic > | variables are used. > | > | First, the RM only guarantees that access to atomic variables be > | atomic, it has nothing to say about how this is achieved, though > | there is a strong implication that this should not be achieved by > | explicit locking code. Indeed GNAT will never generate any locking > | code for atomic variable access (it will simply reject any attempt > | to make a variable or type atomic if the atomic access cannot be > | achieved without such locking code). > | > | That being said, it is important to understand that you cannot > | assume that the entire variable will always be accessed. Consider this example: > | > | > | > | type R is record > | > | A,B,C,D : Character; > | > | end record; > | for R'Size use 32; > | for R'Alignment use 4; > | > | RV : R; > | pragma Atomic (RV); > | X : Character; > | ... > | X := RV.B; > | > | You cannot assume that the reference to RV.B will read the entire > | 32-bit variable with a single load instruction. It is perfectly > | legitimate if the hardware allows it to do a byte read of just the B > | field. This read is still atomic, which is all the RM requires. GNAT > | can and does take advantage of this, depending on the architecture > | and optimization level. > | Any assumption to the contrary is non-portable and risky. Even if > | you examine the assembly language and see a full 32-bit load, this > | might change in a future version of the compiler. > > (source: > https://docs.adacore.com/gnat-unw-docs/html/gnat_ugn_8.html#SEC97 ) Actually, their documentation is misleading here, but not in the way that you are hoping. RV.B is an access to a volatile object, and as such C.6(22/2) applies to it. Therefore, if the machine supports an 8-bit read, then the compiler ought to use it to avoid accessing the other bits of the object. Using a 32-bit read is wrong if an 8-bit read is available; it violates the advice and that fact should be documented somehow. This is NOT any sort of optimization, it's the way things are prescribed to work. Expecting this to work means that you're expecting the Advice to be followed in some cases and ignored in other cases, and you're pretty much expecting the compiler to guess in which cases you think that is important. (Note again that none of this discussion has anything in particular to do with Atomic; these rules apply to all Volatile objects and Atomic is just a special case of that.) > I think, the clause 'if the hardware allows it' in the last paragraph > is part of the problem: Compilers assume that the smallest addressable > unit is the same over the whole address space. There is no way to tell > a compiler if there are exceptions for specific addresses or address > ranges. > > Same problem in C6(22/2): System.Storage_Unit is a global value. > There is no way to tell if there are exceptions for specific addresses > or address ranges. The combination of C.6(20) and C.6(22/2) makes this irrelevant -- if you don't WRITE any accesses of incorrect size, you won't GET any accesses of incorrect size. But any component access is certain to be of incorrect size for the object as a whole, so you must not write those. > > The problem discussed on comp.lang.ada came about because the object > > in question was composite. That means that all of the components are > > also volatile. Thus, accessing a single component HAD to be done as > > a byte read or write -- as the component itself is volatile -- such > > an operation is not even atomic (the whole object is not > > considered). A compiler that wrote the entire object in this case would > > be violating C.6(22/2) for the component. > > > > That's the bug: a composite volatile variable that must be accessed > > in a fixed size can only be accessed as a whole, never as an > > individual component. The requirement C.6(20) and the advice > > C.6(22/2) together mean that the program source must not attempt > > such an operation (if the read/write size actually matters). The > > Ada compiler ought to give some support for this (but it can't be > > the default for compatibility reasons and because in many situations > > there is no problem). I am proposing an aspect for this purpose > > (as a response to Simon Clubley's question). > > I still think it would be very useful if it would be possible to > directly access single fields of a record which is e.g. mapped to an IO > register. Lots of things are "very useful" but don't make much sense in the grand scheme of Ada. We don't want to introduce incompatibility specifically for this relatively unlikely case. For instance, changing the rules so these components aren't volatile would almost certainly cause other problems with the language, and probably cause problems for existing users who don't have this particular size issue. I could imagine some special "notwithstanding" rules for reads of atomic components, but I don't think that those alone would be very useful. OTOH, having a special rule for writes of atomic components is madness, IMHO. Such a rule would necessarily require a read-before-write, which would mean that C.6(20) would have to be repealed for them. I think that C.6(20) is the cornerstone of the C.6 rules. One reason is that it means that you can access write-only registers purely in Ada. If you have an implicit read-before-write, that would fail on hardware with write-only device registers. You could say "users with write-only registers ought to avoid using components", but that just means that we'd be swapping one problem for another nearly as likely problem. Now, it's possible that my thinking on C.6(20) is not shared by the rest of the ARG. In that case, C.6 needs a pretty significant overhaul, and that would necessarily be incompatible (at least formally) by eliminating guarentees. (I've not heard anyone express the opinion that C.6 needs such a overhaul, but I could be wrong.) To me, this is an area where abstraction is a bad idea, mainly because there are as many different rules for accessing hardware devices as there are devices. We want Ada to be able to manage as many of them as possible, and having volatile objects be WYISIWYG makes that possible. > It is an adequate level of abstraction, because it reflects the intent > of the > programmer: He don't want to do a read-modify-write cycle (which would > be explicit if one has to use temporary variables), he simply wants to > access a single field of the record. The read-modify-write cycle is > only a means to an end. Of course, the programmer should know that a > read-modify-write cycle is generated - but someone who does this kind > of low-level programming knows anyways that there is no other way of doing these accesses. This I quite doubt; you're assuming that programmers are thinking at this very low level when writing this code. But the entire reason for using Ada is to get away from that level (else a small assembler subprogram would probably be better), people are unlikely to remember it. And you have the same problem as before, now it's just on write-only 32-bit-only registers instead of all 32-bit-only registers. I could see trying to make reads do only 32-bit accesses in such cases, but the problem is avoiding incompatibilities with existing code that is perfectly happy with the existing smaller reads. There'd probably have to be a declaration that such reads are what you wanted, which is the point of the proposed aspect (see AI12-0128-1, not yet posted but it will be no later than next Monday). ... > > See above. Writing to a part of an atomic object is going to result > > in nonsense because of the other rules of the language. Making some > > sort of atomic read/write cycle is going to break the semantics of > > the necessarily volatile components. The only sensible thing is to > > prevent writing part of an atomic object altogether, which ought to > > be done with help from the compiler (not the current situation of telling > > people not to do that). > > That's the point of the proposed aspect. > > I wanted to say: 'If partial writes are allowed, I would expect that > the full read-modify-write cycle is atomic'. > Because of the problems you mentioned and the fact that atomic > read-modify- write cycles are quite heavyweight on most architectures, > i think as well that it would be best to forbid partial writes to atomic > objects. > > Fortunatetly, in the use case 'IO registers', atomic accesses are > rarely needed. In a sensible program, there are usually no concurrent > accesses to the same IO register from multiple tasks. In the rare > cases where concurrent accesses can occur, the access control can (and > should) be programmed explicitely. I agree here. I would expect that writing a single component would be fairly rare, since usually one has to set most/all of the status bits to trigger an I/O. An aggregate makes more sense in that case. The question of reads of components is more interesting. I didn't want to go through the headache of defining rules specifically for that case (I don't think such rules could be in the upcoming Corrigendum, while a purely Legality aspect might make it in). And I didn't think they were particularly useful without some sort of write rule. But perhaps it's not going to make the Corrigendum anyway, and maybe they're useful enough, in which case more expansive rules might work. Anyway, stay tuned... **************************************************************** From: Matthias Richter Sent: Sunday, October 12, 2014 8:43 AM > > Are these rules meant to be followed literally? > > Yes. > > > Or is it sufficient when the behaviour of the program is 'as if' the > > rules are followed? > > Well, this is not testable in terms of the "effect" of an Ada program > - they make no change to that effect. That's why they can only be > "advice". There is no way that such rules could ever be anything more, > since the only way to tell if they are followed or not is to examine > the generated code -- which is out of bounds for the Standard and for > the ACATS. > > OTOH, we have the advice so that the intent of the language is clear > (even if it doesn't formally mean anything). Thus, if they are > violated, you can go to your implementer and complain (not that they > have to do anything if they don't want, but then at least you'll know > something about your implementer). I don't think I have a reason to complain if the "effect of the program" is really the same - then it doesn't matter how it is realized. As you said, it cannot be tested anyways. My definition of the "effect of the program" includes all side effects and also the temporal order of actions. It certainly does not include the exact timing - as far as I know no high level language claims to give control over this. > > > That's the bug: a composite volatile variable that must be > > > accessed in a fixed size can only be accessed as a whole, never as > > > an individual component. The requirement C.6(20) and the advice > > > C.6(22/2) together mean > > > that the program source must not attempt such an operation (if the > > > read/write size actually matters). I still find the combination of C.6(20) and C.6(22/2) very confusing: C.6(22/2) requires that "exactly the bits of the object and no others" should be accessed _only_ if the size of the volatile object is a multiple of System.Storage_Unit. C.6(22/2b) and (22/2c) in the AARM make clear that when this condition is not met (the size of the object is not a multiple of System.Storage_Unit), it is allowed to access bits outside of the object. It is not said that then the access isn't allowed at all. In C.6(22/2b) not only bit-mapped record component, but also packed array components are mentioned. At least in the case of a packed array the 'neighbors' of a volatile component are inevitably volatile, too. So it isn't said that the access is only allowed if the 'neighbors' are not volatile, either. But now, C.6(20) comes into play. It categorical forbids all accesses to volatile objects which are not explicitely written in the program - at least, I would interpret it this way. This is a contradiction to the intent expressed in C.6(22/2b) and (22/2c). I have the impression that this overshoots the mark. In C.6(20a...i) in the AARM show a number of optimazations which must be prevented for volatile objects. These rules are needed, but I see no reason - to forbid the access to bits outside the object, but residing inside the same smallest accesible unit - to forbid the extra read to the same smallest accesible unit which is needed for a read-modify-write cycle In the very rare cases where the extra read would cause unintended side effects (e. g. registers, where a read triggers an action), one should simply not write partial write operations, since it is clear that they cannot be realized without a read-modify-write. I other than this special case, the extra read doesn't harm. Since existing compilers (at least GNAT, I must admit that I don't know what others do) are working this way, I would suggest to relax the requirements of C.6(20) as described before. I don't generally like the idea to adopt paradigms from the 'C' world, but the paragraph about 'volatile bit fields' in the AAPCS (I quoted that paragraph in my last mail) looks well-thought-out. > > I still think it would be very useful if it would be possible to > > directly access single fields of a record which is e.g. mapped to an > > IO register. > > Lots of things are "very useful" but don't make much sense in the > grand scheme of Ada. We don't want to introduce incompatibility > specifically for this relatively unlikely case. It is not unlikely. In typical microcontroller programs, there may be such operations in about every third line. Ada was always advertised as suitable for low level programming as well as for large software systems. It seems, that the low level aspect was not as visible as it becomes now with the available of Ada implementations for microcontrollers. > For instance, changing the rules so these components aren't volatile > would almost certainly cause other problems with the language, and > probably cause problems for existing users who don't have this > particular size issue. > > I could imagine some special "notwithstanding" rules for reads of > atomic components, but I don't think that those alone would be very useful. > > OTOH, having a special rule for writes of atomic components is > madness, IMHO. Such a rule would necessarily require a > read-before-write, which would mean that C.6(20) would have to be > repealed for them. I think that C.6(20) is the cornerstone of the C.6 > rules. One reason is that it means that you can access write-only > registers purely in Ada. If you have an implicit read-before-write, > that would fail on hardware with write-only device registers. You > could say "users with write-only registers ought to avoid using > components", but that just means that we'd be swapping one problem for another nearly as likely problem. I indeed would say that. That way round it looks more logical, because it is obvious, that is really impossible to write components smaller than/not a multiple of an adressable unit if a register is not readable, because no one (including a compiler) can know what to put in the 'other' bits. > Now, it's possible that my thinking on C.6(20) is not shared by the > rest of the ARG. In that case, C.6 needs a pretty significant > overhaul, and that would necessarily be incompatible (at least > formally) by eliminating guarentees. (I've not heard anyone express > the opinion that C.6 needs such a overhaul, but I could be wrong.) > > To me, this is an area where abstraction is a bad idea, mainly because > there are as many different rules for accessing hardware devices as > there are devices. We want Ada to be able to manage as many of them as > possible, and having volatile objects be WYISIWYG makes that possible. > > > It is an adequate level of abstraction, because it reflects the > > intent of the > > programmer: He don't want to do a read-modify-write cycle (which > > would be explicit if one has to use temporary variables), he simply > > wants to access a single field of the record. The read-modify-write > > cycle is only a means to an end. Of course, the programmer should > > know that a read-modify-write cycle is generated - but someone who > > does this kind of low-level programming knows anyways that there is > > no other way of doing these accesses. > > This I quite doubt; you're assuming that programmers are thinking at > this very low level when writing this code. I'm not assuming it, I'm quite confident of it. It may better to introduce a 'successful' in the sentence: 'Someone who successfully does this kind of low- level programming...' To succesful use the integrated peripherals of a microcontroller, you need to know in detail how they work internally and what every bit in the registers does. Arduino & Co. are spreading the illusion that this knowledge isn't needed anymore, but that doesn't work except for very simple applications. If you have this level of knowledge, than it is a trivial fact that accesses to fractions of a adressable unit must result in a read-modify-write cycle. > But the entire reason for using Ada > is to get away from that level (else a small assembler subprogram > would probably be better) Given the frequency of occurence of such operations, this could be translated to "don't use Ada, write your program in assembly language"... And I definetely want to get away from _that_. > I agree here. I would expect that writing a single component would be > fairly rare, since usually one has to set most/all of the status bits > to trigger an I/O. An aggregate makes more sense in that case. My experience is the other way round. Writes to multiple bits / fields occur mainly in initialization code. Often, in that code registers are written as a whole. If only some bits / fields need to be written, this can often be replaced by a write of the entire register, since the values of the remaining bits are usually known - this could be used as a workaround for the non- existing partial aggregates. During the 'normal' execution of the program (after finishing the initialization part), most accesses affect only single bits or single fields. Examples: - A bit is written to start an A/D-conversion or a serial transmission or... - A bit is written to enable/disable a specific interrupt source - A bit is written to reset a counter - A 3-bit field is written to select one (of eight) analog inputs - A bit is read to see if the A/D converter is ready ... That multiple fields must be accessed at the same time, is quite rare. More often a specific sequence of accesses is required. > The question of reads of components is more interesting. I didn't want > to go through the headache of defining rules specifically for that > case (I don't think such rules could be in the upcoming Corrigendum, > while a purely Legality aspect might make it in). And I didn't think > they were particularly useful without some sort of write rule. But > perhaps it's not going to make the Corrigendum anyway, and maybe > they're useful enough, in which case more expansive rules might work. > > Anyway, stay tuned... Perhaps we should separate the two loosely coupled topics? 1. How can components smaller than the adressable unit be accessed?/should that be allowed at all? 2. How can the value of System.Storage_Unit be overridden for a specific object? (to guarantee that inappropriate instructions are never generated for accesses to that object, in case that the hardware doesn't allow the same access width for this object like in System.Storage_Unit globally specified) When i wrote my orignal comment/proposal (basially topic 2.), I didn't realize that 1. could be a question at all - perhaps because with GNAT it simply worked as I expected it to work - with the exception of the special case when byte accesses are not allowed anywhere. **************************************************************** From: Randy Brukardt Sent: Monday, October 13, 2014 9:15 PM > I don't think I have a reason to complain if the "effect of the > program" is really the same - then it doesn't matter how it is > realized. As you said, it cannot be tested anyways. My definition of > the "effect of the program" > includes all side effects and also the temporal order of actions. It > certainly does not include the exact timing - as far as I know no high > level language claims to give control over this. But you can complain if the operation accesses bits other than those that are part of the object (which is the point in question). That's not really part of the "effect" of the program in Ada terms, but it still would violate the Implementation Advice. But only when you access the *whole* object. > > > > That's the bug: a composite volatile variable that must be > > > > accessed in a fixed size can only be accessed as a whole, never > > > > as an individual component. The requirement C.6(20) and the > > > > advice > > > > C.6(22/2) together mean > > > > that the program source must not attempt such an operation (if > > > > the read/write size actually matters). > > I still find the combination of C.6(20) and C.6(22/2) very confusing: > C.6(22/2) requires that "exactly the bits of the object and no others" > should be accessed _only_ if the size of the volatile object is a > multiple of System.Storage_Unit. > C.6(22/2b) and (22/2c) in the AARM make clear that when this condition > is not met (the size of the object is not a multiple of > System.Storage_Unit), it is allowed to access bits outside of the > object. It is not said that then the access isn't allowed at all. > In C.6(22/2b) not only bit-mapped record component, but also packed > array components are mentioned. At least in the case of a packed array > the 'neighbors' of a volatile component are inevitably volatile, too. > So it isn't said that the access is only allowed if the 'neighbors' > are not volatile, either. The discussion in C.6(22.2b) and C.6(22.2c) only make sense for reads, IMHO. C.6(20) is a requirement and as such it has priority over any advice. So my opinion is that writing volatile components that would *require* a read-before-write cannot be allowed. But that's MY opinion, not the opinion of the whole ARG. If read-before-write is allowed, then IMHO C.6(20) is garbage and needs serious changes to make it clear that read-before-write is allowed (which means that safe access write-only things cannot be guarenteed). ... > In the very rare cases where the extra read would cause unintended > side effects (e. g. registers, where a read triggers an action), one > should simply not write partial write operations, since it is clear > that they cannot be realized without a read-modify-write. I other than > this special case, the extra read doesn't harm. Now you've destroyed your own argument: you're saying that the current situation is OK in special situations. But YOUR special situation (which is pretty unlikely as well) is important enough to have all kinds of extra mechanism in the language. I don't think that's going to fly. ... > > Lots of things are "very useful" but don't make much sense in the > > grand scheme of Ada. We don't want to introduce incompatibility > > specifically for this relatively unlikely case. > > It is not unlikely. In typical microcontroller programs, there may be > such operations in about every third line. > Ada was always advertised as suitable for low level programming as > well as for large software systems. It seems, that the low level > aspect was not as visible as it becomes now with the available of Ada > implementations for microcontrollers. I would not have expected most devices to be that brain-damaged. In my (very limited) experience, such devices were pretty rare (most were properly memory-mapped); what was common was write-only ports (and reads that trigger events). But you only want to fix one of those three cases, and your suggestion would make the situation much worse for those other cases (since it would look like it would work, but it wouldn't). ... > > But the entire reason for using Ada > > is to get away from that level (else a small assembler subprogram > > would probably be better) > > Given the frequency of occurence of such operations, this could be > translated to "don't use Ada, write your program in assembly > language"... > And I definetely want to get away from _that_. I'd put that more into the category of "don't use brain-damaged devices". :-) But most of these effects happen because the devices have side-effects when read or written or both -- and you don't want to address that. I think one has to address both situations. Moreover, I don't want programmers to have to understand much more than the layout and address of the device that they're programming. Otherwise, you're still requiring "high-priests" to program anything embedded, which is silly. (I certainly didn't understand any of the stuff you're talking about when I programmed devices back in the day.) > Perhaps we should separate the two loosely coupled topics? > > 1. How can components smaller than the adressable unit be > accessed?/should that be allowed at all? This is only interesting for writes. > 2. How can the value of System.Storage_Unit be overridden for a > specific object? (to guarantee that inappropriate instructions are > never generated for accesses to that object, in case that the hardware > doesn't allow the same access width for this object like in > System.Storage_Unit globally specified) System.Storage_Unit has nothing to do with how an memory is addresses. I don't know why you keep focusing on that. If a machine has bit instructions, they probably would be used but that isn't suddenly going to cause Storage_Unit to be 1. You're interested in having more control over instruction selection, and that's it. > When i wrote my orignal comment/proposal (basially topic 2.), I didn't > realize that 1. could be a question at all - perhaps because with GNAT > it simply worked as I expected it to work - with the exception of the > special case when byte accesses are not allowed anywhere. As best as I can tell, there's little appetite for doing anything here. I at least wanted to do something so that a programmer with this problem would know that they have to write whole object accesses. I think the odds that we would go any further are slim (particularly as it's highly unlikely that the third-party code generators that most vendors use could be restricted more than they already are - we already had to weaken Volatile to make it closer to the C definition for this reason). **************************************************************** From: Matthias Richter Sent: Sunday, October 19, 2014 8:01 AM > The discussion in C.6(22.2b) and C.6(22.2c) only make sense for reads, IMHO. > C.6(20) is a requirement and as such it has priority over any advice. That a requirement has priority over an advice clearly makes sense. But doesn't C.6(20) forbid reads of bits belonging to other components ('neighbours'), too (not only writes)? So the questions remains if there is a possibility to use the permissions of C.6(22) or if they always violate C.6(20). > > In the very rare cases where the extra read would cause unintended > > side effects (e. g. registers, where a read triggers an action), one > > should simply not write partial write operations, since it is clear > > that they cannot be realized without a read-modify-write. I other > > than this special case, the extra read doesn't harm. > > Now you've destroyed your own argument: you're saying that the current > situation is OK in special situations. But YOUR special situation > (which is pretty unlikely as well) is important enough to have all > kinds of extra mechanism in the language. I don't think that's going to fly. It depends on the point of view. You have another rating of what is 'very common' and what is 'unlikely' than me. I'm afraid we wont reach a consensus on that. > I would not have expected most devices to be that brain-damaged. In my > (very > limited) experience, such devices were pretty rare (most were properly > memory-mapped); I would prefer devices with a properly, uniformly memory-mapped I/O space, too. I wouldn't call the others brain-damaged ;-) but their design is clearly ugly. They are a outcome of design reuse where things better would have been not reused. I can't say if most devices are that way, but it is not an insignificant amount. I first stumbled on this problem using a ARM Cortex-M microcontroller made by that manufacturer which sells most of them (of ARM Cortex-M). It is very likely that Cortex-M will become (if they aren't already) the predominant family of low- and mid-range microcontrollers. Many of the existing non-ARM families will probably disappear from the market. So it seems unwise to ignore these not-so-nice designed devices. I certainly don't know them all, but there are ARM processors from other vendors which suffer from the same problem. > what was common was write-only ports (and reads that trigger events). Even manufacturers of brain-damaged devices don't put multiple fields in write-only registers. (There may be pathologic cases, but these are _very_ rare...) Write-only registers which contain a single value and therefore are always written as a whole are quite common - but in this case the discussion is irrelevant anyway. > I'd put that more into the category of "don't use brain-damaged devices". That's not how things work (perhaps in an ideal world...). Criteria to select microcontrollers are - a set of integrated peripherals matching the needs of the application - performance - price - power consumption - availability - ... Tools have simply to be there. And they have to work as expected... > Moreover, I don't want programmers to have to understand much more > than the layout and address of the device that they're programming. > Otherwise, you're still requiring "high-priests" to program anything > embedded, which is silly. (I certainly didn't understand any of the > stuff you're talking about when I programmed devices back in the day.) That's maybe a philosphical question. I prefer programmers with a sound background in electrical engineering for this kind of tasks. They of course have the knowledge how the underlying hardware works; I would never call them 'high-priests'. If they wouldn't have this knowledge, this would transfer the problem to the hardware developer, who would have to give very detailed instructions what to do with any and every bit. In my experience, the interfacing between the hardware developer and the software developer works much more smoothly if both have some knowledge of 'the other side'. In smaller projects, it may be the same person anyway. > > 2. How can the value of System.Storage_Unit be overridden for a > > specific object? (to guarantee that inappropriate instructions are > > never generated for accesses to that object, in case that the > > hardware doesn't allow the same access width for this object like in > > System.Storage_Unit globally specified) > > System.Storage_Unit has nothing to do with how an memory is addresses. > I don't know why you keep focusing on that. Somewhere in the compiler the smallest addressable unit which can be accessed regularly must be defined. I thought that 'System.Storage_Unit' reflects this definition. If that's wrong, then it's my fault. Then forget 'System.Storage_Unit' in this context, I meant the smallest addressable unit the compiler may use over the whole address space. > If a machine has bit instructions, > they probably would be used but that isn't suddenly going to cause > Storage_Unit to be 1. You're interested in having more control over > instruction selection, and that's it. Yes, exactly. Since the compiler doesn't know that the global definition isn't valid everywhere, I want to have means to override it for specific memory locations. Bit instructions are an interesting example: I have seen some architectures which have bit instructions. But none of them allows bit accesses over the whole address space. Do you know a architecture where bit instructions may be used everywhere? If there a no means to control the usage of bit instructions, a compiler (for whatever language) has only two possibilities: 1. Don't use bit instructions at all (if a programmer really wants to use them, he has to insert assembler code or some special macros) 2. Know where the bit instructions are allowed and use them there I know of at least one example of the second variant. AVR processors allow bit-set and bit-clear instructions in a very limited address range. GCC (and as far as I know, IAR C) use some optimizer magic to convert explicitely written read-modify-write cycles to bit-set respective bit-clear instructions. I've never tested it, but I assume that GNAT for AVR does this, too. However, this second variant seems not to be a appropriate solution of the problem with byte instructions on ARM processors . There it is not a fixed, only family-dependent, address space where the byte instructions ar not allowed, rather it is dependent on the specific type and its configuration of on-chip peripherals. > (particularly as it's highly unlikely that the third-party code > generators that most vendors use could be restricted more than they > already are - we already had to weaken Volatile to make it closer to > the C definition for this reason) These code generators would have to be more restricted if they would follow C.6(20) with all its consequences... **************************************************************** From: Robert Dewar Sent: Saturday, March 21, 2015 8:57 AM We have had continued problems with people trying to use pragma Atomic for mapping memory-mapped variables. There are two problems a) Atomic does not guarantee that read/write references use a single instruction to access *ALL* the bits of the object, as is often expected and required. It is allowable (and most certainly GNAT takes advantage of this) to access e.g. a single field in an atomic record with a byte load instruction even if the record is a word. b) The atomic synchronization semantics are inappropriate in this case. We just implemented pragma/aspect Volatile_Full_Access which is just like Volatile except that it guarantees that every read/write access to an object with this aspect reads or writes all the bits in a single instruction. I suspect this new aspect is really what 99% of programmers trying to do memory mapped I/O really want, and may be worth considering for addition to the language. **************************************************************** From: Randy Brukardt Sent: Saturday, March 21, 2015 8:29 PM > We have had continued problems with people trying to use pragma Atomic > for mapping memory-mapped variables. This problem was reported to Ada-Comment last July. AI12-0128-1 was created to (potentially) address it. > There are two problems > > a) Atomic does not guarantee that read/write references use a single > instruction to access *ALL* the bits of the object, as is often > expected and required. It is allowable (and most certainly GNAT takes > advantage of this) to access e.g. a single field in an atomic record > with a byte load instruction even if the record is a word. It's more than "allowed". It's strongly recommended, because the components of a volatile object are also volatile, and thus the Implementation Advice C.6(22/2) applies to it: A load or store of a volatile object whose size is a multiple of System.Storage_Unit and whose alignment is nonzero, should be implemented by accessing exactly the bits of the object and no others. This is advice in name only; the only reason we didn't make it a requirement was that we couldn't find any way to say that normatively. In this case, we want to turn this off for components. > b) The atomic synchronization semantics are inappropriate in this case. Right. That I wasn't able to convince the person who reported it, so I didn't make much effort to bring this AI to the attention of the full ARG. > We just implemented pragma/aspect Volatile_Full_Access which is just > like Volatile except that it guarantees that every read/write access > to an object with this aspect reads or writes all the bits in a single > instruction. On what does it work? Only on an object, or anywhere that Volatile can be used? The latter seems to be problematical, since the definition of volatile is recursive. I had suggested an aspect specific to volatile objects (in addition, as opposed to making another kind of volatile). Of course, I don't think anyone else ever commented on that. Either way works. > I suspect this new aspect is really what 99% of programmers trying to > do memory mapped I/O really want, and may be worth considering for > addition to the language. Something certainly should be added to the language. Your suggestion is probably better than mine (which was aimed at not making compiler writers do much work), if we can figure out some way to turn off C.6(20) and C.6(22/2) for the components of such an object. In the case of C.6(20): The external effect of a program (see 1.1.3) is defined to include each read and update of a volatile or atomic object. The implementation shall not generate any memory reads or updates of atomic or volatile objects other than those specified by the program. This doesn't make any sense for volatile bit-mapped components, whether or not exact size is being required. In such a case, a read is necessary before a write in order to figure out the bits not being changes, but that violates a strict reading of the wording. One could argue that the Dewar rule applies here (a read before a write is the only sensible way to implement such a thing, thus it must be allowed), but we clearly don't want to allow that for atomic objects, for the reason discussed in AARM C.6(20.a) -- "active" memory. So we ought to have at a minimum a To Be Honest note on the lines of: For an volatile-but-not-atomic object, multiple (divisible) operations may be needed to implement a single read or write in the program. That might include one or more reads in the case of writing some by not all bits of a machine scalar. OTOH, for an atomic object, all operations have to be indivisible, so the operations should do exactly what is specified in the code. Going to be fun. ;-) **************************************************************** From: Steve Baird Sent: Friday, October 9, 2015 5:56 PM This AI was part of my Madrid homework. [This is version /04 of the AI.] This is a preliminary attempt so that we have something concrete to discuss and (most likely) modify. See also the GNAT-defined Volatile_Full_Access pragma and aspect. **************************************************************** From: Randy Brukardt Sent: Friday, October 9, 2015 7:00 PM Steve and I had an extensive private e-mail correspondence on this topic. Here are some thoughts from the correspondence not covered in the AI: (1) Since the problem is accessing hardware registers, it seems strange to separate the solution from Volatile. In all real uses, Volatile would be required as well as Whole_Object_Reference. (Why would anyone care how memory is read or written when the compiler can arbitrarly change it anyway??) And in that case, the requirements of Whole_Object_Reference would conflict with the requirements on a Volatile object (since components of a volatile object are also Volatile). In particular, the read-modify-write cycle violates C.6(20), and size violates IA C.6(22/2). That suggests an integrated solution (like Robert's last aspect, Volatile_Full_Access) might work better. (2) C.6(20) is broken for non-aligned or not multiple of storage-element sizes components, regardless of this new AI. Steve views that as an issue for a different AI, but I disagree. We surely want C.6(20) to apply to objects that have are both Whole_Object_Reference and Volatile, but the read before a write appears to be banned by the text. Since we have to fix it here in order to actually solve the problem, we should fix here for all cases (especially as they are all related to memory-mapped register access). I made a stab updating the C.6(20) wording: The external effect of a program (see 1.1.3) is defined to include each read and update of a volatile or atomic object. The implementation shall not generate any memory reads or updates of atomic or volatile objects {whose size is a multiple of the size of a machine scalar and is aligned properly for that machine scalar} other than those specified by the program. {A similar rule applies for reads or writes of an entire Whole_Object_Reference object.} {For other volatile objects, an update specified by the program may generate a read before an update, but no other updates or reads shall be generated other than those specified by the program.} AARM Ramification: "Specified by the program" includes reads or updates implicitly specified by the program. For instance, the default initialization of an object without an explicit initialization is considered specified by the program. Similarly, reads and updates caused by the (implicit) finalization of an object are considered specified by the program. I'm not sure this covers everything needed. (Note that there is a Bairdian note applied to the end, as Steve wondered about volatile controlled types, and that led me to wonder about implicit reads and writes in general. They certainly ought to be mentioned, if the types involved aren't restricted.) Steve noted that we could restrict what types are allowed to be Volatile, but that isn't the model that was used for Ada 95 and Ada 2012, so such a restriction would be incompatible with existing practice. [Note: I used "machine scalar" rather than "storage element" to account for machines on which the storage element is not directly addressable. The U2200 I worked on in the late 1990s was like that; while it had byte pointers (for the C compiler), they didn't translate into usual machine instructions - they implicitly used a read/modify/write set of instructions, as the actual addressable unit was a 36-bit word.] (3) One could argue that C.6(20) is not broken because the Dewar rule says that read-before-write is implied in the source code for components that aren't directly modifiable by hardware instructions. In that case, there is no problem at all, as the only other issue is C.6(22/2), and as Implementation Advice, implementations can ignore it when it causes issues. I reject this line of thinking both because the IA C.6(22/2) really was intended as a requirement (but we thought it was too undefined to write that way) and because it is too misleading to programmers to have the text of the Standard say something other than the actual rules. (4) It's unclear to me as to whether a Whole_Object_Reference object should be allowed to be Atomic. As Steve notes in his write-up, the read-modify-write cycle would not be atomic as a whole, just the read and the write. That's misleading to a programmer, which probably would expect an indivisible update. Making the whole cycle indivisible would almost certainly require expensive locking (especially on a multicore machine). But perhaps it is OK to leave this to the individual implementations?? **************************************************************** From: Steve Baird Sent: Saturday, October 17, 2015 11:18 PM In this version [this is version /05 of the AI - Editor] we no longer define a new aspect. Instead, > All reads of or writes to any non-atomic subcomponent of an atomic object > shall be implemented by reading and/or writing all of the nearest > enclosing atomic object. ****************************************************************