CVS difference for ai12s/ai12-0348-1.txt
--- ai12s/ai12-0348-1.txt 2020/01/29 02:54:25 1.4
+++ ai12s/ai12-0348-1.txt 2020/01/30 01:56:33 1.5
@@ -947,3 +947,110 @@
feel free to wack me with any typos I managed to add to the above.)
****************************************************************
+
+From: Brad Moore
+Sent: Tuesday, January 28, 2020 9:27 PM
+
+> After discussion with Steve and Tucker, we agreed that we still need a
+> Bounded Error for a non-associative reducer. We then crafted the
+> following
+> text:
+>
+> For a parallel reduction expression, it is a bounded error if the
+> reducer subprogram is not associative. That is, for any arbitrary
+> values of subtype Value_Type A, B, C and a reducer function R, the
+> result of R (A, R (B, C)) should produce a result equal to R (R (A,
+> B), C)). The possible consequences are Program_Error, or a result that
+> does not match the equivalent sequential reduction expression due to
+> the order of calls on the reducer subprogram being unspecified in the
+> overall reduction. Analogous rules apply in the case of a reduction procedure.
+
+...
+
+> We do that with a Bounded Error similar to the original error.
+>
+> For example, the following reduction is well-defined:
+>
+> Str : String := ...
+> type Hash_Type is mod <some-prime-number>;
+>
+> function Hash (Accum, Value : in Hash_Type) return Hash_Type is
+> (2*Accum + Value);
+>
+> ... := [for I in Str'range => Character'Pos(Str(I))]'Reduce (Hash, 0);
+>
+> The reduction here creates a hash value for Str. But the similar
+> parallel reduction triggers the Bounded Error:
+>
+> ... := [parallel for I in Str'range =>
+> Character'Pos(Str(I))]'Reduce (Hash, 0);
+>
+> This makes the result (ahem) hash, since Hash is not associative:
+>
+> Hash (A, Hash (B, C)) = 2*A + 2*B + C -- Unintended result!
+> Hash (Hash (A, B), C) = 4*A + 2*B + C -- Intended result.
+>
+
+Another example that is more likely to be seen in practice:
+
+Technically floating point operations such as addition are not associative
+either, though still useful for parallel execution. Non-associativity can
+sometimes be in the eye of the beholder. If you are looking for a precise
+bit pattern for a hash lookup, you might care more about the
+non-associativity, but if you only looking for an approximation, the
+non-associativity of the floating point can be overlooked.
+
+Even the approximation has tolerance limits. If I use 32 bit floating point
+addition of floating point values from 1.0 to N, I get associative matching
+output for values of N up to about 5700.0.
+After that errors start to appear, but still a reasonable approximation. For
+values of N around 100_000_000 however, the parallel results are complete
+garbage.
+
+Using 64 bit floating point however, gives associative output for values of
+N at least as high as 4_000_000_000.
+(I haven't hit the limit of non-associativity there yet, the processing time
+starts to get too long to wait for the results.)
+
+This illustrates the bounded error proposed in Randy's submission.
+Whether the error is acceptable depends on the needs of the programmer, and
+the bounds of the calculation.
+
+I agree that a bounded error is the better approach than a user note in the
+wording. I cant see a compiler being able to always decide where the line
+should be drawn for what is an acceptable level of associativity for cases
+such as this.
+
+****************************************************************
+
+From: Brad Moore
+Sent: Tuesday, January 28, 2020 10:06 PM
+
+> Even the approximation has tolerance limits. If I use 32 bit floating
+> point addition of floating point values from 1.0 to N, I get
+> associative matching output for values of N up to about 5700.0.
+> After that errors start to appear, but still a reasonable
+> approximation. For values of N around 100_000_000 however, the
+> parallel results are complete garbage.
+>
+> Using 64 bit floating point however, gives associative output for
+> values of N at least as high as 4_000_000_000.
+> (I haven't hit the limit of non-associativity there yet, the
+> processing time starts to get too long to wait for the results.)
+
+For the record,
+
+Actually, for the 4_000_000_000 test, I was using 128 bit floating point
+(GNAT's Long_Long_Float), not 64 bit (GNAT's Long_Float).
+
+****************************************************************
+
+From: Tucker Taft
+Sent: Tuesday, January 28, 2020 11:11 PM
+
+I am not sure floating point should be rolled into this. That is more of a
+"normal" order dependence, and other standards like OpenMP see this as a
+numerical analysis problem, rather than a real "bug". I would be reluctant
+to allow a compiler to raise Program_Error in this case.
+
+****************************************************************
Questions? Ask the ACAA Technical Agent