CVS difference for ais/ai-00285.txt

Differences between 1.4 and version 1.5
Log of other versions for file ais/ai-00285.txt

--- ais/ai-00285.txt	2002/10/01 03:08:54	1.4
+++ ais/ai-00285.txt	2003/01/03 00:01:36	1.5
@@ -896,4 +896,1204 @@
 
 *************************************************************
 
+From: Michael F. Yoder
+Sent: Monday, October 21, 2002  10:58 AM
+
+This is one of the items on my homework list.
+
+UTF = UCS Transformation Format. UCS = Universal Multiple-Octet Coded
+Character Set. I guess the MOC is silent.  :-)
+
+UTF-8 encodes 31-bit values as 8-bit values, as follows.
+
+0xxxxxxx                     encodes itself (the coding is ASCII-compatible)
+110xxxxx 10Y                 encodes xxxxxY where Y stands for yyyyyy
+1110xxxx 10Y 10Z             encodes xxxxYZ
+11110xxx 10Y 10Z 10U         encodes xxxYZU
+111110xx 10Y 10Z 10U 10V     encodes xxYZUV
+1111110x 10Y 10Z 10U 10V 10W encodes xYZUVW
+
+The octets 11111110 and 11111111 aren't used in the encoding. So,
+excepting these 2, octets starting with 11 are headers, those starting
+with 10 are trailers, and those starting with 0 are singletons.
+
+It's forbidden to use the redundant encodings (you must use the shortest
+encoding allowed). There are security reasons for this, aside from the
+fact that doing so breaks the string search property mentioned below.
+
+The encoding is self-synchronizing: if you start in the middle of a
+string of octets, you skip octets of the form 10xxxxxx to get to the
+next start of character.
+
+If the encoding is proper, string searches for an encoded pattern within
+an encoded string will work as desired to yield all occurrences of the
+pattern. (For case-folded searches and the like this only works if the
+string is mapped before being converted to UTF-8.)
+
+*************************************************************
+
+From: Robert Dewar
+Sent: Monday, October 21, 2002  11:03 AM
+
+Is anyone using UTF-8 encoding with Ada. We have some customers using wide
+character encodings but none to our knowledge uses UTF-8.
+
+*************************************************************
+
+From: Robert A. Duff
+Sent: Monday, October 21, 2002  11:43 AM
+
+> It's forbidden to use the redundant encodings (you must use the shortest
+> encoding allowed). There are security reasons for this,...
+
+I'm curious: why is that?  (Not quite curious enough to go RTFM.  ;-))
+
+>... aside from the
+> fact that doing so breaks the string search property mentioned below.
+
+Yes, I understand that.
+
+*************************************************************
+
+From: Michael F. Yoder
+Sent: Monday, October 21, 2002  1:15 PM
+
+This problem is one my previous employer is having to deal with.
+Basically, it's that redundant encodings can be used to sneak things
+past filters if the redundant encodings aren't rejected; if redundant
+encodings are allowed, writing (say) a regular expression that will
+match exactly all possible encoded forms is a pain, is error-prone, and
+is probably significantly slower to check.
+
+Here's a contrived case. A program reads a command, and if it's the
+special command 'shazam' it checks the user's authorization; otherwise
+it passes on the command unmodified, because all other commands are
+safe. If there's a redundant encoding of 'shazam' that the filter
+misses, an unauthorized user can bypass the checking if he can arrange
+to supply that encoding.
+
+*************************************************************
+
+From: Michael F. Yoder
+Sent: Thursday, October 24, 2002  5:46 PM
+
+This is the easy part of my homework. The identifier character ranges
+are defined in terms of multiple character categories (see below), so I
+can't get the harder part without a little coding.
+
+This is using Unicode version 3.2.
+
+A "space" is itself a normative category.  It is anything in the range
+U+2000 to U+200B, plus 5 other scattered characters.
+
+A "separator" is any space plus the two characters "Line Separator"
+U+2028 and "Paragraph Separator" U+2029. These are each in a normative
+category containing just 1 value.
+
+A "decimal digit" is itself a normative category. There are 25 ranges of
+these, 23 including the digits 0 through 9 and 2 with only the digits 1
+through 9. (These two scripts use the ASCII zero rather than encoding a
+separate one.) Five of these ranges are above U+FFFF, that is, out of
+the BMP (their character descriptions all start with "mathematical").
+The digits 1 through 9 in these scripts don't in general look much like
+our 1 through 9.
+
+The rules for identifiers say (I'm condensing and interpreting) that the
+syntax for identifiers should start with their basic definition and
+fiddle it as appropriate to include extra characters (for Ada, that
+means underscore). Their basic definition is
+
+   identifier ::= id-start { id-start | id-extend }
+
+id-start is any letter (which come in 5 subcategories) or a "letter
+number." There are a lot of letters outside the BMP, including the large
+range "CJK Ideograph Extension B."
+
+id-extend is decimal digits plus nonspacing marks, spacing combining
+marks, connector punctuation, and formatting codes.
+
+*************************************************************
+
+From: Robert Dewar
+Sent: Thursday, October 24, 2002  7:19 PM
+
+I am completely confused, why are we discussing this eactly can you
+be clear as to the goals of this discussion?
+
+*************************************************************
+
+From: Randy Brukardt
+Sent: Thursday, October 24, 2002  2:50 PM
+
+I know I don't count, :-)
+but I've had several requests to extend my spam filter to support UTF-8
+encodings. Because I'm not asking for any money for the filter, and I
+haven't had any signficant amount of UTF-8 mail, I haven't done anything
+about it yet. But it seems likely that I will need to do this at some point
+(I've seen occassional UTF-8 encoded mail, but not enough good mail that
+handling it manually is a problem.)
+
+*************************************************************
+
+From: Robert Dewar
+Sent: Thursday, October 24, 2002  4:29 PM
+
+Oh sure, UTF-8 encoded spam is common indeed, but that was not what I was
+talking about (unless you have some spam messages written in Ada source code :-)
+
+*************************************************************
+
+From: Randy Brukardt
+Sent: Thursday, October 24, 2002  4:59 PM
+
+I think you misunderstand. I have written an anti-spam plugin for the IMS
+mailserver that I use. It is written in Ada, of course, and I've had
+requests for it to be able to handle UTF-8 encoded mail. For me, it's fine
+to treat such mail as all spam, but that is not true for some of the other
+users of it. (I've made it available to the community of IMS users, as they
+have made many useful plugins available that I have been using for years.)
+
+In order to properly support UTF-8 mail, I'd need at least to convert the
+search patterns (in Latin-1, of course) into UTF-8. I'd also need to verify
+that the rules that Mike noted are followed (a common trick of spammers is
+to violate basic encoding rules, as most decoders don't check. But the
+illegal encodings tend to get ignored by filters, because they don't match
+exactly. That was one of the prime reasons I wrote the plugin in the first
+place, because a lot of spam is now coming encoded in one way or another,
+and thus is not picked up by a plain text scan).
+
+*************************************************************
+
+From: Robert Dewar
+Sent: Thursday, October 24, 2002  7:17 PM
+
+Oh! I was confused then, I thought this was something to do with Ada.
+
+*************************************************************
+
+From: Randy Brukardt
+Sent: Thursday, October 24, 2002  7:46 PM
+
+Of course it has to do with Ada. You asked "Is anyone using UTF-8 encoding
+with Ada." And I answered that I have an Ada program that needs to process
+UTF-8 text (but doesn't yet). And I tried to explain what the program is and
+why it needs to process UTF-8 text and what support from Ada would be
+valuable.
+
+Perhaps I should have just answered your original question "Yes"? :-)
+
+*************************************************************
+
+From: Robert Dewar
+Sent: Thursday, October 24, 2002  8:09 PM
+
+Sorry, when I meant "using UTF-8 encoding with Ada", I was talking about
+language features for wide character representation.
+
+The fact that your program is in Ada does not seem to be particularly
+informative. I am completely confused here, what ARG-related language
+problem is this thread addressing?
+
+*************************************************************
+
+From: Randy Brukardt
+Sent: Thursday, October 24, 2002  8:32 PM
+
+As I recall, one of the facets of UTF-8 support in Ada would be the
+equivalent of Ada.Characters.Handling for UTF-8 represented Strings. Those
+operations would be valuable for this application, particularly
+To_Wide_String (UTF_8_String) or To_UTF_8_String (String). A UTF-8 Text_IO
+would also be valuable, although I'd find that overkill for this application
+(usually the text has to be decoded to UTF-8 from some 7-bit representation
+anyway).
+
+I'm not sure where else UTF-8 would appear in the standard. Source
+representation and external file representations are outside of the scope of
+the standard. The regular string operations seem to work for most (all?)
+operations. Everything else seems to already be covered by the existing wide
+character support.
+
+*************************************************************
+
+From: Robert Dewar
+Sent: Thursday, October 24, 2002  8:45 PM
+
+Well, harmless I suppose, but I doubt worth the effort. Again, I would
+generate packages on the basis of packages that exist, have proved useful
+and are actually widely used. It seems a mistake to get into the "here's
+a neat idea for a package that would help with something I happen to be
+doing".
+
+*************************************************************
+
+From: Michael F. Yoder
+Sent: Thursday, October 24, 2002  5:46 PM
+
+>  I am completely confused here, what ARG-related language
+>problem is this thread addressing?
+
+Kiyoshi Ishihata stated at the last meeting that there was in interest
+in some countries in being able to write programs as much as possible in
+native languages, the primary deficit in this regard being that
+identifiers are entirely in Latin-1 characters. He didn't specify which
+countries to my recollection, but Japan, Russia, China, and India are
+obvious cases where the commonly used scripts are disjoint from Latin-1.
+
+The information being supplied is exploratory in nature: the idea is to
+find out how hard it would be to extend existing compilers so as to
+satisfy all the national groups at once, and whether and to what extent
+the ARG should be involved in specifying standards for such extensions.
+
+There was a separate issue involving the fact that ISO 10646-n (I forget
+what n is) now has mapped characters outside the BMP. This had to
+happen, given that the code now maps some 70,000 Han characters.
+
+*************************************************************
+
+From: Robert Dewar
+Sent: Thursday, October 24, 2002  8:54 PM
+
+Well I would just allow arbitrary wide characters in identifiers, why not,
+it does not cause any problems. GNAT has implemented an option for this
+for ever. I would specify that there is no upper/lower case equivalence
+in this case, since otherwise you get into a huge mess that is simply not
+worth the effort.
+
+*************************************************************
+
+From: Tucker Taft
+Sent: Thursday, October 24, 2002  10:10 PM
+
+I suggest you read the ARG minutes when they are available.  Kiyoshi
+indicated specifically that they wanted to restrict usage to
+characters that "make sense" as identifier characters.  I will admit
+I was in your camp that the simplest is to just allow anything.
+However, I will leave it to Kiyoshi to explain his reasoning.
+He certainly knows more than I do about the requirements.  You
+should perhaps discuss it direclty with Kiyoshi if you don't agree.
+
+Mike indicated that UTF-8 encoding makes it easy to support even
+very wide characters in identifiers, because it provides a canonical
+representation, as a stream of bytes.  We asked him to share his
+knowledge in this area, so we didn't all have to become experts in
+ISO-10646 to evaluate the implemenation issues in this area.
+
+*************************************************************
+
+From: Randy Brukardt
+Sent: Thursday, October 24, 2002  10:29 PM
+
+Here is my notes on the Wide_Character in identifiers issue, which will be
+turned into the minutes.
+
+"What about full source representation of the language in Wide_Character?
+Kiyoshi reports that there is a push in SC22 to allow full wide characters
+in identifiers.
+
+How do you define which characters are letters? How do you define case
+equivalence? Mike says just use "letter" in the character standard. But this
+is likely to be very complex in the compiler and in the run-time. Tucker
+suggests use anything out of row 00 be treated a letter. Kiyoshi says that
+would not be acceptable to Japan, which is preparing a standard for which
+characters are allowed in identifiers."
+
+*************************************************************
+
+From: Robert Dewar
+Sent: Friday, October 25, 2002  4:11 AM
+
+> I suggest you read the ARG minutes when they are available.  Kiyoshi
+> indicated specifically that they wanted to restrict usage to
+> characters that "make sense" as identifier characters.  I will admit
+> I was in your camp that the simplest is to just allow anything.
+> However, I will leave it to Kiyoshi to explain his reasoning.
+> He certainly knows more than I do about the requirements.  You
+> should perhaps discuss it direclty with Kiyoshi if you don't agree.
+
+I would leave such restrictions up to either local coding standards,
+enforced e.g. by ASIS tools, or enforced by compiler restrictions.
+Getting into what makes sense in different languages is way way out
+of scope (I speak as the former chair of the CRG, character issues
+are very difficult to deal with. In the context of the CRG work, we
+spent ages discussing the issue of whether E and E-acute should be
+equivalent in identifiers, and came to the conclusion that the answer
+might be different in different languages.
+
+There is no point in adding a huge national dependent mess here. Indeed I
+would consider in the ISO standard saying specifically that national bodies
+are welcome to devise local sub-standards for identifiers and character
+set requirements and leave it at that.
+
+I perfectly well understand where Kiyoshi is coming from. I am sure he feels
+as strongly that only certain characters be used as Jean Ichbiah felt about
+the E/E-acute issue. But it just is not practical for the international
+standard to get into the business of deciding what are and what are not
+useful identifier names in all the languages of the world, or even just
+for the P members :-)
+
+*************************************************************
+
+From: Robert Dewar
+Sent: Friday, October 25, 2002  4:16 AM
+
+OK, so great, very appropriate, there can be a Japanese National standard
+that specifies that for Ada compilers to meet this standard, there must be
+a mode in which identifiers are only allowed to contain bla bla characters.
+Other countries in the world are free to devise similar national standards
+but I fail to see why they should be a matter for an international standard.
+
+What would be marginally useful in the international standard would be to
+devise a general framework for those national standards, and make it clear
+that it is an acceptable thing for Ada compilers to implement one or more
+of these standards. Frankly I think that the standard already does that,
+but it would be fine to make it explicit. GNAT for example allows lots
+of localization of identifier characters sets, e.g. Latin-2, Cyrillic etc.
+
+*************************************************************
+
+From: Pascal Leroy
+Sent: Friday, October 25, 2002  6:54 AM
+
+> But it just is not practical for the international
+> standard to get into the business of deciding what are and what are not
+> useful identifier names in all the languages of the world...
+
+It has certainly never been the intent to have the ARG discuss the
+identifier characters for all the languages in the world.  However, there is
+an ISO working group in charge of developing and maintaining the ISO 10646
+standard, and the intent was to piggyback on the work done there.
+
+10646 defines precisely what is a character (and so yes, E and E-acute are
+distinct, as are uppercase A and uppercase alpha, even though they really
+look the same), what is a letter, a digit, how the uppercase/lowercase
+conversions work, etc.  I see no reason why the Ada standard couldn't use
+these definitions.  (And Mike gave us a feeling of what this would look
+like, and it doesn't seem unreasonably complicated to me.)
+
+Note that Java does exactly that, and defines letters and digits in a way
+which is derived from Unicode (itself a close approximation to 10646).  I
+don't see why Ada would lag behind in this area: it would not be a big
+implementation effort, and it would improve usability of the language.
+
+I don't buy the notion that national bodies have a role to play here (except
+of course that they probably want to influence 10646).  It's already hard to
+define one language standard and ensure that it's implemented with a minimum
+of consistency, I don't see how users or implementers could live with the
+coexistence of "Japanese Ada" and "Hebrew Ada" and "Russian Ada".
+
+Pascal
+
+PS: Note that the E vs. E-acute discussion is moot, since this is already
+settled by Latin-1 and yes, they are different.
+
+*************************************************************
+
+From: Robert Dewar
+Sent: Friday, October 25, 2002  7:55 PM
+
+> I don't buy the notion that national bodies have a role to play here (except
+> of course that they probably want to influence 10646).  It's already hard to
+> define one language standard and ensure that it's implemented with a minimum
+> of consistency, I don't see how users or implementers could live with the
+> coexistence of "Japanese Ada" and "Hebrew Ada" and "Russian Ada".
+
+Well GNAT implements lots of different localized character sets, and noone
+seems to have dropped dead :-)
+
+*************************************************************
+
+From: Robert A. Duff
+Sent: Friday, October 25, 2002  9:13 AM
+
+> Kiyoshi Ishihata stated at the last meeting that there was in interest
+> in some countries in being able to write programs as much as possible in
+> native languages, the primary deficit in this regard being that
+> identifiers are entirely in Latin-1 characters.
+
+Yes, but it was also mentioned at the meeting that SC22 is trying to get
+programming languages to do something-or-other related to this.
+I.e. allow 31-bit characters in identifiers, and have some uniformity
+across programming languages about which characters are allowed in
+identifiers.  I suppose WG9 is supposed to "obey" SC22 on this point?
+
+By the way, let's mention the AI number being discussed in these
+messages, so we don't get the "What the heck are you talking about?"
+kinds of messages from Robert or others who might have missed part of
+the discussion.  ;-)  I believe Pascal raised the issue many months ago,
+and it has an AI number, and one can presumably search for that AI
+number in the meeting minutes (once Randy publishes them).
+
+*************************************************************
+
+From: Robert Dewar
+Sent: Friday, October 25, 2002  8:32 PM
+
+I tried, I could not find the AI number on this one
+
+Of course if there are uniform rules at the SC22 level, then it is fine
+to adopt them in Ada. I just think it is not something we should expend
+our own very limited resources on.
+
+*************************************************************
+
+From: Randy Brukardt
+Sent: Friday, October 25, 2002  8:59 PM
+
+This was discussed as part of AI-285, which started life as an AI about
+Latin-9. That discussion took up the entire afternoon of the third day of
+the meeting.
+
+These other issues came up since it was felt that better Wide_Character
+support would (might?) make it unnecessary for the standard to directly deal
+with Latin-9. (Implementations still would have to, in all likelyhood.)
+
+There are a lot of notes in this area, and I haven't gotten that far in the
+minutes yet. So my summary might be suspect... (And I haven't posted the
+mail yet, either, but it's likely that it will all got on AI-285.)
+
+*************************************************************
+
+From: Robert Dewar
+Sent: Friday, October 25, 2002  9:12 PM
+
+> This was discussed as part of AI-285, which started life as an AI about
+> Latin-9. That discussion took up the entire afternoon of the third day of
+> the meeting.
+
+Be careful not to be eaten alive by character discussions. It was quite
+intentional that we banned discussion of these issues from the main group
+in the Ada 9X effort and shoveled them off to the CRG. Spending one of
+six sessions on this issue alone to me says that things are already getting
+out of control :-) I quite understand how this happens (remember I was chair
+of the CRG!)
+
+> > These other issues came up since it was felt that better Wide_Character
+> > support would (might?) make it unnecessary for the standard to directly deal
+> > with Latin-9. (Implementations still would have to, in all likelyhood.)
+
+Well of course in practice Latin-9 is barely interesting, it just introduces
+a different name for the Euro character. But for sure most computing with
+Ada will be done using latin-9 whatever the Ada standard says :-)
+
+*************************************************************
+
+From: Randy Brukardt
+Sent: Friday, October 25, 2002  10:14 PM
+
+Well, it sounds worse that it is. The afternoon session of the last day is
+typically short. We didn't get back from lunch until about 2:15, and we
+adjorned at 3:28. Still, I probably would have dozed off during this
+discussion if I hadn't been taking notes...
+
+*************************************************************
+
+From: Robert A. Duff
+Sent: Friday, October 25, 2002  9:19 AM
+
+I agree that the ARG should not spend time thinking about characters.
+And we should not add all kinds of verbiage about character sets to the
+RM.  But if there is a character-set standard that can be simply
+referred to, why not.  Apparently, there *is* a definition of which
+31-bit characters are "letters".  I thought the intent was to simply
+refer to that definition (which of course changes from year to year).
+
+*************************************************************
+
+From: Robert Dewar
+Sent: Friday, October 25, 2002  8:45 PM
+
+Probably that's reasonable, although I worry that this will generate a lot
+of busy work in implementations for extraordinarily little gain.
+
+*************************************************************
+
+From: Robert A. Duff
+Sent: Saturday, October 26, 2002  9:58 AM
+
+Yes.  The purpose of Mike Yoder's "homework assignment" was to determine
+how difficult it is to write the "Is_Letter" function that the Ada lexer
+would need.  And a case conversion routine, I guess.  And how
+inefficient these would have to be.  (People at the meeting were
+concerned about huge character-set tables having to be in the compiler.)
+
+I'm not at all interested in these character set issues.  If folks can
+make an AI that is trivial to implement (efficiently), and invokes all
+character-set junk by reference to other standards, then I suppose it's
+OK with me.
+
+[ Insert my usual rant about what's important, here.  ;-) ]
+
+*************************************************************
+
+From: Robert A. Duff
+Sent: Saturday, October 26, 2002  10:14 AM
+
+I agree with Bob in all respects, including the parenthetical comment
+
+*************************************************************
+
+From: Pascal Leroy
+Sent: Wednesday, November 27, 2002  4:27 AM
+
+During the last meeting we discussed the possibility of allowing any Unicode
+character (er, I mean, ISO 10646) in Ada source.  Some people were concerned
+that the classification tables and the uppercase translation tables would be
+huge and complex to produce.
+
+Mike Y provided some input on this topic a while back, but since I (and
+probably other people) prefer to see the real tables, I spent a couple of hours
+writing a little Ada program to parse the Unicode database and spit out
+aggregates for these tables.  I am attaching to this message three
+classification tables (letters, digits, and spaces) as well as the table that
+converts to uppercase.
+
+The latter is the largest one, and it only has 419 entries, for a total of 5028
+bytes.  And that's with a representation that is not particularly compact: a
+more space-efficient representation could be obtained for instance by storing
+the ranges as (First, Length) instead of (First, Last).
+
+The tables would change slightly depending on the rules that we choose (e.g.
+for the syntax of identifiers) but their size would not be substantially
+modified.
+
+This demonstrates two things:
+
+1 - The tables are easy to produce from the Unicode database.
+2 - The tables are small.
+
+---
+
+Digits : constant Ranges :=
+   (
+    (16#30#, 16#39#), -- DIGIT ZERO .. DIGIT NINE
+    (16#B2#, 16#B3#), -- SUPERSCRIPT TWO .. SUPERSCRIPT THREE
+    (16#B9#, 16#B9#), -- SUPERSCRIPT ONE .. SUPERSCRIPT ONE
+    (16#660#, 16#669#), -- ARABIC-INDIC DIGIT ZERO .. ARABIC-INDIC DIGIT NINE
+    (16#6F0#, 16#6F9#), -- EXTENDED ARABIC-INDIC DIGIT ZERO .. EXTENDED ARABIC-INDIC DIGIT NINE
+    (16#966#, 16#96F#), -- DEVANAGARI DIGIT ZERO .. DEVANAGARI DIGIT NINE
+    (16#9E6#, 16#9EF#), -- BENGALI DIGIT ZERO .. BENGALI DIGIT NINE
+    (16#A66#, 16#A6F#), -- GURMUKHI DIGIT ZERO .. GURMUKHI DIGIT NINE
+    (16#AE6#, 16#AEF#), -- GUJARATI DIGIT ZERO .. GUJARATI DIGIT NINE
+    (16#B66#, 16#B6F#), -- ORIYA DIGIT ZERO .. ORIYA DIGIT NINE
+    (16#BE7#, 16#BEF#), -- TAMIL DIGIT ONE .. TAMIL DIGIT NINE
+    (16#C66#, 16#C6F#), -- TELUGU DIGIT ZERO .. TELUGU DIGIT NINE
+    (16#CE6#, 16#CEF#), -- KANNADA DIGIT ZERO .. KANNADA DIGIT NINE
+    (16#D66#, 16#D6F#), -- MALAYALAM DIGIT ZERO .. MALAYALAM DIGIT NINE
+    (16#E50#, 16#E59#), -- THAI DIGIT ZERO .. THAI DIGIT NINE
+    (16#ED0#, 16#ED9#), -- LAO DIGIT ZERO .. LAO DIGIT NINE
+    (16#F20#, 16#F29#), -- TIBETAN DIGIT ZERO .. TIBETAN DIGIT NINE
+    (16#1040#, 16#1049#), -- MYANMAR DIGIT ZERO .. MYANMAR DIGIT NINE
+    (16#1369#, 16#1371#), -- ETHIOPIC DIGIT ONE .. ETHIOPIC DIGIT NINE
+    (16#17E0#, 16#17E9#), -- KHMER DIGIT ZERO .. KHMER DIGIT NINE
+    (16#1810#, 16#1819#), -- MONGOLIAN DIGIT ZERO .. MONGOLIAN DIGIT NINE
+    (16#2070#, 16#2070#), -- SUPERSCRIPT ZERO .. SUPERSCRIPT ZERO
+    (16#2074#, 16#2079#), -- SUPERSCRIPT FOUR .. SUPERSCRIPT NINE
+    (16#2080#, 16#2089#), -- SUBSCRIPT ZERO .. SUBSCRIPT NINE
+    (16#FF10#, 16#FF19#), -- FULLWIDTH DIGIT ZERO .. FULLWIDTH DIGIT NINE
+    (16#1D7CE#, 16#1D7FF#) -- MATHEMATICAL BOLD DIGIT ZERO .. MATHEMATICAL MONOSPACE DIGIT NINE
+   );
+
+---
+
+Letters : constant Ranges :=
+   (
+    (16#41#, 16#5A#), -- LATIN CAPITAL LETTER A .. LATIN CAPITAL LETTER Z
+    (16#61#, 16#7A#), -- LATIN SMALL LETTER A .. LATIN SMALL LETTER Z
+    (16#AA#, 16#AA#), -- FEMININE ORDINAL INDICATOR .. FEMININE ORDINAL INDICATOR
+    (16#B5#, 16#B5#), -- MICRO SIGN .. MICRO SIGN
+    (16#BA#, 16#BA#), -- MASCULINE ORDINAL INDICATOR .. MASCULINE ORDINAL INDICATOR
+    (16#C0#, 16#D6#), -- LATIN CAPITAL LETTER A WITH GRAVE .. LATIN CAPITAL LETTER O WITH DIAERESIS
+    (16#D8#, 16#F6#), -- LATIN CAPITAL LETTER O WITH STROKE .. LATIN SMALL LETTER O WITH DIAERESIS
+    (16#F8#, 16#2B8#), -- LATIN SMALL LETTER O WITH STROKE .. MODIFIER LETTER SMALL Y
+    (16#2BB#, 16#2C1#), -- MODIFIER LETTER TURNED COMMA .. MODIFIER LETTER REVERSED GLOTTAL STOP
+    (16#2D0#, 16#2D1#), -- MODIFIER LETTER TRIANGULAR COLON .. MODIFIER LETTER HALF TRIANGULAR COLON
+    (16#2E0#, 16#2E4#), -- MODIFIER LETTER SMALL GAMMA .. MODIFIER LETTER SMALL REVERSED GLOTTAL STOP
+    (16#2EE#, 16#2EE#), -- MODIFIER LETTER DOUBLE APOSTROPHE .. MODIFIER LETTER DOUBLE APOSTROPHE
+    (16#37A#, 16#37A#), -- GREEK YPOGEGRAMMENI .. GREEK YPOGEGRAMMENI
+    (16#386#, 16#386#), -- GREEK CAPITAL LETTER ALPHA WITH TONOS .. GREEK CAPITAL LETTER ALPHA WITH TONOS
+    (16#388#, 16#3F5#), -- GREEK CAPITAL LETTER EPSILON WITH TONOS .. GREEK LUNATE EPSILON SYMBOL
+    (16#400#, 16#481#), -- CYRILLIC CAPITAL LETTER IE WITH GRAVE .. CYRILLIC SMALL LETTER KOPPA
+    (16#48A#, 16#559#), -- CYRILLIC CAPITAL LETTER SHORT I WITH TAIL .. ARMENIAN MODIFIER LETTER LEFT HALF RING
+    (16#561#, 16#587#), -- ARMENIAN SMALL LETTER AYB .. ARMENIAN SMALL LIGATURE ECH YIWN
+    (16#5D0#, 16#5F2#), -- HEBREW LETTER ALEF .. HEBREW LIGATURE YIDDISH DOUBLE YOD
+    (16#621#, 16#64A#), -- ARABIC LETTER HAMZA .. ARABIC LETTER YEH
+    (16#66E#, 16#66F#), -- ARABIC LETTER DOTLESS BEH .. ARABIC LETTER DOTLESS QAF
+    (16#671#, 16#6D3#), -- ARABIC LETTER ALEF WASLA .. ARABIC LETTER YEH BARREE WITH HAMZA ABOVE
+    (16#6D5#, 16#6D5#), -- ARABIC LETTER AE .. ARABIC LETTER AE
+    (16#6E5#, 16#6E6#), -- ARABIC SMALL WAW .. ARABIC SMALL YEH
+    (16#6FA#, 16#6FC#), -- ARABIC LETTER SHEEN WITH DOT BELOW .. ARABIC LETTER GHAIN WITH DOT BELOW
+    (16#710#, 16#710#), -- SYRIAC LETTER ALAPH .. SYRIAC LETTER ALAPH
+    (16#712#, 16#72C#), -- SYRIAC LETTER BETH .. SYRIAC LETTER TAW
+    (16#780#, 16#7A5#), -- THAANA LETTER HAA .. THAANA LETTER WAAVU
+    (16#7B1#, 16#7B1#), -- THAANA LETTER NAA .. THAANA LETTER NAA
+    (16#905#, 16#939#), -- DEVANAGARI LETTER A .. DEVANAGARI LETTER HA
+    (16#93D#, 16#93D#), -- DEVANAGARI SIGN AVAGRAHA .. DEVANAGARI SIGN AVAGRAHA
+    (16#950#, 16#950#), -- DEVANAGARI OM .. DEVANAGARI OM
+    (16#958#, 16#961#), -- DEVANAGARI LETTER QA .. DEVANAGARI LETTER VOCALIC LL
+    (16#985#, 16#9B9#), -- BENGALI LETTER A .. BENGALI LETTER HA
+    (16#9DC#, 16#9E1#), -- BENGALI LETTER RRA .. BENGALI LETTER VOCALIC LL
+    (16#9F0#, 16#9F1#), -- BENGALI LETTER RA WITH MIDDLE DIAGONAL .. BENGALI LETTER RA WITH LOWER DIAGONAL
+    (16#A05#, 16#A39#), -- GURMUKHI LETTER A .. GURMUKHI LETTER HA
+    (16#A59#, 16#A5E#), -- GURMUKHI LETTER KHHA .. GURMUKHI LETTER FA
+    (16#A72#, 16#A74#), -- GURMUKHI IRI .. GURMUKHI EK ONKAR
+    (16#A85#, 16#AB9#), -- GUJARATI LETTER A .. GUJARATI LETTER HA
+    (16#ABD#, 16#ABD#), -- GUJARATI SIGN AVAGRAHA .. GUJARATI SIGN AVAGRAHA
+    (16#AD0#, 16#AE0#), -- GUJARATI OM .. GUJARATI LETTER VOCALIC RR
+    (16#B05#, 16#B39#), -- ORIYA LETTER A .. ORIYA LETTER HA
+    (16#B3D#, 16#B3D#), -- ORIYA SIGN AVAGRAHA .. ORIYA SIGN AVAGRAHA
+    (16#B5C#, 16#B61#), -- ORIYA LETTER RRA .. ORIYA LETTER VOCALIC LL
+    (16#B83#, 16#BB9#), -- TAMIL SIGN VISARGA .. TAMIL LETTER HA
+    (16#C05#, 16#C39#), -- TELUGU LETTER A .. TELUGU LETTER HA
+    (16#C60#, 16#C61#), -- TELUGU LETTER VOCALIC RR .. TELUGU LETTER VOCALIC LL
+    (16#C85#, 16#CB9#), -- KANNADA LETTER A .. KANNADA LETTER HA
+    (16#CDE#, 16#CE1#), -- KANNADA LETTER FA .. KANNADA LETTER VOCALIC LL
+    (16#D05#, 16#D39#), -- MALAYALAM LETTER A .. MALAYALAM LETTER HA
+    (16#D60#, 16#D61#), -- MALAYALAM LETTER VOCALIC RR .. MALAYALAM LETTER VOCALIC LL
+    (16#D85#, 16#DC6#), -- SINHALA LETTER AYANNA .. SINHALA LETTER FAYANNA
+    (16#E01#, 16#E30#), -- THAI CHARACTER KO KAI .. THAI CHARACTER SARA A
+    (16#E32#, 16#E33#), -- THAI CHARACTER SARA AA .. THAI CHARACTER SARA AM
+    (16#E40#, 16#E46#), -- THAI CHARACTER SARA E .. THAI CHARACTER MAIYAMOK
+    (16#E81#, 16#EB0#), -- LAO LETTER KO .. LAO VOWEL SIGN A
+    (16#EB2#, 16#EB3#), -- LAO VOWEL SIGN AA .. LAO VOWEL SIGN AM
+    (16#EBD#, 16#EC6#), -- LAO SEMIVOWEL SIGN NYO .. LAO KO LA
+    (16#EDC#, 16#F00#), -- LAO HO NO .. TIBETAN SYLLABLE OM
+    (16#F40#, 16#F6A#), -- TIBETAN LETTER KA .. TIBETAN LETTER FIXED-FORM RA
+    (16#F88#, 16#F8B#), -- TIBETAN SIGN LCE TSA CAN .. TIBETAN SIGN GRU MED RGYINGS
+    (16#1000#, 16#102A#), -- MYANMAR LETTER KA .. MYANMAR LETTER AU
+    (16#1050#, 16#1055#), -- MYANMAR LETTER SHA .. MYANMAR LETTER VOCALIC LL
+    (16#10A0#, 16#10F8#), -- GEORGIAN CAPITAL LETTER AN .. GEORGIAN LETTER ELIFI
+    (16#1100#, 16#135A#), -- HANGUL CHOSEONG KIYEOK .. ETHIOPIC SYLLABLE FYA
+    (16#13A0#, 16#166C#), -- CHEROKEE LETTER A .. CANADIAN SYLLABICS CARRIER TTSA
+    (16#166F#, 16#1676#), -- CANADIAN SYLLABICS QAI .. CANADIAN SYLLABICS NNGAA
+    (16#1681#, 16#169A#), -- OGHAM LETTER BEITH .. OGHAM LETTER PEITH
+    (16#16A0#, 16#16EA#), -- RUNIC LETTER FEHU FEOH FE F .. RUNIC LETTER X
+    (16#1700#, 16#1711#), -- TAGALOG LETTER A .. TAGALOG LETTER HA
+    (16#1720#, 16#1731#), -- HANUNOO LETTER A .. HANUNOO LETTER HA
+    (16#1740#, 16#1751#), -- BUHID LETTER A .. BUHID LETTER HA
+    (16#1760#, 16#1770#), -- TAGBANWA LETTER A .. TAGBANWA LETTER SA
+    (16#1780#, 16#17B3#), -- KHMER LETTER KA .. KHMER INDEPENDENT VOWEL QAU
+    (16#17D7#, 16#17D7#), -- KHMER SIGN LEK TOO .. KHMER SIGN LEK TOO
+    (16#17DC#, 16#17DC#), -- KHMER SIGN AVAKRAHASANYA .. KHMER SIGN AVAKRAHASANYA
+    (16#1820#, 16#18A8#), -- MONGOLIAN LETTER A .. MONGOLIAN LETTER MANCHU ALI GALI BHA
+    (16#1E00#, 16#1FBC#), -- LATIN CAPITAL LETTER A WITH RING BELOW .. GREEK CAPITAL LETTER ALPHA WITH PROSGEGRAMMENI
+    (16#1FBE#, 16#1FBE#), -- GREEK PROSGEGRAMMENI .. GREEK PROSGEGRAMMENI
+    (16#1FC2#, 16#1FCC#), -- GREEK SMALL LETTER ETA WITH VARIA AND YPOGEGRAMMENI .. GREEK CAPITAL LETTER ETA WITH PROSGEGRAMMENI
+    (16#1FD0#, 16#1FDB#), -- GREEK SMALL LETTER IOTA WITH VRACHY .. GREEK CAPITAL LETTER IOTA WITH OXIA
+    (16#1FE0#, 16#1FEC#), -- GREEK SMALL LETTER UPSILON WITH VRACHY .. GREEK CAPITAL LETTER RHO WITH DASIA
+    (16#1FF2#, 16#1FFC#), -- GREEK SMALL LETTER OMEGA WITH VARIA AND YPOGEGRAMMENI .. GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI
+    (16#2071#, 16#2071#), -- SUPERSCRIPT LATIN SMALL LETTER I .. SUPERSCRIPT LATIN SMALL LETTER I
+    (16#207F#, 16#207F#), -- SUPERSCRIPT LATIN SMALL LETTER N .. SUPERSCRIPT LATIN SMALL LETTER N
+    (16#2102#, 16#2102#), -- DOUBLE-STRUCK CAPITAL C .. DOUBLE-STRUCK CAPITAL C
+    (16#2107#, 16#2107#), -- EULER CONSTANT .. EULER CONSTANT
+    (16#210A#, 16#2113#), -- SCRIPT SMALL G .. SCRIPT SMALL L
+    (16#2115#, 16#2115#), -- DOUBLE-STRUCK CAPITAL N .. DOUBLE-STRUCK CAPITAL N
+    (16#2119#, 16#211D#), -- DOUBLE-STRUCK CAPITAL P .. DOUBLE-STRUCK CAPITAL R
+    (16#2124#, 16#2124#), -- DOUBLE-STRUCK CAPITAL Z .. DOUBLE-STRUCK CAPITAL Z
+    (16#2126#, 16#2126#), -- OHM SIGN .. OHM SIGN
+    (16#2128#, 16#2128#), -- BLACK-LETTER CAPITAL Z .. BLACK-LETTER CAPITAL Z
+    (16#212A#, 16#212D#), -- KELVIN SIGN .. BLACK-LETTER CAPITAL C
+    (16#212F#, 16#2131#), -- SCRIPT SMALL E .. SCRIPT CAPITAL F
+    (16#2133#, 16#2139#), -- SCRIPT CAPITAL M .. INFORMATION SOURCE
+    (16#213D#, 16#213F#), -- DOUBLE-STRUCK SMALL GAMMA .. DOUBLE-STRUCK CAPITAL PI
+    (16#2145#, 16#2149#), -- DOUBLE-STRUCK ITALIC CAPITAL D .. DOUBLE-STRUCK ITALIC SMALL J
+    (16#3005#, 16#3006#), -- IDEOGRAPHIC ITERATION MARK .. IDEOGRAPHIC CLOSING MARK
+    (16#3031#, 16#3035#), -- VERTICAL KANA REPEAT MARK .. VERTICAL KANA REPEAT MARK LOWER HALF
+    (16#303B#, 16#303C#), -- VERTICAL IDEOGRAPHIC ITERATION MARK .. MASU MARK
+    (16#3041#, 16#3096#), -- HIRAGANA LETTER SMALL A .. HIRAGANA LETTER SMALL KE
+    (16#309D#, 16#309F#), -- HIRAGANA ITERATION MARK .. HIRAGANA DIGRAPH YORI
+    (16#30A1#, 16#30FA#), -- KATAKANA LETTER SMALL A .. KATAKANA LETTER VO
+    (16#30FC#, 16#318E#), -- KATAKANA-HIRAGANA PROLONGED SOUND MARK .. HANGUL LETTER ARAEAE
+    (16#31A0#, 16#31FF#), -- BOPOMOFO LETTER BU .. KATAKANA LETTER SMALL RO
+    (16#3400#, 16#A48C#), -- <CJK Ideograph Extension A, First> .. YI SYLLABLE YYR
+    (16#AC00#, 16#D7A3#), -- <Hangul Syllable, First> .. <Hangul Syllable, Last>
+    (16#F900#, 16#FB1D#), -- CJK COMPATIBILITY IDEOGRAPH-F900 .. HEBREW LETTER YOD WITH HIRIQ
+    (16#FB1F#, 16#FB28#), -- HEBREW LIGATURE YIDDISH YOD YOD PATAH .. HEBREW LETTER WIDE TAV
+    (16#FB2A#, 16#FD3D#), -- HEBREW LETTER SHIN WITH SHIN DOT .. ARABIC LIGATURE ALEF WITH FATHATAN ISOLATED FORM
+    (16#FD50#, 16#FDFB#), -- ARABIC LIGATURE TEH WITH JEEM WITH MEEM INITIAL FORM .. ARABIC LIGATURE JALLAJALALOUHOU
+    (16#FE70#, 16#FEFC#), -- ARABIC FATHATAN ISOLATED FORM .. ARABIC LIGATURE LAM WITH ALEF FINAL FORM
+    (16#FF21#, 16#FF3A#), -- FULLWIDTH LATIN CAPITAL LETTER A .. FULLWIDTH LATIN CAPITAL LETTER Z
+    (16#FF41#, 16#FF5A#), -- FULLWIDTH LATIN SMALL LETTER A .. FULLWIDTH LATIN SMALL LETTER Z
+    (16#FF66#, 16#FFDC#), -- HALFWIDTH KATAKANA LETTER WO .. HALFWIDTH HANGUL LETTER I
+    (16#10300#, 16#1031E#), -- OLD ITALIC LETTER A .. OLD ITALIC LETTER UU
+    (16#10330#, 16#10349#), -- GOTHIC LETTER AHSA .. GOTHIC LETTER OTHAL
+    (16#10400#, 16#1044D#), -- DESERET CAPITAL LETTER LONG I .. DESERET SMALL LETTER ENG
+    (16#1D400#, 16#1D6C0#), -- MATHEMATICAL BOLD CAPITAL A .. MATHEMATICAL BOLD CAPITAL OMEGA
+    (16#1D6C2#, 16#1D6DA#), -- MATHEMATICAL BOLD SMALL ALPHA .. MATHEMATICAL BOLD SMALL OMEGA
+    (16#1D6DC#, 16#1D6FA#), -- MATHEMATICAL BOLD EPSILON SYMBOL .. MATHEMATICAL ITALIC CAPITAL OMEGA
+    (16#1D6FC#, 16#1D714#), -- MATHEMATICAL ITALIC SMALL ALPHA .. MATHEMATICAL ITALIC SMALL OMEGA
+    (16#1D716#, 16#1D734#), -- MATHEMATICAL ITALIC EPSILON SYMBOL .. MATHEMATICAL BOLD ITALIC CAPITAL OMEGA
+    (16#1D736#, 16#1D74E#), -- MATHEMATICAL BOLD ITALIC SMALL ALPHA .. MATHEMATICAL BOLD ITALIC SMALL OMEGA
+    (16#1D750#, 16#1D76E#), -- MATHEMATICAL BOLD ITALIC EPSILON SYMBOL .. MATHEMATICAL SANS-SERIF BOLD CAPITAL OMEGA
+    (16#1D770#, 16#1D788#), -- MATHEMATICAL SANS-SERIF BOLD SMALL ALPHA .. MATHEMATICAL SANS-SERIF BOLD SMALL OMEGA
+    (16#1D78A#, 16#1D7A8#), -- MATHEMATICAL SANS-SERIF BOLD EPSILON SYMBOL .. MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL OMEGA
+    (16#1D7AA#, 16#1D7C2#), -- MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL ALPHA .. MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL OMEGA
+    (16#1D7C4#, 16#1D7C9#), -- MATHEMATICAL SANS-SERIF BOLD ITALIC EPSILON SYMBOL .. MATHEMATICAL SANS-SERIF BOLD ITALIC PI SYMBOL
+    (16#20000#, 16#2FA1D#) -- <CJK Ideograph Extension B, First> .. CJK COMPATIBILITY IDEOGRAPH-2FA1D
+   );
+
+---
+
+Spaces : constant Ranges :=
+   (
+    (16#20#, 16#20#), -- SPACE .. SPACE
+    (16#A0#, 16#A0#), -- NO-BREAK SPACE .. NO-BREAK SPACE
+    (16#1680#, 16#1680#), -- OGHAM SPACE MARK .. OGHAM SPACE MARK
+    (16#2000#, 16#200B#), -- EN QUAD .. ZERO WIDTH SPACE
+    (16#202F#, 16#202F#), -- NARROW NO-BREAK SPACE .. NARROW NO-BREAK SPACE
+    (16#205F#, 16#205F#), -- MEDIUM MATHEMATICAL SPACE .. MEDIUM MATHEMATICAL SPACE
+    (16#3000#, 16#3000#) -- IDEOGRAPHIC SPACE .. IDEOGRAPHIC SPACE
+   );
+
+---
+
+Uppercase_Mapping : constant Mapping_Ranges :=
+   (
+    (16#61#, 16#7A#, -32), -- LATIN SMALL LETTER A .. LATIN SMALL LETTER Z
+    (16#B5#, 16#B5#, 743), -- MICRO SIGN .. MICRO SIGN
+    (16#E0#, 16#F6#, -32), -- LATIN SMALL LETTER A WITH GRAVE .. LATIN SMALL LETTER O WITH DIAERESIS
+    (16#F8#, 16#FE#, -32), -- LATIN SMALL LETTER O WITH STROKE .. LATIN SMALL LETTER THORN
+    (16#FF#, 16#FF#, 121), -- LATIN SMALL LETTER Y WITH DIAERESIS .. LATIN SMALL LETTER Y WITH DIAERESIS
+    (16#101#, 16#101#, -1), -- LATIN SMALL LETTER A WITH MACRON .. LATIN SMALL LETTER A WITH MACRON
+    (16#103#, 16#103#, -1), -- LATIN SMALL LETTER A WITH BREVE .. LATIN SMALL LETTER A WITH BREVE
+    (16#105#, 16#105#, -1), -- LATIN SMALL LETTER A WITH OGONEK .. LATIN SMALL LETTER A WITH OGONEK
+    (16#107#, 16#107#, -1), -- LATIN SMALL LETTER C WITH ACUTE .. LATIN SMALL LETTER C WITH ACUTE
+    (16#109#, 16#109#, -1), -- LATIN SMALL LETTER C WITH CIRCUMFLEX .. LATIN SMALL LETTER C WITH CIRCUMFLEX
+    (16#10B#, 16#10B#, -1), -- LATIN SMALL LETTER C WITH DOT ABOVE .. LATIN SMALL LETTER C WITH DOT ABOVE
+    (16#10D#, 16#10D#, -1), -- LATIN SMALL LETTER C WITH CARON .. LATIN SMALL LETTER C WITH CARON
+    (16#10F#, 16#10F#, -1), -- LATIN SMALL LETTER D WITH CARON .. LATIN SMALL LETTER D WITH CARON
+    (16#111#, 16#111#, -1), -- LATIN SMALL LETTER D WITH STROKE .. LATIN SMALL LETTER D WITH STROKE
+    (16#113#, 16#113#, -1), -- LATIN SMALL LETTER E WITH MACRON .. LATIN SMALL LETTER E WITH MACRON
+    (16#115#, 16#115#, -1), -- LATIN SMALL LETTER E WITH BREVE .. LATIN SMALL LETTER E WITH BREVE
+    (16#117#, 16#117#, -1), -- LATIN SMALL LETTER E WITH DOT ABOVE .. LATIN SMALL LETTER E WITH DOT ABOVE
+    (16#119#, 16#119#, -1), -- LATIN SMALL LETTER E WITH OGONEK .. LATIN SMALL LETTER E WITH OGONEK
+    (16#11B#, 16#11B#, -1), -- LATIN SMALL LETTER E WITH CARON .. LATIN SMALL LETTER E WITH CARON
+    (16#11D#, 16#11D#, -1), -- LATIN SMALL LETTER G WITH CIRCUMFLEX .. LATIN SMALL LETTER G WITH CIRCUMFLEX
+    (16#11F#, 16#11F#, -1), -- LATIN SMALL LETTER G WITH BREVE .. LATIN SMALL LETTER G WITH BREVE
+    (16#121#, 16#121#, -1), -- LATIN SMALL LETTER G WITH DOT ABOVE .. LATIN SMALL LETTER G WITH DOT ABOVE
+    (16#123#, 16#123#, -1), -- LATIN SMALL LETTER G WITH CEDILLA .. LATIN SMALL LETTER G WITH CEDILLA
+    (16#125#, 16#125#, -1), -- LATIN SMALL LETTER H WITH CIRCUMFLEX .. LATIN SMALL LETTER H WITH CIRCUMFLEX
+    (16#127#, 16#127#, -1), -- LATIN SMALL LETTER H WITH STROKE .. LATIN SMALL LETTER H WITH STROKE
+    (16#129#, 16#129#, -1), -- LATIN SMALL LETTER I WITH TILDE .. LATIN SMALL LETTER I WITH TILDE
+    (16#12B#, 16#12B#, -1), -- LATIN SMALL LETTER I WITH MACRON .. LATIN SMALL LETTER I WITH MACRON
+    (16#12D#, 16#12D#, -1), -- LATIN SMALL LETTER I WITH BREVE .. LATIN SMALL LETTER I WITH BREVE
+    (16#12F#, 16#12F#, -1), -- LATIN SMALL LETTER I WITH OGONEK .. LATIN SMALL LETTER I WITH OGONEK
+    (16#131#, 16#131#, -232), -- LATIN SMALL LETTER DOTLESS I .. LATIN SMALL LETTER DOTLESS I
+    (16#133#, 16#133#, -1), -- LATIN SMALL LIGATURE IJ .. LATIN SMALL LIGATURE IJ
+    (16#135#, 16#135#, -1), -- LATIN SMALL LETTER J WITH CIRCUMFLEX .. LATIN SMALL LETTER J WITH CIRCUMFLEX
+    (16#137#, 16#137#, -1), -- LATIN SMALL LETTER K WITH CEDILLA .. LATIN SMALL LETTER K WITH CEDILLA
+    (16#13A#, 16#13A#, -1), -- LATIN SMALL LETTER L WITH ACUTE .. LATIN SMALL LETTER L WITH ACUTE
+    (16#13C#, 16#13C#, -1), -- LATIN SMALL LETTER L WITH CEDILLA .. LATIN SMALL LETTER L WITH CEDILLA
+    (16#13E#, 16#13E#, -1), -- LATIN SMALL LETTER L WITH CARON .. LATIN SMALL LETTER L WITH CARON
+    (16#140#, 16#140#, -1), -- LATIN SMALL LETTER L WITH MIDDLE DOT .. LATIN SMALL LETTER L WITH MIDDLE DOT
+    (16#142#, 16#142#, -1), -- LATIN SMALL LETTER L WITH STROKE .. LATIN SMALL LETTER L WITH STROKE
+    (16#144#, 16#144#, -1), -- LATIN SMALL LETTER N WITH ACUTE .. LATIN SMALL LETTER N WITH ACUTE
+    (16#146#, 16#146#, -1), -- LATIN SMALL LETTER N WITH CEDILLA .. LATIN SMALL LETTER N WITH CEDILLA
+    (16#148#, 16#148#, -1), -- LATIN SMALL LETTER N WITH CARON .. LATIN SMALL LETTER N WITH CARON
+    (16#14B#, 16#14B#, -1), -- LATIN SMALL LETTER ENG .. LATIN SMALL LETTER ENG
+    (16#14D#, 16#14D#, -1), -- LATIN SMALL LETTER O WITH MACRON .. LATIN SMALL LETTER O WITH MACRON
+    (16#14F#, 16#14F#, -1), -- LATIN SMALL LETTER O WITH BREVE .. LATIN SMALL LETTER O WITH BREVE
+    (16#151#, 16#151#, -1), -- LATIN SMALL LETTER O WITH DOUBLE ACUTE .. LATIN SMALL LETTER O WITH DOUBLE ACUTE
+    (16#153#, 16#153#, -1), -- LATIN SMALL LIGATURE OE .. LATIN SMALL LIGATURE OE
+    (16#155#, 16#155#, -1), -- LATIN SMALL LETTER R WITH ACUTE .. LATIN SMALL LETTER R WITH ACUTE
+    (16#157#, 16#157#, -1), -- LATIN SMALL LETTER R WITH CEDILLA .. LATIN SMALL LETTER R WITH CEDILLA
+    (16#159#, 16#159#, -1), -- LATIN SMALL LETTER R WITH CARON .. LATIN SMALL LETTER R WITH CARON
+    (16#15B#, 16#15B#, -1), -- LATIN SMALL LETTER S WITH ACUTE .. LATIN SMALL LETTER S WITH ACUTE
+    (16#15D#, 16#15D#, -1), -- LATIN SMALL LETTER S WITH CIRCUMFLEX .. LATIN SMALL LETTER S WITH CIRCUMFLEX
+    (16#15F#, 16#15F#, -1), -- LATIN SMALL LETTER S WITH CEDILLA .. LATIN SMALL LETTER S WITH CEDILLA
+    (16#161#, 16#161#, -1), -- LATIN SMALL LETTER S WITH CARON .. LATIN SMALL LETTER S WITH CARON
+    (16#163#, 16#163#, -1), -- LATIN SMALL LETTER T WITH CEDILLA .. LATIN SMALL LETTER T WITH CEDILLA
+    (16#165#, 16#165#, -1), -- LATIN SMALL LETTER T WITH CARON .. LATIN SMALL LETTER T WITH CARON
+    (16#167#, 16#167#, -1), -- LATIN SMALL LETTER T WITH STROKE .. LATIN SMALL LETTER T WITH STROKE
+    (16#169#, 16#169#, -1), -- LATIN SMALL LETTER U WITH TILDE .. LATIN SMALL LETTER U WITH TILDE
+    (16#16B#, 16#16B#, -1), -- LATIN SMALL LETTER U WITH MACRON .. LATIN SMALL LETTER U WITH MACRON
+    (16#16D#, 16#16D#, -1), -- LATIN SMALL LETTER U WITH BREVE .. LATIN SMALL LETTER U WITH BREVE
+    (16#16F#, 16#16F#, -1), -- LATIN SMALL LETTER U WITH RING ABOVE .. LATIN SMALL LETTER U WITH RING ABOVE
+    (16#171#, 16#171#, -1), -- LATIN SMALL LETTER U WITH DOUBLE ACUTE .. LATIN SMALL LETTER U WITH DOUBLE ACUTE
+    (16#173#, 16#173#, -1), -- LATIN SMALL LETTER U WITH OGONEK .. LATIN SMALL LETTER U WITH OGONEK
+    (16#175#, 16#175#, -1), -- LATIN SMALL LETTER W WITH CIRCUMFLEX .. LATIN SMALL LETTER W WITH CIRCUMFLEX
+    (16#177#, 16#177#, -1), -- LATIN SMALL LETTER Y WITH CIRCUMFLEX .. LATIN SMALL LETTER Y WITH CIRCUMFLEX
+    (16#17A#, 16#17A#, -1), -- LATIN SMALL LETTER Z WITH ACUTE .. LATIN SMALL LETTER Z WITH ACUTE
+    (16#17C#, 16#17C#, -1), -- LATIN SMALL LETTER Z WITH DOT ABOVE .. LATIN SMALL LETTER Z WITH DOT ABOVE
+    (16#17E#, 16#17E#, -1), -- LATIN SMALL LETTER Z WITH CARON .. LATIN SMALL LETTER Z WITH CARON
+    (16#17F#, 16#17F#, -300), -- LATIN SMALL LETTER LONG S .. LATIN SMALL LETTER LONG S
+    (16#183#, 16#183#, -1), -- LATIN SMALL LETTER B WITH TOPBAR .. LATIN SMALL LETTER B WITH TOPBAR
+    (16#185#, 16#185#, -1), -- LATIN SMALL LETTER TONE SIX .. LATIN SMALL LETTER TONE SIX
+    (16#188#, 16#188#, -1), -- LATIN SMALL LETTER C WITH HOOK .. LATIN SMALL LETTER C WITH HOOK
+    (16#18C#, 16#18C#, -1), -- LATIN SMALL LETTER D WITH TOPBAR .. LATIN SMALL LETTER D WITH TOPBAR
+    (16#192#, 16#192#, -1), -- LATIN SMALL LETTER F WITH HOOK .. LATIN SMALL LETTER F WITH HOOK
+    (16#195#, 16#195#, 97), -- LATIN SMALL LETTER HV .. LATIN SMALL LETTER HV
+    (16#199#, 16#199#, -1), -- LATIN SMALL LETTER K WITH HOOK .. LATIN SMALL LETTER K WITH HOOK
+    (16#19E#, 16#19E#, 130), -- LATIN SMALL LETTER N WITH LONG RIGHT LEG .. LATIN SMALL LETTER N WITH LONG RIGHT LEG
+    (16#1A1#, 16#1A1#, -1), -- LATIN SMALL LETTER O WITH HORN .. LATIN SMALL LETTER O WITH HORN
+    (16#1A3#, 16#1A3#, -1), -- LATIN SMALL LETTER OI .. LATIN SMALL LETTER OI
+    (16#1A5#, 16#1A5#, -1), -- LATIN SMALL LETTER P WITH HOOK .. LATIN SMALL LETTER P WITH HOOK
+    (16#1A8#, 16#1A8#, -1), -- LATIN SMALL LETTER TONE TWO .. LATIN SMALL LETTER TONE TWO
+    (16#1AD#, 16#1AD#, -1), -- LATIN SMALL LETTER T WITH HOOK .. LATIN SMALL LETTER T WITH HOOK
+    (16#1B0#, 16#1B0#, -1), -- LATIN SMALL LETTER U WITH HORN .. LATIN SMALL LETTER U WITH HORN
+    (16#1B4#, 16#1B4#, -1), -- LATIN SMALL LETTER Y WITH HOOK .. LATIN SMALL LETTER Y WITH HOOK
+    (16#1B6#, 16#1B6#, -1), -- LATIN SMALL LETTER Z WITH STROKE .. LATIN SMALL LETTER Z WITH STROKE
+    (16#1B9#, 16#1B9#, -1), -- LATIN SMALL LETTER EZH REVERSED .. LATIN SMALL LETTER EZH REVERSED
+    (16#1BD#, 16#1BD#, -1), -- LATIN SMALL LETTER TONE FIVE .. LATIN SMALL LETTER TONE FIVE
+    (16#1BF#, 16#1BF#, 56), -- LATIN LETTER WYNN .. LATIN LETTER WYNN
+    (16#1C5#, 16#1C5#, -1), -- LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON .. LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON
+    (16#1C6#, 16#1C6#, -2), -- LATIN SMALL LETTER DZ WITH CARON .. LATIN SMALL LETTER DZ WITH CARON
+    (16#1C8#, 16#1C8#, -1), -- LATIN CAPITAL LETTER L WITH SMALL LETTER J .. LATIN CAPITAL LETTER L WITH SMALL LETTER J
+    (16#1C9#, 16#1C9#, -2), -- LATIN SMALL LETTER LJ .. LATIN SMALL LETTER LJ
+    (16#1CB#, 16#1CB#, -1), -- LATIN CAPITAL LETTER N WITH SMALL LETTER J .. LATIN CAPITAL LETTER N WITH SMALL LETTER J
+    (16#1CC#, 16#1CC#, -2), -- LATIN SMALL LETTER NJ .. LATIN SMALL LETTER NJ
+    (16#1CE#, 16#1CE#, -1), -- LATIN SMALL LETTER A WITH CARON .. LATIN SMALL LETTER A WITH CARON
+    (16#1D0#, 16#1D0#, -1), -- LATIN SMALL LETTER I WITH CARON .. LATIN SMALL LETTER I WITH CARON
+    (16#1D2#, 16#1D2#, -1), -- LATIN SMALL LETTER O WITH CARON .. LATIN SMALL LETTER O WITH CARON
+    (16#1D4#, 16#1D4#, -1), -- LATIN SMALL LETTER U WITH CARON .. LATIN SMALL LETTER U WITH CARON
+    (16#1D6#, 16#1D6#, -1), -- LATIN SMALL LETTER U WITH DIAERESIS AND MACRON .. LATIN SMALL LETTER U WITH DIAERESIS AND MACRON
+    (16#1D8#, 16#1D8#, -1), -- LATIN SMALL LETTER U WITH DIAERESIS AND ACUTE .. LATIN SMALL LETTER U WITH DIAERESIS AND ACUTE
+    (16#1DA#, 16#1DA#, -1), -- LATIN SMALL LETTER U WITH DIAERESIS AND CARON .. LATIN SMALL LETTER U WITH DIAERESIS AND CARON
+    (16#1DC#, 16#1DC#, -1), -- LATIN SMALL LETTER U WITH DIAERESIS AND GRAVE .. LATIN SMALL LETTER U WITH DIAERESIS AND GRAVE
+    (16#1DD#, 16#1DD#, -79), -- LATIN SMALL LETTER TURNED E .. LATIN SMALL LETTER TURNED E
+    (16#1DF#, 16#1DF#, -1), -- LATIN SMALL LETTER A WITH DIAERESIS AND MACRON .. LATIN SMALL LETTER A WITH DIAERESIS AND MACRON
+    (16#1E1#, 16#1E1#, -1), -- LATIN SMALL LETTER A WITH DOT ABOVE AND MACRON .. LATIN SMALL LETTER A WITH DOT ABOVE AND MACRON
+    (16#1E3#, 16#1E3#, -1), -- LATIN SMALL LETTER AE WITH MACRON .. LATIN SMALL LETTER AE WITH MACRON
+    (16#1E5#, 16#1E5#, -1), -- LATIN SMALL LETTER G WITH STROKE .. LATIN SMALL LETTER G WITH STROKE
+    (16#1E7#, 16#1E7#, -1), -- LATIN SMALL LETTER G WITH CARON .. LATIN SMALL LETTER G WITH CARON
+    (16#1E9#, 16#1E9#, -1), -- LATIN SMALL LETTER K WITH CARON .. LATIN SMALL LETTER K WITH CARON
+    (16#1EB#, 16#1EB#, -1), -- LATIN SMALL LETTER O WITH OGONEK .. LATIN SMALL LETTER O WITH OGONEK
+    (16#1ED#, 16#1ED#, -1), -- LATIN SMALL LETTER O WITH OGONEK AND MACRON .. LATIN SMALL LETTER O WITH OGONEK AND MACRON
+    (16#1EF#, 16#1EF#, -1), -- LATIN SMALL LETTER EZH WITH CARON .. LATIN SMALL LETTER EZH WITH CARON
+    (16#1F2#, 16#1F2#, -1), -- LATIN CAPITAL LETTER D WITH SMALL LETTER Z .. LATIN CAPITAL LETTER D WITH SMALL LETTER Z
+    (16#1F3#, 16#1F3#, -2), -- LATIN SMALL LETTER DZ .. LATIN SMALL LETTER DZ
+    (16#1F5#, 16#1F5#, -1), -- LATIN SMALL LETTER G WITH ACUTE .. LATIN SMALL LETTER G WITH ACUTE
+    (16#1F9#, 16#1F9#, -1), -- LATIN SMALL LETTER N WITH GRAVE .. LATIN SMALL LETTER N WITH GRAVE
+    (16#1FB#, 16#1FB#, -1), -- LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE .. LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE
+    (16#1FD#, 16#1FD#, -1), -- LATIN SMALL LETTER AE WITH ACUTE .. LATIN SMALL LETTER AE WITH ACUTE
+    (16#1FF#, 16#1FF#, -1), -- LATIN SMALL LETTER O WITH STROKE AND ACUTE .. LATIN SMALL LETTER O WITH STROKE AND ACUTE
+    (16#201#, 16#201#, -1), -- LATIN SMALL LETTER A WITH DOUBLE GRAVE .. LATIN SMALL LETTER A WITH DOUBLE GRAVE
+    (16#203#, 16#203#, -1), -- LATIN SMALL LETTER A WITH INVERTED BREVE .. LATIN SMALL LETTER A WITH INVERTED BREVE
+    (16#205#, 16#205#, -1), -- LATIN SMALL LETTER E WITH DOUBLE GRAVE .. LATIN SMALL LETTER E WITH DOUBLE GRAVE
+    (16#207#, 16#207#, -1), -- LATIN SMALL LETTER E WITH INVERTED BREVE .. LATIN SMALL LETTER E WITH INVERTED BREVE
+    (16#209#, 16#209#, -1), -- LATIN SMALL LETTER I WITH DOUBLE GRAVE .. LATIN SMALL LETTER I WITH DOUBLE GRAVE
+    (16#20B#, 16#20B#, -1), -- LATIN SMALL LETTER I WITH INVERTED BREVE .. LATIN SMALL LETTER I WITH INVERTED BREVE
+    (16#20D#, 16#20D#, -1), -- LATIN SMALL LETTER O WITH DOUBLE GRAVE .. LATIN SMALL LETTER O WITH DOUBLE GRAVE
+    (16#20F#, 16#20F#, -1), -- LATIN SMALL LETTER O WITH INVERTED BREVE .. LATIN SMALL LETTER O WITH INVERTED BREVE
+    (16#211#, 16#211#, -1), -- LATIN SMALL LETTER R WITH DOUBLE GRAVE .. LATIN SMALL LETTER R WITH DOUBLE GRAVE
+    (16#213#, 16#213#, -1), -- LATIN SMALL LETTER R WITH INVERTED BREVE .. LATIN SMALL LETTER R WITH INVERTED BREVE
+    (16#215#, 16#215#, -1), -- LATIN SMALL LETTER U WITH DOUBLE GRAVE .. LATIN SMALL LETTER U WITH DOUBLE GRAVE
+    (16#217#, 16#217#, -1), -- LATIN SMALL LETTER U WITH INVERTED BREVE .. LATIN SMALL LETTER U WITH INVERTED BREVE
+    (16#219#, 16#219#, -1), -- LATIN SMALL LETTER S WITH COMMA BELOW .. LATIN SMALL LETTER S WITH COMMA BELOW
+    (16#21B#, 16#21B#, -1), -- LATIN SMALL LETTER T WITH COMMA BELOW .. LATIN SMALL LETTER T WITH COMMA BELOW
+    (16#21D#, 16#21D#, -1), -- LATIN SMALL LETTER YOGH .. LATIN SMALL LETTER YOGH
+    (16#21F#, 16#21F#, -1), -- LATIN SMALL LETTER H WITH CARON .. LATIN SMALL LETTER H WITH CARON
+    (16#223#, 16#223#, -1), -- LATIN SMALL LETTER OU .. LATIN SMALL LETTER OU
+    (16#225#, 16#225#, -1), -- LATIN SMALL LETTER Z WITH HOOK .. LATIN SMALL LETTER Z WITH HOOK
+    (16#227#, 16#227#, -1), -- LATIN SMALL LETTER A WITH DOT ABOVE .. LATIN SMALL LETTER A WITH DOT ABOVE
+    (16#229#, 16#229#, -1), -- LATIN SMALL LETTER E WITH CEDILLA .. LATIN SMALL LETTER E WITH CEDILLA
+    (16#22B#, 16#22B#, -1), -- LATIN SMALL LETTER O WITH DIAERESIS AND MACRON .. LATIN SMALL LETTER O WITH DIAERESIS AND MACRON
+    (16#22D#, 16#22D#, -1), -- LATIN SMALL LETTER O WITH TILDE AND MACRON .. LATIN SMALL LETTER O WITH TILDE AND MACRON
+    (16#22F#, 16#22F#, -1), -- LATIN SMALL LETTER O WITH DOT ABOVE .. LATIN SMALL LETTER O WITH DOT ABOVE
+    (16#231#, 16#231#, -1), -- LATIN SMALL LETTER O WITH DOT ABOVE AND MACRON .. LATIN SMALL LETTER O WITH DOT ABOVE AND MACRON
+    (16#233#, 16#233#, -1), -- LATIN SMALL LETTER Y WITH MACRON .. LATIN SMALL LETTER Y WITH MACRON
+    (16#253#, 16#253#, -210), -- LATIN SMALL LETTER B WITH HOOK .. LATIN SMALL LETTER B WITH HOOK
+    (16#254#, 16#254#, -206), -- LATIN SMALL LETTER OPEN O .. LATIN SMALL LETTER OPEN O
+    (16#256#, 16#257#, -205), -- LATIN SMALL LETTER D WITH TAIL .. LATIN SMALL LETTER D WITH HOOK
+    (16#259#, 16#259#, -202), -- LATIN SMALL LETTER SCHWA .. LATIN SMALL LETTER SCHWA
+    (16#25B#, 16#25B#, -203), -- LATIN SMALL LETTER OPEN E .. LATIN SMALL LETTER OPEN E
+    (16#260#, 16#260#, -205), -- LATIN SMALL LETTER G WITH HOOK .. LATIN SMALL LETTER G WITH HOOK
+    (16#263#, 16#263#, -207), -- LATIN SMALL LETTER GAMMA .. LATIN SMALL LETTER GAMMA
+    (16#268#, 16#268#, -209), -- LATIN SMALL LETTER I WITH STROKE .. LATIN SMALL LETTER I WITH STROKE
+    (16#269#, 16#269#, -211), -- LATIN SMALL LETTER IOTA .. LATIN SMALL LETTER IOTA
+    (16#26F#, 16#26F#, -211), -- LATIN SMALL LETTER TURNED M .. LATIN SMALL LETTER TURNED M
+    (16#272#, 16#272#, -213), -- LATIN SMALL LETTER N WITH LEFT HOOK .. LATIN SMALL LETTER N WITH LEFT HOOK
+    (16#275#, 16#275#, -214), -- LATIN SMALL LETTER BARRED O .. LATIN SMALL LETTER BARRED O
+    (16#280#, 16#280#, -218), -- LATIN LETTER SMALL CAPITAL R .. LATIN LETTER SMALL CAPITAL R
+    (16#283#, 16#283#, -218), -- LATIN SMALL LETTER ESH .. LATIN SMALL LETTER ESH
+    (16#288#, 16#288#, -218), -- LATIN SMALL LETTER T WITH RETROFLEX HOOK .. LATIN SMALL LETTER T WITH RETROFLEX HOOK
+    (16#28A#, 16#28B#, -217), -- LATIN SMALL LETTER UPSILON .. LATIN SMALL LETTER V WITH HOOK
+    (16#292#, 16#292#, -219), -- LATIN SMALL LETTER EZH .. LATIN SMALL LETTER EZH
+    (16#3AC#, 16#3AC#, -38), -- GREEK SMALL LETTER ALPHA WITH TONOS .. GREEK SMALL LETTER ALPHA WITH TONOS
+    (16#3AD#, 16#3AF#, -37), -- GREEK SMALL LETTER EPSILON WITH TONOS .. GREEK SMALL LETTER IOTA WITH TONOS
+    (16#3B1#, 16#3C1#, -32), -- GREEK SMALL LETTER ALPHA .. GREEK SMALL LETTER RHO
+    (16#3C2#, 16#3C2#, -31), -- GREEK SMALL LETTER FINAL SIGMA .. GREEK SMALL LETTER FINAL SIGMA
+    (16#3C3#, 16#3CB#, -32), -- GREEK SMALL LETTER SIGMA .. GREEK SMALL LETTER UPSILON WITH DIALYTIKA
+    (16#3CC#, 16#3CC#, -64), -- GREEK SMALL LETTER OMICRON WITH TONOS .. GREEK SMALL LETTER OMICRON WITH TONOS
+    (16#3CD#, 16#3CE#, -63), -- GREEK SMALL LETTER UPSILON WITH TONOS .. GREEK SMALL LETTER OMEGA WITH TONOS
+    (16#3D0#, 16#3D0#, -62), -- GREEK BETA SYMBOL .. GREEK BETA SYMBOL
+    (16#3D1#, 16#3D1#, -57), -- GREEK THETA SYMBOL .. GREEK THETA SYMBOL
+    (16#3D5#, 16#3D5#, -47), -- GREEK PHI SYMBOL .. GREEK PHI SYMBOL
+    (16#3D6#, 16#3D6#, -54), -- GREEK PI SYMBOL .. GREEK PI SYMBOL
+    (16#3D9#, 16#3D9#, -1), -- GREEK SMALL LETTER ARCHAIC KOPPA .. GREEK SMALL LETTER ARCHAIC KOPPA
+    (16#3DB#, 16#3DB#, -1), -- GREEK SMALL LETTER STIGMA .. GREEK SMALL LETTER STIGMA
+    (16#3DD#, 16#3DD#, -1), -- GREEK SMALL LETTER DIGAMMA .. GREEK SMALL LETTER DIGAMMA
+    (16#3DF#, 16#3DF#, -1), -- GREEK SMALL LETTER KOPPA .. GREEK SMALL LETTER KOPPA
+    (16#3E1#, 16#3E1#, -1), -- GREEK SMALL LETTER SAMPI .. GREEK SMALL LETTER SAMPI
+    (16#3E3#, 16#3E3#, -1), -- COPTIC SMALL LETTER SHEI .. COPTIC SMALL LETTER SHEI
+    (16#3E5#, 16#3E5#, -1), -- COPTIC SMALL LETTER FEI .. COPTIC SMALL LETTER FEI
+    (16#3E7#, 16#3E7#, -1), -- COPTIC SMALL LETTER KHEI .. COPTIC SMALL LETTER KHEI
+    (16#3E9#, 16#3E9#, -1), -- COPTIC SMALL LETTER HORI .. COPTIC SMALL LETTER HORI
+    (16#3EB#, 16#3EB#, -1), -- COPTIC SMALL LETTER GANGIA .. COPTIC SMALL LETTER GANGIA
+    (16#3ED#, 16#3ED#, -1), -- COPTIC SMALL LETTER SHIMA .. COPTIC SMALL LETTER SHIMA
+    (16#3EF#, 16#3EF#, -1), -- COPTIC SMALL LETTER DEI .. COPTIC SMALL LETTER DEI
+    (16#3F0#, 16#3F0#, -86), -- GREEK KAPPA SYMBOL .. GREEK KAPPA SYMBOL
+    (16#3F1#, 16#3F1#, -80), -- GREEK RHO SYMBOL .. GREEK RHO SYMBOL
+    (16#3F2#, 16#3F2#, -79), -- GREEK LUNATE SIGMA SYMBOL .. GREEK LUNATE SIGMA SYMBOL
+    (16#3F5#, 16#3F5#, -96), -- GREEK LUNATE EPSILON SYMBOL .. GREEK LUNATE EPSILON SYMBOL
+    (16#430#, 16#44F#, -32), -- CYRILLIC SMALL LETTER A .. CYRILLIC SMALL LETTER YA
+    (16#450#, 16#45F#, -80), -- CYRILLIC SMALL LETTER IE WITH GRAVE .. CYRILLIC SMALL LETTER DZHE
+    (16#461#, 16#461#, -1), -- CYRILLIC SMALL LETTER OMEGA .. CYRILLIC SMALL LETTER OMEGA
+    (16#463#, 16#463#, -1), -- CYRILLIC SMALL LETTER YAT .. CYRILLIC SMALL LETTER YAT
+    (16#465#, 16#465#, -1), -- CYRILLIC SMALL LETTER IOTIFIED E .. CYRILLIC SMALL LETTER IOTIFIED E
+    (16#467#, 16#467#, -1), -- CYRILLIC SMALL LETTER LITTLE YUS .. CYRILLIC SMALL LETTER LITTLE YUS
+    (16#469#, 16#469#, -1), -- CYRILLIC SMALL LETTER IOTIFIED LITTLE YUS .. CYRILLIC SMALL LETTER IOTIFIED LITTLE YUS
+    (16#46B#, 16#46B#, -1), -- CYRILLIC SMALL LETTER BIG YUS .. CYRILLIC SMALL LETTER BIG YUS
+    (16#46D#, 16#46D#, -1), -- CYRILLIC SMALL LETTER IOTIFIED BIG YUS .. CYRILLIC SMALL LETTER IOTIFIED BIG YUS
+    (16#46F#, 16#46F#, -1), -- CYRILLIC SMALL LETTER KSI .. CYRILLIC SMALL LETTER KSI
+    (16#471#, 16#471#, -1), -- CYRILLIC SMALL LETTER PSI .. CYRILLIC SMALL LETTER PSI
+    (16#473#, 16#473#, -1), -- CYRILLIC SMALL LETTER FITA .. CYRILLIC SMALL LETTER FITA
+    (16#475#, 16#475#, -1), -- CYRILLIC SMALL LETTER IZHITSA .. CYRILLIC SMALL LETTER IZHITSA
+    (16#477#, 16#477#, -1), -- CYRILLIC SMALL LETTER IZHITSA WITH DOUBLE GRAVE ACCENT .. CYRILLIC SMALL LETTER IZHITSA WITH DOUBLE GRAVE ACCENT
+    (16#479#, 16#479#, -1), -- CYRILLIC SMALL LETTER UK .. CYRILLIC SMALL LETTER UK
+    (16#47B#, 16#47B#, -1), -- CYRILLIC SMALL LETTER ROUND OMEGA .. CYRILLIC SMALL LETTER ROUND OMEGA
+    (16#47D#, 16#47D#, -1), -- CYRILLIC SMALL LETTER OMEGA WITH TITLO .. CYRILLIC SMALL LETTER OMEGA WITH TITLO
+    (16#47F#, 16#47F#, -1), -- CYRILLIC SMALL LETTER OT .. CYRILLIC SMALL LETTER OT
+    (16#481#, 16#481#, -1), -- CYRILLIC SMALL LETTER KOPPA .. CYRILLIC SMALL LETTER KOPPA
+    (16#48B#, 16#48B#, -1), -- CYRILLIC SMALL LETTER SHORT I WITH TAIL .. CYRILLIC SMALL LETTER SHORT I WITH TAIL
+    (16#48D#, 16#48D#, -1), -- CYRILLIC SMALL LETTER SEMISOFT SIGN .. CYRILLIC SMALL LETTER SEMISOFT SIGN
+    (16#48F#, 16#48F#, -1), -- CYRILLIC SMALL LETTER ER WITH TICK .. CYRILLIC SMALL LETTER ER WITH TICK
+    (16#491#, 16#491#, -1), -- CYRILLIC SMALL LETTER GHE WITH UPTURN .. CYRILLIC SMALL LETTER GHE WITH UPTURN
+    (16#493#, 16#493#, -1), -- CYRILLIC SMALL LETTER GHE WITH STROKE .. CYRILLIC SMALL LETTER GHE WITH STROKE
+    (16#495#, 16#495#, -1), -- CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK .. CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+    (16#497#, 16#497#, -1), -- CYRILLIC SMALL LETTER ZHE WITH DESCENDER .. CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+    (16#499#, 16#499#, -1), -- CYRILLIC SMALL LETTER ZE WITH DESCENDER .. CYRILLIC SMALL LETTER ZE WITH DESCENDER
+    (16#49B#, 16#49B#, -1), -- CYRILLIC SMALL LETTER KA WITH DESCENDER .. CYRILLIC SMALL LETTER KA WITH DESCENDER
+    (16#49D#, 16#49D#, -1), -- CYRILLIC SMALL LETTER KA WITH VERTICAL STROKE .. CYRILLIC SMALL LETTER KA WITH VERTICAL STROKE
+    (16#49F#, 16#49F#, -1), -- CYRILLIC SMALL LETTER KA WITH STROKE .. CYRILLIC SMALL LETTER KA WITH STROKE
+    (16#4A1#, 16#4A1#, -1), -- CYRILLIC SMALL LETTER BASHKIR KA .. CYRILLIC SMALL LETTER BASHKIR KA
+    (16#4A3#, 16#4A3#, -1), -- CYRILLIC SMALL LETTER EN WITH DESCENDER .. CYRILLIC SMALL LETTER EN WITH DESCENDER
+    (16#4A5#, 16#4A5#, -1), -- CYRILLIC SMALL LIGATURE EN GHE .. CYRILLIC SMALL LIGATURE EN GHE
+    (16#4A7#, 16#4A7#, -1), -- CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK .. CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+    (16#4A9#, 16#4A9#, -1), -- CYRILLIC SMALL LETTER ABKHASIAN HA .. CYRILLIC SMALL LETTER ABKHASIAN HA
+    (16#4AB#, 16#4AB#, -1), -- CYRILLIC SMALL LETTER ES WITH DESCENDER .. CYRILLIC SMALL LETTER ES WITH DESCENDER
+    (16#4AD#, 16#4AD#, -1), -- CYRILLIC SMALL LETTER TE WITH DESCENDER .. CYRILLIC SMALL LETTER TE WITH DESCENDER
+    (16#4AF#, 16#4AF#, -1), -- CYRILLIC SMALL LETTER STRAIGHT U .. CYRILLIC SMALL LETTER STRAIGHT U
+    (16#4B1#, 16#4B1#, -1), -- CYRILLIC SMALL LETTER STRAIGHT U WITH STROKE .. CYRILLIC SMALL LETTER STRAIGHT U WITH STROKE
+    (16#4B3#, 16#4B3#, -1), -- CYRILLIC SMALL LETTER HA WITH DESCENDER .. CYRILLIC SMALL LETTER HA WITH DESCENDER
+    (16#4B5#, 16#4B5#, -1), -- CYRILLIC SMALL LIGATURE TE TSE .. CYRILLIC SMALL LIGATURE TE TSE
+    (16#4B7#, 16#4B7#, -1), -- CYRILLIC SMALL LETTER CHE WITH DESCENDER .. CYRILLIC SMALL LETTER CHE WITH DESCENDER
+    (16#4B9#, 16#4B9#, -1), -- CYRILLIC SMALL LETTER CHE WITH VERTICAL STROKE .. CYRILLIC SMALL LETTER CHE WITH VERTICAL STROKE
+    (16#4BB#, 16#4BB#, -1), -- CYRILLIC SMALL LETTER SHHA .. CYRILLIC SMALL LETTER SHHA
+    (16#4BD#, 16#4BD#, -1), -- CYRILLIC SMALL LETTER ABKHASIAN CHE .. CYRILLIC SMALL LETTER ABKHASIAN CHE
+    (16#4BF#, 16#4BF#, -1), -- CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER .. CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+    (16#4C2#, 16#4C2#, -1), -- CYRILLIC SMALL LETTER ZHE WITH BREVE .. CYRILLIC SMALL LETTER ZHE WITH BREVE
+    (16#4C4#, 16#4C4#, -1), -- CYRILLIC SMALL LETTER KA WITH HOOK .. CYRILLIC SMALL LETTER KA WITH HOOK
+    (16#4C6#, 16#4C6#, -1), -- CYRILLIC SMALL LETTER EL WITH TAIL .. CYRILLIC SMALL LETTER EL WITH TAIL
+    (16#4C8#, 16#4C8#, -1), -- CYRILLIC SMALL LETTER EN WITH HOOK .. CYRILLIC SMALL LETTER EN WITH HOOK
+    (16#4CA#, 16#4CA#, -1), -- CYRILLIC SMALL LETTER EN WITH TAIL .. CYRILLIC SMALL LETTER EN WITH TAIL
+    (16#4CC#, 16#4CC#, -1), -- CYRILLIC SMALL LETTER KHAKASSIAN CHE .. CYRILLIC SMALL LETTER KHAKASSIAN CHE
+    (16#4CE#, 16#4CE#, -1), -- CYRILLIC SMALL LETTER EM WITH TAIL .. CYRILLIC SMALL LETTER EM WITH TAIL
+    (16#4D1#, 16#4D1#, -1), -- CYRILLIC SMALL LETTER A WITH BREVE .. CYRILLIC SMALL LETTER A WITH BREVE
+    (16#4D3#, 16#4D3#, -1), -- CYRILLIC SMALL LETTER A WITH DIAERESIS .. CYRILLIC SMALL LETTER A WITH DIAERESIS
+    (16#4D5#, 16#4D5#, -1), -- CYRILLIC SMALL LIGATURE A IE .. CYRILLIC SMALL LIGATURE A IE
+    (16#4D7#, 16#4D7#, -1), -- CYRILLIC SMALL LETTER IE WITH BREVE .. CYRILLIC SMALL LETTER IE WITH BREVE
+    (16#4D9#, 16#4D9#, -1), -- CYRILLIC SMALL LETTER SCHWA .. CYRILLIC SMALL LETTER SCHWA
+    (16#4DB#, 16#4DB#, -1), -- CYRILLIC SMALL LETTER SCHWA WITH DIAERESIS .. CYRILLIC SMALL LETTER SCHWA WITH DIAERESIS
+    (16#4DD#, 16#4DD#, -1), -- CYRILLIC SMALL LETTER ZHE WITH DIAERESIS .. CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+    (16#4DF#, 16#4DF#, -1), -- CYRILLIC SMALL LETTER ZE WITH DIAERESIS .. CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+    (16#4E1#, 16#4E1#, -1), -- CYRILLIC SMALL LETTER ABKHASIAN DZE .. CYRILLIC SMALL LETTER ABKHASIAN DZE
+    (16#4E3#, 16#4E3#, -1), -- CYRILLIC SMALL LETTER I WITH MACRON .. CYRILLIC SMALL LETTER I WITH MACRON
+    (16#4E5#, 16#4E5#, -1), -- CYRILLIC SMALL LETTER I WITH DIAERESIS .. CYRILLIC SMALL LETTER I WITH DIAERESIS
+    (16#4E7#, 16#4E7#, -1), -- CYRILLIC SMALL LETTER O WITH DIAERESIS .. CYRILLIC SMALL LETTER O WITH DIAERESIS
+    (16#4E9#, 16#4E9#, -1), -- CYRILLIC SMALL LETTER BARRED O .. CYRILLIC SMALL LETTER BARRED O
+    (16#4EB#, 16#4EB#, -1), -- CYRILLIC SMALL LETTER BARRED O WITH DIAERESIS .. CYRILLIC SMALL LETTER BARRED O WITH DIAERESIS
+    (16#4ED#, 16#4ED#, -1), -- CYRILLIC SMALL LETTER E WITH DIAERESIS .. CYRILLIC SMALL LETTER E WITH DIAERESIS
+    (16#4EF#, 16#4EF#, -1), -- CYRILLIC SMALL LETTER U WITH MACRON .. CYRILLIC SMALL LETTER U WITH MACRON
+    (16#4F1#, 16#4F1#, -1), -- CYRILLIC SMALL LETTER U WITH DIAERESIS .. CYRILLIC SMALL LETTER U WITH DIAERESIS
+    (16#4F3#, 16#4F3#, -1), -- CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE .. CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+    (16#4F5#, 16#4F5#, -1), -- CYRILLIC SMALL LETTER CHE WITH DIAERESIS .. CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+    (16#4F9#, 16#4F9#, -1), -- CYRILLIC SMALL LETTER YERU WITH DIAERESIS .. CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+    (16#501#, 16#501#, -1), -- CYRILLIC SMALL LETTER KOMI DE .. CYRILLIC SMALL LETTER KOMI DE
+    (16#503#, 16#503#, -1), -- CYRILLIC SMALL LETTER KOMI DJE .. CYRILLIC SMALL LETTER KOMI DJE
+    (16#505#, 16#505#, -1), -- CYRILLIC SMALL LETTER KOMI ZJE .. CYRILLIC SMALL LETTER KOMI ZJE
+    (16#507#, 16#507#, -1), -- CYRILLIC SMALL LETTER KOMI DZJE .. CYRILLIC SMALL LETTER KOMI DZJE
+    (16#509#, 16#509#, -1), -- CYRILLIC SMALL LETTER KOMI LJE .. CYRILLIC SMALL LETTER KOMI LJE
+    (16#50B#, 16#50B#, -1), -- CYRILLIC SMALL LETTER KOMI NJE .. CYRILLIC SMALL LETTER KOMI NJE
+    (16#50D#, 16#50D#, -1), -- CYRILLIC SMALL LETTER KOMI SJE .. CYRILLIC SMALL LETTER KOMI SJE
+    (16#50F#, 16#50F#, -1), -- CYRILLIC SMALL LETTER KOMI TJE .. CYRILLIC SMALL LETTER KOMI TJE
+    (16#561#, 16#586#, -48), -- ARMENIAN SMALL LETTER AYB .. ARMENIAN SMALL LETTER FEH
+    (16#1E01#, 16#1E01#, -1), -- LATIN SMALL LETTER A WITH RING BELOW .. LATIN SMALL LETTER A WITH RING BELOW
+    (16#1E03#, 16#1E03#, -1), -- LATIN SMALL LETTER B WITH DOT ABOVE .. LATIN SMALL LETTER B WITH DOT ABOVE
+    (16#1E05#, 16#1E05#, -1), -- LATIN SMALL LETTER B WITH DOT BELOW .. LATIN SMALL LETTER B WITH DOT BELOW
+    (16#1E07#, 16#1E07#, -1), -- LATIN SMALL LETTER B WITH LINE BELOW .. LATIN SMALL LETTER B WITH LINE BELOW
+    (16#1E09#, 16#1E09#, -1), -- LATIN SMALL LETTER C WITH CEDILLA AND ACUTE .. LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
+    (16#1E0B#, 16#1E0B#, -1), -- LATIN SMALL LETTER D WITH DOT ABOVE .. LATIN SMALL LETTER D WITH DOT ABOVE
+    (16#1E0D#, 16#1E0D#, -1), -- LATIN SMALL LETTER D WITH DOT BELOW .. LATIN SMALL LETTER D WITH DOT BELOW
+    (16#1E0F#, 16#1E0F#, -1), -- LATIN SMALL LETTER D WITH LINE BELOW .. LATIN SMALL LETTER D WITH LINE BELOW
+    (16#1E11#, 16#1E11#, -1), -- LATIN SMALL LETTER D WITH CEDILLA .. LATIN SMALL LETTER D WITH CEDILLA
+    (16#1E13#, 16#1E13#, -1), -- LATIN SMALL LETTER D WITH CIRCUMFLEX BELOW .. LATIN SMALL LETTER D WITH CIRCUMFLEX BELOW
+    (16#1E15#, 16#1E15#, -1), -- LATIN SMALL LETTER E WITH MACRON AND GRAVE .. LATIN SMALL LETTER E WITH MACRON AND GRAVE
+    (16#1E17#, 16#1E17#, -1), -- LATIN SMALL LETTER E WITH MACRON AND ACUTE .. LATIN SMALL LETTER E WITH MACRON AND ACUTE
+    (16#1E19#, 16#1E19#, -1), -- LATIN SMALL LETTER E WITH CIRCUMFLEX BELOW .. LATIN SMALL LETTER E WITH CIRCUMFLEX BELOW
+    (16#1E1B#, 16#1E1B#, -1), -- LATIN SMALL LETTER E WITH TILDE BELOW .. LATIN SMALL LETTER E WITH TILDE BELOW
+    (16#1E1D#, 16#1E1D#, -1), -- LATIN SMALL LETTER E WITH CEDILLA AND BREVE .. LATIN SMALL LETTER E WITH CEDILLA AND BREVE
+    (16#1E1F#, 16#1E1F#, -1), -- LATIN SMALL LETTER F WITH DOT ABOVE .. LATIN SMALL LETTER F WITH DOT ABOVE
+    (16#1E21#, 16#1E21#, -1), -- LATIN SMALL LETTER G WITH MACRON .. LATIN SMALL LETTER G WITH MACRON
+    (16#1E23#, 16#1E23#, -1), -- LATIN SMALL LETTER H WITH DOT ABOVE .. LATIN SMALL LETTER H WITH DOT ABOVE
+    (16#1E25#, 16#1E25#, -1), -- LATIN SMALL LETTER H WITH DOT BELOW .. LATIN SMALL LETTER H WITH DOT BELOW
+    (16#1E27#, 16#1E27#, -1), -- LATIN SMALL LETTER H WITH DIAERESIS .. LATIN SMALL LETTER H WITH DIAERESIS
+    (16#1E29#, 16#1E29#, -1), -- LATIN SMALL LETTER H WITH CEDILLA .. LATIN SMALL LETTER H WITH CEDILLA
+    (16#1E2B#, 16#1E2B#, -1), -- LATIN SMALL LETTER H WITH BREVE BELOW .. LATIN SMALL LETTER H WITH BREVE BELOW
+    (16#1E2D#, 16#1E2D#, -1), -- LATIN SMALL LETTER I WITH TILDE BELOW .. LATIN SMALL LETTER I WITH TILDE BELOW
+    (16#1E2F#, 16#1E2F#, -1), -- LATIN SMALL LETTER I WITH DIAERESIS AND ACUTE .. LATIN SMALL LETTER I WITH DIAERESIS AND ACUTE
+    (16#1E31#, 16#1E31#, -1), -- LATIN SMALL LETTER K WITH ACUTE .. LATIN SMALL LETTER K WITH ACUTE
+    (16#1E33#, 16#1E33#, -1), -- LATIN SMALL LETTER K WITH DOT BELOW .. LATIN SMALL LETTER K WITH DOT BELOW
+    (16#1E35#, 16#1E35#, -1), -- LATIN SMALL LETTER K WITH LINE BELOW .. LATIN SMALL LETTER K WITH LINE BELOW
+    (16#1E37#, 16#1E37#, -1), -- LATIN SMALL LETTER L WITH DOT BELOW .. LATIN SMALL LETTER L WITH DOT BELOW
+    (16#1E39#, 16#1E39#, -1), -- LATIN SMALL LETTER L WITH DOT BELOW AND MACRON .. LATIN SMALL LETTER L WITH DOT BELOW AND MACRON
+    (16#1E3B#, 16#1E3B#, -1), -- LATIN SMALL LETTER L WITH LINE BELOW .. LATIN SMALL LETTER L WITH LINE BELOW
+    (16#1E3D#, 16#1E3D#, -1), -- LATIN SMALL LETTER L WITH CIRCUMFLEX BELOW .. LATIN SMALL LETTER L WITH CIRCUMFLEX BELOW
+    (16#1E3F#, 16#1E3F#, -1), -- LATIN SMALL LETTER M WITH ACUTE .. LATIN SMALL LETTER M WITH ACUTE
+    (16#1E41#, 16#1E41#, -1), -- LATIN SMALL LETTER M WITH DOT ABOVE .. LATIN SMALL LETTER M WITH DOT ABOVE
+    (16#1E43#, 16#1E43#, -1), -- LATIN SMALL LETTER M WITH DOT BELOW .. LATIN SMALL LETTER M WITH DOT BELOW
+    (16#1E45#, 16#1E45#, -1), -- LATIN SMALL LETTER N WITH DOT ABOVE .. LATIN SMALL LETTER N WITH DOT ABOVE
+    (16#1E47#, 16#1E47#, -1), -- LATIN SMALL LETTER N WITH DOT BELOW .. LATIN SMALL LETTER N WITH DOT BELOW
+    (16#1E49#, 16#1E49#, -1), -- LATIN SMALL LETTER N WITH LINE BELOW .. LATIN SMALL LETTER N WITH LINE BELOW
+    (16#1E4B#, 16#1E4B#, -1), -- LATIN SMALL LETTER N WITH CIRCUMFLEX BELOW .. LATIN SMALL LETTER N WITH CIRCUMFLEX BELOW
+    (16#1E4D#, 16#1E4D#, -1), -- LATIN SMALL LETTER O WITH TILDE AND ACUTE .. LATIN SMALL LETTER O WITH TILDE AND ACUTE
+    (16#1E4F#, 16#1E4F#, -1), -- LATIN SMALL LETTER O WITH TILDE AND DIAERESIS .. LATIN SMALL LETTER O WITH TILDE AND DIAERESIS
+    (16#1E51#, 16#1E51#, -1), -- LATIN SMALL LETTER O WITH MACRON AND GRAVE .. LATIN SMALL LETTER O WITH MACRON AND GRAVE
+    (16#1E53#, 16#1E53#, -1), -- LATIN SMALL LETTER O WITH MACRON AND ACUTE .. LATIN SMALL LETTER O WITH MACRON AND ACUTE
+    (16#1E55#, 16#1E55#, -1), -- LATIN SMALL LETTER P WITH ACUTE .. LATIN SMALL LETTER P WITH ACUTE
+    (16#1E57#, 16#1E57#, -1), -- LATIN SMALL LETTER P WITH DOT ABOVE .. LATIN SMALL LETTER P WITH DOT ABOVE
+    (16#1E59#, 16#1E59#, -1), -- LATIN SMALL LETTER R WITH DOT ABOVE .. LATIN SMALL LETTER R WITH DOT ABOVE
+    (16#1E5B#, 16#1E5B#, -1), -- LATIN SMALL LETTER R WITH DOT BELOW .. LATIN SMALL LETTER R WITH DOT BELOW
+    (16#1E5D#, 16#1E5D#, -1), -- LATIN SMALL LETTER R WITH DOT BELOW AND MACRON .. LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
+    (16#1E5F#, 16#1E5F#, -1), -- LATIN SMALL LETTER R WITH LINE BELOW .. LATIN SMALL LETTER R WITH LINE BELOW
+    (16#1E61#, 16#1E61#, -1), -- LATIN SMALL LETTER S WITH DOT ABOVE .. LATIN SMALL LETTER S WITH DOT ABOVE
+    (16#1E63#, 16#1E63#, -1), -- LATIN SMALL LETTER S WITH DOT BELOW .. LATIN SMALL LETTER S WITH DOT BELOW
+    (16#1E65#, 16#1E65#, -1), -- LATIN SMALL LETTER S WITH ACUTE AND DOT ABOVE .. LATIN SMALL LETTER S WITH ACUTE AND DOT ABOVE
+    (16#1E67#, 16#1E67#, -1), -- LATIN SMALL LETTER S WITH CARON AND DOT ABOVE .. LATIN SMALL LETTER S WITH CARON AND DOT ABOVE
+    (16#1E69#, 16#1E69#, -1), -- LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE .. LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE
+    (16#1E6B#, 16#1E6B#, -1), -- LATIN SMALL LETTER T WITH DOT ABOVE .. LATIN SMALL LETTER T WITH DOT ABOVE
+    (16#1E6D#, 16#1E6D#, -1), -- LATIN SMALL LETTER T WITH DOT BELOW .. LATIN SMALL LETTER T WITH DOT BELOW
+    (16#1E6F#, 16#1E6F#, -1), -- LATIN SMALL LETTER T WITH LINE BELOW .. LATIN SMALL LETTER T WITH LINE BELOW
+    (16#1E71#, 16#1E71#, -1), -- LATIN SMALL LETTER T WITH CIRCUMFLEX BELOW .. LATIN SMALL LETTER T WITH CIRCUMFLEX BELOW
+    (16#1E73#, 16#1E73#, -1), -- LATIN SMALL LETTER U WITH DIAERESIS BELOW .. LATIN SMALL LETTER U WITH DIAERESIS BELOW
+    (16#1E75#, 16#1E75#, -1), -- LATIN SMALL LETTER U WITH TILDE BELOW .. LATIN SMALL LETTER U WITH TILDE BELOW
+    (16#1E77#, 16#1E77#, -1), -- LATIN SMALL LETTER U WITH CIRCUMFLEX BELOW .. LATIN SMALL LETTER U WITH CIRCUMFLEX BELOW
+    (16#1E79#, 16#1E79#, -1), -- LATIN SMALL LETTER U WITH TILDE AND ACUTE .. LATIN SMALL LETTER U WITH TILDE AND ACUTE
+    (16#1E7B#, 16#1E7B#, -1), -- LATIN SMALL LETTER U WITH MACRON AND DIAERESIS .. LATIN SMALL LETTER U WITH MACRON AND DIAERESIS
+    (16#1E7D#, 16#1E7D#, -1), -- LATIN SMALL LETTER V WITH TILDE .. LATIN SMALL LETTER V WITH TILDE
+    (16#1E7F#, 16#1E7F#, -1), -- LATIN SMALL LETTER V WITH DOT BELOW .. LATIN SMALL LETTER V WITH DOT BELOW
+    (16#1E81#, 16#1E81#, -1), -- LATIN SMALL LETTER W WITH GRAVE .. LATIN SMALL LETTER W WITH GRAVE
+    (16#1E83#, 16#1E83#, -1), -- LATIN SMALL LETTER W WITH ACUTE .. LATIN SMALL LETTER W WITH ACUTE
+    (16#1E85#, 16#1E85#, -1), -- LATIN SMALL LETTER W WITH DIAERESIS .. LATIN SMALL LETTER W WITH DIAERESIS
+    (16#1E87#, 16#1E87#, -1), -- LATIN SMALL LETTER W WITH DOT ABOVE .. LATIN SMALL LETTER W WITH DOT ABOVE
+    (16#1E89#, 16#1E89#, -1), -- LATIN SMALL LETTER W WITH DOT BELOW .. LATIN SMALL LETTER W WITH DOT BELOW
+    (16#1E8B#, 16#1E8B#, -1), -- LATIN SMALL LETTER X WITH DOT ABOVE .. LATIN SMALL LETTER X WITH DOT ABOVE
+    (16#1E8D#, 16#1E8D#, -1), -- LATIN SMALL LETTER X WITH DIAERESIS .. LATIN SMALL LETTER X WITH DIAERESIS
+    (16#1E8F#, 16#1E8F#, -1), -- LATIN SMALL LETTER Y WITH DOT ABOVE .. LATIN SMALL LETTER Y WITH DOT ABOVE
+    (16#1E91#, 16#1E91#, -1), -- LATIN SMALL LETTER Z WITH CIRCUMFLEX .. LATIN SMALL LETTER Z WITH CIRCUMFLEX
+    (16#1E93#, 16#1E93#, -1), -- LATIN SMALL LETTER Z WITH DOT BELOW .. LATIN SMALL LETTER Z WITH DOT BELOW
+    (16#1E95#, 16#1E95#, -1), -- LATIN SMALL LETTER Z WITH LINE BELOW .. LATIN SMALL LETTER Z WITH LINE BELOW
+    (16#1E9B#, 16#1E9B#, -59), -- LATIN SMALL LETTER LONG S WITH DOT ABOVE .. LATIN SMALL LETTER LONG S WITH DOT ABOVE
+    (16#1EA1#, 16#1EA1#, -1), -- LATIN SMALL LETTER A WITH DOT BELOW .. LATIN SMALL LETTER A WITH DOT BELOW
+    (16#1EA3#, 16#1EA3#, -1), -- LATIN SMALL LETTER A WITH HOOK ABOVE .. LATIN SMALL LETTER A WITH HOOK ABOVE
+    (16#1EA5#, 16#1EA5#, -1), -- LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE .. LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE
+    (16#1EA7#, 16#1EA7#, -1), -- LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE .. LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE
+    (16#1EA9#, 16#1EA9#, -1), -- LATIN SMALL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE .. LATIN SMALL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
+    (16#1EAB#, 16#1EAB#, -1), -- LATIN SMALL LETTER A WITH CIRCUMFLEX AND TILDE .. LATIN SMALL LETTER A WITH CIRCUMFLEX AND TILDE
+    (16#1EAD#, 16#1EAD#, -1), -- LATIN SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW .. LATIN SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW
+    (16#1EAF#, 16#1EAF#, -1), -- LATIN SMALL LETTER A WITH BREVE AND ACUTE .. LATIN SMALL LETTER A WITH BREVE AND ACUTE
+    (16#1EB1#, 16#1EB1#, -1), -- LATIN SMALL LETTER A WITH BREVE AND GRAVE .. LATIN SMALL LETTER A WITH BREVE AND GRAVE
+    (16#1EB3#, 16#1EB3#, -1), -- LATIN SMALL LETTER A WITH BREVE AND HOOK ABOVE .. LATIN SMALL LETTER A WITH BREVE AND HOOK ABOVE
+    (16#1EB5#, 16#1EB5#, -1), -- LATIN SMALL LETTER A WITH BREVE AND TILDE .. LATIN SMALL LETTER A WITH BREVE AND TILDE
+    (16#1EB7#, 16#1EB7#, -1), -- LATIN SMALL LETTER A WITH BREVE AND DOT BELOW .. LATIN SMALL LETTER A WITH BREVE AND DOT BELOW
+    (16#1EB9#, 16#1EB9#, -1), -- LATIN SMALL LETTER E WITH DOT BELOW .. LATIN SMALL LETTER E WITH DOT BELOW
+    (16#1EBB#, 16#1EBB#, -1), -- LATIN SMALL LETTER E WITH HOOK ABOVE .. LATIN SMALL LETTER E WITH HOOK ABOVE
+    (16#1EBD#, 16#1EBD#, -1), -- LATIN SMALL LETTER E WITH TILDE .. LATIN SMALL LETTER E WITH TILDE
+    (16#1EBF#, 16#1EBF#, -1), -- LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE .. LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE
+    (16#1EC1#, 16#1EC1#, -1), -- LATIN SMALL LETTER E WITH CIRCUMFLEX AND GRAVE .. LATIN SMALL LETTER E WITH CIRCUMFLEX AND GRAVE
+    (16#1EC3#, 16#1EC3#, -1), -- LATIN SMALL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE .. LATIN SMALL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
+    (16#1EC5#, 16#1EC5#, -1), -- LATIN SMALL LETTER E WITH CIRCUMFLEX AND TILDE .. LATIN SMALL LETTER E WITH CIRCUMFLEX AND TILDE
+    (16#1EC7#, 16#1EC7#, -1), -- LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW .. LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW
+    (16#1EC9#, 16#1EC9#, -1), -- LATIN SMALL LETTER I WITH HOOK ABOVE .. LATIN SMALL LETTER I WITH HOOK ABOVE
+    (16#1ECB#, 16#1ECB#, -1), -- LATIN SMALL LETTER I WITH DOT BELOW .. LATIN SMALL LETTER I WITH DOT BELOW
+    (16#1ECD#, 16#1ECD#, -1), -- LATIN SMALL LETTER O WITH DOT BELOW .. LATIN SMALL LETTER O WITH DOT BELOW
+    (16#1ECF#, 16#1ECF#, -1), -- LATIN SMALL LETTER O WITH HOOK ABOVE .. LATIN SMALL LETTER O WITH HOOK ABOVE
+    (16#1ED1#, 16#1ED1#, -1), -- LATIN SMALL LETTER O WITH CIRCUMFLEX AND ACUTE .. LATIN SMALL LETTER O WITH CIRCUMFLEX AND ACUTE
+    (16#1ED3#, 16#1ED3#, -1), -- LATIN SMALL LETTER O WITH CIRCUMFLEX AND GRAVE .. LATIN SMALL LETTER O WITH CIRCUMFLEX AND GRAVE
+    (16#1ED5#, 16#1ED5#, -1), -- LATIN SMALL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE .. LATIN SMALL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
+    (16#1ED7#, 16#1ED7#, -1), -- LATIN SMALL LETTER O WITH CIRCUMFLEX AND TILDE .. LATIN SMALL LETTER O WITH CIRCUMFLEX AND TILDE
+    (16#1ED9#, 16#1ED9#, -1), -- LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW .. LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW
+    (16#1EDB#, 16#1EDB#, -1), -- LATIN SMALL LETTER O WITH HORN AND ACUTE .. LATIN SMALL LETTER O WITH HORN AND ACUTE
+    (16#1EDD#, 16#1EDD#, -1), -- LATIN SMALL LETTER O WITH HORN AND GRAVE .. LATIN SMALL LETTER O WITH HORN AND GRAVE
+    (16#1EDF#, 16#1EDF#, -1), -- LATIN SMALL LETTER O WITH HORN AND HOOK ABOVE .. LATIN SMALL LETTER O WITH HORN AND HOOK ABOVE
+    (16#1EE1#, 16#1EE1#, -1), -- LATIN SMALL LETTER O WITH HORN AND TILDE .. LATIN SMALL LETTER O WITH HORN AND TILDE
+    (16#1EE3#, 16#1EE3#, -1), -- LATIN SMALL LETTER O WITH HORN AND DOT BELOW .. LATIN SMALL LETTER O WITH HORN AND DOT BELOW
+    (16#1EE5#, 16#1EE5#, -1), -- LATIN SMALL LETTER U WITH DOT BELOW .. LATIN SMALL LETTER U WITH DOT BELOW
+    (16#1EE7#, 16#1EE7#, -1), -- LATIN SMALL LETTER U WITH HOOK ABOVE .. LATIN SMALL LETTER U WITH HOOK ABOVE
+    (16#1EE9#, 16#1EE9#, -1), -- LATIN SMALL LETTER U WITH HORN AND ACUTE .. LATIN SMALL LETTER U WITH HORN AND ACUTE
+    (16#1EEB#, 16#1EEB#, -1), -- LATIN SMALL LETTER U WITH HORN AND GRAVE .. LATIN SMALL LETTER U WITH HORN AND GRAVE
+    (16#1EED#, 16#1EED#, -1), -- LATIN SMALL LETTER U WITH HORN AND HOOK ABOVE .. LATIN SMALL LETTER U WITH HORN AND HOOK ABOVE
+    (16#1EEF#, 16#1EEF#, -1), -- LATIN SMALL LETTER U WITH HORN AND TILDE .. LATIN SMALL LETTER U WITH HORN AND TILDE
+    (16#1EF1#, 16#1EF1#, -1), -- LATIN SMALL LETTER U WITH HORN AND DOT BELOW .. LATIN SMALL LETTER U WITH HORN AND DOT BELOW
+    (16#1EF3#, 16#1EF3#, -1), -- LATIN SMALL LETTER Y WITH GRAVE .. LATIN SMALL LETTER Y WITH GRAVE
+    (16#1EF5#, 16#1EF5#, -1), -- LATIN SMALL LETTER Y WITH DOT BELOW .. LATIN SMALL LETTER Y WITH DOT BELOW
+    (16#1EF7#, 16#1EF7#, -1), -- LATIN SMALL LETTER Y WITH HOOK ABOVE .. LATIN SMALL LETTER Y WITH HOOK ABOVE
+    (16#1EF9#, 16#1EF9#, -1), -- LATIN SMALL LETTER Y WITH TILDE .. LATIN SMALL LETTER Y WITH TILDE
+    (16#1F00#, 16#1F07#, 8), -- GREEK SMALL LETTER ALPHA WITH PSILI .. GREEK SMALL LETTER ALPHA WITH DASIA AND PERISPOMENI
+    (16#1F10#, 16#1F15#, 8), -- GREEK SMALL LETTER EPSILON WITH PSILI .. GREEK SMALL LETTER EPSILON WITH DASIA AND OXIA
+    (16#1F20#, 16#1F27#, 8), -- GREEK SMALL LETTER ETA WITH PSILI .. GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI
+    (16#1F30#, 16#1F37#, 8), -- GREEK SMALL LETTER IOTA WITH PSILI .. GREEK SMALL LETTER IOTA WITH DASIA AND PERISPOMENI
+    (16#1F40#, 16#1F45#, 8), -- GREEK SMALL LETTER OMICRON WITH PSILI .. GREEK SMALL LETTER OMICRON WITH DASIA AND OXIA
+    (16#1F51#, 16#1F51#, 8), -- GREEK SMALL LETTER UPSILON WITH DASIA .. GREEK SMALL LETTER UPSILON WITH DASIA
+    (16#1F53#, 16#1F53#, 8), -- GREEK SMALL LETTER UPSILON WITH DASIA AND VARIA .. GREEK SMALL LETTER UPSILON WITH DASIA AND VARIA
+    (16#1F55#, 16#1F55#, 8), -- GREEK SMALL LETTER UPSILON WITH DASIA AND OXIA .. GREEK SMALL LETTER UPSILON WITH DASIA AND OXIA
+    (16#1F57#, 16#1F57#, 8), -- GREEK SMALL LETTER UPSILON WITH DASIA AND PERISPOMENI .. GREEK SMALL LETTER UPSILON WITH DASIA AND PERISPOMENI
+    (16#1F60#, 16#1F67#, 8), -- GREEK SMALL LETTER OMEGA WITH PSILI .. GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI
+    (16#1F70#, 16#1F71#, 74), -- GREEK SMALL LETTER ALPHA WITH VARIA .. GREEK SMALL LETTER ALPHA WITH OXIA
+    (16#1F72#, 16#1F75#, 86), -- GREEK SMALL LETTER EPSILON WITH VARIA .. GREEK SMALL LETTER ETA WITH OXIA
+    (16#1F76#, 16#1F77#, 100), -- GREEK SMALL LETTER IOTA WITH VARIA .. GREEK SMALL LETTER IOTA WITH OXIA
+    (16#1F78#, 16#1F79#, 128), -- GREEK SMALL LETTER OMICRON WITH VARIA .. GREEK SMALL LETTER OMICRON WITH OXIA
+    (16#1F7A#, 16#1F7B#, 112), -- GREEK SMALL LETTER UPSILON WITH VARIA .. GREEK SMALL LETTER UPSILON WITH OXIA
+    (16#1F7C#, 16#1F7D#, 126), -- GREEK SMALL LETTER OMEGA WITH VARIA .. GREEK SMALL LETTER OMEGA WITH OXIA
+    (16#1F80#, 16#1F87#, 8), -- GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI .. GREEK SMALL LETTER ALPHA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
+    (16#1F90#, 16#1F97#, 8), -- GREEK SMALL LETTER ETA WITH PSILI AND YPOGEGRAMMENI .. GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
+    (16#1FA0#, 16#1FA7#, 8), -- GREEK SMALL LETTER OMEGA WITH PSILI AND YPOGEGRAMMENI .. GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
+    (16#1FB0#, 16#1FB1#, 8), -- GREEK SMALL LETTER ALPHA WITH VRACHY .. GREEK SMALL LETTER ALPHA WITH MACRON
+    (16#1FB3#, 16#1FB3#, 9), -- GREEK SMALL LETTER ALPHA WITH YPOGEGRAMMENI .. GREEK SMALL LETTER ALPHA WITH YPOGEGRAMMENI
+    (16#1FBE#, 16#1FBE#, -7205), -- GREEK PROSGEGRAMMENI .. GREEK PROSGEGRAMMENI
+    (16#1FC3#, 16#1FC3#, 9), -- GREEK SMALL LETTER ETA WITH YPOGEGRAMMENI .. GREEK SMALL LETTER ETA WITH YPOGEGRAMMENI
+    (16#1FD0#, 16#1FD1#, 8), -- GREEK SMALL LETTER IOTA WITH VRACHY .. GREEK SMALL LETTER IOTA WITH MACRON
+    (16#1FE0#, 16#1FE1#, 8), -- GREEK SMALL LETTER UPSILON WITH VRACHY .. GREEK SMALL LETTER UPSILON WITH MACRON
+    (16#1FE5#, 16#1FE5#, 7), -- GREEK SMALL LETTER RHO WITH DASIA .. GREEK SMALL LETTER RHO WITH DASIA
+    (16#1FF3#, 16#1FF3#, 9), -- GREEK SMALL LETTER OMEGA WITH YPOGEGRAMMENI .. GREEK SMALL LETTER OMEGA WITH YPOGEGRAMMENI
+    (16#FF41#, 16#FF5A#, -32), -- FULLWIDTH LATIN SMALL LETTER A .. FULLWIDTH LATIN SMALL LETTER Z
+    (16#10428#, 16#1044D#, -40) -- DESERET SMALL LETTER LONG I .. DESERET SMALL LETTER ENG
+   );
+
+*************************************************************
+
+From: Randy Brukardt
+Sent: Wednesday, November 27, 2002  11:01 AM
+
+Thanks for doing this. Where are you finding the information that you are
+using to do this? A quick search of the net didn't turn up anything
+machine-readable...
+
+*************************************************************
+
+From: Michael F. Yoder
+Sent: Wednesday, November 27, 2002  12:20 PM
+
+The root link is www.unicode.org and the "latest version" link goes to
+
+http://www.unicode.org/unicode/reports/tr28/
+
+The "this version" link at the top goes to a page with some relevant
+stuff. The page with the machine-readable files for V3.2 is:
+http://www.unicode.org/Public/UNIDATA/ . The current organization seems
+to be harder to navigate than it used to be; I'm unsure why.
+
+N.B. version 3.2 of Unicode claims to be "fully synchronized" with ISO
+10646, so it is strongly preferable to earlier versions.
+
+*************************************************************
 

Questions? Ask the ACAA Technical Agent