1 files changed, 0 insertions, 344 deletions
diff --git a/libjava/classpath/doc/unicode/ReadMe-2.1.1.txt b/libjava/classpath/doc/unicode/ReadMe-2.1.1.txt
deleted file mode 100644
index 506f155a762..00000000000
--- a/libjava/classpath/doc/unicode/ReadMe-2.1.1.txt
+++ /dev/null
@@ -1,344 +0,0 @@
-
-UNICODE 2.1 CHARACTER DATABASE
-
-Copyright (c) 1991-1998 Unicode, Inc.
-All Rights reserved.
-
-DISCLAIMER
-
-The Unicode Character Database "UNIDAT21.TXT" is provided as-is by
-Unicode, Inc. (The Unicode Consortium). No claims are made as to fitness for any
-particular purpose. No warranties of any kind are expressed or implied. The
-recipient agrees to determine applicability of information provided. If this
-file has been purchased on magnetic or optical media from Unicode, Inc.,
-the sole remedy for any claim will be exchange of defective media within
-90 days of receipt.
-
-This disclaimer is applicable for all other data files accompanying the
-Unicode Character Database, some of which have been compiled by the
-Unicode Consortium, and some of which have been supplied by other vendors.
-
-LIMITATIONS ON RIGHTS TO REDISTRIBUTE THIS DATA
-
-Recipient is granted the right to make copies in any form for internal
-distribution and to freely use the information supplied in the creation of
-products supporting the Unicode (TM) Standard. This file can be redistributed
-to third parties or other organizations (whether for profit or not) as long
-as this notice and the disclaimer notice are retained.
-
-EXPLANATORY INFORMATION
-
-The Unicode Character Database defines the default Unicode character
-properties, and internal mappings. Particular implementations may choose to
-override the properties and mappings that are not normative. If that is done,
-it is up to the implementer to establish a protocol to convey that
-information. For more information about character properties and mappings,
-see "The Unicode Standard, Worldwide Character Encoding, Version 2.0",
-published by Addison-Wesley. For information about other data files
-accompanying the Unicode Character Database, see the section of the
-Unicode Standard they were extracted from, or the explanatory readme
-files and/or header sections with those files.
-
-The Unicode Character Database has been updated to reflect Version 2.1
-of the Unicode Standard, with two additional characters added to those
-published in Version 2.0:
-
-   U+20AC EURO SIGN
-   U+FFFC OBJECT REPLACEMENT CHARACTER
-
-A number of corrections have also been made to case mappings or other
-errors in the database noted since the publication of Version 2.0. And
-a few normative bidirectional properties have been modified to reflect
-decisions of the Unicode Technical Committee.
-
-The Unicode Character Database is a plain ASCII text file consisting of lines
-containing fields terminated by semicolons. Each line represents the data for
-one encoded character in the Unicode Standard, Version 2.1. Every encoded
-character has a data entry, with the exception of certain special ranges, as
-detailed below.
-
-There are five special ranges of characters that are represented only by
-their start and end characters, since the properties in the file are uniform,
-except for code values (which are all sequential and assigned). The names of CJK
-ideograph characters and Hangul syllable characters are algorithmically
-derivable. (See the Unicode Standard for more information). Surrogate
-characters and private use characters have no names.
-
-The exact ranges represented by start and end characters are:
-
-   The CJK Ideographs Area (U+4E00 - U+9FFF)
-   The Hangul Syllables Area (U+AC00 - U+D7A3)
-   The Surrogates Area (U+D800 - U+DFFF)
-   The Private Use Area (U+E000 - U+F8FF)
-   CJK Compatibility Ideographs (U+F900 - U+FAFF)
-
-The following table describes the format and meaning of each field in a
-data entry in the Unicode Character Database. Fields which contain
-normative information are so indicated.
-
-Field	Explanation
------	-----------
-
-  0	Code value in 4-digit hexadecimal format.
-  	This field is normative.
-
-  1	Unicode 2.1 Character Name. These names match exactly the
-	names published in Chapter 7 of the Unicode Standard, Version
-	2.0, except for the two additional characters.
-  	This field is normative.
-
-  2	General Category. This is a useful breakdown into various "character
-	types" which can be used as a default categorization in implementations.
- 	Some of the values are normative, and some are informative.
- 	See below for a brief explanation.
-
-  3	Canonical Combining Classes. The classes used for the
-	Canonical Ordering Algorithm in the Unicode Standard. These
-	classes are also printed in Chapter 4 of the Unicode Standard.
-        This field is normative. See below for a brief explanation.
-
-  4	Bidirectional Category. See the list below for an explanation of the
-	abbreviations used in this field. These are the categories required
-	by the Bidirectional Behavior Algorithm in the Unicode Standard.
-	These categories are summarized in Chapter 4 of the Unicode Standard.
-	This field is normative.
-
-  5	Character Decomposition. In the Unicode Standard, not all of
-	the decompositions are full decompositions. Recursive
-	application of look-up for decompositions will, in all cases, lead to
-	a maximal decomposition. The decompositions match exactly the
-	decompositions published with the character names in Chapter 7
-	of the Unicode Standard. This field is normative.
-
-  6	Decimal digit value. This is a numeric field. If the character
-	has the decimal digit property, as specified in Chapter 4 of
-	the Unicode Standard, the value of that digit is represented
-	with an integer value in this field. This field is normative.
-
-  7	Digit value. This is a numeric field. If the character represents a
-	digit, not necessarily a decimal digit, the value is here. This
-	covers digits which do not form decimal radix forms, such as the
-	compatibility superscript digits. This field is informative.
-
-  8	Numeric value. This is a numeric field. If the character has the
-	numeric property, as specified in Chapter 4 of the Unicode
-	Standard, the value of that character is represented with an
-	integer or rational number in this field. This includes fractions as,
-	e.g., "1/5" for U+2155 VULGAR FRACTION ONE FIFTH.
-	Also included are numerical values for compatibility characters
-	such as circled numbers. This field is normative.
-
-  9	If the characters has been identified as a "mirrored" character in
-        bidirectional text, this field has the value "Y"; otherwise "N".
-	The list of mirrored characters is also printed in Chapter 4 of
-	the Unicode Standard. This field is normative.
-
- 10	Unicode 1.0 Name. This is the old name as published in Unicode 1.0.
-	This name is only provided when it is significantly different from
-	the Unicode 2.1 name for the character. This field is informative.
-
- 11	10646 Comment field. This field is informative.
-
- 12	Upper case equivalent mapping. If a character is part of an
-	alphabet with case distinctions, and has an upper case equivalent,
-	then the upper case equivalent is in this field. See the explanation
-	below on case distinctions. These mappings are always one-to-one,
-	not one-to-many or many-to-one. This field is informative.
-
- 13	Lower case equivalent mapping. Similar to 12. This field is informative.
-
- 14	Title case equivalent mapping. Similar to 12. This field is informative.
-
-GENERAL CATEGORY
-
-The values in this field are abbreviations for the following. Some of the
-values are normative, and some are informative. For more information, see
-the Unicode Standard. Note: the standard does not assign information to
-control characters (except for TAB in the Bidirectonal Algorithm).
-Implementations will generally also assign categories to certain control
-characters, notably CR and LF, according to platform conventions.
-
-
-Normative
-    Mn = Mark, Non-Spacing
-    Mc = Mark, Spacing Combining
-    Me = Mark, Enclosing
-
-    Nd = Number, Decimal Digit
-    Nl = Number, Letter
-    No = Number, Other
-
-    Zs = Separator, Space
-    Zl = Separator, Line
-    Zp = Separator, Paragraph
-
-    Cc = Other, Control
-    Cf = Other, Format
-    Cs = Other, Surrogate
-    Co = Other, Private Use
-    Cn = Other, Not Assigned
-
-Informative
-    Lu = Letter, Uppercase
-    Ll = Letter, Lowercase
-    Lt = Letter, Titlecase
-    Lm = Letter, Modifier
-    Lo = Letter, Other
-
-    Pc = Punctuation, Connector
-    Pd = Punctuation, Dash
-    Ps = Punctuation, Open
-    Pe = Punctuation, Close
-    Po = Punctuation, Other
-
-    Sm = Symbol, Math
-    Sc = Symbol, Currency
-    Sk = Symbol, Modifier
-    So = Symbol, Other
-
-BIDIRECTIONAL PROPERTIES
-
-Please refer to the Unicode Standard for an explanation of the algorithm for
-Bidirectional Behavior and an explanation of the sigificance of these categories.
-These values are normative.
-
-Strong types:
-	L	Left-Right; Most alphabetic, syllabic, and logographic
-			characters (e.g., CJK ideographs)
-	R	Right-Left; Arabic, Hebrew, and
-			punctuation specific to those scripts
-Weak types:
-	EN	European Number
-	ES	European Number Separator
-	ET	European Number Terminator
-	AN	Arabic Number
-	CS	Common Number Separator
-
-Separators:
-	B	Block Separator
-	S	Segment Separator
-
-Neutrals:
-	WS	Whitespace
-	ON	Other Neutrals ; All other characters: punctuation, symbols
-
-CHARACTER DECOMPOSITION TAGS
-
-The decomposition is a normative property of a character. The tags supplied
-with certain decompositions generally indicate formatting information.
-Where no such tag is given, the decomposition is designated as canonical.
-Conversely, the presence of a formatting tag also indicates
-that the decomposition is a compatibility decomposition and not a canonical
-decomposition. In the absence of other formatting information in a
-compatibility decomposition, the tag <compat> is used to distinguish it from
-canonical decompositions.
-
-In some instances a canonical decomposition or a compatibility decomposition
-may consist of a single character. For a canonical decomposition, this
-indicates that the character is a canonical equivalent of another single
-character. For a compatibility decomposition, this indicates that the
-character is a compatibility equivalent of another single character.
-
-The compatibility formatting tags used are:
-
-	<font>		A font variant (e.g. a blackletter form).
-	<noBreak>	A no-break version of a space or hyphen.
-	<initial>	An initial presentation form (Arabic).
-	<medial>	A medial presentation form (Arabic).
-	<final>		A final presentation form (Arabic).
-	<isolated>	An isolated presentation form (Arabic).
-	<circle>	An encircled form.
-	<super>		A superscript form.
-	<sub>		A subscript form.
-	<vertical>	A vertical layout presentation form.
-	<wide>		A wide (or zenkaku) compatibility character.
-	<narrow>	A narrow (or hankaku) compatibility character.
-	<small>		A small variant form (CNS compatibility).
-	<square>	A CJK squared font variant.
-	<fraction>	A vulgar fraction form.
-	<compat>	Otherwise unspecified compatibility character.
-
-CANONICAL COMBINING CLASSES
-
-  0: Spacing, enclosing, reordrant, and surrounding
-  1: Overlays and interior
-  6: Tibetan subjoined Letters
-  7: Nuktas
-  8: Hiragana/Katakana voiced marks
-  9: Viramas
- 10: Start of fixed position classes
-199: End of fixed position classes
-200: Below left attached
-202: Below attached
-204: Below right attached
-208: Left attached (reordrant around single base character)
-210: Right attached
-212: Above left attached
-214: Above attached
-216: Above right attached
-218: Below left
-220: Below
-222: Below right
-224: Left (reordrant around single base character)
-226: Right
-228: Above left
-230: Above
-232: Above right
-234: Double above
-
-Note: some of the combining classes in this list do not currently have
-members but are specified here for completeness.
-
-CASE MAPPINGS
-
-In addition to uppercase and lowercase, because of the inclusion of certain
-composite characters for compatibility, such as "01F1;LATIN CAPITAL LETTER
-DZ", there is a third case, called titlecase, which is used where the first
-character of a word is to be capitalized (e.g. UPPERCASE, Titlecase,
-lowercase). An example of such a character is "01F2;LATIN CAPITAL LETTER D
-WITH SMALL LETTER Z".
-
-The uppercase, titlecase and lowercase fields are only included for characters
-that have a single corresponding character of that type. Composite characters
-(such as "339D;SQUARE CM") that do not have a single corresponding character
-of that type can be cased by decomposition.
-
-The case mapping is an informative, default mapping. Certain languages, such
-as Turkish, German, French, or Greek may have small deviations from the
-default mappings listed in the Unicode Character Database.
-
-MODIFICATION HISTORY
-
-Modifications made in updating the Unicode Character Database for
-the Unicode Standard, Version 2.1 (from Version 2.0) are:
-* Added two characters (U+20AC and U+FFFC).
-* Amended bidi properties for U+0026, U+002E, U+0040, U+2007.
-* Corrected case mappings for U+018E, U+019F, U+01DD, U+0258, U+0275,
-	U+03C2, U+1E9B.
-* Changed combining order class for U+0F71.
-* Corrected canonical decompositions for U+0F73, U+1FBE.
-* Changed decomposition for U+FB1F from compatibility to canonical.
-* Added compatibility decompositions for U+FBE8, U+FBE9, U+FBF9..U+FBFB.
-* Corrected compatibility decompositions for U+2469, U+246A, U+3358.
-
-
-Some of the modifications made in updating the Unicode Character Database
-for the Unicode Standard, Version 2.0 are:
-* Fixed decompositions with TONOS to use correct NSM: 030D.
-* Removed old Hangul Syllables; mapping to new characters are
-	in a separate table.
-* Marked compability decompositions with additional tags.
-* Changed old tag names for clarity.
-* Revision of decompositions to use first-level decomposition, instead
-	of maximal decomposition.
-* Correction of all known errors in decompositions from earlier versions.
-* Added control code names (as old Unicode names).
-* Added Hangul Jamo decompositions.
-* Added Number category to match properties list in book.
-* Fixed categories of Koranic Arabic marks.
-* Fixed categories of precomposed characters to match decomposition where possible.
-* Added Hebrew cantillation marks and the Tibetan script.
-* Added place holders for ranges such as CJK Ideographic Area and the
-	Private Use Area.
-* Added categories Me, Sk, Pc, Nl, Cs, Cf, and rectified a number of mistakes in the
-	database.