UProperty

public interface UProperty

Selection constants for Unicode properties.

These constants are used in functions like UCharacter.hasBinaryProperty(int) to select one of the Unicode properties.

The properties APIs are intended to reflect Unicode properties as defined in the Unicode Character Database (UCD) and Unicode Technical Reports (UTR).

For details about the properties see http://www.unicode.org.

For names of Unicode properties see the UCD file PropertyAliases.txt.

Important: If ICU is built with UCD files from Unicode versions below 3.2, then properties marked with "new" are not or not fully available. Check UCharacter.getUnicodeVersion() to be sure.

See Also

Nested Class Summary

interface UProperty.NameChoice Selector constants for UCharacter.getPropertyName() and UCharacter.getPropertyValueName(). 

Constant Summary

int AGE String property Age.
int ALPHABETIC

Binary property Alphabetic.

int ASCII_HEX_DIGIT Binary property ASCII_Hex_Digit (0-9 A-F a-f).
int BIDI_CLASS Enumerated property Bidi_Class.
int BIDI_CONTROL

Binary property Bidi_Control.

int BIDI_MIRRORED

Binary property Bidi_Mirrored.

int BIDI_MIRRORING_GLYPH String property Bidi_Mirroring_Glyph.
int BIDI_PAIRED_BRACKET String property Bidi_Paired_Bracket (new in Unicode 6.3).
int BIDI_PAIRED_BRACKET_TYPE Enumerated property Bidi_Paired_Bracket_Type (new in Unicode 6.3).
int BINARY_START First constant for binary Unicode properties.
int BLOCK Enumerated property Block.
int CANONICAL_COMBINING_CLASS Enumerated property Canonical_Combining_Class.
int CASED Binary property Cased.
int CASE_FOLDING String property Case_Folding.
int CASE_IGNORABLE Binary property Case_Ignorable.
int CASE_SENSITIVE

Binary property Case_Sensitive.

int CHANGES_WHEN_CASEFOLDED Binary property Changes_When_Casefolded.
int CHANGES_WHEN_CASEMAPPED Binary property Changes_When_Casemapped.
int CHANGES_WHEN_LOWERCASED Binary property Changes_When_Lowercased.
int CHANGES_WHEN_NFKC_CASEFOLDED Binary property Changes_When_NFKC_Casefolded.
int CHANGES_WHEN_TITLECASED Binary property Changes_When_Titlecased.
int CHANGES_WHEN_UPPERCASED Binary property Changes_When_Uppercased.
int DASH

Binary property Dash.

int DECOMPOSITION_TYPE Enumerated property Decomposition_Type.
int DEFAULT_IGNORABLE_CODE_POINT

Binary property Default_Ignorable_Code_Point (new).

int DEPRECATED

Binary property Deprecated (new).

int DIACRITIC

Binary property Diacritic.

int DOUBLE_START First constant for double Unicode properties.
int EAST_ASIAN_WIDTH Enumerated property East_Asian_Width.
int EXTENDER

Binary property Extender.

int FULL_COMPOSITION_EXCLUSION

Binary property Full_Composition_Exclusion.

int GENERAL_CATEGORY Enumerated property General_Category.
int GENERAL_CATEGORY_MASK Bitmask property General_Category_Mask.
int GRAPHEME_BASE

Binary property Grapheme_Base (new).

int GRAPHEME_CLUSTER_BREAK Enumerated property Grapheme_Cluster_Break (new in Unicode 4.1).
int GRAPHEME_EXTEND

Binary property Grapheme_Extend (new).

int GRAPHEME_LINK

Binary property Grapheme_Link (new).

int HANGUL_SYLLABLE_TYPE Enumerated property Hangul_Syllable_Type, new in Unicode 4.
int HEX_DIGIT

Binary property Hex_Digit.

int HYPHEN

Binary property Hyphen.

int IDEOGRAPHIC

Binary property Ideographic.

int IDS_BINARY_OPERATOR

Binary property IDS_Binary_Operator (new).

int IDS_TRINARY_OPERATOR

Binary property IDS_Trinary_Operator (new).

int ID_CONTINUE

Binary property ID_Continue.

int ID_START

Binary property ID_Start.

int INT_START First constant for enumerated/integer Unicode properties.
int JOINING_GROUP Enumerated property Joining_Group.
int JOINING_TYPE Enumerated property Joining_Type.
int JOIN_CONTROL

Binary property Join_Control.

int LEAD_CANONICAL_COMBINING_CLASS Enumerated property Lead_Canonical_Combining_Class.
int LINE_BREAK Enumerated property Line_Break.
int LOGICAL_ORDER_EXCEPTION

Binary property Logical_Order_Exception (new).

int LOWERCASE

Binary property Lowercase.

int LOWERCASE_MAPPING String property Lowercase_Mapping.
int MASK_START First constant for bit-mask Unicode properties.
int MATH

Binary property Math.

int NAME String property Name.
int NFC_INERT Binary property NFC_Inert.
int NFC_QUICK_CHECK Enumerated property NFC_Quick_Check.
int NFD_INERT Binary property NFD_Inert.
int NFD_QUICK_CHECK Enumerated property NFD_Quick_Check.
int NFKC_INERT Binary property NFKC_Inert.
int NFKC_QUICK_CHECK Enumerated property NFKC_Quick_Check.
int NFKD_INERT Binary property NFKD_Inert.
int NFKD_QUICK_CHECK Enumerated property NFKD_Quick_Check.
int NONCHARACTER_CODE_POINT

Binary property Noncharacter_Code_Point.

int NUMERIC_TYPE Enumerated property Numeric_Type.
int NUMERIC_VALUE Double property Numeric_Value.
int OTHER_PROPERTY_START First constant for Unicode properties with unusual value types.
int PATTERN_SYNTAX Binary property Pattern_Syntax (new in Unicode 4.1).
int PATTERN_WHITE_SPACE Binary property Pattern_White_Space (new in Unicode 4.1).
int POSIX_ALNUM Binary property alnum (a C/POSIX character class).
int POSIX_BLANK Binary property blank (a C/POSIX character class).
int POSIX_GRAPH Binary property graph (a C/POSIX character class).
int POSIX_PRINT Binary property print (a C/POSIX character class).
int POSIX_XDIGIT Binary property xdigit (a C/POSIX character class).
int QUOTATION_MARK

Binary property Quotation_Mark.

int RADICAL

Binary property Radical (new).

int SCRIPT Enumerated property Script.
int SCRIPT_EXTENSIONS Miscellaneous property Script_Extensions (new in Unicode 6.0).
int SEGMENT_STARTER Binary Property Segment_Starter.
int SENTENCE_BREAK Enumerated property Sentence_Break (new in Unicode 4.1).
int SIMPLE_CASE_FOLDING String property Simple_Case_Folding.
int SIMPLE_LOWERCASE_MAPPING String property Simple_Lowercase_Mapping.
int SIMPLE_TITLECASE_MAPPING String property Simple_Titlecase_Mapping.
int SIMPLE_UPPERCASE_MAPPING String property Simple_Uppercase_Mapping.
int SOFT_DOTTED

Binary property Soft_Dotted (new).

int STRING_START First constant for string Unicode properties.
int S_TERM Binary property STerm (new in Unicode 4.0.1).
int TERMINAL_PUNCTUATION

Binary property Terminal_Punctuation.

int TITLECASE_MAPPING String property Titlecase_Mapping.
int TRAIL_CANONICAL_COMBINING_CLASS Enumerated property Trail_Canonical_Combining_Class.
int UNIFIED_IDEOGRAPH

Binary property Unified_Ideograph (new).

int UPPERCASE

Binary property Uppercase.

int UPPERCASE_MAPPING String property Uppercase_Mapping.
int VARIATION_SELECTOR Binary property Variation_Selector (new in Unicode 4.0.1).
int WHITE_SPACE

Binary property White_Space.

int WORD_BREAK Enumerated property Word_Break (new in Unicode 4.1).
int XID_CONTINUE

Binary property XID_Continue.

int XID_START

Binary property XID_Start.

Constants

public static final int AGE

String property Age. Corresponds to UCharacter.getAge(int).

Constant Value: 16384

public static final int ALPHABETIC

Binary property Alphabetic.

Property for UCharacter.isUAlphabetic(), different from the property in UCharacter.isalpha().

Lu + Ll + Lt + Lm + Lo + Nl + Other_Alphabetic.

Constant Value: 0

public static final int ASCII_HEX_DIGIT

Binary property ASCII_Hex_Digit (0-9 A-F a-f).

Constant Value: 1

public static final int BIDI_CLASS

Enumerated property Bidi_Class. Same as UCharacter.getDirection(int), returns UCharacterDirection values.

Constant Value: 4096

public static final int BIDI_CONTROL

Binary property Bidi_Control.

Format controls which have specific functions in the Bidi Algorithm.

Constant Value: 2

public static final int BIDI_MIRRORED

Binary property Bidi_Mirrored.

Characters that may change display in RTL text.

Property for UCharacter.isMirrored().

See Bidi Algorithm; UTR 9.

Constant Value: 3

public static final int BIDI_MIRRORING_GLYPH

String property Bidi_Mirroring_Glyph. Corresponds to UCharacter.getMirror(int).

Constant Value: 16385

public static final int BIDI_PAIRED_BRACKET

String property Bidi_Paired_Bracket (new in Unicode 6.3). Corresponds to UCharacter.getBidiPairedBracket.

Constant Value: 16397

public static final int BIDI_PAIRED_BRACKET_TYPE

Enumerated property Bidi_Paired_Bracket_Type (new in Unicode 6.3). Used in UAX #9: Unicode Bidirectional Algorithm (http://www.unicode.org/reports/tr9/) Returns UCharacter.BidiPairedBracketType values.

Constant Value: 4117

public static final int BINARY_START

First constant for binary Unicode properties.

Constant Value: 0

public static final int BLOCK

Enumerated property Block. Same as UCharacter.UnicodeBlock.of(int), returns UCharacter.UnicodeBlock values.

Constant Value: 4097

public static final int CANONICAL_COMBINING_CLASS

Enumerated property Canonical_Combining_Class. Same as UCharacter.getCombiningClass(int), returns 8-bit numeric values.

Constant Value: 4098

public static final int CASED

Binary property Cased. For Lowercase, Uppercase and Titlecase characters.

Constant Value: 49

public static final int CASE_FOLDING

String property Case_Folding. Corresponds to UCharacter.foldCase(String, boolean).

Constant Value: 16386

public static final int CASE_IGNORABLE

Binary property Case_Ignorable. Used in context-sensitive case mappings.

Constant Value: 50

public static final int CASE_SENSITIVE

Binary property Case_Sensitive.

Either the source of a case mapping or _in_ the target of a case mapping. Not the same as the general category Cased_Letter.

Constant Value: 34

public static final int CHANGES_WHEN_CASEFOLDED

Binary property Changes_When_Casefolded.

Constant Value: 54

public static final int CHANGES_WHEN_CASEMAPPED

Binary property Changes_When_Casemapped.

Constant Value: 55

public static final int CHANGES_WHEN_LOWERCASED

Binary property Changes_When_Lowercased.

Constant Value: 51

public static final int CHANGES_WHEN_NFKC_CASEFOLDED

Binary property Changes_When_NFKC_Casefolded.

Constant Value: 56

public static final int CHANGES_WHEN_TITLECASED

Binary property Changes_When_Titlecased.

Constant Value: 53

public static final int CHANGES_WHEN_UPPERCASED

Binary property Changes_When_Uppercased.

Constant Value: 52

public static final int DASH

Binary property Dash.

Variations of dashes.

Constant Value: 4

public static final int DECOMPOSITION_TYPE

Enumerated property Decomposition_Type. Returns UCharacter.DecompositionType values.

Constant Value: 4099

public static final int DEFAULT_IGNORABLE_CODE_POINT

Binary property Default_Ignorable_Code_Point (new).

Property that indicates codepoint is ignorable in most processing.

Codepoints (2060..206F, FFF0..FFFB, E0000..E0FFF) + Other_Default_Ignorable_Code_Point + (Cf + Cc + Cs - White_Space)

Constant Value: 5

public static final int DEPRECATED

Binary property Deprecated (new).

The usage of deprecated characters is strongly discouraged.

Constant Value: 6

public static final int DIACRITIC

Binary property Diacritic.

Characters that linguistically modify the meaning of another character to which they apply.

Constant Value: 7

public static final int DOUBLE_START

First constant for double Unicode properties.

Constant Value: 12288

public static final int EAST_ASIAN_WIDTH

Enumerated property East_Asian_Width. See http://www.unicode.org/reports/tr11/ Returns UCharacter.EastAsianWidth values.

Constant Value: 4100

public static final int EXTENDER

Binary property Extender.

Extend the value or shape of a preceding alphabetic character, e.g. length and iteration marks.

Constant Value: 8

public static final int FULL_COMPOSITION_EXCLUSION

Binary property Full_Composition_Exclusion.

CompositionExclusions.txt + Singleton Decompositions + Non-Starter Decompositions.

Constant Value: 9

public static final int GENERAL_CATEGORY

Enumerated property General_Category. Same as UCharacter.getType(int), returns UCharacterCategory values.

Constant Value: 4101

public static final int GENERAL_CATEGORY_MASK

Bitmask property General_Category_Mask. This is the General_Category property returned as a bit mask. When used in UCharacter.getIntPropertyValue(c), returns bit masks for UCharacterCategory values where exactly one bit is set. When used with UCharacter.getPropertyValueName() and UCharacter.getPropertyValueEnum(), a multi-bit mask is used for sets of categories like "Letters".

Constant Value: 8192

public static final int GRAPHEME_BASE

Binary property Grapheme_Base (new).

For programmatic determination of grapheme cluster boundaries. [0..10FFFF]-Cc-Cf-Cs-Co-Cn-Zl-Zp-Grapheme_Link-Grapheme_Extend-CGJ

Constant Value: 10

public static final int GRAPHEME_CLUSTER_BREAK

Enumerated property Grapheme_Cluster_Break (new in Unicode 4.1). Used in UAX #29: Text Boundaries (http://www.unicode.org/reports/tr29/) Returns UCharacter.GraphemeClusterBreak values.

Constant Value: 4114

public static final int GRAPHEME_EXTEND

Binary property Grapheme_Extend (new).

For programmatic determination of grapheme cluster boundaries.

Me+Mn+Mc+Other_Grapheme_Extend-Grapheme_Link-CGJ

Constant Value: 11

public static final int GRAPHEME_LINK

Binary property Grapheme_Link (new).

For programmatic determination of grapheme cluster boundaries.

Constant Value: 12

public static final int HANGUL_SYLLABLE_TYPE

Enumerated property Hangul_Syllable_Type, new in Unicode 4. Returns UCharacter.HangulSyllableType values.

Constant Value: 4107

public static final int HEX_DIGIT

Binary property Hex_Digit.

Characters commonly used for hexadecimal numbers.

Constant Value: 13

public static final int HYPHEN

Binary property Hyphen.

Dashes used to mark connections between pieces of words, plus the Katakana middle dot.

Constant Value: 14

public static final int IDEOGRAPHIC

Binary property Ideographic.

CJKV ideographs.

Constant Value: 17

public static final int IDS_BINARY_OPERATOR

Binary property IDS_Binary_Operator (new).

For programmatic determination of Ideographic Description Sequences.

Constant Value: 18

public static final int IDS_TRINARY_OPERATOR

Binary property IDS_Trinary_Operator (new).

For programmatic determination of Ideographic Description Sequences.

Constant Value: 19

public static final int ID_CONTINUE

Binary property ID_Continue.

Characters that can continue an identifier.

ID_Start+Mn+Mc+Nd+Pc

Constant Value: 15

public static final int ID_START

Binary property ID_Start.

Characters that can start an identifier.

Lu+Ll+Lt+Lm+Lo+Nl

Constant Value: 16

public static final int INT_START

First constant for enumerated/integer Unicode properties.

Constant Value: 4096

public static final int JOINING_GROUP

Enumerated property Joining_Group. Returns UCharacter.JoiningGroup values.

Constant Value: 4102

public static final int JOINING_TYPE

Enumerated property Joining_Type. Returns UCharacter.JoiningType values.

Constant Value: 4103

public static final int JOIN_CONTROL

Binary property Join_Control.

Format controls for cursive joining and ligation.

Constant Value: 20

public static final int LEAD_CANONICAL_COMBINING_CLASS

Enumerated property Lead_Canonical_Combining_Class. ICU-specific property for the ccc of the first code point of the decomposition, or lccc(c)=ccc(NFD(c)[0]). Useful for checking for canonically ordered text; see Normalizer.FCD and http://www.unicode.org/notes/tn5/#FCD . Returns 8-bit numeric values like CANONICAL_COMBINING_CLASS.

Constant Value: 4112

public static final int LINE_BREAK

Enumerated property Line_Break. Returns UCharacter.LineBreak values.

Constant Value: 4104

public static final int LOGICAL_ORDER_EXCEPTION

Binary property Logical_Order_Exception (new).

Characters that do not use logical order and require special handling in most processing.

Constant Value: 21

public static final int LOWERCASE

Binary property Lowercase.

Same as UCharacter.isULowercase(), different from UCharacter.islower().

Ll+Other_Lowercase

Constant Value: 22

public static final int LOWERCASE_MAPPING

String property Lowercase_Mapping. Corresponds to UCharacter.toLowerCase(String).

Constant Value: 16388

public static final int MASK_START

First constant for bit-mask Unicode properties.

Constant Value: 8192

public static final int MATH

Binary property Math.

Sm+Other_Math

Constant Value: 23

public static final int NAME

String property Name. Corresponds to UCharacter.getName(int).

Constant Value: 16389

public static final int NFC_INERT

Binary property NFC_Inert. ICU-specific property for characters that are inert under NFC, i.e., they do not interact with adjacent characters. Used for example in normalizing transforms in incremental mode to find the boundary of safely normalizable text despite possible text additions.

See Also
Constant Value: 39

public static final int NFC_QUICK_CHECK

Enumerated property NFC_Quick_Check. Returns numeric values compatible with Normalizer.QuickCheckResult.

Constant Value: 4110

public static final int NFD_INERT

Binary property NFD_Inert. ICU-specific property for characters that are inert under NFD, i.e., they do not interact with adjacent characters. Used for example in normalizing transforms in incremental mode to find the boundary of safely normalizable text despite possible text additions. There is one such property per normalization form. These properties are computed as follows - an inert character is: a) unassigned, or ALL of the following: b) of combining class 0. c) not decomposed by this normalization form. AND if NFC or NFKC, d) can never compose with a previous character. e) can never compose with a following character. f) can never change if another character is added. Example: a-breve might satisfy all but f, but if you add an ogonek it changes to a-ogonek + breve See also com.ibm.text.UCD.NFSkippable in the ICU4J repository, and icu/source/common/unormimp.h .

Constant Value: 37

public static final int NFD_QUICK_CHECK

Enumerated property NFD_Quick_Check. Returns numeric values compatible with Normalizer.QuickCheckResult.

Constant Value: 4108

public static final int NFKC_INERT

Binary property NFKC_Inert. ICU-specific property for characters that are inert under NFKC, i.e., they do not interact with adjacent characters. Used for example in normalizing transforms in incremental mode to find the boundary of safely normalizable text despite possible text additions.

See Also
Constant Value: 40

public static final int NFKC_QUICK_CHECK

Enumerated property NFKC_Quick_Check. Returns numeric values compatible with Normalizer.QuickCheckResult.

Constant Value: 4111

public static final int NFKD_INERT

Binary property NFKD_Inert. ICU-specific property for characters that are inert under NFKD, i.e., they do not interact with adjacent characters. Used for example in normalizing transforms in incremental mode to find the boundary of safely normalizable text despite possible text additions.

See Also
Constant Value: 38

public static final int NFKD_QUICK_CHECK

Enumerated property NFKD_Quick_Check. Returns numeric values compatible with Normalizer.QuickCheckResult.

Constant Value: 4109

public static final int NONCHARACTER_CODE_POINT

Binary property Noncharacter_Code_Point.

Code points that are explicitly defined as illegal for the encoding of characters.

Constant Value: 24

public static final int NUMERIC_TYPE

Enumerated property Numeric_Type. Returns UCharacter.NumericType values.

Constant Value: 4105

public static final int NUMERIC_VALUE

Double property Numeric_Value. Corresponds to UCharacter.getUnicodeNumericValue(int).

Constant Value: 12288

public static final int OTHER_PROPERTY_START

First constant for Unicode properties with unusual value types.

Constant Value: 28672

public static final int PATTERN_SYNTAX

Binary property Pattern_Syntax (new in Unicode 4.1). See UAX #31 Identifier and Pattern Syntax (http://www.unicode.org/reports/tr31/)

Constant Value: 42

public static final int PATTERN_WHITE_SPACE

Binary property Pattern_White_Space (new in Unicode 4.1). See UAX #31 Identifier and Pattern Syntax (http://www.unicode.org/reports/tr31/)

Constant Value: 43

public static final int POSIX_ALNUM

Binary property alnum (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.

Constant Value: 44

public static final int POSIX_BLANK

Binary property blank (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.

Constant Value: 45

public static final int POSIX_GRAPH

Binary property graph (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.

Constant Value: 46

public static final int POSIX_PRINT

Binary property print (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.

Constant Value: 47

public static final int POSIX_XDIGIT

Binary property xdigit (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.

Constant Value: 48

public static final int QUOTATION_MARK

Binary property Quotation_Mark.

Constant Value: 25

public static final int RADICAL

Binary property Radical (new).

For programmatic determination of Ideographic Description Sequences.

Constant Value: 26

public static final int SCRIPT

Enumerated property Script. Same as UScript.getScript(int), returns UScript values.

Constant Value: 4106

public static final int SCRIPT_EXTENSIONS

Miscellaneous property Script_Extensions (new in Unicode 6.0). Some characters are commonly used in multiple scripts. For more information, see UAX #24: http://www.unicode.org/reports/tr24/. Corresponds to UScript.hasScript and UScript.getScriptExtensions.

Constant Value: 28672

public static final int SEGMENT_STARTER

Binary Property Segment_Starter. ICU-specific property for characters that are starters in terms of Unicode normalization and combining character sequences. They have ccc=0 and do not occur in non-initial position of the canonical decomposition of any character (like " in NFD(a-umlaut) and a Jamo T in an NFD(Hangul LVT)). ICU uses this property for segmenting a string for generating a set of canonically equivalent strings, e.g. for canonical closure while processing collation tailoring rules.

Constant Value: 41

public static final int SENTENCE_BREAK

Enumerated property Sentence_Break (new in Unicode 4.1). Used in UAX #29: Text Boundaries (http://www.unicode.org/reports/tr29/) Returns UCharacter.SentenceBreak values.

Constant Value: 4115

public static final int SIMPLE_CASE_FOLDING

String property Simple_Case_Folding. Corresponds to UCharacter.foldCase(int, boolean).

Constant Value: 16390

public static final int SIMPLE_LOWERCASE_MAPPING

String property Simple_Lowercase_Mapping. Corresponds to UCharacter.toLowerCase(int).

Constant Value: 16391

public static final int SIMPLE_TITLECASE_MAPPING

String property Simple_Titlecase_Mapping. Corresponds to UCharacter.toTitleCase(int).

Constant Value: 16392

public static final int SIMPLE_UPPERCASE_MAPPING

String property Simple_Uppercase_Mapping. Corresponds to UCharacter.toUpperCase(int).

Constant Value: 16393

public static final int SOFT_DOTTED

Binary property Soft_Dotted (new).

Characters with a "soft dot", like i or j.

An accent placed on these characters causes the dot to disappear.

Constant Value: 27

public static final int STRING_START

First constant for string Unicode properties.

Constant Value: 16384

public static final int S_TERM

Binary property STerm (new in Unicode 4.0.1). Sentence Terminal. Used in UAX #29: Text Boundaries (http://www.unicode.org/reports/tr29/)

Constant Value: 35

public static final int TERMINAL_PUNCTUATION

Binary property Terminal_Punctuation.

Punctuation characters that generally mark the end of textual units.

Constant Value: 28

public static final int TITLECASE_MAPPING

String property Titlecase_Mapping. Corresponds to UCharacter.toTitleCase(String).

Constant Value: 16394

public static final int TRAIL_CANONICAL_COMBINING_CLASS

Enumerated property Trail_Canonical_Combining_Class. ICU-specific property for the ccc of the last code point of the decomposition, or lccc(c)=ccc(NFD(c)[last]). Useful for checking for canonically ordered text; see Normalizer.FCD and http://www.unicode.org/notes/tn5/#FCD . Returns 8-bit numeric values like CANONICAL_COMBINING_CLASS.

Constant Value: 4113

public static final int UNIFIED_IDEOGRAPH

Binary property Unified_Ideograph (new).

For programmatic determination of Ideographic Description Sequences.

Constant Value: 29

public static final int UPPERCASE

Binary property Uppercase.

Same as UCharacter.isUUppercase(), different from UCharacter.isUpperCase().

Lu+Other_Uppercase

Constant Value: 30

public static final int UPPERCASE_MAPPING

String property Uppercase_Mapping. Corresponds to UCharacter.toUpperCase(String).

Constant Value: 16396

public static final int VARIATION_SELECTOR

Binary property Variation_Selector (new in Unicode 4.0.1). Indicates all those characters that qualify as Variation Selectors. For details on the behavior of these characters, see StandardizedVariants.html and 15.6 Variation Selectors.

Constant Value: 36

public static final int WHITE_SPACE

Binary property White_Space.

Same as UCharacter.isUWhiteSpace(), different from UCharacter.isSpace() and UCharacter.isWhitespace(). Space characters+TAB+CR+LF-ZWSP-ZWNBSP

Constant Value: 31

public static final int WORD_BREAK

Enumerated property Word_Break (new in Unicode 4.1). Used in UAX #29: Text Boundaries (http://www.unicode.org/reports/tr29/) Returns UCharacter.WordBreak values.

Constant Value: 4116

public static final int XID_CONTINUE

Binary property XID_Continue.

ID_Continue modified to allow closure under normalization forms NFKC and NFKD.

Constant Value: 32

public static final int XID_START

Binary property XID_Start.

ID_Start modified to allow closure under normalization forms NFKC and NFKD.

Constant Value: 33