UScript

public final class UScript extends Object

Constants for ISO 15924 script codes, and related functions.

The current set of script code constants supports at least all scripts that are encoded in the version of Unicode which ICU currently supports. The names of the constants are usually derived from the Unicode script property value aliases. See UAX #24 Unicode Script Property (http://www.unicode.org/reports/tr24/) and http://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt .

In addition, constants for many ISO 15924 script codes are included, for use with language tags, CLDR data, and similar. Some of those codes are not used in the Unicode Character Database (UCD). For example, there are no characters that have a UCD script property value of Hans or Hant. All Han ideographs have the Hani script property value in Unicode.

Private-use codes Qaaa..Qabx are not included, except as used in the UCD or in CLDR.

Starting with ICU 55, script codes are only added when their scripts have been or will certainly be encoded in Unicode, and have been assigned Unicode script property value aliases, to ensure that their script names are stable and match the names of the constants. Script codes like Latf and Aran that are not subject to separate encoding may be added at any time.

Nested Class Summary

enum UScript.ScriptUsage Script usage constants. 

Constant Summary

int ADLAM ISO 15924 script code
int AFAKA ISO 15924 script code
int AHOM ISO 15924 script code
int ANATOLIAN_HIEROGLYPHS ISO 15924 script code
int ARABIC Arabic
int ARMENIAN Armenian
int AVESTAN ISO 15924 script code
int BALINESE ISO 15924 script code
int BAMUM ISO 15924 script code
int BASSA_VAH ISO 15924 script code
int BATAK ISO 15924 script code
int BENGALI Bengali
int BHAIKSUKI ISO 15924 script code
int BLISSYMBOLS ISO 15924 script code
int BOOK_PAHLAVI ISO 15924 script code
int BOPOMOFO Bopomofo
int BRAHMI ISO 15924 script code
int BRAILLE Braille Script in Unicode 4
int BUGINESE Script in Unicode 4.1
int BUHID Buhid
int CANADIAN_ABORIGINAL Unified Canadian Aboriginal Symbols
int CARIAN ISO 15924 script code
int CAUCASIAN_ALBANIAN ISO 15924 script code
int CHAKMA ISO 15924 script code
int CHAM ISO 15924 script code
int CHEROKEE Cherokee
int CIRTH ISO 15924 script code
int COMMON Common
int COPTIC Coptic
int CUNEIFORM ISO 15924 script code
int CYPRIOT Cypriot Script in Unicode 4
int CYRILLIC Cyrillic
int DEMOTIC_EGYPTIAN ISO 15924 script code
int DESERET Deseret
int DEVANAGARI Devanagari
int DUPLOYAN ISO 15924 script code
int EASTERN_SYRIAC ISO 15924 script code
int EGYPTIAN_HIEROGLYPHS ISO 15924 script code
int ELBASAN ISO 15924 script code
int ESTRANGELO_SYRIAC ISO 15924 script code
int ETHIOPIC Ethiopic
int GEORGIAN Georgian
int GLAGOLITIC Script in Unicode 4.1
int GOTHIC Gothic
int GRANTHA ISO 15924 script code
int GREEK Greek
int GUJARATI Gujarati
int GURMUKHI Gurmukhi
int HAN Han
int HANGUL Hangul
int HANUNOO Hanunooo
int HAN_WITH_BOPOMOFO ISO 15924 script code
int HARAPPAN_INDUS ISO 15924 script code
int HATRAN ISO 15924 script code
int HEBREW Hebrew
int HIERATIC_EGYPTIAN ISO 15924 script code
int HIRAGANA Hiragana
int IMPERIAL_ARAMAIC ISO 15924 script code
int INHERITED Inherited
int INSCRIPTIONAL_PAHLAVI ISO 15924 script code
int INSCRIPTIONAL_PARTHIAN ISO 15924 script code
int INVALID_CODE Invalid code
int JAMO ISO 15924 script code
int JAPANESE ISO 15924 script code
int JAVANESE ISO 15924 script code
int JURCHEN ISO 15924 script code
int KAITHI ISO 15924 script code
int KANNADA Kannada
int KATAKANA Katakana
int KATAKANA_OR_HIRAGANA Script in Unicode 4.0.1
int KAYAH_LI ISO 15924 script code
int KHAROSHTHI Script in Unicode 4.1
int KHMER Khmer
int KHOJKI ISO 15924 script code
int KHUDAWADI ISO 15924 script code
int KHUTSURI ISO 15924 script code
int KOREAN ISO 15924 script code
int KPELLE ISO 15924 script code
int LANNA ISO 15924 script code
int LAO Lao
int LATIN Latin
int LATIN_FRAKTUR ISO 15924 script code
int LATIN_GAELIC ISO 15924 script code
int LEPCHA ISO 15924 script code
int LIMBU Limbu Script in Unicode 4
int LINEAR_A ISO 15924 script code
int LINEAR_B Linear B Script in Unicode 4
int LISU ISO 15924 script code
int LOMA ISO 15924 script code
int LYCIAN ISO 15924 script code
int LYDIAN ISO 15924 script code
int MAHAJANI ISO 15924 script code
int MALAYALAM Malayalam
int MANDAEAN ISO 15924 script code
int MANDAIC ISO 15924 script code
int MANICHAEAN ISO 15924 script code
int MARCHEN ISO 15924 script code
int MATHEMATICAL_NOTATION ISO 15924 script code
int MAYAN_HIEROGLYPHS ISO 15924 script code
int MEITEI_MAYEK ISO 15924 script code
int MENDE Mende Kikakui ISO 15924 script code
int MEROITIC ISO 15924 script code
int MEROITIC_CURSIVE ISO 15924 script code
int MEROITIC_HIEROGLYPHS ISO 15924 script code
int MIAO ISO 15924 script code
int MODI ISO 15924 script code
int MONGOLIAN Mangolian
int MOON ISO 15924 script code
int MRO ISO 15924 script code
int MULTANI ISO 15924 script code
int MYANMAR Myammar
int NABATAEAN ISO 15924 script code
int NAKHI_GEBA ISO 15924 script code
int NEWA ISO 15924 script code
int NEW_TAI_LUE Script in Unicode 4.1
int NKO ISO 15924 script code
int NUSHU ISO 15924 script code
int OGHAM Ogham
int OLD_CHURCH_SLAVONIC_CYRILLIC ISO 15924 script code
int OLD_HUNGARIAN ISO 15924 script code
int OLD_ITALIC Old Itallic
int OLD_NORTH_ARABIAN ISO 15924 script code
int OLD_PERMIC ISO 15924 script code
int OLD_PERSIAN Script in Unicode 4.1
int OLD_SOUTH_ARABIAN ISO 15924 script code
int OL_CHIKI ISO 15924 script code
int ORIYA Oriya
int ORKHON ISO 15924 script code
int OSAGE ISO 15924 script code
int OSMANYA Osmanya Script in Unicode 4
int PAHAWH_HMONG ISO 15924 script code
int PALMYRENE ISO 15924 script code
int PAU_CIN_HAU ISO 15924 script code
int PHAGS_PA ISO 15924 script code
int PHOENICIAN ISO 15924 script code
int PHONETIC_POLLARD ISO 15924 script code
int PSALTER_PAHLAVI ISO 15924 script code
int REJANG ISO 15924 script code
int RONGORONGO ISO 15924 script code
int RUNIC Runic
int SAMARITAN ISO 15924 script code
int SARATI ISO 15924 script code
int SAURASHTRA ISO 15924 script code
int SHARADA ISO 15924 script code
int SHAVIAN Shavian Script in Unicode 4
int SIDDHAM ISO 15924 script code
int SIGN_WRITING ISO 15924 script code for Sutton SignWriting
int SIMPLIFIED_HAN ISO 15924 script code
int SINDHI ISO 15924 script code
int SINHALA Sinhala
int SORA_SOMPENG ISO 15924 script code
int SUNDANESE ISO 15924 script code
int SYLOTI_NAGRI Script in Unicode 4.1
int SYMBOLS ISO 15924 script code
int SYMBOLS_EMOJI ISO 15924 script code
int SYRIAC Syriac
int TAGALOG Tagalog
int TAGBANWA Tagbanwa
int TAI_LE Tai Le Script in Unicode 4
int TAI_VIET ISO 15924 script code
int TAKRI ISO 15924 script code
int TAMIL Tamil
int TANGUT ISO 15924 script code
int TELUGU Telugu
int TENGWAR ISO 15924 script code
int THAANA Thana
int THAI Thai
int TIBETAN Tibetan
int TIFINAGH Script in Unicode 4.1
int TIRHUTA ISO 15924 script code
int TRADITIONAL_HAN ISO 15924 script code
int UCAS Unified Canadian Aboriginal Symbols (alias)
int UGARITIC Ugaritic Script in Unicode 4
int UNKNOWN ISO 15924 script code
int UNWRITTEN_LANGUAGES ISO 15924 script code
int VAI ISO 15924 script code
int VISIBLE_SPEECH ISO 15924 script code
int WARANG_CITI ISO 15924 script code
int WESTERN_SYRIAC ISO 15924 script code
int WOLEAI ISO 15924 script code
int YI Yi syllables

Public Method Summary

static boolean
breaksBetweenLetters(int script)
Returns true if the script allows line breaks between letters (excluding hyphenation).
static int[]
getCode(ULocale locale)
Gets a script codes associated with the given locale or ISO 15924 abbreviation or name.
static int[]
getCode(String nameOrAbbrOrLocale)
Gets the script codes associated with the given locale or ISO 15924 abbreviation or name.
static int[]
getCode(Locale locale)
Gets a script codes associated with the given locale or ISO 15924 abbreviation or name.
static int
getCodeFromName(String nameOrAbbr)
Returns the script code associated with the given Unicode script property alias (name or abbreviation).
static String
getName(int scriptCode)
Returns the long Unicode script name, if there is one.
static String
getSampleString(int script)
Returns the script sample character string.
static int
getScript(int codepoint)
Gets the script code associated with the given codepoint.
static int
getScriptExtensions(int c, BitSet set)
Sets code point c's Script_Extensions as script code integers into the output BitSet.
static String
getShortName(int scriptCode)
Returns the 4-letter ISO 15924 script code, which is the same as the short Unicode script name if Unicode has names for the script.
static UScript.ScriptUsage
getUsage(int script)
Returns the script usage according to UAX #31 Unicode Identifier and Pattern Syntax.
static boolean
hasScript(int c, int sc)
Do the Script_Extensions of code point c contain script sc? If c does not have explicit Script_Extensions, then this tests whether c has the Script property value sc.
static boolean
isCased(int script)
Returns true if in modern (or most recent) usage of the script case distinctions are customary.
static boolean
isRightToLeft(int script)
Returns true if the script is written right-to-left.

Inherited Method Summary

Constants

public static final int ADLAM

ISO 15924 script code

Constant Value: 167

public static final int AFAKA

ISO 15924 script code

Constant Value: 147

public static final int AHOM

ISO 15924 script code

Constant Value: 161

public static final int ANATOLIAN_HIEROGLYPHS

ISO 15924 script code

Constant Value: 156

public static final int ARABIC

Arabic

Constant Value: 2

public static final int ARMENIAN

Armenian

Constant Value: 3

public static final int AVESTAN

ISO 15924 script code

Constant Value: 117

public static final int BALINESE

ISO 15924 script code

Constant Value: 62

public static final int BAMUM

ISO 15924 script code

Constant Value: 130

public static final int BASSA_VAH

ISO 15924 script code

Constant Value: 134

public static final int BATAK

ISO 15924 script code

Constant Value: 63

public static final int BENGALI

Bengali

Constant Value: 4

public static final int BHAIKSUKI

ISO 15924 script code

Constant Value: 168

public static final int BLISSYMBOLS

ISO 15924 script code

Constant Value: 64

public static final int BOOK_PAHLAVI

ISO 15924 script code

Constant Value: 124

public static final int BOPOMOFO

Bopomofo

Constant Value: 5

public static final int BRAHMI

ISO 15924 script code

Constant Value: 65

public static final int BRAILLE

Braille Script in Unicode 4

Constant Value: 46

public static final int BUGINESE

Script in Unicode 4.1

Constant Value: 55

public static final int BUHID

Buhid

Constant Value: 44

public static final int CANADIAN_ABORIGINAL

Unified Canadian Aboriginal Symbols

Constant Value: 40

public static final int CARIAN

ISO 15924 script code

Constant Value: 104

public static final int CAUCASIAN_ALBANIAN

ISO 15924 script code

Constant Value: 159

public static final int CHAKMA

ISO 15924 script code

Constant Value: 118

public static final int CHAM

ISO 15924 script code

Constant Value: 66

public static final int CHEROKEE

Cherokee

Constant Value: 6

public static final int CIRTH

ISO 15924 script code

Constant Value: 67

public static final int COMMON

Common

Constant Value: 0

public static final int COPTIC

Coptic

Constant Value: 7

public static final int CUNEIFORM

ISO 15924 script code

Constant Value: 101

public static final int CYPRIOT

Cypriot Script in Unicode 4

Constant Value: 47

public static final int CYRILLIC

Cyrillic

Constant Value: 8

public static final int DEMOTIC_EGYPTIAN

ISO 15924 script code

Constant Value: 69

public static final int DESERET

Deseret

Constant Value: 9

public static final int DEVANAGARI

Devanagari

Constant Value: 10

public static final int DUPLOYAN

ISO 15924 script code

Constant Value: 135

public static final int EASTERN_SYRIAC

ISO 15924 script code

Constant Value: 97

public static final int EGYPTIAN_HIEROGLYPHS

ISO 15924 script code

Constant Value: 71

public static final int ELBASAN

ISO 15924 script code

Constant Value: 136

public static final int ESTRANGELO_SYRIAC

ISO 15924 script code

Constant Value: 95

public static final int ETHIOPIC

Ethiopic

Constant Value: 11

public static final int GEORGIAN

Georgian

Constant Value: 12

public static final int GLAGOLITIC

Script in Unicode 4.1

Constant Value: 56

public static final int GOTHIC

Gothic

Constant Value: 13

public static final int GRANTHA

ISO 15924 script code

Constant Value: 137

public static final int GREEK

Greek

Constant Value: 14

public static final int GUJARATI

Gujarati

Constant Value: 15

public static final int GURMUKHI

Gurmukhi

Constant Value: 16

public static final int HAN

Han

Constant Value: 17

public static final int HANGUL

Hangul

Constant Value: 18

public static final int HANUNOO

Hanunooo

Constant Value: 43

public static final int HAN_WITH_BOPOMOFO

ISO 15924 script code

Constant Value: 172

public static final int HARAPPAN_INDUS

ISO 15924 script code

Constant Value: 77

public static final int HATRAN

ISO 15924 script code

Constant Value: 162

public static final int HEBREW

Hebrew

Constant Value: 19

public static final int HIERATIC_EGYPTIAN

ISO 15924 script code

Constant Value: 70

public static final int HIRAGANA

Hiragana

Constant Value: 20

public static final int IMPERIAL_ARAMAIC

ISO 15924 script code

Constant Value: 116

public static final int INHERITED

Inherited

Constant Value: 1

public static final int INSCRIPTIONAL_PAHLAVI

ISO 15924 script code

Constant Value: 122

public static final int INSCRIPTIONAL_PARTHIAN

ISO 15924 script code

Constant Value: 125

public static final int INVALID_CODE

Invalid code

Constant Value: -1

public static final int JAMO

ISO 15924 script code

Constant Value: 173

public static final int JAPANESE

ISO 15924 script code

Constant Value: 105

public static final int JAVANESE

ISO 15924 script code

Constant Value: 78

public static final int JURCHEN

ISO 15924 script code

Constant Value: 148

public static final int KAITHI

ISO 15924 script code

Constant Value: 120

public static final int KANNADA

Kannada

Constant Value: 21

public static final int KATAKANA

Katakana

Constant Value: 22

public static final int KATAKANA_OR_HIRAGANA

Script in Unicode 4.0.1

Constant Value: 54

public static final int KAYAH_LI

ISO 15924 script code

Constant Value: 79

public static final int KHAROSHTHI

Script in Unicode 4.1

Constant Value: 57

public static final int KHMER

Khmer

Constant Value: 23

public static final int KHOJKI

ISO 15924 script code

Constant Value: 157

public static final int KHUDAWADI

ISO 15924 script code

Constant Value: 145

public static final int KHUTSURI

ISO 15924 script code

Constant Value: 72

public static final int KOREAN

ISO 15924 script code

Constant Value: 119

public static final int KPELLE

ISO 15924 script code

Constant Value: 138

public static final int LANNA

ISO 15924 script code

Constant Value: 106

public static final int LAO

Lao

Constant Value: 24

public static final int LATIN

Latin

Constant Value: 25

public static final int LATIN_FRAKTUR

ISO 15924 script code

Constant Value: 80

public static final int LATIN_GAELIC

ISO 15924 script code

Constant Value: 81

public static final int LEPCHA

ISO 15924 script code

Constant Value: 82

public static final int LIMBU

Limbu Script in Unicode 4

Constant Value: 48

public static final int LINEAR_A

ISO 15924 script code

Constant Value: 83

public static final int LINEAR_B

Linear B Script in Unicode 4

Constant Value: 49

public static final int LISU

ISO 15924 script code

Constant Value: 131

public static final int LOMA

ISO 15924 script code

Constant Value: 139

public static final int LYCIAN

ISO 15924 script code

Constant Value: 107

public static final int LYDIAN

ISO 15924 script code

Constant Value: 108

public static final int MAHAJANI

ISO 15924 script code

Constant Value: 160

public static final int MALAYALAM

Malayalam

Constant Value: 26

public static final int MANDAEAN

ISO 15924 script code

Constant Value: 84

public static final int MANDAIC

ISO 15924 script code

Constant Value: 84

public static final int MANICHAEAN

ISO 15924 script code

Constant Value: 121

public static final int MARCHEN

ISO 15924 script code

Constant Value: 169

public static final int MATHEMATICAL_NOTATION

ISO 15924 script code

Constant Value: 128

public static final int MAYAN_HIEROGLYPHS

ISO 15924 script code

Constant Value: 85

public static final int MEITEI_MAYEK

ISO 15924 script code

Constant Value: 115

public static final int MENDE

Mende Kikakui ISO 15924 script code

Constant Value: 140

public static final int MEROITIC

ISO 15924 script code

Constant Value: 86

public static final int MEROITIC_CURSIVE

ISO 15924 script code

Constant Value: 141

public static final int MEROITIC_HIEROGLYPHS

ISO 15924 script code

Constant Value: 86

public static final int MIAO

ISO 15924 script code

Constant Value: 92

public static final int MODI

ISO 15924 script code

Constant Value: 163

public static final int MONGOLIAN

Mangolian

Constant Value: 27

public static final int MOON

ISO 15924 script code

Constant Value: 114

public static final int MRO

ISO 15924 script code

Constant Value: 149

public static final int MULTANI

ISO 15924 script code

Constant Value: 164

public static final int MYANMAR

Myammar

Constant Value: 28

public static final int NABATAEAN

ISO 15924 script code

Constant Value: 143

public static final int NAKHI_GEBA

ISO 15924 script code

Constant Value: 132

public static final int NEWA

ISO 15924 script code

Constant Value: 170

public static final int NEW_TAI_LUE

Script in Unicode 4.1

Constant Value: 59

public static final int NKO

ISO 15924 script code

Constant Value: 87

public static final int NUSHU

ISO 15924 script code

Constant Value: 150

public static final int OGHAM

Ogham

Constant Value: 29

public static final int OLD_CHURCH_SLAVONIC_CYRILLIC

ISO 15924 script code

Constant Value: 68

public static final int OLD_HUNGARIAN

ISO 15924 script code

Constant Value: 76

public static final int OLD_ITALIC

Old Itallic

Constant Value: 30

public static final int OLD_NORTH_ARABIAN

ISO 15924 script code

Constant Value: 142

public static final int OLD_PERMIC

ISO 15924 script code

Constant Value: 89

public static final int OLD_PERSIAN

Script in Unicode 4.1

Constant Value: 61

public static final int OLD_SOUTH_ARABIAN

ISO 15924 script code

Constant Value: 133

public static final int OL_CHIKI

ISO 15924 script code

Constant Value: 109

public static final int ORIYA

Oriya

Constant Value: 31

public static final int ORKHON

ISO 15924 script code

Constant Value: 88

public static final int OSAGE

ISO 15924 script code

Constant Value: 171

public static final int OSMANYA

Osmanya Script in Unicode 4

Constant Value: 50

public static final int PAHAWH_HMONG

ISO 15924 script code

Constant Value: 75

public static final int PALMYRENE

ISO 15924 script code

Constant Value: 144

public static final int PAU_CIN_HAU

ISO 15924 script code

Constant Value: 165

public static final int PHAGS_PA

ISO 15924 script code

Constant Value: 90

public static final int PHOENICIAN

ISO 15924 script code

Constant Value: 91

public static final int PHONETIC_POLLARD

ISO 15924 script code

Constant Value: 92

public static final int PSALTER_PAHLAVI

ISO 15924 script code

Constant Value: 123

public static final int REJANG

ISO 15924 script code

Constant Value: 110

public static final int RONGORONGO

ISO 15924 script code

Constant Value: 93

public static final int RUNIC

Runic

Constant Value: 32

public static final int SAMARITAN

ISO 15924 script code

Constant Value: 126

public static final int SARATI

ISO 15924 script code

Constant Value: 94

public static final int SAURASHTRA

ISO 15924 script code

Constant Value: 111

public static final int SHARADA

ISO 15924 script code

Constant Value: 151

public static final int SHAVIAN

Shavian Script in Unicode 4

Constant Value: 51

public static final int SIDDHAM

ISO 15924 script code

Constant Value: 166

public static final int SIGN_WRITING

ISO 15924 script code for Sutton SignWriting

Constant Value: 112

public static final int SIMPLIFIED_HAN

ISO 15924 script code

Constant Value: 73

public static final int SINDHI

ISO 15924 script code

Constant Value: 145

public static final int SINHALA

Sinhala

Constant Value: 33

public static final int SORA_SOMPENG

ISO 15924 script code

Constant Value: 152

public static final int SUNDANESE

ISO 15924 script code

Constant Value: 113

public static final int SYLOTI_NAGRI

Script in Unicode 4.1

Constant Value: 58

public static final int SYMBOLS

ISO 15924 script code

Constant Value: 129

public static final int SYMBOLS_EMOJI

ISO 15924 script code

Constant Value: 174

public static final int SYRIAC

Syriac

Constant Value: 34

public static final int TAGALOG

Tagalog

Constant Value: 42

public static final int TAGBANWA

Tagbanwa

Constant Value: 45

public static final int TAI_LE

Tai Le Script in Unicode 4

Constant Value: 52

public static final int TAI_VIET

ISO 15924 script code

Constant Value: 127

public static final int TAKRI

ISO 15924 script code

Constant Value: 153

public static final int TAMIL

Tamil

Constant Value: 35

public static final int TANGUT

ISO 15924 script code

Constant Value: 154

public static final int TELUGU

Telugu

Constant Value: 36

public static final int TENGWAR

ISO 15924 script code

Constant Value: 98

public static final int THAANA

Thana

Constant Value: 37

public static final int THAI

Thai

Constant Value: 38

public static final int TIBETAN

Tibetan

Constant Value: 39

public static final int TIFINAGH

Script in Unicode 4.1

Constant Value: 60

public static final int TIRHUTA

ISO 15924 script code

Constant Value: 158

public static final int TRADITIONAL_HAN

ISO 15924 script code

Constant Value: 74

public static final int UCAS

Unified Canadian Aboriginal Symbols (alias)

Constant Value: 40

public static final int UGARITIC

Ugaritic Script in Unicode 4

Constant Value: 53

public static final int UNKNOWN

ISO 15924 script code

Constant Value: 103

public static final int UNWRITTEN_LANGUAGES

ISO 15924 script code

Constant Value: 102

public static final int VAI

ISO 15924 script code

Constant Value: 99

public static final int VISIBLE_SPEECH

ISO 15924 script code

Constant Value: 100

public static final int WARANG_CITI

ISO 15924 script code

Constant Value: 146

public static final int WESTERN_SYRIAC

ISO 15924 script code

Constant Value: 96

public static final int WOLEAI

ISO 15924 script code

Constant Value: 155

public static final int YI

Yi syllables

Constant Value: 41

Public Methods

public static boolean breaksBetweenLetters (int script)

Returns true if the script allows line breaks between letters (excluding hyphenation). Such a script typically requires dictionary-based line breaking. For example, Hani and Thai.

Parameters
script script code
Returns
  • true if the script allows line breaks between letters

public static int[] getCode (ULocale locale)

Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. Returns MALAYAM given "Malayam" OR "Mlym". Returns LATIN given "en" OR "en_US"

Parameters
locale ULocale
Returns
  • The script codes array. null if the the code cannot be found.

public static int[] getCode (String nameOrAbbrOrLocale)

Gets the script codes associated with the given locale or ISO 15924 abbreviation or name. Returns MALAYAM given "Malayam" OR "Mlym". Returns LATIN given "en" OR "en_US"

Note: To search by short or long script alias only, use getCodeFromName(String) instead. That does a fast lookup with no access of the locale data.

Parameters
nameOrAbbrOrLocale name of the script or ISO 15924 code or locale
Returns
  • The script codes array. null if the the code cannot be found.

public static int[] getCode (Locale locale)

Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. Returns MALAYAM given "Malayam" OR "Mlym". Returns LATIN given "en" OR "en_US"

Parameters
locale Locale
Returns
  • The script codes array. null if the the code cannot be found.

public static int getCodeFromName (String nameOrAbbr)

Returns the script code associated with the given Unicode script property alias (name or abbreviation). Short aliases are ISO 15924 script codes. Returns MALAYAM given "Malayam" OR "Mlym".

Parameters
nameOrAbbr name of the script or ISO 15924 code
Returns
  • The script code value, or INVALID_CODE if the code cannot be found.

public static String getName (int scriptCode)

Returns the long Unicode script name, if there is one. Otherwise returns the 4-letter ISO 15924 script code. Returns "Malayam" given MALAYALAM.

Parameters
scriptCode int script code
Returns
  • long script name as given in PropertyValueAliases.txt, or the 4-letter code
Throws
IllegalArgumentException if the script code is not valid

public static String getSampleString (int script)

Returns the script sample character string. This string normally consists of one code point but might be longer. The string is empty if the script is not encoded.

Parameters
script script code
Returns
  • the sample character string

public static int getScript (int codepoint)

Gets the script code associated with the given codepoint. Returns UScript.MALAYAM given 0x0D02

Parameters
codepoint UChar32 codepoint
Returns
  • The script code

public static int getScriptExtensions (int c, BitSet set)

Sets code point c's Script_Extensions as script code integers into the output BitSet.

  • If c does have Script_Extensions, then the return value is the negative number of Script_Extensions codes (= -set.cardinality()); in this case, the Script property value (normally Common or Inherited) is not included in the set.
  • If c does not have Script_Extensions, then the one Script code is put into the set and also returned.
  • If c is not a valid code point, then the one UNKNOWN code is put into the set and also returned.
In other words, if the return value is non-negative, it is c's single Script code and the set contains exactly this Script code. If the return value is -n, then the set contains c's n>=2 Script_Extensions script codes.

Some characters are commonly used in multiple scripts. For more information, see UAX #24: http://www.unicode.org/reports/tr24/.

Parameters
c code point
set set of script code integers; will be cleared, then bits are set corresponding to c's Script_Extensions
Returns
  • negative number of script codes in c's Script_Extensions, or the non-negative single Script value

public static String getShortName (int scriptCode)

Returns the 4-letter ISO 15924 script code, which is the same as the short Unicode script name if Unicode has names for the script. Returns "Mlym" given MALAYALAM.

Parameters
scriptCode int script code
Returns
  • short script name (4-letter code)
Throws
IllegalArgumentException if the script code is not valid

public static UScript.ScriptUsage getUsage (int script)

Returns the script usage according to UAX #31 Unicode Identifier and Pattern Syntax. Returns NOT_ENCODED if the script is not encoded in Unicode.

Parameters
script script code
Returns
  • script usage

public static boolean hasScript (int c, int sc)

Do the Script_Extensions of code point c contain script sc? If c does not have explicit Script_Extensions, then this tests whether c has the Script property value sc.

Some characters are commonly used in multiple scripts. For more information, see UAX #24: http://www.unicode.org/reports/tr24/.

Parameters
c code point
sc script code
Returns
  • true if sc is in Script_Extensions(c)

public static boolean isCased (int script)

Returns true if in modern (or most recent) usage of the script case distinctions are customary. For example, Latn and Cyrl.

Parameters
script script code
Returns
  • true if the script is cased

public static boolean isRightToLeft (int script)

Returns true if the script is written right-to-left. For example, Arab and Hebr.

Parameters
script script code
Returns
  • true if the script is right-to-left