NumericShaper

public final class NumericShaper extends Object
implements Serializable

The NumericShaper class is used to convert Latin-1 (European) digits to other Unicode decimal digits. Users of this class will primarily be people who wish to present data using national digit shapes, but find it more convenient to represent the data internally using Latin-1 (European) digits. This does not interpret the deprecated numeric shape selector character (U+206E).

Instances of NumericShaper are typically applied as attributes to text with the NUMERIC_SHAPING attribute of the TextAttribute class. For example, this code snippet causes a TextLayout to shape European digits to Arabic in an Arabic context:

 Map map = new HashMap();
 map.put(TextAttribute.NUMERIC_SHAPING,
     NumericShaper.getContextualShaper(NumericShaper.ARABIC));
 FontRenderContext frc = ...;
 TextLayout layout = new TextLayout(text, map, frc);
 layout.draw(g2d, x, y);
 

It is also possible to perform numeric shaping explicitly using instances of NumericShaper, as this code snippet demonstrates:
 char[] text = ...;
 // shape all EUROPEAN digits (except zero) to ARABIC digits
 NumericShaper shaper = NumericShaper.getShaper(NumericShaper.ARABIC);
 shaper.shape(text, start, count);

 // shape European digits to ARABIC digits if preceding text is Arabic, or
 // shape European digits to TAMIL digits if preceding text is Tamil, or
 // leave European digits alone if there is no preceding text, or
 // preceding text is neither Arabic nor Tamil
 NumericShaper shaper =
     NumericShaper.getContextualShaper(NumericShaper.ARABIC |
                                         NumericShaper.TAMIL,
                                       NumericShaper.EUROPEAN);
 shaper.shape(text, start, count);
 

Bit mask- and enum-based Unicode ranges

This class supports two different programming interfaces to represent Unicode ranges for script-specific digits: bit mask-based ones, such as NumericShaper.ARABIC, and enum-based ones, such as NumericShaper.Range.ARABIC. Multiple ranges can be specified by ORing bit mask-based constants, such as:

 NumericShaper.ARABIC | NumericShaper.TAMIL
 
or creating a Set with the NumericShaper.Range constants, such as:
 EnumSet.of(NumericShaper.Scirpt.ARABIC, NumericShaper.Range.TAMIL)
 
The enum-based ranges are a super set of the bit mask-based ones.

If the two interfaces are mixed (including serialization), Unicode range values are mapped to their counterparts where such mapping is possible, such as NumericShaper.Range.ARABIC from/to NumericShaper.ARABIC. If any unmappable range values are specified, such as NumericShaper.Range.BALINESE, those ranges are ignored.

Decimal Digits Precedence

A Unicode range may have more than one set of decimal digits. If multiple decimal digits sets are specified for the same Unicode range, one of the sets will take precedence as follows.

Unicode Range NumericShaper Constants Precedence
Arabic NumericShaper.ARABIC
NumericShaper.EASTERN_ARABIC
NumericShaper.EASTERN_ARABIC
NumericShaper.Range.ARABIC
NumericShaper.Range.EASTERN_ARABIC
NumericShaper.Range.EASTERN_ARABIC
Tai Tham NumericShaper.Range.TAI_THAM_HORA
NumericShaper.Range.TAI_THAM_THAM
NumericShaper.Range.TAI_THAM_THAM

Nested Class Summary

enum NumericShaper.Range A NumericShaper.Range represents a Unicode range of a script having its own decimal digits. 

Constant Summary

int ALL_RANGES Identifies all ranges, for full contextual shaping.
int ARABIC Identifies the ARABIC range and decimal base.
int BENGALI Identifies the BENGALI range and decimal base.
int DEVANAGARI Identifies the DEVANAGARI range and decimal base.
int EASTERN_ARABIC Identifies the ARABIC range and ARABIC_EXTENDED decimal base.
int ETHIOPIC Identifies the ETHIOPIC range and decimal base.
int EUROPEAN Identifies the Latin-1 (European) and extended range, and Latin-1 (European) decimal base.
int GUJARATI Identifies the GUJARATI range and decimal base.
int GURMUKHI Identifies the GURMUKHI range and decimal base.
int KANNADA Identifies the KANNADA range and decimal base.
int KHMER Identifies the KHMER range and decimal base.
int LAO Identifies the LAO range and decimal base.
int MALAYALAM Identifies the MALAYALAM range and decimal base.
int MONGOLIAN Identifies the MONGOLIAN range and decimal base.
int MYANMAR Identifies the MYANMAR range and decimal base.
int ORIYA Identifies the ORIYA range and decimal base.
int TAMIL Identifies the TAMIL range and decimal base.
int TELUGU Identifies the TELUGU range and decimal base.
int THAI Identifies the THAI range and decimal base.
int TIBETAN Identifies the TIBETAN range and decimal base.

Public Method Summary

boolean
equals(Object o)
Returns true if the specified object is an instance of NumericShaper and shapes identically to this one, regardless of the range representations, the bit mask or the enum.
static NumericShaper
getContextualShaper(Set<NumericShaper.Range> ranges)
Returns a contextual shaper for the provided Unicode range(s).
static NumericShaper
getContextualShaper(int ranges, int defaultContext)
Returns a contextual shaper for the provided unicode range(s).
static NumericShaper
getContextualShaper(Set<NumericShaper.Range> ranges, NumericShaper.Range defaultContext)
Returns a contextual shaper for the provided Unicode range(s).
static NumericShaper
getContextualShaper(int ranges)
Returns a contextual shaper for the provided unicode range(s).
Set<NumericShaper.Range>
getRangeSet()
Returns a Set representing all the Unicode ranges in this NumericShaper that will be shaped.
int
getRanges()
Returns an int that ORs together the values for all the ranges that will be shaped.
static NumericShaper
getShaper(int singleRange)
Returns a shaper for the provided unicode range.
static NumericShaper
getShaper(NumericShaper.Range singleRange)
Returns a shaper for the provided Unicode range.
int
hashCode()
Returns a hash code for this shaper.
boolean
isContextual()
Returns a boolean indicating whether or not this shaper shapes contextually.
void
shape(char[] text, int start, int count)
Converts the digits in the text that occur between start and start + count.
void
shape(char[] text, int start, int count, NumericShaper.Range context)
Converts the digits in the text that occur between start and start + count, using the provided context.
void
shape(char[] text, int start, int count, int context)
Converts the digits in the text that occur between start and start + count, using the provided context.
String
toString()
Returns a String that describes this shaper.

Inherited Method Summary

Constants

public static final int ALL_RANGES

Identifies all ranges, for full contextual shaping.

This constant specifies all of the bit mask-based ranges. Use EmunSet.allOf(NumericShaper.Range.class) to specify all of the enum-based ranges.

Constant Value: 524287

public static final int ARABIC

Identifies the ARABIC range and decimal base.

Constant Value: 2

public static final int BENGALI

Identifies the BENGALI range and decimal base.

Constant Value: 16

public static final int DEVANAGARI

Identifies the DEVANAGARI range and decimal base.

Constant Value: 8

public static final int EASTERN_ARABIC

Identifies the ARABIC range and ARABIC_EXTENDED decimal base.

Constant Value: 4

public static final int ETHIOPIC

Identifies the ETHIOPIC range and decimal base.

Constant Value: 65536

public static final int EUROPEAN

Identifies the Latin-1 (European) and extended range, and Latin-1 (European) decimal base.

Constant Value: 1

public static final int GUJARATI

Identifies the GUJARATI range and decimal base.

Constant Value: 64

public static final int GURMUKHI

Identifies the GURMUKHI range and decimal base.

Constant Value: 32

public static final int KANNADA

Identifies the KANNADA range and decimal base.

Constant Value: 1024

public static final int KHMER

Identifies the KHMER range and decimal base.

Constant Value: 131072

public static final int LAO

Identifies the LAO range and decimal base.

Constant Value: 8192

public static final int MALAYALAM

Identifies the MALAYALAM range and decimal base.

Constant Value: 2048

public static final int MONGOLIAN

Identifies the MONGOLIAN range and decimal base.

Constant Value: 262144

public static final int MYANMAR

Identifies the MYANMAR range and decimal base.

Constant Value: 32768

public static final int ORIYA

Identifies the ORIYA range and decimal base.

Constant Value: 128

public static final int TAMIL

Identifies the TAMIL range and decimal base.

Constant Value: 256

public static final int TELUGU

Identifies the TELUGU range and decimal base.

Constant Value: 512

public static final int THAI

Identifies the THAI range and decimal base.

Constant Value: 4096

public static final int TIBETAN

Identifies the TIBETAN range and decimal base.

Constant Value: 16384

Public Methods

public boolean equals (Object o)

Returns true if the specified object is an instance of NumericShaper and shapes identically to this one, regardless of the range representations, the bit mask or the enum. For example, the following code produces "true".

 NumericShaper ns1 = NumericShaper.getShaper(NumericShaper.ARABIC);
 NumericShaper ns2 = NumericShaper.getShaper(NumericShaper.Range.ARABIC);
 System.out.println(ns1.equals(ns2));
 

Parameters
o the specified object to compare to this NumericShaper
Returns
  • true if o is an instance of NumericShaper and shapes in the same way; false otherwise.

public static NumericShaper getContextualShaper (Set<NumericShaper.Range> ranges)

Returns a contextual shaper for the provided Unicode range(s). The Latin-1 (EUROPEAN) digits are converted to the decimal digits corresponding to the range of the preceding text, if the range is one of the provided ranges.

The shaper assumes EUROPEAN as the starting context, that is, if EUROPEAN digits are encountered before any strong directional text in the string, the context is presumed to be EUROPEAN, and so the digits will not shape.

Parameters
ranges the specified Unicode ranges
Returns
  • a contextual shaper for the specified ranges
Throws
NullPointerException if ranges is null.

public static NumericShaper getContextualShaper (int ranges, int defaultContext)

Returns a contextual shaper for the provided unicode range(s). Latin-1 (EUROPEAN) digits will be converted to the decimal digits corresponding to the range of the preceding text, if the range is one of the provided ranges. Multiple ranges are represented by or-ing the values together, for example, NumericShaper.ARABIC | NumericShaper.THAI. The shaper uses defaultContext as the starting context.

Parameters
ranges the specified Unicode ranges
defaultContext the starting context, such as NumericShaper.EUROPEAN
Returns
  • a shaper for the specified Unicode ranges.
Throws
IllegalArgumentException if the specified defaultContext is not a single valid range.

public static NumericShaper getContextualShaper (Set<NumericShaper.Range> ranges, NumericShaper.Range defaultContext)

Returns a contextual shaper for the provided Unicode range(s). The Latin-1 (EUROPEAN) digits will be converted to the decimal digits corresponding to the range of the preceding text, if the range is one of the provided ranges. The shaper uses defaultContext as the starting context.

Parameters
ranges the specified Unicode ranges
defaultContext the starting context, such as NumericShaper.Range.EUROPEAN
Returns
  • a contextual shaper for the specified Unicode ranges.
Throws
NullPointerException if ranges or defaultContext is null

public static NumericShaper getContextualShaper (int ranges)

Returns a contextual shaper for the provided unicode range(s). Latin-1 (EUROPEAN) digits are converted to the decimal digits corresponding to the range of the preceding text, if the range is one of the provided ranges. Multiple ranges are represented by or-ing the values together, such as, NumericShaper.ARABIC | NumericShaper.THAI. The shaper assumes EUROPEAN as the starting context, that is, if EUROPEAN digits are encountered before any strong directional text in the string, the context is presumed to be EUROPEAN, and so the digits will not shape.

Parameters
ranges the specified Unicode ranges
Returns
  • a shaper for the specified ranges

public Set<NumericShaper.Range> getRangeSet ()

Returns a Set representing all the Unicode ranges in this NumericShaper that will be shaped.

Returns
  • all the Unicode ranges to be shaped.

public int getRanges ()

Returns an int that ORs together the values for all the ranges that will be shaped.

For example, to check if a shaper shapes to Arabic, you would use the following:

if ((shaper.getRanges() & shaper.ARABIC) != 0) &#123; ...

Note that this method supports only the bit mask-based ranges. Call getRangeSet() for the enum-based ranges.

Returns
  • the values for all the ranges to be shaped.

public static NumericShaper getShaper (int singleRange)

Returns a shaper for the provided unicode range. All Latin-1 (EUROPEAN) digits are converted to the corresponding decimal unicode digits.

Parameters
singleRange the specified Unicode range
Returns
  • a non-contextual numeric shaper
Throws
IllegalArgumentException if the range is not a single range

public static NumericShaper getShaper (NumericShaper.Range singleRange)

Returns a shaper for the provided Unicode range. All Latin-1 (EUROPEAN) digits are converted to the corresponding decimal digits of the specified Unicode range.

Parameters
singleRange the Unicode range given by a NumericShaper.Range constant.
Returns
  • a non-contextual NumericShaper.
Throws
NullPointerException if singleRange is null

public int hashCode ()

Returns a hash code for this shaper.

Returns
  • this shaper's hash code.

public boolean isContextual ()

Returns a boolean indicating whether or not this shaper shapes contextually.

Returns
  • true if this shaper is contextual; false otherwise.

public void shape (char[] text, int start, int count)

Converts the digits in the text that occur between start and start + count.

Parameters
text an array of characters to convert
start the index into text to start converting
count the number of characters in text to convert
Throws
IndexOutOfBoundsException if start or start + count is out of bounds
NullPointerException if text is null

public void shape (char[] text, int start, int count, NumericShaper.Range context)

Converts the digits in the text that occur between start and start + count, using the provided context. Context is ignored if the shaper is not a contextual shaper.

Parameters
text a char array
start the index into text to start converting
count the number of chars in text to convert
context the context to which to convert the characters, such as NumericShaper.Range.EUROPEAN
Throws
IndexOutOfBoundsException if start or start + count is out of bounds
NullPointerException if text or context is null

public void shape (char[] text, int start, int count, int context)

Converts the digits in the text that occur between start and start + count, using the provided context. Context is ignored if the shaper is not a contextual shaper.

Parameters
text an array of characters
start the index into text to start converting
count the number of characters in text to convert
context the context to which to convert the characters, such as NumericShaper.EUROPEAN
Throws
IndexOutOfBoundsException if start or start + count is out of bounds
NullPointerException if text is null
IllegalArgumentException if this is a contextual shaper and the specified context is not a single valid range.

public String toString ()

Returns a String that describes this shaper. This method is used for debugging purposes only.

Returns
  • a String describing this shaper.