Abstract Class Collator in Java

digitization, transformation, binary

Abstract Class Collator in Java

The Collator class compares strings in a manner that is appropriate for a particular locale. Although Collator is an abstract class, the getInstance() factory methods can be used to get a usable instance of a Collator subclass that implements a particular collation strategy. One subclass, RuleBasedCollator, is provided as part of the JDK.

A Collator object has a strength property that controls the level of difference that is considered significant for comparison purposes. The Collator class provides four strength values: PRIMARY, SECONDARY, TERTIARY, and IDENTICAL. Although the interpretation of these strengths is locale-dependent, they generally have the following meanings:

Ads code goes here

PRIMARY – The comparison considers letter differences but ignores case and diacriticals.
SECONDARY – The comparison considers letter differences and diacriticals but ignores case.
TERTIARY – The comparison considers letter differences, cases, and diacriticals.
IDENTICAL – The comparison considers all differences.

The default comparison strength is TERTIARY.

If we only need to compare two String objects once, the compare() method of the Collator class provides the best performance. However, if we need to compare the same String objects multiple times, such as when we are sorting, we should use CollationKey objects instead.

A CollationKey object contains a String that has been converted into a series of bits that can be compared in a bitwise fashion against other CollationKey objects. We use a Collator object to create a CollationKey for a
given String.

The structure of the Abstract Class Collator is given by

public abstract class java.text.Collator extends java.lang.Object
implements java.io.Serializable,
java.lang.Cloneable {
// Constants
public static final int CANONICAL_DECOMPOSITION;
public static final int FULL_DECOMPOSITION;
public static final int IDENTICAL;
public static final int NO_DECOMPOSITION;
public static final int PRIMARY;
public static final int SECONDARY;
public static final int TERTIARY;
// Constructors
protected Collator();
// Class Methods
public static synchronized Locale[] getAvailableLocales();
public static synchronized Collator getInstance();
public static synchronized Collator getInstance(Locale desiredLocale);
// Instance Methods
public Object clone();
public abstract int compare(String source, String target);
public boolean equals(Object that);
public boolean equals(String source, String target);
public abstract CollationKey getCollationKey(String source);
public synchronized int getDecomposition();
public synchronized int getStrength();
public abstract synchronized int hashCode();
public synchronized void setDecomposition(int decompositionMode);
public synchronized void setStrength(int newStrength);
}

The details of the class structure are given as follows:

public static final int CANONICAL_DECOMPOSITION;

public static final int CANONICAL_DECOMPOSITION represents a decomposition constant that specifies that Unicode 2.0 characters which are canonical variants are decomposed for collation. This is the default decomposition setting.

public static final int FULL_DECOMPOSITION;

public static final int FULL_DECOMPOSITION represents a decomposition constant that specifies that Unicode 2.0 canonical variants and compatibility variants are decomposed for collation. This is the most complete decomposition setting, and thus the slowest setting.

public static final int IDENTICAL;

public static final int IDENTICAL represents a strength constant that specifies that all differences are considered significant for comparison purposes.

public static final int NO_DECOMPOSITION;

public static final int NO_DECOMPOSITION represents a decomposition setting that specifies that no Unicode characters are decomposed for collation. This is the least complete decomposition setting, and thus the fastest setting. It only works correctly for languages that do not use diacriticals.

public static final int PRIMARY;

public static final int PRIMARY represents a strength constant that specifies that only primary differences are considered significant for comparison purposes. Primary differences are typically letter differences.

public static final int SECONDARY;

public static final int SECONDARY represents a strength constant that specifies that only secondary differences and above are considered significant for comparison purposes. Secondary differences are typically differences in diacriticals or accents.

public static final int TERTIARY;

public static final int TERTIARY represents a strength constant that specifies that only tertiary differences and above are considered significant for comparison purposes. Tertiary differences are typically differences in case. This is the default strength setting.

protected Collator();

protected Collator() constructor creates a Collator with the default strength of TERTIARY and default decomposition mode of CANONICAL_DECOMPOSITION.

public static synchronized Locale[] getAvailableLocales();

public static synchronized Locale[] getAvailableLocales() method returns an array of the Locale objects for which this class can create Collator objects.

READ  Class PipedOutputStream in Java

This method returns an array of Locale objects.

public static synchronized Collator getInstance();

public static synchronized Collator getInstance() method creates a Collator that compares strings in the default Locale.

This method returns a Collator appropriate for the default Locale.

public static synchronized Collator getInstance(Locale desiredLocale);

public static synchronized Collator getInstance(Locale desiredLocale) method creates a Collator that compares strings in the given Locale.

This method returns a Collator appropriate for the given Locale.

Parameter
desiredLocale – The Locale to use.

public Object clone();

public Object clone() method creates a copy of this Collator and returns it.

This method returns a copy of this Collator.

public abstract int compare(String source, String target);

public abstract int compare(String source, String target) method compares the given strings according to the collation rules for this Collator and
returns a value that indicates their relationship. If either of the strings is compared more than once,
a CollationKey should be used instead.

This method returns -1 if the source is less than the target, 0 if the strings are equal, or 1 if the source is greater than the target.

Parameter
source – The source string.
target – The target string.

public boolean equals(Object that);

public boolean equals(Object that) method returns true if obj is an instance of Collator and is equivalent to this Collator.

This method returns true if the objects are equal; false if they are not.

Parameter
that – The object to be compared with this object.

public boolean equals(String source, String target);

public boolean equals(String source, String target) method compares the given strings for equality using the collation rules for this Collator.

READ  Class StringBufferInputStream in Java

Note: This method applies locale-specific rules and is thus not the same as String.equals().

This method returns true if the given strings are equal; false otherwise.

Parameter
source – The source string.
target – The target string.

public abstract CollationKey getCollationKey(String source);

public abstract CollationKey getCollationKey(String source) method generates a CollationKey for the given string. The returned object can be compared with other CollationKey objects using CollationKey.compareTo(). This comparison is
faster than using Collator.compare(), so if the same string is used for many comparisons, we should use CollationKey objects.

This method returns a CollationKey for the given string.
Parameter
source – The string to use when generating the CollationKey.

public synchronized int getDecomposition();

public synchronized int getDecomposition() method returns the current decomposition mode for this Collator. The decomposition mode specifies how composed Unicode characters are handled during collation. We can adjust the
decomposition mode to choose between faster and more complete collation. The returned value is one of the following values: NO_DECOMPOSITION, CANONICAL_DECOMPOSITION, or FULL_DECOMPOSITION.

This method returns the decomposition mode for this Collator.

public synchronized int getStrength();

public synchronized int getStrength() method returns the current strength setting for this Collator. The strength specifies the minimum level of difference that is considered significant during collation. The returned value is one
of the following values: PRIMARY, SECONDARY, TERTIARY, or IDENTICAL.

This method returns the strength setting for this Collator.

public abstract synchronized int hashCode();

public abstract synchronized int hashCode() method returns a hashcode for this Collator.

This method returns a hashcode for this object.

public synchronized void setDecomposition(int decompositionMode);

public synchronized void setDecomposition(int decompositionMode) method sets the decomposition mode for this collator. The decomposition mode specifies how composed Unicode characters are handled during collation. We can adjust the decomposition mode to choose between faster and more complete collation.

READ  Class LineNumberInputStream in Java

Parameter
decompositionMode – The decomposition mode: NO_DECOMPOSITION, CANONICAL_DECOMPOSITION, or
FULL_DECOMPOSITION.

public synchronized void setStrength(int newStrength);

public synchronized void setStrength(int newStrength) method sets the strength of this Collator. The strength specifies the minimum level of difference that is considered significant during collation.

Parameter
newStrength – The new strength setting: PRIMARY, SECONDARY, TERTIARY, or IDENTICAL.

Apart from these Collator class also has inherited methods from class- Object. They are as follows:

  • finalize()
  • notifyAll()
  • wait()
  • wait(long, int)
  • getClass()
  • notify()
  • toString()
  • wait(long)

Share and Enjoy !

Leave a Comment

Your email address will not be published. Required fields are marked *