uk.org.blankaspect.nlf
Class NlfUtilities

java.lang.Object
  extended by uk.org.blankaspect.nlf.NlfUtilities

public class NlfUtilities
extends java.lang.Object

This class contains a number of publicly accessible utility methods.

Since:
1.0

Method Summary
static int getUtf8Length(java.lang.String str)
          Returns the length of the UTF-8 sequence that results from encoding the specified string.
static boolean isNameChar(int codePoint)
          Determines whether the specified Unicode code point is allowed to be the second or subsequent character in an identifier or an attribute name.
static boolean isNameStartChar(int codePoint)
          Determines whether the specified Unicode code point is allowed to be the first character in an identifier or an attribute name.
static boolean isUtf8LengthWithinBounds(java.lang.String str, int minLength, int maxLength)
          Tests whether the length of the UTF-8 sequence that results from encoding the specified string is within the specified bounds.
static byte[] stringToUtf8(java.lang.String str)
          Encodes the specified string as a UTF-8 sequence.
static java.lang.String utf8ToString(byte[] data)
          Decodes the specified UTF-8 sequence to a string.
static java.lang.String utf8ToString(byte[] data, int offset, int length)
          Decodes the specified UTF-8 sequence to a string.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

utf8ToString

public static java.lang.String utf8ToString(byte[] data)
                                     throws java.lang.IllegalArgumentException,
                                            NlfUncheckedException
Decodes the specified UTF-8 sequence to a string. An IllegalArgumentException is thrown if the UTF-8 sequence is malformed or contains bytes that cannot be mapped to a character. This method is equivalent to #utf8ToString( data, 0, data.length ).

Parameters:
data - an array that contains the UTF-8 sequence that is to be decoded.
Returns:
the string that results from decoding the input sequence.
Throws:
java.lang.IllegalArgumentException - if
  • the UTF-8 sequence is malformed, or
  • the UTF-8 sequence contains bytes that cannot be mapped to a character.
NlfUncheckedException - if the Java implementation does not support the UTF-8 character encoding, which is required of all Java implementations.
Since:
1.0
See Also:
utf8ToString(byte[], int, int), stringToUtf8(java.lang.String)

utf8ToString

public static java.lang.String utf8ToString(byte[] data,
                                            int offset,
                                            int length)
                                     throws java.lang.IllegalArgumentException,
                                            NlfUncheckedException
Decodes the specified UTF-8 sequence to a string. An IllegalArgumentException is thrown if the UTF-8 sequence is malformed or if it contains bytes that cannot be mapped to a character.

Parameters:
data - an array that contains the UTF-8 sequence that is to be decoded.
offset - the offset to data at which the input sequence begins.
length - the length of the input sequence.
Returns:
the string that results from decoding the input sequence.
Throws:
java.lang.IllegalArgumentException - if
  • the UTF-8 sequence is malformed, or
  • the UTF-8 sequence contains bytes that cannot be mapped to a character.
NlfUncheckedException - if the Java implementation does not support the UTF-8 character encoding, which is required of all Java implementations.
Since:
1.0
See Also:
utf8ToString(byte[]), stringToUtf8(java.lang.String)

stringToUtf8

public static byte[] stringToUtf8(java.lang.String str)
                           throws NlfUncheckedException
Encodes the specified string as a UTF-8 sequence.

Parameters:
str - the string that is to be encoded.
Returns:
an array containing the UTF-8 sequence that results from encoding the input string.
Throws:
NlfUncheckedException - if the Java implementation does not support the UTF-8 character encoding, which is required of all Java implementations.
Since:
1.0
See Also:
utf8ToString(byte[], int, int)

getUtf8Length

public static int getUtf8Length(java.lang.String str)
Returns the length of the UTF-8 sequence that results from encoding the specified string.

Parameters:
str - the string whose encoded length is to be determined.
Returns:
the length of the UTF-8 sequence that results from encoding the specified string.
Since:
1.0
See Also:
stringToUtf8(java.lang.String)

isUtf8LengthWithinBounds

public static boolean isUtf8LengthWithinBounds(java.lang.String str,
                                               int minLength,
                                               int maxLength)
Tests whether the length of the UTF-8 sequence that results from encoding the specified string is within the specified bounds.

Parameters:
str - the string whose encoded length is to be tested.
minLength - the minimum length of the encoded string.
maxLength - the maximum length of the encoded string.
Returns:
true if the length of the UTF-8 sequence that results from encoding the specified string is within the specified bounds; false otherwise.
Since:
1.0
See Also:
getUtf8Length(java.lang.String)

isNameStartChar

public static boolean isNameStartChar(int codePoint)
Determines whether the specified Unicode code point is allowed to be the first character in an identifier or an attribute name.

The set of allowable characters is the same as that for XML names according to the XML 1.1 specification, but without the ':' (colon) character. Note that XML 1.1 names are less restrictive than those of XML 1.0, so that a name that is valid under XML 1.1 might not be valid under XML 1.0.

Parameters:
codePoint - the Unicode code point that is to be tested.
Returns:
true if the code point is allowed to be the first character in an identifier or an attribute name; false otherwise.
Since:
1.0
See Also:
isNameChar(int)

isNameChar

public static boolean isNameChar(int codePoint)
Determines whether the specified Unicode code point is allowed to be the second or subsequent character in an identifier or an attribute name.

The set of allowable characters is the same as that for XML names according to the XML 1.1 specification, but without the ':' (colon) character. Note that XML 1.1 names are less restrictive than those of XML 1.0, so that a name that is valid under XML 1.1 might not be valid under XML 1.0.

Parameters:
codePoint - the Unicode code point that is to be tested.
Returns:
true if the code point is allowed to be the second or subsequent character in an identifier or an attribute name; false otherwise.
Since:
1.0
See Also:
isNameStartChar(int)