java.nio.charset

About

The java.nio.charset package provides classes and interfaces for character encoding and decoding. It is part of the Java NIO (New I/O) package introduced in Java 1.4 to provide non-blocking I/O operations, among other features. Understanding character sets is crucial for applications dealing with text processing, file I/O, network communication, and internationalization.

Charset (java.nio.charset.Charset)

Represents a named mapping between sequences of sixteen-bit Unicode characters and sequences of bytes. It defines methods for encoding, decoding, and working with character sets.

Static Methods

  • static Charset forName(String charsetName): Returns a charset object for the given charset name.

  • static Charset defaultCharset(): Returns the default charset of this Java virtual machine.

  • static SortedMap<String, Charset> availableCharsets(): Returns a map of available charsets.

  • static boolean isSupported(String charsetName): Tells whether the named charset is supported.

Methods

  • boolean contains(Charset cs): Tells whether or not this charset contains the given charset.

  • boolean canEncode(): Tells whether or not this charset supports encoding.

  • String name(): Returns the canonical name of this charset.

  • Set<String> aliases(): Returns a set containing this charset's aliases.

  • String displayName(): Returns a string that is the canonical name of this charset.

  • String displayName(Locale locale): Returns a localized string that is the canonical name of this charset.

  • boolean isRegistered(): Tells whether or not this charset is known by its canonical name.

  • CharsetEncoder newEncoder(): Constructs a new encoder for this charset.

  • CharsetDecoder newDecoder(): Constructs a new decoder for this charset.

  • ByteBuffer encode(CharBuffer cb): Convenience method that encodes Unicode characters into bytes.

  • ByteBuffer encode(String str): Convenience method that encodes a string into bytes.

  • CharBuffer decode(ByteBuffer bb): Convenience method that decodes bytes into Unicode characters.

  • int compareTo(Charset that): Compares this charset to another.

CharsetDecoder (java.nio.charset.CharsetDecoder)

Converts a byte sequence into a sequence of sixteen-bit Unicode characters.

Methods

  • Charset charset(): Returns the charset that created this decoder.

  • CharsetDecoder onMalformedInput(CodingErrorAction newAction): Changes the action for malformed-input errors.

  • CharsetDecoder onUnmappableCharacter(CodingErrorAction newAction): Changes the action for unmappable-character errors.

  • CodingErrorAction malformedInputAction(): Returns the current malformed-input action.

  • CodingErrorAction unmappableCharacterAction(): Returns the current unmappable-character action.

  • CharsetDecoder replaceWith(String newReplacement): Changes this decoder's replacement value.

  • String replacement(): Returns this decoder's replacement value.

  • boolean isAutoDetecting(): Tells whether or not this decoder implements an auto-detecting charset.

  • boolean isCharsetDetected(): Tells whether or not a charset has yet been detected by this decoder.

  • Charset detectedCharset(): Returns the charset detected by this decoder, if any.

  • CoderResult decode(ByteBuffer in, CharBuffer out, boolean endOfInput): Decodes one or more bytes into one or more characters.

  • CoderResult flush(CharBuffer out): Flushes this decoder.

  • CharsetDecoder reset(): Resets this decoder, clearing any internal state.

CharsetEncoder (java.nio.charset.CharsetEncoder)

Converts a sequence of sixteen-bit Unicode characters into a byte sequence.

Methods

  • Charset charset(): Returns the charset that created this encoder.

  • CharsetEncoder onMalformedInput(CodingErrorAction newAction): Changes the action for malformed-input errors.

  • CharsetEncoder onUnmappableCharacter(CodingErrorAction newAction): Changes the action for unmappable-character errors.

  • CodingErrorAction malformedInputAction(): Returns the current malformed-input action.

  • CodingErrorAction unmappableCharacterAction(): Returns the current unmappable-character action.

  • CharsetEncoder replaceWith(byte[] newReplacement): Changes this encoder's replacement value.

  • byte[] replacement(): Returns this encoder's replacement value.

  • float averageBytesPerChar(): Returns this encoder's average bytes-per-character.

  • float maxBytesPerChar(): Returns this encoder's maximum bytes-per-character.

  • CoderResult encode(CharBuffer in, ByteBuffer out, boolean endOfInput): Encodes one or more characters into one or more bytes.

  • CoderResult flush(ByteBuffer out): Flushes this encoder.

  • CharsetEncoder reset(): Resets this encoder, clearing any internal state.

  • boolean canEncode(char c): Tells whether or not this encoder can encode the given character.

  • boolean canEncode(CharSequence cs): Tells whether or not this encoder can encode the given character sequence.

CodingErrorAction (java.nio.charset.CodingErrorAction)

Describes actions to be taken when encoding/decoding errors occur. Actions include IGNORE, REPLACE, and REPORT.

Static Variables

  • static CodingErrorAction IGNORE: Action indicating that a coding error is to be ignored.

  • static CodingErrorAction REPLACE: Action indicating that a coding error is to be handled by dropping the erroneous input and replacing it with the current replacement byte or character sequence.

  • static CodingErrorAction REPORT: Action indicating that a coding error is to be reported, either by returning a CoderResult object or by throwing a CharacterCodingException, depending upon the method implementing the coding process.

StandardCharsets (java.nio.charset.StandardCharsets)

Defines constants for standard character sets such as UTF-8, UTF-16, ISO-8859-1, etc.

Static Variables

  • static Charset US_ASCII: Seven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block of the Unicode character set.

  • static Charset ISO_8859_1: ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1.

  • static Charset UTF_8: Eight-bit UCS Transformation Format.

  • static Charset UTF_16BE: Sixteen-bit UCS Transformation Format, big-endian byte order.

  • static Charset UTF_16LE: Sixteen-bit UCS Transformation Format, little-endian byte order.

  • static Charset UTF_16: Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark.

UnsupportedCharsetException (java.nio.charset.UnsupportedCharsetException)

Constructor

  • UnsupportedCharsetException(String charsetName): Constructs an instance of this class.

Methods

  • String getCharsetName(): Retrieves the name of the unsupported charset.

Examples

Encoding and Decoding Strings

Handling Encoding or Decoding Errors

Listing Available Charsets

Last updated