The
java.io.Reader
and
java.io.Writer
classes are abstract superclasses for classes that read and write character-based data.
The subclasses are notable for handling the conversion between different character sets.
Input and output streams are fundamentally byte-based. However, readers and writers are based on characters. In Java, a
char
is a two-byte Unicode character;
but in other character sets that you may have to read or write, characters can have varying widths.
ASCII and ISO Latin-1 use one-byte characters. Unicode uses two-byte characters. UTF-8 uses characters of varying width between one and three bytes. Readers and writers know how to handle all these character sets and many more seamlessly.
The encoding and decoding classes themselves are hidden in the
sun
packages. They are used internally by the
InputStreamReader
and
OutputStreamWriter
classes to convert the bytes used by the streams into
chars
used by the readers and writers and vice versa.
You can also convert byte arrays in a particular character set into Unicode strings using two
String
constructors.
The exact list of encodings available varies a little from platform to platform. Here are most of the important
Character Set Encodings.
This same set of encodings is used by readers that convert all bytes they read into Unicode
chars
.