You must rewind your incoming buffer when you fail to encode a character in a CharsetEncoder or you’ll get an IllegalArgumentException

I am writing a CharsetEncoder in Java, which is my kind of fun.

I was getting a mysterious error when I identified that I could not encode certain characters:

Exception in thread "main" java.lang.IllegalArgumentException
	at java.nio.Buffer.position(Buffer.java:244)
	at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:618)
	at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:802)
	at java.nio.charset.Charset.encode(Charset.java:843)
	at java.nio.charset.Charset.encode(Charset.java:863)

After some investigation I realised the library code in Charset.encode was expecting me not to have consumed any characters of my incoming CharBuffer if I rejected the input by returning something like CoderResult.unmappableForLength.

Of course, in order to discover the input was unmappable, I did have to read it, but I made this problem go away by stepping back one char when I found an error like this:

@Override
public CoderResult encodeLoop(CharBuffer in, ByteBuffer out) {
    char ch = in.get();
    if(isUnmappable(ch)) {
        in.position(in.position() - 1);
        return CoderResult.unmappableForLength(2);
    }
    // ... rest of method ...

I hope this helps someone.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.