First of all, I just want to point out that "Where to post what" (viewtopic.php?f=4&t=2) is out of date and should point to https://bitbucket.org/chromiumembedded/cef/issues instead.
Issue
Chrome auto-detection of character encoding only checks first X characters in the html string. If there are UTF-16 characters at the end of a long html string, these characters will not render properly in Chrome/CEF. These characters will render properly if the HTML is shortened.
Example
Given this 180,000 character HTML file: https://pastebin.com/LjtHdDs2 , you may open this in Chrome browser or render via CEF and the Chinese text at the end of the HTML string will be garbled. The Chinese characters will be rendered properly if:
1. Many of the <rect></rect> elements are removed, resulting in a shorter HTML string overall.
2. A single UTF-16 character is added somewhere towards the beginning of the HTML string.
3. <meta charset="utf-16"/> is added at the beginning of the file
Theory
Chrome checks the first X characters of the HTML string to autodetect encoding. This number is somewhere around unsigned short max (65535). If no special chars are found, it defaults to UTF-8 (?)
I'm not sure this is really a "bug" persay, but it is somewhat strange behavior with no warning that can cause confusion for developers.