Fix garbled text with Unicode display names obtained from HTTP headers

This is evil -- the semantic of non-ASCII data in header fields got
changed several times, and RFC 7230 from June 2014 says this [1]:

   Historically, HTTP has allowed field content with text in the
   ISO-8859-1 charset [ISO-8859-1], supporting other charsets only
   through use of [RFC2047] encoding.  In practice, most HTTP header
   field values use only a subset of the US-ASCII charset [USASCII].
   Newly defined header fields SHOULD limit their field values to
   US-ASCII octets.  A recipient SHOULD treat other octets in field
   content (obs-text) as opaque data.

...which means that there's no universal standard for using Unicode
characters in HTTP headers. In presence of this voidness, we can get
creative and support arguably broken schemes.

Our SSO solution (Shibboleth/SAML with eduid.cz) appears to work with
UTF-8. UTF-8 is certainly no worse than ISO-8859-1, and I'm a supporter
of a view which says that decoding ASCII as UTF-8 is reasonable approach
in these not-so-well-defined cases. This at least prevents the mojibake
of showing an ISO-8859-1 rendering of some UTF-8 octets.

[1] https://tools.ietf.org/html/rfc7230#section-3.2.4

Change-Id: I8b674550e34b0c6fed4cc0af2f069aaeadbae6cc
This commit is contained in:
Jan Kundrát
2017-01-20 13:09:51 +01:00
parent 04c96a8ab9
commit 7059cbd8da

View File

@@ -18,6 +18,8 @@ import static com.google.common.base.MoreObjects.firstNonNull;
import static com.google.common.base.Strings.emptyToNull;
import static com.google.common.net.HttpHeaders.AUTHORIZATION;
import static com.google.gerrit.reviewdb.client.AccountExternalId.SCHEME_GERRIT;
import static java.nio.charset.StandardCharsets.ISO_8859_1;
import static java.nio.charset.StandardCharsets.UTF_8;
import com.google.gerrit.extensions.registration.DynamicItem;
import com.google.gerrit.httpd.HtmlDomUtil;
@@ -143,7 +145,8 @@ class HttpAuthFilter implements Filter {
String getRemoteDisplayname(HttpServletRequest req) {
if (displaynameHeader != null) {
return emptyToNull(req.getHeader(displaynameHeader));
String raw = req.getHeader(displaynameHeader);
return emptyToNull(new String(raw.getBytes(ISO_8859_1), UTF_8));
}
return null;
}