Prevent unicode object error from zero-byte read

During large file uploads under py3, we are occasionally seeing a
"unicode objects must be encoded before hashing" error even though
we are reading from a byte stream.  From what I can tell, it looks
like it's happening when a zero-byte read is requested, so we handle
that case explicitly.  This is a band-aid fix; we still need to track
down the source.

Co-authored-by: wangxiyuan <>
Co-authored-by: Brian Rosmaita <>

Related-bug: #1805332
Change-Id: Ia7653f9fcbe902abc203c10c80ab44a641a4d8f9
wangxiyuan 2018-11-27 14:50:50 +08:00 committed by Brian Rosmaita
parent 5b68597400
commit 1d25a2b7a2
1 changed files with 14 additions and 1 deletions

View File

@ -1631,7 +1631,20 @@ class ChunkReader(object):
if i > left:
i = left
result = self.do_read(i)
# Note(rosmaita): under some circumstances in py3, a zero-byte
# read results in a non-byte value that then causes a "unicode
# objects must be encoded before hashing" error when we do the
# hash computations below. (At least that seems to be what's
# happening in testing.) So just fake a zero-byte read and let
# the current execution path continue.
# See
# TODO(rosmaita): find what in the execution path is returning
# a native string instead of bytes and fix it.
if i == 0:
result = b''
result = self.do_read(i)
self.bytes_read += len(result)