Prevent unicode object error from zero-byte read

During large file uploads under py3, we are occasionally seeing a "unicode objects must be encoded before hashing" error even though we are reading from a byte stream. From what I can tell, it looks like it's happening when a zero-byte read is requested, so we handle that case explicitly. This is a band-aid fix; we still need to track down the source. Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: Brian Rosmaita <rosmaita.fossdev@gmail.com> Related-bug: #1805332 Change-Id: Ia7653f9fcbe902abc203c10c80ab44a641a4d8f9
2018-11-27 14:50:50 +08:00 · 2018-11-27 14:50:50 +08:00 · 1d25a2b7a2
parent 5b68597400
commit 1d25a2b7a2
1 changed files with 14 additions and 1 deletions
--- a/glance_store/_drivers/swift/store.py
+++ b/glance_store/_drivers/swift/store.py
@ -1631,7 +1631,20 @@ class ChunkReader(object):
        if i > left:
            i = left

-        result = self.do_read(i)
+        # Note(rosmaita): under some circumstances in py3, a zero-byte
+        # read results in a non-byte value that then causes a "unicode
+        # objects must be encoded before hashing" error when we do the
+        # hash computations below.  (At least that seems to be what's
+        # happening in testing.)  So just fake a zero-byte read and let
+        # the current execution path continue.
+        # See https://bugs.launchpad.net/glance-store/+bug/1805332
+        # TODO(rosmaita): find what in the execution path is returning
+        # a native string instead of bytes and fix it.
+        if i == 0:
+            result = b''
+        else:
+            result = self.do_read(i)
+
        self.bytes_read += len(result)
        self.checksum.update(result)
        self.os_hash_value.update(result)