Stop randomly sorting error lines

When we receive a HTML formatted error page e.g. from Apache, we convert
the HTML to plain text that we can output. This conversion effectively
amounts to stripping all the HTML and newline characters from the body
of the page and then removing duplicated lines. This last step is
proving problematic. Way back when, we did this the "dumb" way, by
having a list to store lines and only adding lines to this list if they
were not already present. This changed in change
I7b46e263a76d84573bdfbbece57b1048764ed939 when we switched to calling
set() on the generated list. However, sets are unordered which means we
end up with confusing, nonsensical error message in this case. For
example, given the following error page:

  <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
  <html><head>
  <title>502 Bad Gateway</title>
  <h1>Bad Gateway</h1>
  <p>The proxy server received an invalid
  response from an upstream server.<br />
  </p>
  <p>Additionally, a 201 Created
  error was encountered while trying to use an ErrorDocument to handle the request.</p>
  <hr>
  <address>Apache/2.4.52 (Ubuntu) Server at 10.0.110.85 Port 80</address>
  </body></html>

We would expect to see the following:

  502 Bad Gateway: Bad Gateway: The proxy server received an invalid:
  response from an upstream server.: Additionally, a 201 Created: error
  was encountered while trying to use an ErrorDocument to handle the
  request.: Apache/2.4.52 (Ubuntu) Server at 10.0.110.85 Port 80

Instead, we're getting:

  Apache/2.4.52 (Ubuntu) Server at 10.0.110.85 Port 80: error was
  encountered while trying to use an ErrorDocument to handle the
  request.: Additionally, a 201 Created: The proxy server received an
  invalid: response from an upstream server.: 502 Bad Gateway: Bad
  Gateway

Which is total nonsense.

Fix this by iterating as we used to, rather than relying on unsorted
sets.

PS: We also change variable names to keep mypy happy, since we'd like to
integrate that soon.

Change-Id: I52c1321e00aff1b2dfeaad2adfd4c02455b6eda7
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
This commit is contained in:
Stephen Finucane 2023-08-03 17:36:47 +01:00
parent 88fc0c2cf6
commit 8ee8f57a43

View File

@ -246,13 +246,13 @@ def raise_from_response(response, error_message=None):
except Exception:
details = response.text
elif response.content and 'text/html' in content_type:
# Split the lines, strip whitespace and inline HTML from the response.
details = [
re.sub(r'<.+?>', '', i.strip()) for i in response.text.splitlines()
]
details = list(set([msg for msg in details if msg]))
messages = []
for line in response.text.splitlines():
message = re.sub(r'<.+?>', '', line.strip())
if message not in messages:
messages.append(message)
# Return joined string separated by colons.
details = ': '.join(details)
details = ': '.join(messages)
if not details:
details = response.reason if response.reason else response.text