cliff/releasenotes/notes/handle-none-values-when-sorting-de40e36c66ad95ca.yaml
Stephen Finucane 4f45f9a30e Handle null values when sorting
One unfortunate change (or fortunate, depending on how you look at
types) in Python 3 is the inability to sort iterables of different
types. For example:

  >>> x = ['foo', 'bar', None, 'qux']
  >>> sorted(x)
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  TypeError: '<' not supported between instances of 'NoneType' and 'str'

Fortunately, we can take advantage of the fact that by providing a
function for the 'key' that returns a tuple, we can sort on multiple
conditions. In this case, "when the first key returns that two elements
are equal, the second key is used to compare." [1] We can use this to
first separate the values by whether they are None or not, punting those
that are not to the end, before sorting the non-None values normally.
For example:

  >>> x = ['foo', 'bar', None, 'qux']
  >>> sorted(x, key=lambda k: (k is None, k))
  ['bar', 'foo', 'qux', None]

We were already using this feature implicitly through our use of
'operator.itemgetter(*indexes)', which will return a tuple if there is
more than one item in 'indexes', and now we simply make that explicit,
fixing the case where we're attempting to compare a comparable type
with None. For all other cases, such as comparing a value that isn't
comparable, we surround things with a try-catch and a debug logging
statement to allow things to continue.

Note that we could optimize what we're done further by building a key
value that covers all indexes, rather than using a for loop to do so.
For example:

  >>> x = [('baz', 2), (None, 0), ('bar', 3), ('baz', 4), ('qux', 0)]
  >>> sorted(x, key=lambda k: list(
  ...     itertools.chain((k[i] is None, k[i]) for i in (0, 1)))
  ... )
  [('bar', 3), ('baz', 2), ('baz', 4), ('qux', 0), (None, 0)]

However, this would be harder to grok and would also mean we're unable
to handle exceptions on a single column where e.g. there are mixed types
or types that are not comparable while still sorting on the other
columns. Perhaps this would be desirable for some users, but sorting on
a best-effort basis does seem wiser and generally more user friendly.
Anyone that wants to sort on such columns should ensure their types are
comparable or implement their own sorting implementation.

[1] https://www.kite.com/python/answers/how-to-sort-by-two-keys-in-python

Change-Id: I4803051a6dd05c143a15923254af97e32cd39693
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Story: 2008456
Task: 41466
2021-01-29 15:40:31 +00:00

9 lines
353 B
YAML

---
fixes:
- |
Sorting output using the ``--sort-column`` option will now handle ``None``
values. This was supported implicitly in Python 2 but was broken in the
move to Python 3. In addition, requests to sort a column containing
non-comparable types will now be ignored. Previously, these request would
result in a ``TypeError``.