Search

01 June, 2018

Python List sorting with key argument explored




Everybody uses lists as an array to store values. List provide a lot of in-build features.

  • Sorting
  • Membership
  • Indexing
  • Iteration
  • reversing
  • Adding/Removing
  • Popping
  • Count

Lots of reasons to use lists. I think, one of the most used features is sorting. Internally , python uses Merge sort technique to sort the array items. But the sort method can be used in many other ways to have more control.




The python help says:


>>> help(list.sort)
Help on method_descriptor:

sort(...)
    L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*;
    cmp(x, y) -> -1, 0, 1


Things to note: The sort method doesn't return anything. It saves the changes in the list permanently. Thats why it is faster than the method sorted

SIMPLE SORT


>>> l = ['c', 'b', 'd', 'a']
>>> l.sort()
>>> l
['a', 'b', 'c', 'd']

If elements are strings, they gets sorted by alphabetical order.

SORT IN REVERSE ORDER


>>> l = ['c', 'b', 'd', 'a']
>>> l.sort(reverse=True)
>>> l
['d', 'c', 'b', 'a']


Using the KEY argument.

Perhaps the most versatile of all is the "key" argument. The value of the key parameter should be a function that takes a single argument and it returns a key to use for sorting purposes. This technique is fast because the key function is called exactly once for each input record. 


Simple Sort


>>> l = ['c', 'b', 'd', 'a']
>>> l.sort(key=str, reverse=True)
>>> l
['d', 'c', 'b', 'a']


What above code shows is, I want to sort alphabetically by the rules of ascii , in reverse order. This is not so convincing I assume. Perhaps a much interesting example.

Sort by Length of string


>>> l = ['abc', 'b', 'ab']
>>> l.sort(key=len)
>>> l
['b', 'ab', 'abc']

In the above case, we are sorting by length of strings.

So we actually now have a list : [3, 1, 2] .

Hence the result,

['b', 'ab', 'abc']


Corresponding lengths: 

[1 , 2, 3]

Sort by case (Upper)



>>> l = ['abc', 'A', 'AB']
>>> l.sort(key=str.upper)
>>> l
['A', 'AB', 'abc']

When we specify str.upper, we are saying - Treat every element as UPPER CASE, then sort.

So we are sorting ['ABC''A''AB']

As a result, we get ['A''AB', 'ABC']

Sort by case (Lower)



>>> l = ['abc', 'A', 'B', 'b', 'AB']
>>> l.sort(key=str.lower)
>>> l
['A', 'AB', 'abc', 'B', 'b']

Sort by last letter



>>> strs = ['xc', 'zb', 'yd' ,'wa']
>>> strs.sort(key=lambda x: x[-1])
>>> strs
['wa', 'zb', 'xc', 'yd']

I want to sort by the last letter of each word. Essentially this is how we manually do it.

Step 1 : Extract last letter of each word and keep in a list
Step 2 : Sort that list
Step 3 : Relatively display the original list but sorted.

For the Step 1 , we have used a function created using lambda. This function accepts a string and returns it's last letter.

Zeros as least preference


I have a list of numbers with some zeros in it . I want to have the list sorted , but I also want all the zeros at the end.

Original list : l = [4,0,3,0,8,0,1]

Expected list after sorting: [1, 3, 4, 8, 0, 0, 0]

There are many ways to do it. I'll use what comes to my mind first.


>>> sorted(l, key=lambda x:str(x) if x == 0 else x)
[1, 3, 4, 8, 0, 0, 0]


Guess what's happening here? I am exploiting the fact that, as a preference, python will pick up integers first, then strings. So I converted 0 into '0'.
Here's the proof:

>>> ll = [3,2,3, '1', '3', '0']
>>> sorted(ll)
[2, 3, 3, '0', '1', '3']

Sorting a list of big string integers


I have a list of big huge integers in string form and I want sort that. The idea we can use here is:
We know a 2 length integer is likely to be bigger than a 1 length integer.


>>> l = ['1','2','3','56235434245634562345','556532', '84']
>>> sorted(l, key=lambda x: (len(x), x))
['1', '2', '3', '84', '556532', '56235434245634562345']

Essentially we are sorting a list of tuple.

>>> sorted([(1, 5), (1,2), (2, 0), (2, 4)])
[(1, 2), (1, 5), (2, 0), (2, 4)]

So minimal length numbers will get sorted first. Then next length follows and so on .


Sorting a list of dictionaries by a specific key


We have a list that has many dictionaries . All dicts has the same key . We want to sort the list by the value of this key.


>>> l
[{'a': 60}, {'a': 10}, {'a': 20}]

>>> print sorted(l, key=lambda x:x['a'])
[{'a': 10}, {'a': 20}, {'a': 60}]


Need more guidance? I found the google developers guide to be very good. Have a look.


1 comment: