Entry
Is the sort() method on lists stable?
How do I perform a "stable" sort on a list?
What algorithm is used for the sort() method on lists?
Sep 23rd, 2004 14:47
Armin Rigo, Michael Chermside, Tim Peters, Jarno Virtanen
A sort algorithm is said to be "stable" if multiple items which compare
as equal will stay in the same order they were in after a sort.
The sort() method of lists IS STABLE starting from Python 2.3. The
upcoming docs for Python 2.4 guarantee that it will remain stable in the
future. The algorithm, by Tim Peters, is motivated by Jarno at:
http://pythonowns.blogspot.com/2002_07_28_pythonowns_archive.html
With older versions, a quick experiment might suggest that the sort()
method on lists was always stable:
Python 2.2.1 (#34, Apr 9 2002, 19:34:33) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import generators
>>> a = [1, 1.0, 3.0, 3, 2.0, 2, 0, 0.0]
>>> b = a[:] # make a duplicate for a subsequent test
>>> a.sort()
>>> a
[0, 0.0, 1, 1.0, 2.0, 2, 3.0, 3]
But this would be MISLEADING.
In fact, according to Tim Peters:
"Small lists are handled by binary insertion sort, for peak speed, and
the way that was coded just so happens to be stable. That's why small
examples will always lead you to believe .sort() is stable. The
full-blown samplesort used for non-trivial lists isn't stable and can't
be made stable efficiently."
I believe that the primary reason for using a samplesort is its n*ln(n)
behavior combined with minimizing comparisons (which are slow since they
have to be done in Python not C).
So if you REQUIRE a stable sort and upgrading to 2.3 is not an option,
another technique is needed. Decorating the existing values is probably
the best and simplest, although it requires making an extra copy of the
list and may use a good deal of memory:
>>> def enumerate(seq):
... 'Generates an indexed series: (0,seq[0]), (1,seq[1]), ...'
... i = 0
... it = iter(seq)
... while 1:
... yield (i, it.next())
... i += 1
...
>>> def stableSort(seq):
... decoratedSeq = [(item,i) for i, item in enumerate(seq)]
... decoratedSeq.sort()
... return [item for item, i in decoratedSeq]
...
>>> b = stableSort(b)
>>> b
[0, 0.0, 1, 1.0, 2.0, 2, 3.0, 3]
Proving that this sort is actually stable on lists long enough to invoke
the samplesort is left as an exercise for the reader <wink>.