Manipulating Data

This section discusses the operation of the Python standard library for data.

sorted() function

sorted() function can sort a list. For example:

>>> sorted([3, 1, 2])
[1, 2, 3]

sorted() function can also accept a key function as a parameter, which can be sorted by specified rules. For example:

>>> sorted([3, 1, 2], key=lambda x: -x)
[3, 2, 1]

sorted() function can also accept a reverse parameter, if reverse is true, the reverse order sort is indicated. For example:

>>> sorted([3, 1, 2], reverse=True)
[3, 2, 1]

key parameter can also sort an object according to the specified properties. For example:

>>> class Person:
...     def __init__(self, name, age):
...         self.name = name
...         self.age = age
...     def __repr__(self):
...         return 'Person(name={}, age={})'.format(self.name, self.age)
...     def wage(self):
...         return self.age * 10
>>> persons = [Person('Bob', 20), Person('Alice', 30), Person('Carol', 40)]
>>> sorted(persons, key=lambda p: p.name)
[Person(name='Alice', age=30), Person(name='Bob', age=20), Person(name='Carol', age=40)]
>>> sorted(persons, key=lambda p: p.age)
[Person(name='Bob', age=20), Person(name='Alice', age=30), Person(name='Carol', age=40)]

sorted() function is stable, that is, sorting does not change the relative position of the original data.

The built-in sorted() function is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal - this is helpful for sorting in multiple passes.

Use the Operator module function

Import related module:

>>> from operator import itemgetter, attrgetter, methodcaller

For example, use the attrgetter() function, you can sort according to the specified properties. For example:

>>> persons = [Person('Bob', 20), Person('Alice', 30), Person('Carol', 40)]
>>> sorted(persons, key=attrgetter('name'))
[Person(name='Alice', age=30), Person(name='Bob', age=20), Person(name='Carol', age=40)]
>>> sorted(persons, key=attrgetter('age'))
[Person(name='Bob', age=20), Person(name='Alice', age=30), Person(name='Carol', age=40)]

Using the methodcaller() function, you can sort according to the specified method. For example:

>>> persons = [Person('Bob', 20), Person('Alice', 30), Person('Carol', 40)]
>>> sorted(persons, key=methodcaller('wage'))
[Person(name='Bob', age=20), Person(name='Alice', age=30), Person(name='Carol', age=40)]

Using the itemgetter() function, you can sort according to the specified element. For example:

>>> inventory = [('apple', 3), ('banana', 2), ('pear', 5), ('orange', 1)]
>>> sorted(inventory, key=itemgetter(1))
[('orange', 1), ('banana', 2), ('apple', 3), ('pear', 5)]
>>> sorted(inventory, key=itemgetter(0))
[('apple', 3), ('banana', 2), ('orange', 1), ('pear', 5)]

Using efficient arrays

The data structure of the Array type can be used to store the same type of data and can more efficiently operate and less memory consumption.

First import the array module and create an array object:

>>> from array import array
>>> a = array('i', [1, 4, 9, 16, 25, 36, 49, 64, 81, 100])

Output the type of array object and the size of the element:

>>> a.typecode
'i'
>>> a.itemsize
4

Itemsize is 4, indicating that each element accounts for 4 bytes.

Add an element to the Array object:

>>> a.append(121)
>>> a
array('i', [1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121])
>>> a.insert(0, 0)
>>> a
array('i', [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121])
>>> a.extend([-2, -3, -4, -5, -6])
>>> a
array('i', [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, -2, -3, -4, -5, -6])

Traverse the elements in Array:

>>> for i, element in enumerate(a):
...     print(i, element)
...
0 0
1 1
2 4
3 9
4 16
5 25
6 36
7 49
8 64
9 81
10 100
11 121
12 -2
13 -3
14 -4
15 -5
16 -6

If you try to add a different type of element in the Array object, it will throw an exception:

>>> a.append(3.14)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: integer argument expected, got float

If you try to add an element that exceeds the length of the array's object, it will throw an exception:

>>> a = array('B', [1, 2, 3, 4, 5])
>>> a.append(300)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: unsigned byte integer is greater than maximum

Array objects can be converted to List:

>>> a.tolist()
[1, 2, 3, 4, 5]

Array bisection

If a list is already in order, use the Bisect module and use the left / right biction method, we can quickly find the location of an element in the list.

>>> import bisect
>>> a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
>>> bisect.bisect(a, 5)
5
>>> bisect.bisect_left(a, 5)
4
>>> bisect.bisect_right(a, 5)
5

bisec.bisect () and bisect.bisect_right () returns the index on the right side of the elements, and bisect.bisect_left () returns the index of the left side of the elements.

Use bisect.insort () to insert an element to a list quickly and keep the list sorted, for example:

>>> a = [1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> bisect.insort(a, 25)
>>> a
[1, 4, 9, 16, 25, 25, 36, 49, 64, 81]
>>> bisect.insort_left(a, 25)
>>> a
[1, 4, 9, 16, 25, 25, 25, 36, 49, 64, 81]
>>> bisect.insort_right(a, 25)
>>> a
[1, 4, 9, 16, 25, 25, 25, 25, 36, 49, 64, 81]

bisect can be used to do array lookup based on breakpoints, for example, we want to know the results of the grade, we can first find the score, then look for grades according to index:

>>> breakpoints = [60, 70, 80, 90]
>>> grades = 'FDCBA'
>>> grades[bisect.bisect(breakpoints, 66)]
'D'

Last updated