Python Programming/Lists


A list in Python is an ordered group of items (or elements). It is a very general structure, and list elements don't have to be of the same type: you can put numbers, letters, strings and nested lists all on the same list.

Overview

edit

Lists in Python at a glance:

list1 = []                      # A new empty list
list2 = [1, 2, 3, "cat"]        # A new non-empty list with mixed item types
list1.append("cat")             # Add a single member, at the end of the list
list1.extend(["dog", "mouse"])  # Add several members
list1.insert(0, "fly")          # Insert at the beginning
list1[0:0] = ["cow", "doe"]     # Add members at the beginning
doe = list1.pop(1)              # Remove item at index
if "cat" in list1:              # Membership test
  list1.remove("cat")           # Remove AKA delete
#list1.remove("elephant") - throws an error
for item in list1:              # Iteration AKA for each item
  print(item)
print("Item count:", len(list1))# Length AKA size AKA item count
list3 = [6, 7, 8, 9]
for i in range(0, len(list3)):  # Read-write iteration AKA for each item
  list3[i] += 1                 # Item access AKA element access by index
last = list3[-1]                # Last item
nextToLast = list3[-2]          # Next-to-last item
isempty = len(list3) == 0       # Test for emptiness
set1 = set(["cat", "dog"])      # Initialize set from a list
list4 = list(set1)              # Get a list from a set
list5 = list4[:]                # A shallow list copy
list4equal5 = list4==list5      # True: same by value
list4refEqual5 = list4 is list5 # False: not same by reference
list6 = list4[:]
del list6[:]                    # Clear AKA empty AKA erase
list7 = [1, 2] + [2, 3, 4]      # Concatenation
print(list1, list2, list3, list4, list5, list6, list7)
print(list4equal5, list4refEqual5)
print(list3[1:3], list3[1:], list3[:2]) # Slices
print(max(list3 ), min(list3 ), sum(list3)) # Aggregates

print([x for x in range(10)])   # List comprehension
print([x for x in range(10) if x % 2 == 1])
print([x for x in range(10) if x % 2 == 1 if x < 5])
print([x + 1 for x in range(10) if x % 2 == 1])
print([x + y for x in '123' for y in 'abc'])

List creation

edit

There are two different ways to make a list in Python. The first is through assignment ("statically"), the second is using list comprehensions ("actively").

Plain creation

edit

To make a static list of items, write them between square brackets. For example:

[ 1,2,3,"This is a list",'c',Donkey("kong") ]

Observations:

  1. The list contains items of different data types: integer, string, and Donkey class.
  2. Objects can be created 'on the fly' and added to lists. The last item is a new instance of Donkey class.

Creation of a new list whose members are constructed from non-literal expressions:

a = 2
b = 3
myList = [a+b, b+a, len(["a","b"])]

List comprehensions

edit

Using list comprehension, you describe the process using which the list should be created. To do that, the list is broken into two pieces. The first is a picture of what each element will look like, and the second is what you do to get it.

For instance, let's say we have a list of words:

listOfWords = ["this","is","a","list","of","words"]

To take the first letter of each word and make a list out of it using list comprehension, we can do this:

>>> listOfWords = ["this","is","a","list","of","words"]
>>> items = [ word[0] for word in listOfWords ]
>>> print(items)
['t', 'i', 'a', 'l', 'o', 'w']

List comprehension supports more than one for statement. It will evaluate the items in all of the objects sequentially and will loop over the shorter objects if one object is longer than the rest.

>>> item = [x+y for x in 'cat' for y in 'pot']
>>> print(item)
['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'to', 'tt']

List comprehension supports an if statement, to only include members into the list that fulfill a certain condition:

>>> print([x+y for x in 'cat' for y in 'pot'])
['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'to', 'tt']
>>> print([x+y for x in 'cat' for y in 'pot' if x != 't' and y != 'o' ])
['cp', 'ct', 'ap', 'at']
>>> print([x+y for x in 'cat' for y in 'pot' if x != 't' or y != 'o' ])
['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'tt']

In version 2.x, Python's list comprehension does not define a scope. Any variables that are bound in an evaluation remain bound to whatever they were last bound to when the evaluation was completed. In version 3.x Python's list comprehension uses local variables:

>>> print(x, y)                        #Input to python version 2
t t                                    #Output using python 2

>>> print(x, y)                        #Input to python version 3
NameError: name 'x' is not defined     #Python 3 returns an error because x and y were not leaked

This is exactly the same as if the comprehension had been expanded into an explicitly-nested group of one or more 'for' statements and 0 or more 'if' statements.

List creation shortcuts

edit

You can initialize a list to a size, with an initial value for each element:

>>> zeros=[0]*5
>>> print zeros
[0, 0, 0, 0, 0]

This works for any data type:

>>> foos=['foo']*3
>>> print(foos)
['foo', 'foo', 'foo']

But there is a caveat. When building a new list by multiplying, Python copies each item by reference. This poses a problem for mutable items, for instance in a multidimensional array where each element is itself a list. You'd guess that the easy way to generate a two dimensional array would be:

listoflists=[ [0]*4 ] *5

and this works, but probably doesn't do what you expect:

>>> listoflists=[ [0]*4 ] *5
>>> print(listoflists)
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
>>> listoflists[0][2]=1
>>> print(listoflists)
[[0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0]]

What's happening here is that Python is using the same reference to the inner list as the elements of the outer list. Another way of looking at this issue is to examine how Python sees the above definition:

>>> innerlist=[0]*4
>>> listoflists=[innerlist]*5
>>> print(listoflists)
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
>>> innerlist[2]=1
>>> print(listoflists)
[[0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0]]

Assuming the above effect is not what you intend, one way around this issue is to use list comprehensions:

>>> listoflists=[[0]*4 for i in range(5)]
>>> print(listoflists)
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
>>> listoflists[0][2]=1
>>> print(listoflists)
[[0, 0, 1, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]

List size

edit

To find the length of a list use the built in len() method.

>>> len([1,2,3])
3
>>> a = [1,2,3,4]
>>> len( a )
4

Combining lists

edit

Lists can be combined in several ways. The easiest is just to 'add' them. For instance:

>>> [1,2] + [3,4]
[1, 2, 3, 4]

Another way to combine lists is with extend. If you need to combine lists inside of a lambda, extend is the way to go.

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> a.extend(b)
>>> print(a)
[1, 2, 3, 4, 5, 6]

The other way to append a value to a list is to use append. For example:

>>> p=[1,2]
>>> p.append([3,4])
>>> p
[1, 2, [3, 4]]
>>> # or
>>> print(p)
[1, 2, [3, 4]]

However, [3,4] is an element of the list, and not part of the list. append always adds one element only to the end of a list. So if the intention was to concatenate two lists, always use extend.

Getting pieces of lists (slices)

edit

Continuous slices

edit

Like strings, lists can be indexed and sliced:

>>> list = [2, 4, "usurp", 9.0, "n"]
>>> list[2]
'usurp'
>>> list[3:]
[9.0, 'n']

Much like the slice of a string is a substring, the slice of a list is a list. However, lists differ from strings in that we can assign new values to the items in a list:

>>> list[1] = 17
>>> list
[2, 17, 'usurp', 9.0, 'n']

We can assign new values to slices of the lists, which don't even have to be the same length:

>>> list[1:4] = ["opportunistic", "elk"]
>>> list
[2, 'opportunistic', 'elk', 'n']

It's even possible to append items onto the start of lists by assigning to an empty slice:

>>> list[:0] = [3.14, 2.71]
>>> list
[3.14, 2.71, 2, 'opportunistic', 'elk', 'n']

Similarly, you can append to the end of the list by specifying an empty slice after the end:

>>> list[len(list):] = ['four', 'score']
>>> list
[3.14, 2.71, 2, 'opportunistic', 'elk', 'n', 'four', 'score']

You can also completely change the contents of a list:

>>> list[:] = ['new', 'list', 'contents']
>>> list
['new', 'list', 'contents']

The right-hand side of a list assignment statement can be any iterable type:

>>> list[:2] = ('element',('t',),[])
>>> list
['element', ('t',), [], 'contents']

With slicing you can create copy of list since slice returns a new list:

>>> original = [1, 'element', []]
>>> list_copy = original[:]
>>> list_copy
[1, 'element', []]
>>> list_copy.append('new element')
>>> list_copy
[1, 'element', [], 'new element']
>>> original
[1, 'element', []]

Note, however, that this is a shallow copy and contains references to elements from the original list, so be careful with mutable types:

>>> list_copy[2].append('something')
>>> original
[1, 'element', ['something']]

Non-Continuous slices

edit

It is also possible to get non-continuous parts of an array. If one wanted to get every n-th occurrence of a list, one would use the :: operator. The syntax is a:b:n where a and b are the start and end of the slice to be operated upon.

>>> list = [i for i in range(10) ]
>>> list
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list[::2]
[0, 2, 4, 6, 8]
>>> list[1:7:2]
[1, 3, 5]

Comparing lists

edit

Lists can be compared for equality.

>>> [1,2] == [1,2]
True
>>> [1,2] == [3,4]
False

Lists can be compared using a less-than operator, which uses lexicographical order:

>>> [1,2] < [2,1]
True
>>> [2,2] < [2,1]
False
>>> ["a","b"] < ["b","a"]
True

Sorting lists

edit

Sorting at a glance:

list1 = [2, 3, 1, 'a', 'B']
list1.sort()                                   # list1 gets modified, case sensitive
list2 = sorted(list1)                          # list1 is unmodified; since Python 2.4
list3 = sorted(list1, key=lambda x: x.lower()) # case insensitive ; will give error as not all elements of list are strings and .lower() is not applicable
list4 = sorted(list1, reverse=True)            # Reverse sorting order: descending
print(list1, list2, list3, list4)

Sorting lists is easy with a sort method.

>>> list1 = [2, 3, 1, 'a', 'b']
>>> list1.sort()
>>> list1
[1, 2, 3, 'a', 'b']

Note that the list is sorted in place, and the sort() method returns None to emphasize this side effect.

If you use Python 2.4 or higher there are some more sort parameters:

  • sort(cmp,key,reverse)
    • cmp : function to determine the relative order between any two elements
    • key : function to obtain the value against which to sort for any element.
    • reverse : sort(reverse=True) or sort(reverse=False)

Python also includes a sorted() function.

>>> list1 = [5, 2, 3, 'q', 'p']
>>> sorted(list1)
[2, 3, 5, 'p', 'q']
>>> list1
[5, 2, 3, 'q', 'p']

Note that unlike the sort() method, sorted(list) does not sort the list in place, but instead returns the sorted list. The sorted() function, like the sort() method also accepts the reverse parameter.

Links:

Iteration

edit

Iteration over lists:

Read-only iteration over a list, AKA for each element of the list:

list1 = [1, 2, 3, 4]
for item in list1:
  print(item)

Writable iteration over a list:

list1 = [1, 2, 3, 4]
for i in range(0, len(list1)):
  list1[i]+=1 # Modify the item at an index as you see fit
print(list)

From a number to a number with a step:

for i in range(1, 13+1, 3): # For i=1 to 13 step 3
  print(i)
for i in range(10, 5-1, -1): # For i=10 to 5 step -1
  print(i)

For each element of a list satisfying a condition (filtering):

for item in list:
  if not condition(item):
    continue
  print(item)

See also Python Programming/Loops#For_Loops.

Removing

edit

Removing aka deleting an item at an index (see also #pop(i)):

list1 = [1, 2, 3, 4]
list1.pop() # Remove the last item
list1.pop(0) # Remove the first item , which is the item at index 0
print(list1)

list1 = [1, 2, 3, 4]
del list1[1] # Remove the 2nd element; an alternative to list.pop(1)
print(list1)

Removing an element by value:

list1 = ["a", "a", "b"]
list1.remove("a") # Removes only the 1st occurrence of "a"
print(list1)

Creating a new list by copying a filtered selection of items from the old list:

list1 = [1, 2, 3, 4]
newlist = [item for item in list1 if item > 2]
print(newlist)

This uses a list comprehension.

Update a list by retaining a filtered selection of items in it by using "[:]":

list1 = [1, 2, 3, 4]
sameList = list1
list1[:] = [item for item in list1 if item > 2]
print(sameList, sameList is list1)

For more complex condition a separate function can be used to define the criteria:

list1 = [1, 2, 3, 4]
def keepingCondition(item):
  return item > 2
sameList = list1
list1[:] = [item for item in list1 if keepingCondition(item)]
print(sameList, sameList is list1)

Removing items while iterating a list usually leads to unintended outcomes unless you do it carefully by using an index:

list1 = [1, 2, 3, 4]
index = len(list1)
while index > 0:
  index -= 1
  if not list1[index] < 2:
    list1.pop(index)

Links:

Aggregates

edit

There are some built-in functions for arithmetic aggregates over lists. These include minimum, maximum, and sum:

list = [1, 2, 3, 4]
print(max(list), min(list), sum(list))
average = sum(list) / float(len(list)) # Provided the list is non-empty
# The float above ensures the division is a float one rather than integer one.
print(average)

The max and min functions also apply to lists of strings, returning maximum and minimum with respect to alphabetical order:

list = ["aa", "ab"]
print(max(list), min(list)) # Prints "ab aa"

Copying

edit

Copying AKA cloning of lists:

Making a shallow copy:

list1= [1, 'element']
list2 = list1[:] # Copy using "[:]"
list2[0] = 2 # Only affects list2, not list1
print(list1[0]) # Displays 1

# By contrast
list1 = [1, 'element']
list2 = list1
list2[0] = 2 # Modifies the original list
print(list1[0]) # Displays 2

The above does not make a deep copy, which has the following consequence:

list1 = [1, [2, 3]] # Notice the second item being a nested list
list2 = list1[:] # A shallow copy
list2[1][0] = 4 # Modifies the 2nd item of list1 as well
print(list1[1][0]) # Displays 4 rather than 2

Making a deep copy:

import copy
list1 = [1, [2, 3]] # Notice the second item being a nested list
list2 = copy.deepcopy(list1) # A deep copy
list2[1][0] = 4 # Leaves the 2nd item of list1 unmodified
print list1[1][0] # Displays 2

See also #Continuous slices.

Links:

Clearing

edit

Clearing a list:

del list1[:] # Clear a list
list1 = []   # Not really clear but rather assign to a new empty list

Clearing using a proper approach makes a difference when the list is passed as an argument:

def workingClear(ilist):
  del ilist[:]
def brokenClear(ilist):
  ilist = [] # Lets ilist point to a new list, losing the reference to the argument list
list1=[1, 2]; workingClear(list1); print(list1)
list1=[1, 2]; brokenClear(list1); print(list1)

Keywords: emptying a list, erasing a list, clear a list, empty a list, erase a list.

Removing duplicate items

edit

Removing duplicate items from a list (keeping only unique items) can be achieved as follows.

If each item in the list is hashable, using list comprehension, which is fast:

list1 = [1, 4, 4, 5, 3, 2, 3, 2, 1]
seen = {}
list1[:] = [seen.setdefault(e, e) for e in list1 if e not in seen]

If each item in the list is hashable, using index iteration, much slower:

list1 = [1, 4, 4, 5, 3, 2, 3, 2, 1]
seen = set()
for i in range(len(list1) - 1, -1, -1):
  if list1[i] in seen:
    list1.pop(i)
  seen.add(list1[i])

If some items are not hashable, the set of visited items can be kept in a list:

list1 = [1, 4, 4, ["a", "b"], 5, ["a", "b"], 3, 2, 3, 2, 1]
seen = []
for i in range(len(list1) - 1, -1, -1):
  if list1[i] in seen:
    list1.pop(i)
  seen.append(list1[i])

If each item in the list is hashable and preserving element order does not matter:

list1 = [1, 4, 4, 5, 3, 2, 3, 2, 1]
list1[:] = list(set(list1))  # Modify list1
list2 = list(set(list1))

In the above examples where index iteration is used, scanning happens from the end to the beginning. When these are rewritten to scan from the beginning to the end, the result seems hugely slower.

Links:

List methods

edit

append(x)

edit

Add item x onto the end of the list.

>>> list = [1, 2, 3]
>>> list.append(4)
>>> list
[1, 2, 3, 4]

See pop(i)

pop(i)

edit

Remove the item in the list at the index i and return it. If i is not given, remove the last item in the list and return it.

>>> list = [1, 2, 3, 4]
>>> a = list.pop(0)
>>> list
[2, 3, 4]
>>> a
1
>>> b = list.pop()
>>>list
[2, 3]
>>> b
4

Operators

edit

To concatenate two lists.

To create a new list by concatenating the given list the given number of times. i.e. list1 * 0 == []; list1 * 3 == list1 + list1 + list1;

The operator 'in' is used for two purposes; either to iterate over every item in a list in a for loop, or to check if a value is in a list returning true or false.

>>> list = [1, 2, 3, 4]
>>> if 3 in list:
>>>    ....
>>> l = [0, 1, 2, 3, 4]
>>> 3 in l
True
>>> 18 in l
False
>>>for x in l:
>>>    print(x)
0
1
2
3
4

Difference

edit

To get the difference between two lists, just iterate:

a = [0, 1, 2, 3, 4, 4]
b = [1, 2, 3, 4, 4, 5]
print([item for item in a if item not in b])
# [0]

Intersection

edit

To get the intersection between two lists (by preserving its elements order, and their doubles), apply the difference with the difference:

a = [0, 1, 2, 3, 4, 4]
b = [1, 2, 3, 4, 4, 5]
dif = [item for item in a if item not in b]
print([item for item in a if item not in dif])
# [1, 2, 3, 4, 4]

# Note that using the above on:
a = [1, 1]; b = [1]
# will result in [1, 1]

# Similarly
a = [1]; b = [1, 1]
# will result in [1]

Exercises

edit
  1. Use a list comprehension to construct the list ['ab', 'ac', 'ad', 'bb', 'bc', 'bd'].
  2. Use a slice on the above list to construct the list ['ab', 'ad', 'bc'].
  3. Use a list comprehension to construct the list ['1a', '2a', '3a', '4a'].
  4. Simultaneously remove the element '2a' from the above list and print it.
  5. Copy the above list and add '2a' back into the list such that the original is still missing it.
  6. Use a list comprehension to construct the list ['abe', 'abf', 'ace', 'acf', 'ade', 'adf', 'bbe', 'bbf', 'bce', 'bcf', 'bde', 'bdf']


Solutions

edit

Question 1 :

List1 = [a + b for a in 'ab' for b in 'bcd']
print(List1)
>>> ['ab', 'ac', 'ad', 'bb', 'bc', 'bd']

Question 2 :

List2 = List1[::2]
print(List2)
>>> ['ab', 'ad', 'bc']

Question 3 :

List3 = [a + b for a in '1234' for b in 'a']
print(List3)
>>> ['1a', '2a', '3a', '4a']

Question 4 :

print(List3.pop(List3.index('3a')))
print(List3)
>>> 3a
>>> ['1a', '2a', '4a']

Question 5 :

List4 = List3[:]
List4.insert(2, '3a')
print(List4)
>>> ['1a', '2a', '3a', '4a']

Question 6 :

List5 = [a + b + c for a in 'ab' for b in 'bcd' for c in 'ef']
print(List5)
>>> ['abe', 'abf', 'ace', 'acf', 'ade', 'adf', 'bbe', 'bbf', 'bce', 'bcf', 'bde', 'bdf']
edit