Open main menu

Wikibooks β

Python Programming/Print version

< Python Programming(Redirected from Python Programming/Live print version)


Python Programming

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
http://en.wikibooks.org/wiki/Python_Programming

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.


Overview

Python is a high-level, structured, open-source programming language that can be used for a wide variety of programming tasks. Python was created by Guido Van Rossum in the early 1990s; its following has grown steadily and interest has increased markedly in the last few years or so. It is named after Monty Python's Flying Circus comedy program.

Python is used extensively for system administration (many vital components of Linux distributions are written in it); also, it is a great language to teach programming to novices. NASA has used Python for its software systems and has adopted it as the standard scripting language for its Integrated Planning System. Python is also extensively used by Google to implement many components of its Web Crawler and Search Engine & Yahoo! for managing its discussion groups.

Python within itself is an interpreted programming language that is automatically compiled into bytecode before execution (the bytecode is then normally saved to disk, just as automatically, so that compilation need not happen again until and unless the source gets changed). It is also a dynamically typed language that includes (but does not require one to use) object-oriented features and constructs.

The most unusual aspect of Python is that whitespace is significant; instead of block delimiters (braces → "{}" in the C family of languages), indentation is used to indicate where blocks begin and end.

For example, the following Python code can be interactively typed at an interpreter prompt, display the famous "Hello World!" on the user screen:

 >>> print "Hello World!"
Hello World!

Another great feature of Python is its availability for all platforms. Python can run on Microsoft Windows, Macintosh and all Linux distributions with ease. This makes the programs very portable, as any program written for one platform can easily be used on another.

Python provides a powerful assortment of built-in types (e.g., lists, dictionaries and strings), a number of built-in functions, and a few constructs, mostly statements. For example, loop constructs that can iterate over items in a collection instead of being limited to a simple range of integer values. Python also comes with a powerful standard library, which includes hundreds of modules to provide routines for a wide variety of services including regular expressions and TCP/IP sessions.

Python is used and supported by a large Python Community that exists on the Internet. The mailing lists and news groups like the tutor list actively support and help new python programmers. While they discourage doing homework for you, they are quite helpful and are populated by the authors of many of the Python textbooks currently available on the market.

Note:

Python 2 vs Python 3: Several years ago, the Python developers made the decision to come up with a major new version of Python. Initially called “Python 3000”, this became the 3.x series of versions of Python. What was radical about this was that the new version is backward-incompatible with Python 2.x: certain old features (like the handling of Unicode strings) were deemed to be too unwieldy or broken to be worth carrying forward. Instead, new, cleaner ways of achieving the same results were added.

 




Getting Python

In order to program in Python you need the Python interpreter. If it is not already installed or if the version you are using is obsolete, you will need to obtain and install Python using the methods below:

Python 2 vs Python 3Edit

In 2008, a new version of Python (version 3) was published that was not entirely backward compatible. Developers were asked to switch to the new version as soon as possible, but many of the common external modules are not yet (as of Aug 2010) available for Python 3. There is a program called 2to3 to convert the source code of a Python 2 program to the source code of a Python 3 program. Consider this fact before you start working with Python.Now we are in the era of Python3.6

Installing Python in WindowsEdit

Go to the Python Homepage or the ActiveState website and get the proper version for your platform. Download it, read the instructions and get it installed.

In order to run Python from the command line, you will need to have the python directory in your PATH. Alternatively, you could use an Integrated Development Environment (IDE) for Python like DrPython[1], eric[2], PyScripter[3], or Python's own IDLE (which ships with every version of Python since 2.3).

The PATH variable can be modified from the Window's System control panel. To add the PATH in Windows 7 :

  1. Go to Start.
  2. Right click on computer.
  3. Click on properties.
  4. Click on 'Advanced System Settings'
  5. Click on 'Environmental Variables'.
  6. In the system variables select Path and edit it, by appending a ';' (without quote) and adding 'C:\python27'(without quote).

If you prefer having a temporary environment, you can create a new command prompt short-cut that automatically executes the following statement:

PATH %PATH%;c:\python27

If you downloaded a different version (such as Python 3.1), change the "27" for the version of Python you have (27 is 2.7.x, the current version of Python 2.)

CygwinEdit

By default, the Cygwin installer for Windows does not include Python in the downloads. However, it can be selected from the list of packages.

Installing Python on MacEdit

Users on Apple Mac OS X will find that it already ships with Python 2.3 (OS X 10.4 Tiger) or Python 2.6.1 (OS X Snow Leopard), but if you want the more recent version head to Python Download Page follow the instruction on the page and in the installers. As a bonus you will also install the Python IDE.

Installing Python on Unix environmentsEdit

Python is available as a package for some Linux distributions. In some cases, the distribution CD will contain the python package for installation, while other distributions require downloading the source code and using the compilation scripts.

Gentoo LinuxEdit

Gentoo is an example of a distribution that installs Python by default — the package management system Portage depends on Python.

Ubuntu LinuxEdit

Users of Ubuntu will notice that Python comes installed by default, only it sometimes is not the latest version. To check which version of Python is installed, type
python -V
into the terminal.

Arch LinuxEdit

Arch Linux does not come with Python pre-installed by default, but it is easily available for installation through the package manager to pacman. As root (or using sudo if you've installed and configured it), type:

pacman -S python

This will be update package databases and install Python 3. Python 2 can be installed with:

pacman -S python2

Other versions can be built from source from the Arch User Repository.

Source code installationsEdit

Some platforms do not have a version of Python installed, and do not have pre-compiled binaries. In these cases, you will need to download the source code from the official site. Once the download is complete, you will need to unpack the compressed archive into a folder.

To build Python, simply run the configure script (requires the Bash shell) and compile using make.

Other DistributionsEdit

Python, which is also referred to as CPython, is written in the C Programming language. The C source code is generally portable, that means CPython can run on various platforms. More precisely, CPython can be made available on all platforms that provide a compiler to translate the C source code to binary code for that platform.

Apart from CPython there are also other implementations that run on top of a virtual machine. For example, on Java's JRE (Java Runtime Environment) or Microsoft's .NET CLR (Common Language Runtime). Both can access and use the libraries available on their platform. Specifically, they make use of reflection that allows complete inspection and use of all classes and objects for their very technology.

Python Implementations (Platforms)

Environment Description Get From
Jython Java Version of Python Jython
IronPython C# Version of Python IronPython

Integrated Development Environments (IDE)Edit

CPython ships with IDLE; however, IDLE is not considered user-friendly.[1] For Linux, KDevelop and Spyder are popular. For Windows, PyScripter is free, quick to install, and comes included with PortablePython.

Some Integrated Development Environments (IDEs) for Python

Environment Description Get From
ActivePython Highly flexible, Pythonwin IDE ActivePython
Anjuta IDE Linux/Unix Anjuta
Eclipse (PyDev plugin) Open-source IDE Eclipse
Eric Open-source Linux/Windows IDE. Eric
KDevelop Cross-language IDE for KDE KDevelop
Ninja-IDE Cross-platform open-source IDE. Nina-IDE
PyScripter Free Windows IDE (portable) PyScripter
Pythonwin Windows-oriented environment Pythonwin
Spyder Free cross-platform IDE (math-oriented) Spyder
VisualWx Free GUI Builder VisualWx

The Python official wiki has a complete list of IDEs.

There are several commercial IDEs such as Komodo, BlackAdder, Code Crusader, Code Forge, and PyCharm. However, for beginners learning to program, purchasing a commercial IDE is unnecessary.

Trying Python onlineEdit

You can try Python online, thereby avoiding the need to install. Keywords: REPL.

Links:

Keeping Up to DateEdit

Python has a very active community and the language itself is evolving continuously. Make sure to check python.org for recent releases and relevant tools. The website is an invaluable asset.

Public Python-related mailing lists are hosted at mail.python.org. Two examples of such mailing lists are the Python-announce-list to keep up with newly released third party-modules or software for Python and the general discussion list Python-list. These lists are mirrored to the Usenet newsgroups comp.lang.python.announce & comp.lang.python.

NotesEdit




Setting it up

There are several IDEs available for Python. A full list can be found on the Python wiki.

Installing Python PyDev Plug-in for Eclipse IDEEdit

You can use the Eclipse IDE as your Python IDE. The only requirement is Eclipse and the Eclipse PyDev Plug-in.

Go to http://www.eclipse.org/downloads/ and get the proper Eclipse IDE version for your OS platform. There are also updates on the site, but just look for the basic program, Download and install it. The install just requires you to unpack the downloaded Eclipse install file onto your system.

You can install PyDev Plug-in two ways:

  • Suggested: Use Eclipse's update manager, found in the tool bar under "Help" -> "install new Software". add http://pydev.org/updates/ in "work with" click add, and select PyDev ,and click "Next" and let Eclipse do the rest. Eclipse will now check for any updates to PyDev when it searches for updates.
    • If you get an error stating a requirement for the plugin "org.eclipse.mylyn", expand the PyDev tree, and deselect the optional mylyn components.
  • Or install PyDev manually, by going to http://pydev.sourceforge.net and get the latest PyDev Plug-in version. Download it, and install it by unpacking it into the Eclipse base folder.

Python Mode for EmacsEdit

There is also a python mode for Emacs which provides features such as running pieces of code, and changing the tab level for blocks. You can download the mode at https://launchpad.net/python-mode

Installing new modulesEdit

Although many applications and modules have searchable webpages, there is a central repository for searching packages for installation, known as the "Cheese Shop".

See AlsoEdit




Interactive mode

Python has two basic modes: script and interactive. The normal mode is the mode where the scripted and finished .py files are run in the Python interpreter. Interactive mode is a command line shell which gives immediate feedback for each statement, while running previously fed statements in active memory. As new lines are fed into the interpreter, the fed program is evaluated both in part and in whole.

Interactive mode is a good way to play around and try variations on syntax.

On macOS or linux, open a terminal and simply type "python". On Windows, bring up the command prompt and type "py", or start an interactive Python session by selecting "Python (command line)", "IDLE", or similar program from the task bar / app menu. IDLE is a GUI which includes both an interactive mode and options to edit and run files.

Python should print something like this:

$ python
Python 3.0b3 (r30b3:66303, Sep  8 2008, 14:01:02) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>

(If Python doesn't run, make sure it is installed and your path is set correctly. See Getting Python.)

The >>> is Python's way of telling you that you are in interactive mode. In interactive mode what you type is immediately run. Try typing 1+1 in. Python will respond with 2. Interactive mode allows you to test out and see what Python will do. If you ever feel the need to play with new Python statements, go into interactive mode and try them out.

A sample interactive session:

>>> 5
5
>>> print(5*7)
35
>>> "hello" * 4
'hellohellohellohello'
>>> "hello".__class__
<type 'str'>

However, you need to be careful in the interactive environment to avoid confusion. For example, the following is a valid Python script:

if 1:
  print("True")
print("Done")

If you try to enter this as written in the interactive environment, you might be surprised by the result:

>>> if 1:
...   print("True")
... print("Done")
  File "<stdin>", line 3
    print("Done")
        ^
SyntaxError: invalid syntax

What the interpreter is saying is that the indentation of the second print was unexpected. You should have entered a blank line to end the first (i.e., "if") statement, before you started writing the next print statement. For example, you should have entered the statements as though they were written:

if 1:
  print("True")
 
print("Done")

Which would have resulted in the following:

>>> if 1:
...   print("True")
...
True
>>> print("Done")
Done
>>>

Interactive modeEdit

Instead of Python exiting when the program is finished, you can use the -i flag to start an interactive session. This can be very useful for debugging and prototyping.

python -i hello.py




Self Help

This book is useful for learning Python, but there might be a topic that the book does not cover. You might want to search for modules in the standard library, or inspect an unknown object's functions, or perhaps you know there is a function that you have to call inside an object but you don't know its name. This is where the interactive help comes into play.

Navigating HelpEdit

To start Python's interactive help, type "help()" at the prompt.

>>>help()

You will be presented with a greeting and a quick introduction to the help system. For Python 2.6, the prompt will look something like this:

Welcome to Python 2.6!  This is the online help utility.

If this is your first time using Python, you should definitely check out
the tutorial on the Internet at http://docs.python.org/tutorial/.

Enter the name of any module, keyword, or topic to get help on writing
Python programs and using Python modules.  To quit this help utility and
return to the interpreter, just type "quit".

To get a list of available modules, keywords, or topics, type "modules",
"keywords", or "topics".  Each module also comes with a one-line summary
of what it does; to list the modules whose summaries contain a given word
such as "spam", type "modules spam".

Notice also that the prompt will change from ">>>" (three right angle brackets) to "help>" You can access the different portions of help simply by typing in modules, keywords, or topics.

Typing in the name of one of these will print the help page associated with the item in question. To get a list of available modules, keywords, or topics, type "modules","keywords", or "topics". Each module also comes with a one-line summary of what it does; to list the modules whose summaries contain a given word such as "spam", type "modules spam".

You can exit the help system by typing "quit" or by entering a blank line to return to the interpreter.

Help ParameterEdit

You can obtain information on a specific command without entering interactive help.

For example, you can obtain help on a given topic simply by adding a string in quotes, such as help("object"). You may also obtain help on a given object as well, by passing it as a parameter to the help function.



Creating Python programs


Welcome to Python! This tutorial will show you how to start writing programs.

Python programs are nothing more than text files, and they may be edited with a standard text editor program.[1] What text editor you use will probably depend on your operating system: any text editor can create Python programs. However, it is easier to use a text editor that includes Python syntax highlighting.


Hello World!Edit

The first program that beginning programmers usually write is the "Hello, World!" program. This program simply outputs the phrase "Hello, World!" then terminates itself. Let's write "Hello, World!" in Python!

Open up your text editor and create a new file called hello.py containing just this line (you can copy-paste if you want):

print('Hello, world!')

The below line is used for Python 3.x.x

print("Hello, world!")

You Can also put the below line to pause the program at the end until you press anything.

input()

This program uses the print function, which simply outputs its parameters to the terminal. By default, print appends a newline character to its output, which simply moves the cursor to the next line.

Note:
In Python 2.x, print is a statement rather than a function. As such, it can be used without parentheses, in which case it prints everything until the end of the line and accepts a standalone comma after the final item on the line to indicate a multi-line statement. In Python 3.x, print is a proper function expecting its arguments inside parentheses. Using print with parentheses (as above) is compatible with Python 2.x and using this style ensures version-independence.


Now that you've written your first program, let's run it in Python! This process differs slightly depending on your operating system.

WindowsEdit

  • Create a folder on your computer to use for your Python programs, such as C:\pythonpractice, and save your hello.py program in that folder.
  • In the Start menu, select "Run...", and type in cmd. This will cause the Windows terminal to open.
  • Type cd \pythonpractice to change directory to your pythonpractice folder, and hit Enter.
  • Type hello.py to run your program!

If it didn't work, make sure your PATH contains the python directory. See Getting Python.

MacEdit

  • Create a folder on your computer to use for your Python programs. A good suggestion would be to name it pythonpractice and place it in your Home folder (the one that contains folders for Documents, Movies, Music, Pictures, etc). Save your hello.py program into this folder.
  • Open the Applications folder, go into the Utilities folder, and open the Terminal program.
  • Type cd pythonpractice to change directory to your pythonpractice folder, and hit Enter.
  • Type python ./hello.py to run your program!

Note:
If you have both Python 2 and Python 3 installed (Your machine comes with a version of Python 2 but you can install Python 3 as well), you should run python3 hello.py

LinuxEdit

  • Create a folder on your computer to use for your Python programs, such as ~/pythonpractice, and save your hello.py program in that folder.
  • Open up the terminal program. In KDE, open the main menu and select "Run Command..." to open Konsole. In GNOME, open the main menu, open the Applications folder, open the Accessories folder, and select Terminal.
  • Type cd ~/pythonpractice to change directory to your pythonpractice folder, and hit Enter.
  • Type python ./hello.py to run your program!

Note:
If you have both Python version 2.6.1 and Python 3.0 installed (Very possible if you are using Ubuntu, and ran sudo apt-get install python3 to have python3 installed), you should run python3 hello.py

Linux (advanced)Edit

  • Create a folder on your computer to use for your Python programs, such as ~/pythonpractice.
  • Open up your favorite text editor and create a new file called hello.py containing just the following 2 lines (you can copy-paste if you want):[2]
#! /usr/bin/python
print('Hello, world!')

Note:
If you have both python version 2.6.1 and version 3.0 installed (Very possible if you are using a debian or debian-based(*buntu, Mint, …) distro, and ran sudo apt-get install python3 to have python3 installed), use

#! /usr/bin/python3
print('Hello, world!')
  • save your hello.py program in the ~/pythonpractice folder.
  • Open up the terminal program. In KDE, open the main menu and select "Run Command..." to open Konsole. In GNOME, open the main menu, open the Applications folder, open the Accessories folder, and select Terminal.
  • Type cd ~/pythonpractice to change directory to your pythonpractice folder, and hit Enter.
  • Type chmod a+x hello.py to tell Linux that it is an executable program.
  • Type ./hello.py to run your program!
  • In addition, you can also use ln -s hello.py /usr/bin/hello to make a symbolic link hello.py to /usr/bin under the name hello, then run it by simply executing hello.

Note that this mainly should be done for complete, compiled programs, if you have a script that you made and use frequently, then it might be a good idea to put it somewhere in your home directory and put a link to it in /usr/bin. If you want a playground, a good idea is to invoke mkdir ~/.local/bin and then put scripts in there. To make ~/.local/bin content executable the same way /usr/bin does type $PATH = $PATH:~/local/bin (you can add this line to your shell rc file, for example ~/.bashrc).

Note:
File extensions aren't necessary in UNIX-like file-systems. To linux, hello.py means the exact same thing as hello.txt, hello.mp3, or just hello. Linux mostly uses the contents of the file to determine what type it is.

johndoe@linuxbox ~ $ file /usr/bin/hello
/usr/bin/hello: Python script, ASCII text executable

ResultEdit

The program should print:

Hello, world!

Congratulations! You're well on your way to becoming a Python programmer.

ExercisesEdit

  1. Modify the hello.py program to say hello to someone from your family or your friends (or to Ada Lovelace).
  2. Change the program so that after the greeting, it asks, "How did you get here?".
  3. Re-write the original program to use two print statements: one for "Hello" and one for "world". The program should still only print out on one line.

Solutions

NotesEdit

  1. Sometimes, Python programs are distributed in compiled form. We won't have to worry about that for quite a while.
  2. A Quick Introduction to Unix/My First Shell Script explains what a hash bang line does.




Variables and Strings


In this section, you will be introduced to two different kinds of data in Python: variables and strings. Please follow along by running the included programs and examining their output.

VariablesEdit

A variable is something that holds a value that may change. In simplest terms, a variable is just a box that you can put stuff in. You can use variables to store all kinds of stuff, but for now, we are just going to look at storing numbers in variables.

lucky = 7
print (lucky)
7

This code creates a variable called lucky, and assigns to it the integer number 7. When we ask Python to tell us what is stored in the variable lucky, it returns that number again.

We can also change what is inside a variable. For example:

 

changing = 3                                   
print (changing)
3 

changing = 9
print (changing)
9

different = 12
print (different)
12
print (changing)
9

changing = 15
print (changing)
15

We declare a variable called changing, put the integer 3 in it, and verify that the assignment was properly done. Then, we assign the integer 9 to changing, and ask again what is stored in changing. Python has thrown away the 3, and has replaced it with 9. Next, we create a second variable, which we call different, and put 12 in it. Now we have two independent variables, different and changing, that hold different information, i.e., assigning a new value to one of them is not affecting the other.

You can also assign the value of a variable to be the value of another variable. For example:

red = 5
blue = 10
print (red, blue)
5 10

yellow = red
print (yellow, red, blue)
5 5 10

red = blue
print (yellow, red, blue)
5 10 10

To understand this code, keep in mind that the name of the variable is always on the left side of the equals sign (the assignment operator), and the value of the variable on the right side of the equals sign. First the name, then the value.

We start out declaring that red is 5, and blue is 10. As you can see, you can pass several arguments to print to tell it to print multiple items on one line, separating them by spaces. As expected, Python reports that red stores 5, and blue holds 10.

Now we create a third variable, called yellow. To set its value, we tell Python that we want yellow to be whatever red is. (Remember: name to the left, value to the right.) Python knows that red is 5, so it also sets yellow to be 5.

Now we're going to take the red variable, and set it to the value of the blue variable. Don't get confused — name on the left, value on the right. Python looks up the value of blue, and finds that it is 10. So, Python throws away red's old value (5), and replaces it with 10. After this assignment Python reports that yellow is 5, red is 10, and blue is 10.

But didn't we say that yellow should be whatever value red is? The reason that yellow is still 5 when red is 10, is because we only said that yellow should be whatever red is at the moment of the assignment. After Python has figured out what red is and assigned that value to yellow, yellow doesn't care about red any more. yellow has a value now, and that value is going to stay the same no matter what happens to red.

Note:
The interplay between different variables in Python is, in fact, more complex than explained here. The above example works with integer numbers and with all other basic data types built into Python; the behavior of lists and dictionaries (you will encounter these complex data types later) is entirely different, though. You may read the chapter on data types for a more detailed explanation of what variables really are in Python and how their type affects their behavior. However, it is probably sufficient for now if you just keep this in mind: whenever you are declaring variables or changing their values, you always write the name of the variable on the left of the equals sign (the assignment operator), and the value you wish to assign to it on the right.

StringEdit

A 'string' is simply a list of characters in order. A character is anything you can type on the keyboard in one keystroke, like a letter, a number, or a backslash. For example, "hello" is a string. It is five characters long — h, e, l, l, o. Strings can also have spaces: "hello world" contains 11 characters, including the space between "hello" and "world". There are no limits to the number of characters you can have in a string — you can have anywhere from one to a million or more. You can even have a string that has 0 characters, which is usually called "the empty string."

There are three ways you can declare a string in Python: single quotes ('), double quotes ("), and triple quotes ("""). In all cases, you start and end the string with your chosen string declaration. For example:

>>> print ('I am a single quoted string')
I am a single quoted string
>>> print ("I am a double quoted string")
I am a double quoted string
>>> print ("""I am a triple quoted string""")
I am a triple quoted string

You can use quotation marks within strings by placing a backslash directly before them, so that Python knows you want to include the quotation marks in the string, instead of ending the string there. Placing a backslash directly before another symbol like this is known as escaping the symbol. Note that if you want to put a backslash into the string, you also have to escape the backslash, to tell Python that you want to include the backslash, rather than using it as an escape character.

>>> print ("So I said, \"You don't know me! You'll never understand me!\"")
So I said, "You don't know me! You'll never understand me!"
>>> print ('So I said, "You don\'t know me! You\'ll never understand me!"')
So I said, "You don't know me! You'll never understand me!"
>>> print ("This will result in only three backslashes: \\ \\ \\")
This will result in only three backslashes: \ \ \
>>> print ("""The double quotation mark (\") is used to indicate direct quotations.""")
The double quotation mark (") is used to indicate direct quotations.

As you can see from the above examples, only the specific character used to quote the string needs to be escaped. This makes for more readable code.

To see how to use strings, let's go back for a moment to an old, familiar program:

>>> print("Hello, world!")
Hello, world!

Look at that! You've been using strings since the beginning! You can also add two strings together using the + operator: this is called concatenating them.

>>> print ("Hello, " + "world!")
Hello, world!

Notice that there is a space at the end of the first string. If you don't put that in, the two words will run together, and you'll end up with Hello,world!

You can also repeat strings by using the * operator, like so:

>>> print ("bouncy, " * 10)
bouncy, bouncy, bouncy, bouncy, bouncy, bouncy, bouncy, bouncy, bouncy, bouncy,

If you want to find out how long a string is, you use the len() function, which simply takes a string and counts the number of characters in it. (len stands for "length.") Just put the string that you want to find the length of, inside the parentheses of the function. For example:

>>> print (len("Hello, world!"))
13

Strings and VariablesEdit

Now that you've learned about variables and strings separately, let's see how they work together.

Variables can store much more than just numbers. You can also use them to store strings! Here's how:

question = "What did you have for lunch?"
print (question)

In this program, we are creating a variable called question, and storing the string "What did you have for lunch?" in it. Then, we just tell Python to print out whatever is inside the question variable. Notice that when we tell Python to print out question, there are no quotation marks around the word question: this is to signify that we are using a variable, instead of a string. If we put in quotation marks around question, Python would treat it as a string, and simply print out question instead of What did you have for lunch?.

Let's try something different. Sure, it's all fine and dandy to ask the user what they had for lunch, but it doesn't make much difference if they can't respond! Let's edit this program so that the user can type in what they ate.

question = "What did you have for lunch?"
print (question)
answer = raw_input() #You should use "input()" in python 3.x, because python 3.x doesn't have a function named "raw_input".

print ("You had " + answer + "! That sounds delicious!")

To ask the user to write something, we used a function called raw_input(), which waits until the user writes something and presses enter, and then returns what the user wrote. Don't forget the parentheses! Even though there's nothing inside of them, they're still important, and Python will give you an error if you don't put them in. You can also use a different function called input(), which works in nearly the same way. We will learn the differences between these two functions later.

Note:
In Python 3.x raw_input() was renamed to input(). That is, the new input() function reads a line from sys.stdin and returns it without the trailing newline. It raises EOFError if the input is terminated prematurely (e.g. by pressing Ctrl+D). To get the old behavior of input(), use eval(input()).

In this program, we created a variable called answer, and put whatever the user wrote into it. Then, we print out a new string, which contains whatever the user wrote. Notice the extra space at the end of the "You had " string, and the exclamation mark at the start of the "! That sounds delicious!" string. They help format the output and make it look nice, so that the strings don't all run together.

Combining Numbers and StringsEdit

Take a look at this program, and see if you can figure out what it's supposed to do.

print ("Please give me a number: ")
number = raw_input()

plusTen = number + 10
print ("If we add 10 to your number, we get " + plusTen)

This program should take a number from the user, add 10 to it, and print out the result. But if you try running it, it won't work! You'll get an error that looks like this:

Traceback (most recent call last):
  File "test.py", line 7, in <module>
    print "If we add 10 to your number, we get " + plusTen
TypeError: cannot concatenate 'str' and 'int' objects

What's going on here? Python is telling us that there is a TypeError, which means there is a problem with the types of information being used. Specifically, Python can't figure out how to reconcile the two types of data that are being used simultaneously: integers and strings. For example, Python thinks that the number variable is holding a string, instead of a number. If the user enters 15, then number will contain a string that is two characters long: a 1, followed by a 5. So how can we tell Python that 15 should be a number, instead of a string?

Also, when printing out the answer, we are telling Python to concatenate together a string ("If we add 10 to your number, we get ") and a number (plusTen). Python doesn't know how to do that -- it can only concatenate strings together. How do we tell Python to treat a number as a string, so that we can print it out with another string?

Luckily, there are two functions that are perfect solutions for these problems. The int() function will take a string and turn it into an integer, while the str() function will take an integer and turn it into a string. In both cases, we put what we want to change inside the parentheses. Therefore, our modified program will look like this:

print ("Please give me a number:",)
response = raw_input()

number = int(response) 
plusTen = number + 10

print ("If we add 10 to your number, we get " + str(plusTen))

Note:
Another way of doing the same is to add a comma after the string part and then the number variable, like this:

print ("If we add 10 to your number, we get ", plusTen)

or use special print formatting like this:

print ("If we add 10 to your number, we get %s" % plusTen)

which alternative can be written this way, if you have multiple inputs:

plusTwenty = number + 20
print ("If we add 10 and 20 to your number, we get %s and %s" % (plusTen, plusTwenty))

or use format()

print ("If we add 10 to your number, we get {0}".format(plusTen))

That's all you need to know about strings and variables! We'll learn more about types later.

List of Learned FunctionsEdit

  • print(): Prints its parameter to the console.
  • input() or raw_input(): asks the user for a response, and returns that response. (Note that in version 3.x raw_input() does not exist and has been replaced by input())
  • len(): returns the length of a string (number of characters)
  • str(): returns the string representation of an object
  • int(): given a string or number, returns an integer

Note:

  1. input and raw_input function accept a string as parameter. This string will be displayed on the prompt while waiting for the user input.
  2. The difference between the two is that raw_input accepts the data coming from the input device as a raw string, while input accepts the data and evaluates it into python code. This is why using input as a way to get a user string value returns an error because the user needs to enter strings with quotes.

It is recommended to use raw_input at all times and use the int function to convert the raw string into an integer. This way we do not have to bother with error messages until the error handling chapter and will not make a security vulnerability in your code.

ExercisesEdit

  1. Write a program that asks the user to type in a string, and then tells the user how long that string was.
  2. Ask the user for a string, and then for a number. Print out that string, that many times. (For example, if the string is hello and the number is 3 you should print out hellohellohello.)
  3. What would happen if a mischievous user typed in a word when you ask for a number? Try it.

Solutions

Quiz



Basic syntax


There are five fundamental concepts in Python.

Case SensitivityEdit

All variables are case-sensitive. Python treats 'number' and 'Number' as separate, unrelated entities.

Spaces and tabs don't mixEdit

Instead of block delimiters (braces → "{}" in the C family of languages), indentation is used to indicate where blocks begin and end. Because whitespace is significant, remember that spaces and tabs don't mix, so use only one or the other when indenting your programs. A common error is to mix them. While they may look the same in editor, the interpreter will read them differently and it will result in either an error or unexpected behavior. Most decent text editors can be configured to let tab key emit spaces instead.

Python's Style Guideline described that the preferred way is using 4 spaces.

Crystal Clear action bookmark.png

Tips: If you invoked python from the command-line, you can give -t or -tt argument to python to make python issue a warning or error on inconsistent tab usage.

pythonprogrammer@wikibook:~$ python -tt myscript.py

This will issue an error if you have mixed spaces and tabs.


ObjectsEdit

In Python, like all object-oriented languages, there are aggregations of code and data called objects, which typically represent the pieces in a conceptual model of a system.

Objects in Python are created (i.e., instantiated) from templates called classes (which are covered later, as much of the language can be used without understanding classes). They have attributes, which represent the various pieces of code and data which make up the object. To access attributes, one writes the name of the object followed by a period (henceforth called a dot), followed by the name of the attribute.

An example is the 'upper' attribute of strings, which refers to the code that returns a copy of the string in which all the letters are uppercase. To get to this, it is necessary to have a way to refer to the object (in the following example, the way is the literal string that constructs the object).

'bob'.upper

Code attributes are called methods. So in this example, upper is a method of 'bob' (as it is of all strings). To execute the code in a method, use a matched pair of parentheses surrounding a comma separated list of whatever arguments the method accepts (upper doesn't accept any arguments). So to find an uppercase version of the string 'bob', one could use the following:

'bob'.upper()

ScopeEdit

In a large system, it is important that one piece of code does not affect another in difficult to predict ways. One of the simplest ways to further this goal is to prevent one programmer's choice of a name from blocking another's use of that name. The concept of scope was invented to do this. A scope is a "region" of code in which a name can be used and outside of which the name cannot be easily accessed. There are two ways of delimiting regions in Python: with functions or with modules. They each have different ways of accessing from outside the scope useful data that was produced within the scope. With functions, that way is to return the data. The way to access names from other modules leads us to another concept.

NamespacesEdit

It would be possible to teach Python without the concept of namespaces because they are so similar to attributes, which we have already mentioned, but the concept of namespaces is one that transcends any particular programming language, and so it is important to teach. To begin with, there is a built-in function dir() that can be used to help one understand the concept of namespaces. When you first start the Python interpreter (i.e., in interactive mode), you can list the objects in the current (or default) namespace using this function.

Python 2.3.4 (#53, Oct 18 2004, 20:35:07) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> dir()
['__builtins__', '__doc__', '__name__']

This function can also be used to show the names available within a module's namespace. To demonstrate this, first we can use the type() function to show what kind of object __builtins__ is:

>>> type(__builtins__)
<type 'module'>

Since it is a module, it has a namespace. We can list the names within the __builtins__ namespace, again using the dir() function (note that the complete list of names has been abbreviated):

>>> dir(__builtins__)
['ArithmeticError', ... 'copyright', 'credits', ... 'help', ... 'license', ... 'zip']
>>>

Namespaces are a simple concept. A namespace is a particular place in which names specific to a module reside. Each name within a namespace is distinct from names outside of that namespace. This layering of namespaces is called scope. A name is placed within a namespace when that name is given a value. For example:

>>> dir()
['__builtins__', '__doc__', '__name__']
>>> name = "Bob"
>>> import math
>>> dir()
['__builtins__', '__doc__', '__name__', 'math', 'name']

Note that I was able to add the "name" variable to the namespace using a simple assignment statement. The import statement was used to add the "math" name to the current namespace. To see what math is, we can simply:

>>> math
<module 'math' (built-in)>

Since it is a module, it also has a namespace. To display the names within this namespace, we:

>>> dir(math)
['__doc__', '__name__', 'acos', 'asin', 'atan', 'atan2', 'ceil', 'cos', 'cosh', 'degrees', 'e',
'exp', 'fabs', 'floor', 'fmod', 'frexp', 'hypot', 'ldexp', 'log', 'log10', 'modf', 'pi', 'pow',
'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh']
>>>

If you look closely, you will notice that both the default namespace and the math module namespace have a '__name__' object. The fact that each layer can contain an object with the same name is what scope is all about. To access objects inside a namespace, simply use the name of the module, followed by a dot, followed by the name of the object. This allows us to differentiate between the __name__ object within the current namespace, and that of the object with the same name within the math module. For example:

>>> print (__name__)
__main__
>>> print (math.__name__)
math
>>> print (math.__doc__)
This module is always available.  It provides access to the
mathematical functions defined by the C standard.
>>> math.pi
3.1415926535897931



Sequences


Sequences allow you to store multiple values in an organized and efficient fashion. There are seven sequence types: strings, Unicode strings, lists, tuples, bytearrays, buffers, and xrange objects. Dictionaries and sets are containers for sequential data. See the official python documentation on sequences: Python_Documentation (actually there are more, but these are the most commonly used types).

StringsEdit

We already covered strings, but that was before you knew what a sequence is. In other languages, the elements in arrays and sometimes the characters in strings may be accessed with the square brackets, or subscript operator. This works in Python too:

>>> "Hello, world!"[0]
'H'
>>> "Hello, world!"[1]
'e'
>>> "Hello, world!"[2]
'l'
>>> "Hello, world!"[3]
'l'
>>> "Hello, world!"[4]
'o'

Indexes are numbered from 0 to n-1 where n is the number of items (or characters), and they are positioned between the items:

 H  e  l  l  o  ,  _  w  o  r  l  d  !
 0  1  2  3  4  5  6  7  8  9 10 11 12

The item which comes immediately after an index is the one selected by that index. Negative indexes are counted from the end of the string:

>>> "Hello, world!"[-2]
'd'
>>> "Hello, world!"[-9]
'o'
>>> "Hello, world!"[-13]
'H'
>>> "Hello, world!"[-1]
'!'

But in Python, the colon : allows the square brackets to take as many as two numbers. For any sequence which only uses numeric indexes, this will return the portion which is between the specified indexes. This is known as "slicing," and the result of slicing a string is often called a "substring."

>>> "Hello, world!"[3:9]
'lo, wo'
>>> string = "Hello, world!"
>>> string[:5]
'Hello'
>>> string[-6:-1]
'world'
>>> string[-9:]
'o, world!'
>>> string[:-8]
'Hello'
>>> string[:]
'Hello, world!'

As demonstrated above, if either number is omitted it is assumed to be the beginning or end of the sequence. Note also that the brackets are inclusive on the left but exclusive on the right: in the first example above with [3:9] the position 3, 'l', is included while position 9, 'r', is excluded.

ListsEdit

A list is just what it sounds like: a list of values, organized in order. A list is created using square brackets. For example, an empty list would be initialized like this:

spam = []

The values of the list are separated by commas. For example:

spam = ["bacon", "eggs", 42]

Lists may contain objects of varying types. It may hold both the strings "eggs" and "bacon" as well as the number 42.

Like characters in a string, items in a list can be accessed by indexes starting at 0. To access a specific item in a list, you refer to it by the name of the list, followed by the item's number in the list inside brackets. For example:

>>> spam
['bacon', 'eggs', 42]
>>> spam[0]
'bacon'
>>> spam[1]
'eggs'
>>> spam[2]
42

You can also use negative numbers, which count backwards from the end of the list:

>>> spam[-1]
42
>>> spam[-2]
'eggs'
>>> spam[-3]
'bacon'

The len() function also works on lists, returning the number of items in the array:

>>> len(spam)
3

Note that the len() function counts the number of item inside a list, so the last item in spam (42) has the index (len(spam) - 1).

The items in a list can also be changed, just like the contents of an ordinary variable:

>>> spam = ["bacon", "eggs", 42]
>>> spam
['bacon', 'eggs', 42]
>>> spam[1]
'eggs'
>>> spam[1] = "ketchup"
>>> spam
['bacon', 'ketchup', 42]

(Strings, being immutable, are impossible to modify.) As with strings, lists may be sliced:

>>> spam[1:]
['eggs', 42]
>>> spam[:-1]
['bacon', 'eggs']

It is also possible to add items to a list. There are many ways to do it, the easiest way is to use the append() method of list:

>>> spam.append(10)
>>> spam
['bacon', 'eggs', 42, 10]

Note that you cannot manually insert an element by specifying the index outside of its range. The following code would fail:

>>> spam[4] = 10
IndexError: list assignment index out of range

Instead, you must use the insert() function. If you want to insert an item inside a list at a certain index, you may use the insert() method of list, for example:

>>> spam.insert(1, 'and')
>>> spam
['bacon', 'and', 'eggs', 42, 10]


You can also delete items from a list using the del statement:

>>> spam
['bacon', 'and', 'eggs', 42, 10]
>>> del spam[1]
>>> spam
['bacon', 'eggs', 42, 10]
>>> spam[0]
'bacon'
>>> spam[1]
'eggs'
>>> spam[2]
42
>>> spam[3]
10

As you can see, the list re-orders itself, so there are no gaps in the numbering.

Lists have an unusual characteristic. Given two lists a and b, if you set b to a, and change a, b will also be changed.

>>> a=[2, 3, 4, 5]
>>> b=a
>>> del a[3]
>>> print a
[2, 3, 4]
>>> print b
[2, 3, 4]

This can easily be worked around by using b=a[:] instead.

For further explanation on lists, or to find out how to make 2D arrays, see Data Structure/Lists

TuplesEdit

Tuples are similar to lists, except they are immutable. Once you have set a tuple, there is no way to change it whatsoever: you cannot add, change, or remove elements of a tuple. Otherwise, tuples work identically to lists.

To declare a tuple, you use commas:

unchanging = "rocks", 0, "the universe"

It is often necessary to use parentheses to differentiate between different tuples, such as when doing multiple assignments on the same line:

foo, bar = "rocks", 0, "the universe" # 3 elements here
foo, bar = "rocks", (0, "the universe") # 2 elements here because the second element is a tuple

Unnecessary parentheses can be used without harm, but nested parentheses denote nested tuples:

>>> var = "me", "you", "us", "them"
>>> var = ("me", "you", "us", "them")

both produce:

>>> print var 
('me', 'you', 'us', 'them')

but:

>>> var = ("me", "you", ("us", "them"))
>>> print(var)
('me', 'you', ('us', 'them')) # A tuple of 3 elements, the last of which is itself a tuple.

For further explanation on tuple, see Data Structure/Tuples

DictionariesEdit

Dictionaries are also like lists, and they are mutable -- you can add, change, and remove elements from a dictionary. However, the elements in a dictionary are not bound to numbers, the way a list is. Every element in a dictionary has two parts: a key, and a value. Calling a key of a dictionary returns the value linked to that key. You could consider a list to be a special kind of dictionary, in which the key of every element is a number, in numerical order.

Dictionaries are declared using curly braces, and each element is declared first by its key, then a colon, and then its value. For example:

>>> definitions = {"guava": "a tropical fruit", "python": "a programming language", "the answer": 42}
>>> definitions
{'python': 'a programming language', 'the answer': 42, 'guava': 'a tropical fruit'}
>>> definitions["the answer"]
42
>>> definitions["guava"]
'a tropical fruit'
>>> len(definitions)    
3

Also, adding an element to a dictionary is much simpler: simply declare it as you would a variable.

>>> definitions["new key"] = "new value"
>>> definitions
{'python': 'a programming language', 'the answer': 42, 'guava': 'a tropical fruit', 'new key': 'new value'}
d={1:'one',2:'two',3:'three'}
def search(d,v):
    for i in d:
        if d[i]==v:
            return(i)
    return('the value not found')
v1=input('enter a value to be recorded')
print(search(d,v1))
#this program is written in python 3.3....in new window(not in shell),it is written with def func...

For further explanation on dictionary, see Data Structure/Dictionaries

SetsEdit

Sets are just like lists except that they are unordered and they do not allow duplicate values. Elements of a set are neither bound to a number (like list and tuple) nor to a key (like dictionary). The reason for using a set over other data types is that a set is much faster for a large number of items than a list or tuple and sets provide fast data insertion, deletion, and membership testing. Sets also support mathematical set operations such as testing for subsets and finding the union or intersection of two sets.

>>> mind = set([42, 'a string', (23, 4)])
>>> mind
set([(23, 4), 42, 'a string'])


>>> mind = set([42, 'a string', 40, 41])
>>> mind
set([40, 41, 42, 'a string'])
>>> mind = set([42, 'a string', 40, 0])
>>> mind
set([40, 0, 42, 'a string'])
>>> mind.add('hello')
>>> mind
set([40, 0, 42, 'a string', 'hello'])

Note that sets are unordered, items you add into sets will end up in an indeterminable position, and it may also change from time to time.

>>> mind.add('duplicate value')
>>> mind.add('duplicate value')
>>> mind
set([0, 'a string', 40, 42, 'hello', 'duplicate value'])

Sets cannot contain a single value more than once. Unlike lists, which can contain anything, the types of data that can be included in sets are restricted. A set can only contain hashable, immutable data types. Integers, strings, and tuples are hashable; lists, dictionaries, and other sets (except frozensets, see below) are not.

FrozensetEdit

The relationship between frozenset and set is like the relationship between tuple and list. Frozenset is an immutable version of set. An example:

>>> frozen=frozenset(['life','universe','everything'])
>>> frozen
frozenset(['universe', 'life', 'everything'])

Other data typesEdit

Python also has other types of sequences, though these are used less frequently and need to be imported from the standard library before being used. We will only brush over them here.

array
A typed-list, an array may only contain homogeneous values.
collections.defaultdict
A dictionary that, when an element is not found, returns a default value instead of error.
collections.deque
A double ended queue, allows fast manipulation on both sides of the queue.
heapq
A priority queue.
Queue
A thread-safe multi-producer, multi-consumer queue for use with multi-threaded programs. Note that a list can also be used as queue in a single-threaded code.

For further explanation on set, see Data Structure/Sets

3rd party data structureEdit

Some useful data types in Python do not come in the standard library. Some of these are very specialized in their use. We will mention some of the more well known 3rd party types.

numpy.array
useful for heavy number crunching => see numpy section
sorteddict
like the name says, a sorted dictionary
more
this list isn't comprehensive

ExercisesEdit

  1. Write a program that puts 5, 10, and "twenty" into a list. Then remove 10 from the list.
  2. Write a program that puts 5, 10, and "twenty" into a tuple.
  3. Write a program that puts 5, 10, and "twenty" into a set. Put "twenty", 10, and 5 into another set purposefully in a different order. Print both of them out and notice the ordering.
  4. Write a program that constructs a tuple, one element of which is a frozenset.
  5. Write a program that creates a dictionary mapping 1 to "Monday," 2 to "Tuesday," etc.



Data types


Data types determine whether an object can do something, or whether it just would not make sense. Other programming languages often determine whether an operation makes sense for an object by making sure the object can never be stored somewhere where the operation will be performed on the object (this type system is called static typing). Python does not do that. Instead it stores the type of an object with the object, and checks when the operation is performed whether that operation makes sense for that object (this is called dynamic typing).

Built-in Data typesEdit

Python's built-in (or standard) data types can be grouped into several classes. Sticking to the hierarchy scheme used in the official Python documentation these are numeric types, sequences, sets and mappings (and a few more not discussed further here). Some of the types are only available in certain versions of the language as noted below.

  • boolean: the type of the built-in values True and False. Useful in conditional expressions, and anywhere else you want to represent the truth or falsity of some condition. Mostly interchangeable with the integers 1 and 0. In fact, conditional expressions will accept values of any type, treating special ones like boolean False, integer 0 and the empty string "" as equivalent to False, and all other values as equivalent to True. But for safety’s sake, it is best to only use boolean values in these places.

Numeric types:

  • int: Integers; equivalent to C longs in Python 2.x, non-limited length in Python 3.x
  • long: Long integers of non-limited length; exists only in Python 2.x
  • float: Floating-Point numbers, equivalent to C doubles
  • complex: Complex Numbers

Sequences:

  • str: String; represented as a sequence of 8-bit characters in Python 2.x, but as a sequence of Unicode characters (in the range of U+0000 - U+10FFFF) in Python 3.x
  • bytes: a sequence of integers in the range of 0-255; only available in Python 3.x
  • byte array: like bytes, but mutable (see below); only available in Python 3.x
  • list
  • tuple

Sets:

  • set: an unordered collection of unique objects; available as a standard type since Python 2.6
  • frozen set: like set, but immutable (see below); available as a standard type since Python 2.6

Mappings:

  • dict: Python dictionaries, also called hashmaps or associative arrays, which means that an element of the list is associated with a definition, rather like a Map in Java

Some others, such as type and callables

Mutable vs Immutable ObjectsEdit

In general, data types in Python can be distinguished based on whether objects of the type are mutable or immutable. The content of objects of immutable types cannot be changed after they are created.

Some immutable types Some mutable types
  • int, float, complex
  • str
  • bytes
  • tuple
  • frozenset
  • bool
  • array
  • bytearray
  • list
  • set
  • dict

Only mutable objects support methods that change the object in place, such as reassignment of a sequence slice, which will work for lists, but raise an error for tuples and strings.

It is important to understand that variables in Python are really just references to objects in memory. If you assign an object to a variable as below,

a = 1
s = 'abc'
l = ['a string', 456, ('a', 'tuple', 'inside', 'a', 'list')]

all you really do is make this variable (a, s, or l) point to the object (1, 'abc', ['a string', 456, ('a', 'tuple', 'inside', 'a', 'list')]), which is kept somewhere in memory, as a convenient way of accessing it. If you reassign a variable as below

a = 7
s = 'xyz'
l = ['a simpler list', 99, 10]

you make the variable point to a different object (newly created ones in our examples). As stated above, only mutable objects can be changed in place (l[0] = 1 is ok in our example, but s[0] = 'a' raises an error). This becomes tricky, when an operation is not explicitly asking for a change to happen in place, as is the case for the += (increment) operator, for example. When used on an immutable object (as in a += 1 or in s += 'qwertz'), Python will silently create a new object and make the variable point to it. However, when used on a mutable object (as in l += [1,2,3]), the object pointed to by the variable will be changed in place. While in most situations, you do not have to know about this different behavior, it is of relevance when several variables are pointing to the same object. In our example, assume you set p = s and m = l, then s += 'etc' and l += [9,8,7]. This will change s and leave p unaffected, but will change both m and l since both point to the same list object. Python's built-in id() function, which returns a unique object identifier for a given variable name, can be used to trace what is happening under the hood.
Typically, this behavior of Python causes confusion in functions. As an illustration, consider this code:

def append_to_sequence (myseq):
    myseq += (9,9,9)
    return myseq

tuple1 = (1,2,3) # tuples are immutable
list1 = [1,2,3] # lists are mutable

tuple2 = append_to_sequence(tuple1)
list2 = append_to_sequence(list1)

print 'tuple1 = ', tuple1 # outputs (1, 2, 3)
print 'tuple2 = ', tuple2 # outputs (1, 2, 3, 9, 9, 9)
print 'list1 = ', list1 # outputs [1, 2, 3, 9, 9, 9]
print 'list2 = ', list2 # outputs [1, 2, 3, 9, 9, 9]

This will give the above indicated, and usually unintended, output. myseq is a local variable of the append_to_sequence function, but when this function gets called, myseq will nevertheless point to the same object as the variable that we pass in (t or l in our example). If that object is immutable (like a tuple), there is no problem. The += operator will cause the creation of a new tuple, and myseq will be set to point to it. However, if we pass in a reference to a mutable object, that object will be manipulated in place (so myseq and l, in our case, end up pointing to the same list object).

Links:

Creating Objects of Defined TypesEdit

Literal integers can be entered in three ways:

  • decimal numbers can be entered directly
  • hexadecimal numbers can be entered by prepending a 0x or 0X (0xff is hex FF, or 255 in decimal)
  • the format of octal literals depends on the version of Python:
  • Python 2.x: octals can be entered by prepending a 0 (0732 is octal 732, or 474 in decimal)
  • Python 3.x: octals can be entered by prepending a 0o or 0O (0o732 is octal 732, or 474 in decimal)

Floating point numbers can be entered directly.

Long integers are entered either directly (1234567891011121314151617181920 is a long integer) or by appending an L (0L is a long integer). Computations involving short integers that overflow are automatically turned into long integers.

Complex numbers are entered by adding a real number and an imaginary one, which is entered by appending a j (i.e. 10+5j is a complex number. So is 10j). Note that j by itself does not constitute a number. If this is desired, use 1j.

Strings can be either single or triple quoted strings. The difference is in the starting and ending delimiters, and in that single quoted strings cannot span more than one line. Single quoted strings are entered by entering either a single quote (') or a double quote (") followed by its match. So therefore

'foo' works, and
"moo" works as well,
     but
'bar" does not work, and
"baz' does not work either.
"quux'' is right out.

Triple quoted strings are like single quoted strings, but can span more than one line. Their starting and ending delimiters must also match. They are entered with three consecutive single or double quotes, so

'''foo''' works, and
"""moo""" works as well,
     but
'"'bar'"' does not work, and
"""baz''' does not work either.
'"'quux"'" is right out.

Tuples are entered in parentheses, with commas between the entries:

(10, 'Mary had a little lamb')

Also, the parenthesis can be left out when it's not ambiguous to do so:

10, 'whose fleece was as white as snow'

Note that one-element tuples can be entered by surrounding the entry with parentheses and adding a comma like so:

('this is a singleton tuple',)

Lists are similar, but with brackets:

['abc', 1,2,3]

Dicts are created by surrounding with curly braces a list of key/value pairs separated from each other by a colon and from the other entries with commas:

{ 'hello': 'world', 'weight': 'African or European?' }

Any of these composite types can contain any other, to any depth:

((((((((('bob',),['Mary', 'had', 'a', 'little', 'lamb']), { 'hello' : 'world' } ),),),),),),)

Null objectEdit

The Python analogue of null pointer known from other programming languages is None. None is not a null pointer or a null reference but an actual object of which there is only one instance. One of the uses of None is in default argument values of functions, for which see Python Programming/Functions#Default_Argument_Values. Comparisons to None are usually made using is rather than ==.

Testing for None and assignment:

if item is None:
  ...
  another = None

if not item is None:
  ...

if item is not None: # Also possible
  ...

Using None in a default argument value:

def log(message, type = None):
  ...

PEP8 states that "Comparisons to singletons like None should always be done with is or is not, never the equality operators." Therefore, "if item == None:" is inadvisable. A class can redefine the equality operator (==) such that instances of it will equal None.

See also Operators#Identity chapter.

Links:

Type conversionEdit

Type conversion in Python by example:

v1 = int(2.7) # 2
v2 = int(-3.9) # -3
v3 = int("2") # 2
v4 = int("11", 16) # 17, base 16
v5 = long(2)
v6 = float(2) # 2.0
v7 = float("2.7") # 2.7
v8 = float("2.7E-2") # 0.027
v9 = float(False) # 0.0
vA = float(True) # 1.0
vB = str(4.5) # "4.5"
vC = str([1, 3, 5]) # "[1, 3, 5]"
vD = bool(0) # False; bool fn since Python 2.2.1
vE = bool(3) # True
vF = bool([]) # False - empty list
vG = bool([False]) # True - non-empty list
vH = bool({}) # False - empty dict; same for empty tuple
vI = bool("") # False - empty string
vJ = bool(" ") # True - non-empty string
vK = bool(None) # False
vL = bool(len) # True
vM = set([1, 2])
vN = list(vM)
vO = list({1: "a", 2: "b"}) # dict -> list of keys
vP = tuple(vN)
vQ = list("abc") # ['a', 'b', 'c']
print v1, v2, v3, type(v1), type(v2), type(v3)

Implicit type conversion:

int1 = 4
float1 = int1 + 2.1 # 4 converted to float
# str1 = "My int:" + int1 # Error: no implicit type conversion from int to string
str1 = "My int:" + str(int1)
int2 = 4 + True # 5: bool is implicitly converted to int

Keywords: type casting.

Links:

ExercisesEdit

  1. Write a program that instantiates a single object, adds [1,2] to the object, and returns the result.
    1. Find an object that returns an output of the same length (if one exists?).
    2. Find an object that returns an output length 2 greater than it started.
    3. Find an object that causes an error.
  2. Find two data types X and Y such that X = X + Y will cause an error, but X += Y will not.



Numbers


Python 2.x supports 4 built-in numeric types - int, long, float and complex. Of these, the long type has been dropped in Python 3.x - the int type is now of unlimited length by default. You don’t have to specify what type of variable you want; Python does that automatically.

  • Int: The basic integer type in python, equivalent to the hardware 'c long' for the platform you are using in Python 2.x, unlimited in length in Python 3.x.
  • Long: Integer type with unlimited length. In python 2.2 and later, Ints are automatically turned into long ints when they overflow. Dropped since Python 3.0, use int type instead.
  • Float: This is a binary floating point number. Longs and Ints are automatically converted to floats when a float is used in an expression, and with the true-division / operator. In CPython, floats are usually implemented using the C languages double, which often yields 52 bits of significand, 11 bits of exponent, and 1 sign bit, but this is machine dependent.
  • Complex: This is a complex number consisting of two floats. Complex literals are written as a + bj where a and b are floating-point numbers denoting the real and imaginary parts respectively.

In general, the number types are automatically 'up cast' in this order:

Int → Long → Float → Complex. The farther to the right you go, the higher the precedence.

>>> x = 5
>>> type(x)
<type 'int'>
>>> x = 187687654564658970978909869576453
>>> type(x)
<type 'long'>
>>> x = 1.34763
>>> type(x)
<type 'float'>
>>> x = 5 + 2j
>>> type(x)
<type 'complex'>

The result of divisions is somewhat confusing. In Python 2.x, using the / operator on two integers will return another integer, using floor division. For example, 5/2 will give you 2. You have to specify one of the operands as a float to get true division, e.g. 5/2. or 5./2 (the dot specifies you want to work with float) will yield 2.5. Starting with Python 2.2 this behavior can be changed to true division by the future division statement from __future__ import division. In Python 3.x, the result of using the / operator is always true division (you can ask for floor division explicitly by using the // operator since Python 2.2).

This illustrates the behavior of the / operator in Python 2.2+:

>>> 5/2
2
>>> 5/2.
2.5
>>> 5./2
2.5
>>> from __future__ import division
>>> 5/2
2.5
>>> 5//2
2

For operations on numbers, see chapters Basic Math and Math.

LinksEdit



Strings


OverviewEdit

Strings in Python at a glance:

str1 = "Hello"                # A new string using double quotes
str2 = 'Hello'                # Single quotes do the same
str3 = "Hello\tworld\n"       # One with a tab and a newline
str4 = str1 + " world"        # Concatenation
str5 = str1 + str(4)          # Concatenation with a number
str6 = str1[2]                # 3rd character
str6a = str1[-1]              # Last character
#str1[0] = "M"                # No way; strings are immutable
for char in str1: print char  # For each character
str7 = str1[1:]               # Without the 1st character
str8 = str1[:-1]              # Without the last character
str9 = str1[1:4]              # Substring: 2nd to 4th character
str10 = str1 * 3              # Repetition
str11 = str1.lower()          # Lowercase
str12 = str1.upper()          # Uppercase
str13 = str1.rstrip()         # Strip right (trailing) whitespace
str14 = str1.replace('l','h') # Replacement
list15 = str1.split('l')      # Splitting
if str1 == str2: print "Equ"  # Equality test
if "el" in str1: print "In"   # Substring test
length = len(str1)            # Length
pos1 = str1.find('llo')       # Index of substring or -1
pos2 = str1.rfind('l')        # Index of substring, from the right
count = str1.count('l')       # Number of occurrences of a substring

print str1, str2, str3, str4, str5, str6, str7, str8, str9, str10 
print str11, str12, str13, str14, list15
print length, pos1, pos2, count

See also chapter Regular Expression for advanced pattern matching on strings in Python.

String operationsEdit

EqualityEdit

Two strings are equal if they have exactly the same contents, meaning that they are both the same length and each character has a one-to-one positional correspondence. Many other languages compare strings by identity instead; that is, two strings are considered equal only if they occupy the same space in memory. Python uses the is operator to test the identity of strings and any two objects in general.

Examples:

>>> a = 'hello'; b = 'hello' # Assign 'hello' to a and b.
>>> a == b                   # check for equality
True
>>> a == 'hello'             #
True
>>> a == "hello"             # (choice of delimiter is unimportant)
True
>>> a == 'hello '            # (extra space)
False
>>> a == 'Hello'             # (wrong case)
False

NumericalEdit

There are two quasi-numerical operations which can be done on strings -- addition and multiplication. String addition is just another name for concatenation. String multiplication is repetitive addition, or concatenation. So:

>>> c = 'a'
>>> c + 'b'
'ab'
>>> c * 5
'aaaaa'

ContainmentEdit

There is a simple operator 'in' that returns True if the first operand is contained in the second. This also works on substrings

>>> x = 'hello'
>>> y = 'ell'
>>> x in y
False
>>> y in x
True

Note that 'print x in y' would have also returned the same value.

Indexing and SlicingEdit

Much like arrays in other languages, the individual characters in a string can be accessed by an integer representing its position in the string. The first character in string s would be s[0] and the nth character would be at s[n-1].

>>> s = "Xanadu"
>>> s[1]
'a'

Unlike arrays in other languages, Python also indexes the arrays backwards, using negative numbers. The last character has index -1, the second to last character has index -2, and so on.

>>> s[-4]
'n'

We can also use "slices" to access a substring of s. s[a:b] will give us a string starting with s[a] and ending with s[b-1].

>>> s[1:4]
'ana'

None of these are assignable.

>>> print s
>>> s[0] = 'J'
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: object does not support item assignment
>>> s[1:3] = "up"
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: object does not support slice assignment
>>> print s

Outputs (assuming the errors were suppressed):

Xanadu
Xanadu

Another feature of slices is that if the beginning or end is left empty, it will default to the first or last index, depending on context:

>>> s[2:]
'nadu'
>>> s[:3]
'Xan'
>>> s[:]
'Xanadu'

You can also use negative numbers in slices:

>>> print s[-2:]
'du'

To understand slices, it's easiest not to count the elements themselves. It is a bit like counting not on your fingers, but in the spaces between them. The list is indexed like this:

Element:     1     2     3     4
Index:    0     1     2     3     4
         -4    -3    -2    -1

So, when we ask for the [1:3] slice, that means we start at index 1, and end at index 3, and take everything in between them. If you are used to indexes in C or Java, this can be a bit disconcerting until you get used to it.

String constantsEdit

String constants can be found in the standard string module. An example is string.digits, which equals to '0123456789'.

Links:

String methodsEdit

There are a number of methods or built-in string functions:

  • capitalize
  • center
  • count
  • decode
  • encode
  • endswith
  • expandtabs
  • find
  • index
  • isalnum
  • isalpha
  • isdigit
  • islower
  • isspace
  • istitle
  • isupper
  • join
  • ljust
  • lower
  • lstrip
  • replace
  • rfind
  • rindex
  • rjust
  • rstrip
  • split
  • splitlines
  • startswith
  • strip
  • swapcase
  • title
  • translate
  • upper
  • zfill

Only emphasized items will be covered.

is*Edit

isalnum(), isalpha(), isdigit(), islower(), isupper(), isspace(), and istitle() fit into this category.

The length of the string object being compared must be at least 1, or the is* methods will return False. In other words, a string object of len(string) == 0, is considered "empty", or False.

  • isalnum returns True if the string is entirely composed of alphabetic and/or numeric characters (i.e. no punctuation).
  • isalpha and isdigit work similarly for alphabetic characters or numeric characters only.
  • isspace returns True if the string is composed entirely of whitespace.
  • islower, isupper, and istitle return True if the string is in lowercase, uppercase, or titlecase respectively. Uncased characters are "allowed", such as digits, but there must be at least one cased character in the string object in order to return True. Titlecase means the first cased character of each word is uppercase, and any immediately following cased characters are lowercase. Curiously, 'Y2K'.istitle() returns True. That is because uppercase characters can only follow uncased characters. Likewise, lowercase characters can only follow uppercase or lowercase characters. Hint: whitespace is uncased.

Example:

>>> '2YK'.istitle()
False
>>> 'Y2K'.istitle()
True
>>> '2Y K'.istitle()
True

Title, Upper, Lower, Swapcase, CapitalizeEdit

Returns the string converted to title case, upper case, lower case, inverts case, or capitalizes, respectively.

The title method capitalizes the first letter of each word in the string (and makes the rest lower case). Words are identified as substrings of alphabetic characters that are separated by non-alphabetic characters, such as digits, or whitespace. This can lead to some unexpected behavior. For example, the string "x1x" will be converted to "X1X" instead of "X1x".

The swapcase method makes all uppercase letters lowercase and vice versa.

The capitalize method is like title except that it considers the entire string to be a word. (i.e. it makes the first character upper case and the rest lower case)

Example:

s = 'Hello, wOrLD'
print s              # 'Hello, wOrLD'
print s.title()      # 'Hello, World'
print s.swapcase()   # 'hELLO, WoRld'
print s.upper()      # 'HELLO, WORLD'
print s.lower()      # 'hello, world'
print s.capitalize() # 'Hello, world'

Keywords: to lower case, to upper case, lcase, ucase, downcase, upcase.

countEdit

Returns the number of the specified substrings in the string. i.e.

>>> s = 'Hello, world'
>>> s.count('o') # print the number of 'o's in 'Hello, World' (2)
2

Hint: .count() is case-sensitive, so this example will only count the number of lowercase letter 'o's. For example, if you ran:

>>> s = 'HELLO, WORLD'
>>> s.count('o') # print the number of lowercase 'o's in 'HELLO, WORLD' (0)
0

strip, rstrip, lstripEdit

Returns a copy of the string with the leading (lstrip) and trailing (rstrip) whitespace removed. strip removes both.

>>> s = '\t Hello, world\n\t '
>>> print s
         Hello, world

>>> print s.strip()
Hello, world
>>> print s.lstrip()
Hello, world
        # ends here
>>> print s.rstrip()
         Hello, world

Note the leading and trailing tabs and newlines.

Strip methods can also be used to remove other types of characters.

import string
s = 'www.wikibooks.org'
print s
print s.strip('w')                 # Removes all w's from outside
print s.strip(string.lowercase)    # Removes all lowercase letters from outside
print s.strip(string.printable)    # Removes all printable characters

Outputs:

www.wikibooks.org
.wikibooks.org
.wikibooks.
 

Note that string.lowercase and string.printable require an import string statement

ljust, rjust, centerEdit

left, right or center justifies a string into a given field size (the rest is padded with spaces).

>>> s = 'foo'
>>> s
'foo'
>>> s.ljust(7)
'foo    '
>>> s.rjust(7)
'    foo'
>>> s.center(7)
'  foo  '

joinEdit

Joins together the given sequence with the string as separator:

>>> seq = ['1', '2', '3', '4', '5']
>>> ' '.join(seq)
'1 2 3 4 5'
>>> '+'.join(seq)
'1+2+3+4+5'

map may be helpful here: (it converts numbers in seq into strings)

>>> seq = [1,2,3,4,5]
>>> ' '.join(map(str, seq))
'1 2 3 4 5'

now arbitrary objects may be in seq instead of just strings.

find, index, rfind, rindexEdit

The find and index methods return the index of the first found occurrence of the given subsequence. If it is not found, find returns -1 but index raises a ValueError. rfind and rindex are the same as find and index except that they search through the string from right to left (i.e. they find the last occurrence)

>>> s = 'Hello, world'
>>> s.find('l')
2
>>> s[s.index('l'):]
'llo, world'
>>> s.rfind('l')
10
>>> s[:s.rindex('l')]
'Hello, wor'
>>> s[s.index('l'):s.rindex('l')]
'llo, wor'

Because Python strings accept negative subscripts, index is probably better used in situations like the one shown because using find instead would yield an unintended value.

replaceEdit

Replace works just like it sounds. It returns a copy of the string with all occurrences of the first parameter replaced with the second parameter.

>>> 'Hello, world'.replace('o', 'X')
'HellX, wXrld'

Or, using variable assignment:

string = 'Hello, world'
newString = string.replace('o', 'X')
print string
print newString

Outputs:

Hello, world
HellX, wXrld

Notice, the original variable (string) remains unchanged after the call to replace.

expandtabsEdit

Replaces tabs with the appropriate number of spaces (default number of spaces per tab = 8; this can be changed by passing the tab size as an argument).

s = 'abcdefg\tabc\ta'
print s
print len(s)
t = s.expandtabs()
print t
print len(t)

Outputs:

abcdefg abc     a
13
abcdefg abc     a
17

Notice how (although these both look the same) the second string (t) has a different length because each tab is represented by spaces not tab characters.

To use a tab size of 4 instead of 8:

v = s.expandtabs(4)
print v
print len(v)

Outputs:

abcdefg abc a
13

Please note each tab is not always counted as eight spaces. Rather a tab "pushes" the count to the next multiple of eight. For example:

s = '\t\t'
print s.expandtabs().replace(' ', '*')
print len(s.expandtabs())

Output:

 ****************
 16
s = 'abc\tabc\tabc'
print s.expandtabs().replace(' ', '*')
print len(s.expandtabs())

Outputs:

 abc*****abc*****abc
 19

split, splitlinesEdit

The split method returns a list of the words in the string. It can take a separator argument to use instead of whitespace.

>>> s = 'Hello, world'
>>> s.split()
['Hello,', 'world']
>>> s.split('l')
['He', '', 'o, wor', 'd']

Note that in neither case is the separator included in the split strings, but empty strings are allowed.

The splitlines method breaks a multiline string into many single line strings. It is analogous to split('\n') (but accepts '\r' and '\r\n' as delimiters as well) except that if the string ends in a newline character, splitlines ignores that final character (see example).

>>> s = """
... One line
... Two lines
... Red lines
... Blue lines
... Green lines
... """
>>> s.split('\n')
['', 'One line', 'Two lines', 'Red lines', 'Blue lines', 'Green lines', '']
>>> s.splitlines()
['', 'One line', 'Two lines', 'Red lines', 'Blue lines', 'Green lines']

ExercisesEdit

  1. Write a program that takes a string, (1) capitalizes the first letter, (2) creates a list containing each word, and (3) searches for the last occurrence of "a" in the first word.
  2. Run the program on the string "Bananas are yellow."
  3. Write a program that replaces all instances of "one" with "one (1)". For this exercise capitalization does not matter, so it should treat "one", "One", and "oNE" identically.
  4. Run the program on the string "One banana was brown, but one was green."

External linksEdit



Lists


A list in Python is an ordered group of items (or elements). It is a very general structure, and list elements don't have to be of the same type: you can put numbers, letters, strings and nested lists all on the same list.

OverviewEdit

Lists in Python at a glance:

list1 = []                      # A new empty list
list2 = [1, 2, 3, "cat"]        # A new non-empty list with mixed item types
list1.append("cat")             # Add a single member, at the end of the list
list1.extend(["dog", "mouse"])  # Add several members
list1.insert(0, "fly")          # Insert at the beginning
list1[0:0] = ["cow", "doe"]     # Add members at the beginning
doe = list1.pop(1)              # Remove item at index
if "cat" in list1:              # Membership test
  list1.remove("cat")           # Remove AKA delete
#list1.remove("elephant") - throws an error
for item in list1:              # Iteration AKA for each item
  print item
print "Item count:", len(list1) # Length AKA size AKA item count
list3 = [6, 7, 8, 9]
for i in range(0, len(list3)):  # Read-write iteration AKA for each item
  list3[i] += 1                 # Item access AKA element access by index
last = list3[-1]                # Last item
nextToLast = list3[-2]          # Next-to-last item
isempty = len(list3) == 0       # Test for emptiness
set1 = set(["cat", "dog"])      # Initialize set from a list
list4 = list(set1)              # Get a list from a set
list5 = list4[:]                # A shallow list copy
list4equal5 = list4==list5      # True: same by value
list4refEqual5 = list4 is list5 # False: not same by reference
list6 = list4[:]
del list6[:]                    # Clear AKA empty AKA erase
list7 = [1, 2] + [2, 3, 4]      # Concatenation
print list1, list2, list3, list4, list5, list6, list7
print list4equal5, list4refEqual5
print list3[1:3], list3[1:], list3[:2] # Slices
print max(list3 ), min(list3 ), sum(list3) # Aggregates

print [x for x in range(10)]    # List comprehension
print [x for x in range(10) if x % 2 == 1]
print [x for x in range(10) if x % 2 == 1 if x < 5]
print [x + 1 for x in range(10) if x % 2 == 1]
print [x + y for x in '123' for y in 'abc']

List creationEdit

There are two different ways to make a list in Python. The first is through assignment ("statically"), the second is using list comprehensions ("actively").

Plain creationEdit

To make a static list of items, write them between square brackets. For example:

[ 1,2,3,"This is a list",'c',Donkey("kong") ]

Observations:

  1. The list contains items of different data types: integer, string, and Donkey class.
  2. Objects can be created 'on the fly' and added to lists. The last item is a new instance of Donkey class.

Creation of a new list whose members are constructed from non-literal expressions:

a = 2
b = 3
myList = [a+b, b+a, len(["a","b"])]

List comprehensionsEdit

Using list comprehension, you describe the process using which the list should be created. To do that, the list is broken into two pieces. The first is a picture of what each element will look like, and the second is what you do to get it.

For instance, let's say we have a list of words:

listOfWords = ["this","is","a","list","of","words"]

To take the first letter of each word and make a list out of it using list comprehension, we can do this:

>>> listOfWords = ["this","is","a","list","of","words"]
>>> items = [ word[0] for word in listOfWords ]
>>> print items
['t', 'i', 'a', 'l', 'o', 'w']

List comprehension supports more than one for statement. It will evaluate the items in all of the objects sequentially and will loop over the shorter objects if one object is longer than the rest.

>>> item = [x+y for x in 'cat' for y in 'pot']
>>> print item
['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'to', 'tt']

List comprehension supports an if statement, to only include members into the list that fulfill a certain condition:

>>> print [x+y for x in 'cat' for y in 'pot']
['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'to', 'tt']
>>> print [x+y for x in 'cat' for y in 'pot' if x != 't' and y != 'o' ]
['cp', 'ct', 'ap', 'at']
>>> print [x+y for x in 'cat' for y in 'pot' if x != 't' or y != 'o' ]
['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'tt']

In version 2.x, Python's list comprehension does not define a scope. Any variables that are bound in an evaluation remain bound to whatever they were last bound to when the evaluation was completed. In version 3.x Python's list comprehension uses local variables:

>>> print x, y                         #Input to python version 2
r t                                    #Output using python 2

>>> print x, y                         #Input to python version 3
NameError: name 'x' is not defined     #Python 3 returns an error because x and y were not leaked

This is exactly the same as if the comprehension had been expanded into an explicitly-nested group of one or more 'for' statements and 0 or more 'if' statements.

List creation shortcutsEdit

You can initialize a list to a size, with an initial value for each element:

>>> zeros=[0]*5
>>> print zeros
[0, 0, 0, 0, 0]

This works for any data type:

>>> foos=['foo']*3
>>> print foos
['foo', 'foo', 'foo']

But there is a caveat. When building a new list by multiplying, Python copies each item by reference. This poses a problem for mutable items, for instance in a multidimensional array where each element is itself a list. You'd guess that the easy way to generate a two dimensional array would be:

listoflists=[ [0]*4 ] *5

and this works, but probably doesn't do what you expect:

>>> listoflists=[ [0]*4 ] *5
>>> print listoflists
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
>>> listoflists[0][2]=1
>>> print listoflists
[[0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0]]

What's happening here is that Python is using the same reference to the inner list as the elements of the outer list. Another way of looking at this issue is to examine how Python sees the above definition:

>>> innerlist=[0]*4
>>> listoflists=[innerlist]*5
>>> print listoflists
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
>>> innerlist[2]=1
>>> print listoflists
[[0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0]]

Assuming the above effect is not what you intend, one way around this issue is to use list comprehensions:

>>> listoflists=[[0]*4 for i in range(5)]
>>> print listoflists
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
>>> listoflists[0][2]=1
>>> print listoflists
[[0, 0, 1, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]

List sizeEdit

To find the length of a list use the built in len() method.

>>> len([1,2,3])
3
>>> a = [1,2,3,4]
>>> len( a )
4

Combining listsEdit

Lists can be combined in several ways. The easiest is just to 'add' them. For instance:

>>> [1,2] + [3,4]
[1, 2, 3, 4]

Another way to combine lists is with extend. If you need to combine lists inside of a lambda, extend is the way to go.

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> a.extend(b)
>>> print a
[1, 2, 3, 4, 5, 6]

The other way to append a value to a list is to use append. For example:

>>> p=[1,2]
>>> p.append([3,4])
>>> p
[1, 2, [3, 4]]
>>> # or
>>> print p
[1, 2, [3, 4]]

However, [3,4] is an element of the list, and not part of the list. append always adds one element only to the end of a list. So if the intention was to concatenate two lists, always use extend.

Getting pieces of lists (slices)Edit

Continuous slicesEdit

Like strings, lists can be indexed and sliced:

>>> list = [2, 4, "usurp", 9.0, "n"]
>>> list[2]
'usurp'
>>> list[3:]
[9.0, 'n']

Much like the slice of a string is a substring, the slice of a list is a list. However, lists differ from strings in that we can assign new values to the items in a list:

>>> list[1] = 17
>>> list
[2, 17, 'usurp', 9.0, 'n']

We can assign new values to slices of the lists, which don't even have to be the same length:

>>> list[1:4] = ["opportunistic", "elk"]
>>> list
[2, 'opportunistic', 'elk', 'n']

It's even possible to append items onto the start of lists by assigning to an empty slice:

>>> list[:0] = [3.14, 2.71]
>>> list
[3.14, 2.71, 2, 'opportunistic', 'elk', 'n']

Similarly, you can append to the end of the list by specifying an empty slice after the end:

>>> list[len(list):] = ['four', 'score']
>>> list
[3.14, 2.71, 2, 'opportunistic', 'elk', 'n', 'four', 'score']

You can also completely change the contents of a list:

>>> list[:] = ['new', 'list', 'contents']
>>> list
['new', 'list', 'contents']

The right-hand side of a list assignment statement can be any iterable type:

>>> list[:2] = ('element',('t',),[])
>>> list
['element', ('t',), [], 'contents']

With slicing you can create copy of list since slice returns a new list:

>>> original = [1, 'element', []]
>>> list_copy = original[:]
>>> list_copy
[1, 'element', []]
>>> list_copy.append('new element')
>>> list_copy
[1, 'element', [], 'new element']
>>> original
[1, 'element', []]

Note, however, that this is a shallow copy and contains references to elements from the original list, so be careful with mutable types:

>>> list_copy[2].append('something')
>>> original
[1, 'element', ['something']]

Non-Continuous slicesEdit

It is also possible to get non-continuous parts of an array. If one wanted to get every n-th occurrence of a list, one would use the :: operator. The syntax is a:b:n where a and b are the start and end of the slice to be operated upon.

>>> list = [i for i in range(10) ]
>>> list
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list[::2]
[0, 2, 4, 6, 8]
>>> list[1:7:2]
[1, 3, 5]

Comparing listsEdit

Lists can be compared for equality.

>>> [1,2] == [1,2]
True
>>> [1,2] == [3,4]
False

Lists can be compared using a less-than operator, which uses lexicographical order:

>>> [1,2] < [2,1]
True
>>> [2,2] < [2,1]
False
>>> ["a","b"] < ["b","a"]
True

Sorting listsEdit

Sorting at a glance:

list1 = [2, 3, 1, 'a', 'B']
list1.sort()                                   # list1 gets modified, case sensitive
list2 = sorted(list1)                          # list1 is unmodified; since Python 2.4
list3 = sorted(list1, key=lambda x: x.lower()) # case insensitive
list4 = sorted(list1, reverse=True)            # Reverse sorting order: descending
print list1, list2, list3, list4

Sorting lists is easy with a sort method.

>>> list1 = [2, 3, 1, 'a', 'b']
>>> list1.sort()
>>> list1
[1, 2, 3, 'a', 'b']

Note that the list is sorted in place, and the sort() method returns None to emphasize this side effect.

If you use Python 2.4 or higher there are some more sort parameters:

  • sort(cmp,key,reverse)
    • cmp : method to be used for sorting
    • key : function to be executed with key element. List is sorted by return-value of the function
    • reverse : sort(reverse=True) or sort(reverse=False)

Python also includes a sorted() function.

>>> list1 = [5, 2, 3, 'q', 'p']
>>> sorted(list1)
[2, 3, 5, 'p', 'q']
>>> list1
[5, 2, 3, 'q', 'p']

Note that unlike the sort() method, sorted(list) does not sort the list in place, but instead returns the sorted list. The sorted() function, like the sort() method also accepts the reverse parameter.

Links:

IterationEdit

Iteration over lists:

Read-only iteration over a list, AKA for each element of the list:

list1 = [1, 2, 3, 4]
for item in list1:
  print item

Writable iteration over a list:

list1 = [1, 2, 3, 4]
for i in range(0, len(list1)):
  list1[i]+=1 # Modify the item at an index as you see fit
print list

From a number to a number with a step:

for i in range(1, 13+1, 3): # For i=1 to 13 step 3
  print i
for i in range(10, 5-1, -1): # For i=10 to 5 step -1
  print i

For each element of a list satisfying a condition (filtering):

for item in list:
  if not condition(item):
    continue
  print item

See also Python Programming/Loops#For_Loops.

RemovingEdit

Removing aka deleting an item at an index (see also #pop(i)):

list1 = [1, 2, 3, 4]
list1.pop() # Remove the last item
list1.pop(0) # Remove the first item , which is the item at index 0
print list1

list1 = [1, 2, 3, 4]
del list1[1] # Remove the 2nd element; an alternative to list.pop(1)
print list1

Removing an element by value:

list1 = ["a", "a", "b"]
list1.remove("a") # Removes only the 1st occurrence of "a"
print list1

Keeping only items in a list satisfying a condition, and thus removing the items that do not satisfy it:

list1 = [1, 2, 3, 4]
newlist = [item for item in list1 if item > 2]
print newlist

This uses a list comprehension.

Removing items failing a condition can be done without losing the identity of the list being made shorter, by using "[:]":

list1 = [1, 2, 3, 4]
sameList = list1
list1[:] = [item for item in list1 if item > 2]
print sameList, sameList is list1

Removing items failing a condition can be done by having the condition in a separate function:

list1 = [1, 2, 3, 4]
def keepingCondition(item):
  return item > 2
sameList = list1
list1[:] = [item for item in list1 if keepingCondition(item)]
print sameList, sameList is list1

Removing items while iterating a list usually leads to unintended outcomes unless you do it carefully by using an index:

list1 = [1, 2, 3, 4]
index = len(list1)
while index > 0:
  index -= 1
  if not list1[index] < 2:
    list1.pop(index)

Links:

AggregatesEdit

There are some built-in functions for arithmetic aggregates over lists. These include minimum, maximum, and sum:

list = [1, 2, 3, 4]
print max(list), min(list), sum(list)
average = sum(list) / float(len(list)) # Provided the list is non-empty
# The float above ensures the division is a float one rather than integer one.
print average

The max and min functions also apply to lists of strings, returning maximum and minimum with respect to alphabetical order:

list = ["aa", "ab"]
print max(list), min(list) # Prints "ab aa"

CopyingEdit

Copying AKA cloning of lists:

Making a shallow copy:

list1= [1, 'element']
list2 = list1[:] # Copy using "[:]"
list2[0] = 2 # Only affects list2, not list1
print list1[0] # Displays 1

# By contrast
list1 = [1, 'element']
list2 = list1
list2[0] = 2 # Modifies the original list
print list1[0] # Displays 2

The above does not make a deep copy, which has the following consequence:

list1 = [1, [2, 3]] # Notice the second item being a nested list
list2 = list1[:] # A shallow copy
list2[1][0] = 4 # Modifies the 2nd item of list1 as well
print list1[1][0] # Displays 4 rather than 2

Making a deep copy:

import copy
list1 = [1, [2, 3]] # Notice the second item being a nested list
list2 = copy.deepcopy(list1) # A deep copy
list2[1][0] = 4 # Leaves the 2nd item of list1 unmodified
print list1[1][0] # Displays 2

See also #Continuous slices.

Links:

ClearingEdit

Clearing a list:

del list1[:] # Clear a list
list1 = []   # Not really clear but rather assign to a new empty list

Clearing using a proper approach makes a difference when the list is passed as an argument:

def workingClear(ilist):
  del ilist[:]
def brokenClear(ilist):
  ilist = [] # Lets ilist point to a new list, losing the reference to the argument list
list1=[1, 2]; workingClear(list1); print list1
list1=[1, 2]; brokenClear(list1); print list1

Keywords: emptying a list, erasing a list, clear a list, empty a list, erase a list.

Removing duplicate itemsEdit

Removing duplicate items from a list (keeping only unique items) can be achieved as follows.

If each item in the list is hashable, using list comprehension, which is real fast:

list1 = [1, 4, 4, 5, 3, 2, 3, 2, 1]
seen = {}
list1[:] = [seen.setdefault(e, e) for e in list1 if e not in seen]

If each item in the list is hashable, using index iteration, much slower:

list1 = [1, 4, 4, 5, 3, 2, 3, 2, 1]
seen = set()
for i in range(len(list1) - 1, -1, -1):
  if list1[i] in seen:
    list1.pop(i)
  seen.add(list1[i])

If some items are not hashable, the set of visited items can be kept in a list:

list1 = [1, 4, 4, ["a", "b"], 5, ["a", "b"], 3, 2, 3, 2, 1]
seen = []
for i in range(len(list1) - 1, -1, -1):
  if list1[i] in seen:
    list1.pop(i)
  seen.append(list1[i])

If each item in the list is hashable and preserving element order does not matter:

list1 = [1, 4, 4, 5, 3, 2, 3, 2, 1]
list1[:] = list(set(list1))  # Modify list1
list2 = list(set(list1))

In the above examples where index iteration is used, scanning happens from the end to the beginning. When these are rewritten to scan from the beginning to the end, the result seems hugely slower.

Links:

List methodsEdit

append(x)Edit

Add item x onto the end of the list.

>>> list = [1, 2, 3]
>>> list.append(4)
>>> list
[1, 2, 3, 4]

See pop(i)

pop(i)Edit

Remove the item in the list at the index i and return it. If i is not given, remove the the last item in the list and return it.

>>> list = [1, 2, 3, 4]
>>> a = list.pop(0)
>>> list
[2, 3, 4]
>>> a
1
>>> b = list.pop()
>>>list
[2, 3]
>>> b
4

operatorsEdit

inEdit

The operator 'in' is used for two purposes; either to iterate over every item in a list in a for loop, or to check if a value is in a list returning true or false.

>>> list = [1, 2, 3, 4]
>>> if 3 in list:
>>>    ....
>>> l = [0, 1, 2, 3, 4]
>>> 3 in l
True
>>> 18 in l
False
>>>for x in l:
>>>    print x
0
1
2
3
4

SubclassingEdit

In a modern version of Python [which one?], there is a class called 'list'. You can make your own subclass of it, and determine list behaviour which is different from the default standard.

ExercisesEdit

  1. Use a list comprehension to construct the list ['ab', 'ac', 'ad', 'bb', 'bc', 'bd'].
  2. Use a slice on the above list to construct the list ['ab', 'ad', 'bc'].
  3. Use a list comprehension to construct the list ['1a', '2a', '3a', '4a'].
  4. Simultaneously remove the element '2a' from the above list and print it.
  5. Copy the above list and add '2a' back into the list such that the original is still missing it.
  6. Use a list comprehension to construct the list ['abe', 'abf', 'ace', 'acf', 'ade', 'adf', 'bbe', 'bbf', 'bce', 'bcf', 'bde', 'bdf']

External linksEdit



Tuples


A tuple in Python is much like a list except that it is immutable (unchangeable) once created. A tuple of hashable objects is hashable and thus suitable as a key in a dictionary and as a member of a set.

OverviewEdit

Tuples in Python at a glance:

tup1  = (1, 'a')
tup2  = 1, 'a'                # Brackets not needed
tup3  = (1,)                  # Singleton
tup4  = 1,                    # Singleton without brackets
tup5 = ()                     # Empty tuple
list1 = [1, 'a']
it1, it2 = tup1               # Assign items
print tup1 == tup2            # True
print tup1 == list1           # False
print tup1 == tuple(list1)    # True
print list(tup1) == list1     # True
print tup1[0]                 # First member
for item in tup1: print item  # Iteration
print (1, 2) + (3, 4)         # (1, 2, 3, 4)
tup1 += (3,)
print tup1                    # (1, 'a', 3), despite immutability
print len(tup1)               # Length AKA size AKA item count
print 3 in tup1               # Membership - true
return r1, r2                 # Return multiple values
r1, r2 = myfun()              # Receive multiple values
tup6 = ([1,2],)
tup6[0][0]=3
print tup6                    # The list within is mutable
set1 = set( (1,2) )           # Can be placed into a set
#set1 = set( ([1,2], 2) )     # Error: The list within makes it unhashable

Tuple notationEdit

Tuples may be created directly or converted from lists. Generally, tuples are enclosed in parentheses.

>>> l = [1, 'a', [6, 3.14]]
>>> t = (1, 'a', [6, 3.14])
>>> t
(1, 'a', [6, 3.14])
>>> tuple(l)
(1, 'a', [6, 3.14])
>>> t == tuple(l)
True
>>> t == l
False

A one item tuple is created by an item in parentheses followed by a comma:

>>> t = ('A single item tuple',)
>>> t
('A single item tuple',)

Also, tuples will be created from items separated by commas.

>>> t = 'A', 'tuple', 'needs', 'no', 'parens'
>>> t
('A', 'tuple', 'needs', 'no', 'parens')

Packing and UnpackingEdit

You can also perform multiple assignment using tuples.

>>> article, noun, verb, adjective, direct_object = t   #t is defined above
>>> noun
'tuple'

Note that either, or both sides of an assignment operator can consist of tuples.

>>> a, b = 1, 2
>>> b
2

The example above: article, noun, verb, adjective, direct_object = t is called "tuple unpacking" because the tuple t was unpacked and its values assigned to each of the variables on the left. "Tuple packing" is the reverse: t=article, noun, verb, adjective, direct_object. When unpacking a tuple, or performing multiple assignment, you must have the same number of variables being assigned to as values being assigned.

Operations on tuplesEdit

These are the same as for lists except that we may not assign to indices or slices, and there is no "append" operator.

>>> a = (1, 2)
>>> b = (3, 4)
>>> a + b
(1, 2, 3, 4)
>>> a
(1, 2)
>>> b
(3, 4)
>>> a.append(3)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'tuple' object has no attribute 'append'
>>> a
(1, 2)
>>> a[0] = 0
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: object does not support item assignment
>>> a
(1, 2)

For lists we would have had:

>>> a = [1, 2]
>>> b = [3, 4]
>>> a + b
[1, 2, 3, 4]
>>> a
[1, 2]
>>> b
[3, 4]
>>> a.append(3)
>>> a
[1, 2, 3]
>>> a[0] = 0
>>> a
[0, 2, 3]

Tuple AttributesEdit

Length: Finding the length of a tuple is the same as with lists; use the built in len() method.

>>> len( ( 1, 2, 3) )
3
>>> a = ( 1, 2, 3, 4 )
>>> len( a )
4

ConversionsEdit

Convert list to tuples using the built in tuple() method.

>>> l = [4, 5, 6]
>>> tuple(l)
(4, 5, 6)

Converting a tuple into a list using the built in list() method to cast as a list:

>>> t = (4, 5, 6)
>>> list(t)
[4, 5, 6]

Dictionaries can also be converted to tuples of tuples using the items method of dictionaries:

>>> d = {'a': 1, 'b': 2}
>>> tuple(d.items())
(('a', 1), ('b', 2))

Uses of TuplesEdit

Tuples can be used in place of lists where the number of items is known and small, for example when returning multiple values from a function. Many other languages require creating an object or container to return, but with Python's tuple assignment, multiple-value returns are easy:

def func(x, y):
    # code to compute x and y
    return x, y

This resulting tuple can be easily unpacked with the tuple assignment technique explained above:

x, y = func(1, 2)

Using List Comprehension to process Tuple elementsEdit

Occasionally, there is a need to manipulate the values contained within a tuple in order to create a new tuple. For example, if we wanted a way to double all of the values within a tuple, we can combine some of the above information in addition to list comprehension like this:

def double(T):
    'double() - return a tuple with each tuple element (e) doubled.'
    return tuple( [ e * 2 for e in T ] )

ExercisesEdit

  1. Create the list ['a', 'b', 'c'], then create a tuple from that list.
  2. Create the tuple ('a', 'b', 'c'), then create a list from that tuple. (Hint: the material needed to do this has been covered, but it's not entirely obvious)
  3. Make the following instantiations simultaneously: a = 'a', b=2, c='gamma'. (That is, in one line of code).
  4. Create a tuple containing just a single element which in turn contains the three elements 'a', 'b', and 'c'. Verify that the length is actually 1 by using the len() function.

External linksEdit



Dictionaries


A dictionary in Python is a collection of unordered values accessed by key rather than by index. The keys have to be hashable: integers, floating point numbers, strings, tuples, and frozensets are hashable, while lists, dictionaries, and sets other than frozensets are not. Dictionaries were available as early as in Python 1.4.

OverviewEdit

Dictionaries in Python at a glance:

dict1 = {}                     # Create an empty dictionary
dict2 = dict()                 # Create an empty dictionary 2
dict2 = {"r": 34, "i": 56}     # Initialize to non-empty value
dict3 = dict([("r", 34), ("i", 56)])      # Init from a list of tuples
dict4 = dict(r=34, i=56)       # Initialize to non-empty value 3
dict1["temperature"] = 32      # Assign value to a key
if "temperature" in dict1:     # Membership test of a key AKA key exists
  del dict1["temperature"]     # Delete AKA remove
equalbyvalue = dict2 == dict3
itemcount2 = len(dict2)        # Length AKA size AKA item count
isempty2 = len(dict2) == 0     # Emptiness test
for key in dict2:              # Iterate via keys
  print key, dict2[key]        # Print key and the associated value
  dict2[key] += 10             # Modify-access to the key-value pair
for key in sorted(dict2):      # Iterate via keys in sorted order of the keys
  print key, dict2[key]        # Print key and the associated value
for value in dict2.values():   # Iterate via values
  print value
dict5 = {} # {x: dict2[x] + 1 for x in dict2 } # Dictionary comprehension in Python 2.7 or later
dict6 = dict2.copy()             # A shallow copy
dict6.update({"i": 60, "j": 30}) # Add or overwrite; a bit like list's extend
dict7 = dict2.copy()
dict7.clear()                  # Clear AKA empty AKA erase
sixty = dict6.pop("i")         # Remove key i, returning its value
print dict1, dict2, dict3, dict4, dict5, dict6, dict7, equalbyvalue, itemcount2, sixty

Dictionary notationEdit

Dictionaries may be created directly or converted from sequences. Dictionaries are enclosed in curly braces, {}

>>> d = {'city':'Paris', 'age':38, (102,1650,1601):'A matrix coordinate'}
>>> seq = [('city','Paris'), ('age', 38), ((102,1650,1601),'A matrix coordinate')]
>>> d
{'city': 'Paris', 'age': 38, (102, 1650, 1601): 'A matrix coordinate'}
>>> dict(seq)
{'city': 'Paris', 'age': 38, (102, 1650, 1601): 'A matrix coordinate'}
>>> d == dict(seq)
True

Also, dictionaries can be easily created by zipping two sequences.

>>> seq1 = ('a','b','c','d')
>>> seq2 = [1,2,3,4]
>>> d = dict(zip(seq1,seq2))
>>> d
{'a': 1, 'c': 3, 'b': 2, 'd': 4}

Operations on DictionariesEdit

The operations on dictionaries are somewhat unique. Slicing is not supported, since the items have no intrinsic order.

>>> d = {'a':1,'b':2, 'cat':'Fluffers'}
>>> d.keys()
['a', 'b', 'cat']
>>> d.values()
[1, 2, 'Fluffers']
>>> d['a']
1
>>> d['cat'] = 'Mr. Whiskers'
>>> d['cat']
'Mr. Whiskers'
>>> 'cat' in d
True
>>> 'dog' in d
False

Combining two DictionariesEdit

You can combine two dictionaries by using the update method of the primary dictionary. Note that the update method will merge existing elements if they conflict.

>>> d = {'apples': 1, 'oranges': 3, 'pears': 2}
>>> ud = {'pears': 4, 'grapes': 5, 'lemons': 6}
>>> d.update(ud)
>>> d
{'grapes': 5, 'pears': 4, 'lemons': 6, 'apples': 1, 'oranges': 3}
>>>

Deleting from dictionaryEdit

del dictionaryName[membername]

ExercisesEdit

Write a program that:

  1. Asks the user for a string, then creates the following dictionary. The values are the letters in the string, with the corresponding key being the place in the string. https://docs.python.org/2/tutorial/datastructures.html#looping-techniques
  2. Replaces the entry whose key is the integer 3, with the value "Pie".
  3. Asks the user for a string of digits, then prints out the values corresponding to those digits.

External linksEdit



Sets


Starting with version 2.3, Python comes with an implementation of the mathematical set. Initially this implementation had to be imported from the standard module set, but with Python 2.6 the types set and frozenset became built-in types. A set is an unordered collection of objects, unlike sequence objects such as lists and tuples, in which each element is indexed. Sets cannot have duplicate members - a given object appears in a set 0 or 1 times. All members of a set have to be hashable, just like dictionary keys. Integers, floating point numbers, tuples, and strings are hashable; dictionaries, lists, and other sets (except frozensets) are not.

OverviewEdit

Sets in Python at a glance:

set1 = set()                   # A new empty set
set1.add("cat")                # Add a single member
set1.update(["dog", "mouse"])  # Add several members, like list's extend
set1 |= set(["doe", "horse"])  # Add several members 2, like list's extend
if "cat" in set1:              # Membership test
  set1.remove("cat")
#set1.remove("elephant") - throws an error
set1.discard("elephant")       # No error thrown
print set1
for item in set1:              # Iteration AKA for each element
  print item
print "Item count:", len(set1) # Length AKA size AKA item count
#1stitem = set1[0]             # Error: no indexing for sets
isempty = len(set1) == 0       # Test for emptiness
set1 = {"cat", "dog"}          # Initialize set using braces; since Python 2.7
#set1 = {}                     # No way; this is a dict
set1 = set(["cat", "dog"])     # Initialize set from a list
set2 = set(["dog", "mouse"])
set3 = set1 & set2             # Intersection
set4 = set1 | set2             # Union
set5 = set1 - set3             # Set difference
set6 = set1 ^ set2             # Symmetric difference
issubset = set1 <= set2        # Subset test
issuperset = set1 >= set2      # Superset test
set7 = set1.copy()             # A shallow copy
set7.remove("cat")
print set7.pop()               # Remove an arbitrary element
set8 = set1.copy()
set8.clear()                   # Clear AKA empty AKA erase
set9 = {x for x in range(10) if x % 2} # Set comprehension; since Python 2.7
print set1, set2, set3, set4, set5, set6, set7, set8, set9, issubset, issuperset

Constructing SetsEdit

One way to construct sets is by passing any sequential object to the "set" constructor.

>>> set([0, 1, 2, 3])
set([0, 1, 2, 3])
>>> set("obtuse")
set(['b', 'e', 'o', 's', 'u', 't'])

We can also add elements to sets one by one, using the "add" function.

>>> s = set([12, 26, 54])
>>> s.add(32)
>>> s
set([32, 26, 12, 54])

Note that since a set does not contain duplicate elements, if we add one of the members of s to s again, the add function will have no effect. This same behavior occurs in the "update" function, which adds a group of elements to a set.

>>> s.update([26, 12, 9, 14])
>>> s
set([32, 9, 12, 14, 54, 26])

Note that you can give any type of sequential structure, or even another set, to the update function, regardless of what structure was used to initialize the set.

The set function also provides a copy constructor. However, remember that the copy constructor will copy the set, but not the individual elements.

>>> s2 = s.copy()
>>> s2
set([32, 9, 12, 14, 54, 26])

Membership TestingEdit

We can check if an object is in the set using the same "in" operator as with sequential data types.

>>> 32 in s
True
>>> 6 in s
False
>>> 6 not in s
True

We can also test the membership of entire sets. Given two sets and , we check if is a subset or a superset of .

>>> s.issubset(set([32, 8, 9, 12, 14, -4, 54, 26, 19]))
True
>>> s.issuperset(set([9, 12]))
True

Note that "issubset" and "issuperset" can also accept sequential data types as arguments

>>> s.issuperset([32, 9])
True

Note that the <= and >= operators also express the issubset and issuperset functions respectively.

>>> set([4, 5, 7]) <= set([4, 5, 7, 9])
True
>>> set([9, 12, 15]) >= set([9, 12])
True

Like lists, tuples, and string, we can use the "len" function to find the number of items in a set.

Removing ItemsEdit

There are three functions which remove individual items from a set, called pop, remove, and discard. The first, pop, simply removes an item from the set. Note that there is no defined behavior as to which element it chooses to remove.

>>> s = set([1,2,3,4,5,6])
>>> s.pop()
1
>>> s
set([2,3,4,5,6])

We also have the "remove" function to remove a specified element.

>>> s.remove(3)
>>> s
set([2,4,5,6])

However, removing a item which isn't in the set causes an error.

>>> s.remove(9)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
KeyError: 9

If you wish to avoid this error, use "discard." It has the same functionality as remove, but will simply do nothing if the element isn't in the set

We also have another operation for removing elements from a set, clear, which simply removes all elements from the set.

>>> s.clear()
>>> s
set([])

Iteration Over SetsEdit

We can also have a loop move over each of the items in a set. However, since sets are unordered, it is undefined which order the iteration will follow.

>>> s = set("blerg")
>>> for n in s:
...     print n,
...
r b e l g

Set OperationsEdit

Python allows us to perform all the standard mathematical set operations, using members of set. Note that each of these set operations has several forms. One of these forms, s1.function(s2) will return another set which is created by "function" applied to and . The other form, s1.function_update(s2), will change to be the set created by "function" of and . Finally, some functions have equivalent special operators. For example, s1 & s2 is equivalent to s1.intersection(s2)

IntersectionEdit

Any element which is in both and will appear in their intersection.

>>> s1 = set([4, 6, 9])
>>> s2 = set([1, 6, 8])
>>> s1.intersection(s2)
set([6])
>>> s1 & s2
set([6])
>>> s1.intersection_update(s2)
>>> s1
set([6])

UnionEdit

The union is the merger of two sets. Any element in or will appear in their union.

>>> s1 = set([4, 6, 9])
>>> s2 = set([1, 6, 8])
>>> s1.union(s2)
set([1, 4, 6, 8, 9])
>>> s1 | s2
set([1, 4, 6, 8, 9])

Note that union's update function is simply "update" above.

Symmetric DifferenceEdit

The symmetric difference of two sets is the set of elements which are in one of either set, but not in both.

>>> s1 = set([4, 6, 9])
>>> s2 = set([1, 6, 8])
>>> s1.symmetric_difference(s2)
set([8, 1, 4, 9])
>>> s1 ^ s2
set([8, 1, 4, 9])
>>> s1.symmetric_difference_update(s2)
>>> s1
set([8, 1, 4, 9])

Set DifferenceEdit

Python can also find the set difference of and , which is the elements that are in but not in .

>>> s1 = set([4, 6, 9])
>>> s2 = set([1, 6, 8])
>>> s1.difference(s2)
set([9, 4])
>>> s1 - s2
set([9, 4])
>>> s1.difference_update(s2)
>>> s1
set([9, 4])

Multiple setsEdit

Starting with Python 2.6, "union", "intersection", and "difference" can work with multiple input by using the set constructor. For example, using "set.intersection()":

>>> s1 = set([3, 6, 7, 9])
>>> s2 = set([6, 7, 9, 10])
>>> s3 = set([7, 9, 10, 11])
>>> set.intersection(s1, s2, s3)
set([9, 7])

frozensetEdit

A frozenset is basically the same as a set, except that it is immutable - once it is created, its members cannot be changed. Since they are immutable, they are also hashable, which means that frozensets can be used as members in other sets and as dictionary keys. frozensets have the same functions as normal sets, except none of the functions that change the contents (update, remove, pop, etc.) are available.

>>> fs = frozenset([2, 3, 4])
>>> s1 = set([fs, 4, 5, 6])
>>> s1
set([4, frozenset([2, 3, 4]), 6, 5])
>>> fs.intersection(s1)
frozenset([4])
>>> fs.add(6)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'frozenset' object has no attribute 'add'

ExercisesEdit

  1. Create the set {'cat', 1, 2, 3}, call it s.
  2. Create the set {'c', 'a', 't', '1', '2', '3'}.
  3. Create the frozen set {'cat', 1, 2, 3}, call it fs.
  4. Create a set containing the frozenset fs, it should look like {frozenset({'cat', 2, 3, 1})}.

ReferenceEdit



Basic Math


Now that we know how to work with numbers and strings, let's write a program that might actually be useful! Let's say you want to find out how much you weigh in stone. A concise program can make short work of this task. Since a stone is 14 pounds, and there are about 2.2 pounds in a kilogram, the following formula should do the trick:

So, let's turn this formula into a program!

mass_kg = int(input("What is your mass in kilograms?" ))
mass_stone = mass_kg * 2.2 / 14
print("You weigh", mass_stone, "stone.")

Run this program and get your weight in stone! Notice that applying the formula was as simple as putting in a few mathematical statements:

mass_stone = mass_kg * 2.2 / 14

Mathematical OperatorsEdit

Here are some commonly used mathematical operators

Syntax Math Operation Name
a+b addition
a-b subtraction
a*b multiplication
a/b division (see note below)
a//b floor division (e.g. 5//2=2) - Available in Python 2.2 and later
a%b modulo
-a negation
abs(a) absolute value
a**b exponent
math.sqrt(a) square root

Note:
In order to use the math.sqrt() function, you must explicitly tell Python that you want it to load the math module. To do that, write

import math

at the top of your file. For other functions made available by this statement, see here.

Beware that due to the limitations of floating point arithmetic, rounding errors can cause unexpected results. For example:

 >>> print(0.6/0.2)
 3.0
 >>> print(0.6//0.2)
 2.0

For the Python 2.x series, / does "floor division" for integers and longs (e.g. 5/2=2) but "true division" for floats and complex (e.g. 5.0/2.0=2.5). For Python 3.x, / does "true division" for all types.[1][2]

This can be fixed by putting a round([math]-0.5) around a normal division sign, because of a Python error causing round(0.5) to round down.

Order of OperationsEdit

Python uses the standard order of operations as taught in Algebra and Geometry classes at high school or secondary school. That is, mathematical expressions are evaluated in the following order (memorized by many as PEMDAS), which is also applied to parentheticals.

(Note that operations which share a table row are performed from left to right. That is, a division to the left of a multiplication, with no parentheses between them, is performed before the multiplication simply because it is to the left.)

Name Syntax Description PEMDAS Mnemonic
Parentheses ( ... ) Before operating on anything else, Python must evaluate all parentheticals starting at the innermost level. (This includes functions.) Please
Exponents ** As an exponent is simply short multiplication or division, it should be evaluated before them. Excuse
Multiplication and

Division

* / // % Again, multiplication is rapid addition and must, therefore, happen first. My

Dear

Addition and

Subtraction

+ - Aunt

Sally

Formatting outputEdit

Wouldn't it be nice if we always worked with nice round numbers while doing math? Unfortunately, the real world is not quite so neat and tidy as we would like it to be. Sometimes, we end up with long, ugly numbers like the following:

What is your mass in kilograms? 65
You weigh 10.2142857143 stone.

By default, Python's print statement prints numbers to 10 decimal places. But what if you only want one or two? We can use the round() function, which rounds a number to the number of decimal points you choose. round() takes two arguments: the number you want to round, and the number of decimal places to round it to. For example:

>>> print (round(3.14159265, 2))
3.14

Now, let's change our program to only print the result to two decimal places.

print ("You weigh", round(mass_stone, 2), "stone.")

This also demonstrates the concept of nesting functions. As you can see, you can place one function inside another function, and everything will still work exactly the way you would expect. If you don't like this, you can always use multiple variables, instead:

twoSigFigs = round(mass_stone, 2)
numToString = str(twoSigFigs)
print ("You weigh " + numToString + " stone.")

ExercisesEdit

  1. Ask the user to specify the number of sides on a polygon and find the number of diagonals within the polygon.
  2. Take the lengths of two sides of a right-angle triangle from the user and apply the Pythagorean Theorem to find the hypotenuse.

Solutions

NotesEdit

  1. What's New in Python 2.2
  2. PEP 238 -- Changing the Division Operator



Operators


BasicsEdit

Python math works like you would expect.

>>> x = 2
>>> y = 3
>>> z = 5
>>> x * y
6
>>> x + y
5
>>> x * y + z
11
>>> (x + y) * z
25

Note that Python adheres to the PEMDAS order of operations.

PowersEdit

There is a built in exponentiation operator **, which can take either integers, floating point or complex numbers. This occupies its proper place in the order of operations.

>>> 2**8
256

Division and Type ConversionEdit

For Python 2.x, dividing two integers or longs uses integer division, also known as "floor division" (applying the floor function after division. So, for example, 5 / 2 is 2. Using "/" to do division this way is deprecated; if you want floor division, use "//" (available in Python 2.2 and later).

"/" does "true division" for floats and complex numbers; for example, 5.0/2.0 is 2.5.

For Python 3.x, "/" does "true division" for all types.[1][2]

Dividing by or into a floating point number (there are no fractional types in Python) will cause Python to use true division. To coerce an integer to become a float, 'float()' with the integer as a parameter

>>> x = 5
>>> float(x)
5.0

This can be generalized for other numeric types: int(), complex(), long().

Beware that due to the limitations of floating point arithmetic, rounding errors can cause unexpected results. For example:

>>> print 0.6/0.2
3.0
>>> print 0.6//0.2
2.0

ModuloEdit

The modulus (remainder of the division of the two operands, rather than the quotient) can be found using the % operator, or by the divmod builtin function. The divmod function returns a tuple containing the quotient and remainder.

>>> 10%7
3
>>> -10%7
4

NegationEdit

Unlike some other languages, variables can be negated directly:

>>> x = 5
>>> -x
-5

ComparisonEdit

Operation Means
< Less than
> Greater than
<= Less than or equal to
>= Greater than or equal to
== Equal to
 != Not equal to

Numbers, strings and other types can be compared for equality/inequality and ordering:

>>> 2 == 3
False
>>> 3 == 3
True
>>> 2 < 3
True
>>> "a" < "aa"
True

IdentityEdit

The operators is and is not test for object identity and stand in contrast to == (equals): x is y is true if and only if x and y are references to the same object in memory. x is not y yields the inverse truth value. Note that an identity test is more stringent than an equality test since two distinct objects may have the same value.

>>> [1,2,3] == [1,2,3]
True
>>> [1,2,3] is [1,2,3]
False

For the built-in immutable data types (like int, str and tuple) Python uses caching mechanisms to improve performance, i.e., the interpreter may decide to reuse an existing immutable object instead of generating a new one with the same value. The details of object caching are subject to changes between different Python versions and are not guaranteed to be system-independent, so identity checks on immutable objects like 'hello' is 'hello', (1,2,3) is (1,2,3), 4 is 2**2 may give different results on different machines.

In some Python implementations, the following results are applicable:

print 8 is 8            # True
print "str" is "str"    # True
print (1, 2) is (1, 2)  # False - whyever, it is immutable
print [1, 2] is [1, 2]  # False
print id(8) == id(8)    # True
int1 = 8
print int1 is 8         # True
oldid = id(int1)
int1 += 2
print id(int1) == oldid # False

Links:

Augmented AssignmentEdit

There is shorthand for assigning the output of an operation to one of the inputs:

>>> x = 2
>>> x # 2
2
>>> x *= 3
>>> x # 2 * 3
6
>>> x += 4
>>> x # 2 * 3 + 4
10
>>> x /= 5
>>> x # (2 * 3 + 4) / 5
2
>>> x **= 2
>>> x # ((2 * 3 + 4) / 5) ** 2
4
>>> x %= 3
>>> x # ((2 * 3 + 4) / 5) ** 2 % 3
1

>>> x = 'repeat this  '
>>> x  # repeat this
repeat this
>>> x *= 3  # fill with x repeated three times
>>> x
repeat this  repeat this  repeat this

BooleanEdit

orEdit

if a or b:
    do_this
else:
    do_this

andEdit

if a and b:
    do_this
else:
    do_this

notEdit

if not a:
    do_this
else:
    do_this

The order of operations here is: not first, and second, or third. In particular, "True or True and False or False" becomes "True or False or False" which is True.

Caution, Boolean operators are valid on things other than Booleans; for instance "1 and 6" will return 6. Specifically, "and" returns either the first value considered to be false, or the last value if all are considered true. "or" returns the first true value, or the last value if all are considered false.

ExercisesEdit

  1. Use Python to calculate .
  2. Use Python to calculate .
  3. Use Python to calculate 11111111111111111111+22222222222222222222, but in one line of code with at most 15 characters. (Hint: each of those numbers is 20 digits long, so you have to find some other way to input those numbers)
  4. Exactly one of the following expressions evaluates to "cat"; the other evaluates to "dog". Trace the logic to determine which one is which, then check your answer using Python.
1 and "cat" or "dog"
0 and "cat" or "dog"

ReferencesEdit



Control Flow


As with most imperative languages, there are three main categories of program control flow:

  • loops
  • branches
  • function calls

Function calls are covered in the next section.

Generators and list comprehensions are advanced forms of program control flow, but they are not covered here.

OverviewEdit

Control flow in Python at a glance:

x = -6                              # Branching
if x > 0:                           # If
  print "Positive"
elif x == 0:                        # Else if AKA elseif
  print "Zero"
else:                               # Else
  print "Negative"
list1 = [100, 200, 300]
for i in list1: print i             # A for loop
for i in range(0, 5): print i       # A for loop from 0 to 4
for i in range(5, 0, -1): print i   # A for loop from 5 to 1
for i in range(0, 5, 2): print i    # A for loop from 0 to 4, step 2
list2 = [(1, 1), (2, 4), (3, 9)]
for x, xsq in list2: print x, xsq   # A for loop with a two-tuple as its iterator
l1 = [1, 2]; l2 = ['a', 'b']
for i1, i2 in zip(l1, l2): print i1, i2 # A for loop iterating two lists at once.
i = 5
while i > 0:                        # A while loop
  i -= 1
list1 = ["cat", "dog", "mouse"]
i = -1 # -1 if not found
for item in list1:
  i += 1
  if item=="dog":
    break                           # Break; also usable with while loop
print "Index of dog:",i             
for i in range(1,6):
  if i <= 4:
    continue                        # Continue; also usable with while loop
  print "Greater than 4:", i

LoopsEdit

In Python, there are two kinds of loops, 'for' loops and 'while' loops.

For loopsEdit

A for loop iterates over elements of a sequence (tuple or list). A variable is created to represent the object in the sequence. For example,

x = [100,200,300]
for i in x:
      print (i)   #these parenthesis are needed for the code to get executed in higher versions of Python

This will output

100
200
300

The for loop loops over each of the elements of a list or iterator, assigning the current element to the variable name given. In the example above, each of the elements in x is assigned to i.

A built-in function called range exists to make creating sequential lists such as the one above easier. The loop above is equivalent to:

l = range(100, 301,100)
for i in l:
    print (i)

The next example uses a negative step (the third argument for the built-in range function):

for i in range(5, 0, -1):
    print (i)

This will output

5
4
3
2
1

The negative step can be -2:

for i in range(10, 0, -2):
    print (i)

This will output

10
8
6
4
2

For loops can have names for each element of a tuple, if it loops over a sequence of tuples:

l = [(1, 1), (2, 4), (3, 9), (4, 16), (5, 25)]
for x, xsquared in l:
    print x, ':', xsquared

This will output

1 : 1
2 : 4
3 : 9
4 : 16
5 : 25

Links:

While loopsEdit

A while loop repeats a sequence of statements until some condition becomes false. For example:

x = 5
while x > 0:
    print (x)  #all the print statement must be in parenthesis for version 3.4.0
    x = x - 1  #the algebra need not be done within the parenthesis

Will output:

5
4
3
2
1

Python's while loops can also have an 'else' clause, which is a block of statements that is executed (once) when the while statement evaluates to false. The break statement inside the while loop will not direct the program flow to the else clause. For example:

x = 5
y = x
while y > 0:
    print (y)
    y = y - 1
else:
    print (x)

This will output:

5
4
3
2
1
5

Unlike some languages, there is no post-condition loop.

Links:

Breaking and continuingEdit

Python includes statements to exit a loop (either a for loop or a while loop) prematurely. To exit a loop, use the break statement:

x = 5
while x > 0:
    print x
    break
    x -= 1
    print x

This will output

5

The statement to begin the next iteration of the loop without waiting for the end of the current loop is 'continue'.

l = [5,6,7]
for x in l:
    continue
    print x

This will not produce any output.

Else clause of loopsEdit

The else clause of loops will be executed if no break statements are met in the loop.

l = range(1,100)
for x in l:
    if x == 100:
        print x
        break
    else:
        print x," is not 100"
else:
    print "100 not found in range"


Another example of a while loop using the break statement and the else statement:

expected_str = "melon"
received_str = "apple"
basket = ["banana", "grapes", "strawberry", "melon", "orange"]
x = 0
step = int(raw_input("Input iteration step: "))
 
while(received_str != expected_str):
    if(x >= len(basket)): print "No more fruits left on the basket."; break
    received_str = basket[x]
    x += step # Change this to 3 to make the while statement
              # evaluate to false, avoiding the break statement, using the else clause.
    if(received_str==basket[2]): print "I hate",basket[2],"!"; break
    if(received_str != expected_str): print "I am waiting for my ",expected_str,"."
else:
    print "Finally got what I wanted! my precious ",expected_str,"!"
print "Going back home now !"

This will output:


Input iteration step: 2
I am waiting for my  melon .
I hate strawberry !
Going back home now !

White SpaceEdit

Python determines where a loop repeats itself by the indentation in the whitespace. Everything that is indented is part of the loop, the next entry that is not indented is not. For example, the code below prints "1 1 2 1 1 2"

for i in [0, 1]:
    for j in ["a","b"]:
        print("1")
    print("2")

On the other hand, the code below prints "1 2 1 2 1 2 1 2"

for i in [0, 1]:
    for j in ["a","b"]:
        print("1")
        print("2")

BranchesEdit

There is basically only one kind of branch in Python, the 'if' statement. The simplest form of the if statement simple executes a block of code only if a given predicate is true, and skips over it if the predicate is false

For instance,

>>> x = 10
>>> if x > 0:
...    print "Positive"
...
Positive
>>> if x < 0:
...    print "Negative"
...

You can also add "elif" (short for "else if") branches onto the if statement. If the predicate on the first “if” is false, it will test the predicate on the first elif, and run that branch if it’s true. If the first elif is false, it tries the second one, and so on. Note, however, that it will stop checking branches as soon as it finds a true predicate, and skip the rest of the if statement. You can also end your if statements with an "else" branch. If none of the other branches are executed, then python will run this branch.

>>> x = -6
>>> if x > 0:
...    print "Positive"
... elif x == 0:
...    print "Zero"
... else:
...    print "Negative"
...
'Negative'

Links:

ConclusionEdit

Any of these loops, branches, and function calls can be nested in any way desired. A loop can loop over a loop, a branch can branch again, and a function can call other functions, or even call itself.

ExercisesEdit

  1. Print the numbers from 0 to 1000 (including both 0 and 1000).
  2. Print the numbers from 0 to 1000 that are multiples of 5.
  3. Print the numbers from 1 to 1000 that are multiples of 5.
  4. Use a nested for-loop to prints the 3x3 multiplication table below
1 2 3 
2 4 6 
3 6 9
  1. Print the 3x3 multiplication table below.
  1 2 3 
 ------
1|1 2 3 
2|2 4 6 
3|3 6 9

External linksEdit



Decision Control


Python, like many other computer programming languages, uses Boolean logic for its decision control. That is, the Python interpreter compares one or more values in order to decide whether to execute a piece of code or not, given the proper syntax and instructions.

Decision control is then divided into two major categories, conditional and repetition. Conditional logic simply uses the keyword if and a Boolean expression to decide whether or not to execute a code block. Repetition builds on the conditional constructs by giving us a simple method in which to repeat a block of code while a Boolean expression evaluates to true.

Boolean ExpressionsEdit

Here is a little example of boolean expressions (you don't have to type it in):

a = 6
b = 7
c = 42
print (1, a == 6)
print (2, a == 7)
print (3, a == 6 and b == 7)
print (4, a == 7 and b == 7)
print (5, not a == 7 and b == 7)
print (6, a == 7 or b == 7)
print (7, a == 7 or b == 6)
print (8, not (a == 7 and b == 6))
print (9, not a == 7 and b == 6)

With the output being:

1 True
2 False
3 True
4 False
5 True
6 True
7 False
8 True
9 False

What is going on? The program consists of a bunch of funny looking print statements. Each print statement prints a number and an expression. The number is to help keep track of which statement I am dealing with. Notice how each expression ends up being either True or False; these are built-in Python values. The lines:

print (1, a == 6)
print (2, a == 7)

print out True and False respectively, just as expected, since the first is true and the second is false. The third print, print (3, a == 6 and b == 7), is a little different. The operator and means if both the statement before and the statement after are true then the whole expression is true otherwise the whole expression is false. The next line, print (4, a == 7 and b == 7), shows how if part of an and expression is false, the whole thing is false. The behavior of and can be summarized as follows:

expression result
true and true true
true and false false
false and true false
false and false false

Note that if the first expression is false Python does not check the second expression since it knows the whole expression is false.

The next line, print (5, not a == 7 and b == 7), uses the not operator. not just gives the opposite of the expression (The expression could be rewritten as print (5, a != 7 and b == 7)). Here's the table:

expression result
not true false
not false true

The two following lines, print (6, a == 7 or b == 7) and print (7, a == 7 or b == 6), use the or operator. The or operator returns true if the first expression is true, or if the second expression is true or both are true. If neither are true it returns false. Here's the table:

expression result
true or true true
true or false true
false or true true
false or false false

Note that if the first expression is true Python doesn't check the second expression since it knows the whole expression is true. This works since or is true if at least one of the expressions are true. The first part is true so the second part could be either false or true, but the whole expression is still true.

The next two lines, print (8, not (a == 7 and b == 6)) and print (9, not a == 7 and b == 6), show that parentheses can be used to group expressions and force one part to be evaluated first. Notice that the parentheses changed the expression from false to true. This occurred since the parentheses forced the not to apply to the whole expression instead of just the a == 7 portion.

Here is an example of using a boolean expression:

list = ["Life","The Universe","Everything","Jack","Jill","Life","Jill"]

# Make a copy of the list.
copy = list[:]
# Sort the copy
copy.sort()
prev = copy[0]
del copy[0]

count = 0

# Go through the list searching for a match
while count < len(copy) and copy[count] != prev:
    prev = copy[count]
    count = count + 1

# If a match was not found then count can't be < len
# since the while loop continues while count is < len
# and no match is found
if count < len(copy):
    print ("First Match:",prev)

See the Lists chapter to explain what [:] means on the first line.

Here is the output:

First Match: Jill

This program works by continuing to check for a match while count < len(copy) and copy[count] != prev. When either count is greater than the last index of copy or a match has been found the and is no longer true so the loop exits. The if simply checks to make sure that the while exited because a match was found.

The other 'trick' of and is used in this example. If you look at the table for and notice that the third entry is "false and won't check". If count >= len(copy) (in other words count < len(copy) is false) then copy[count] is never looked at. This is because Python knows that if the first is false then they both can't be true. This is known as a short circuit and is useful if the second half of the and will cause an error if something is wrong. I used the first expression ( count < len(copy)) to check and see if count was a valid index for copy. (If you don't believe me remove the matches `Jill' and `Life', check that it still works and then reverse the order of count < len(copy) and copy[count] != prev to copy[count] != prev and count < len(copy).)

Boolean expressions can be used when you need to check two or more different things at once.

ExamplesEdit

password1.py

## This programs asks a user for a name and a password.
# It then checks them to make sure that the user is allowed in.
# Note that this is a simple and insecure example,
# real password code should never be implemented this way.

name = raw_input("What is your name? ")
password = raw_input("What is the password? ")
if name == "Josh" and password == "Friday":
    print ("Welcome Josh")
elif name == "Fred" and password == "Rock":
    print ("Welcome Fred")
else:
    print ("I don't know you.")

Sample runs

What is your name? Josh
What is the password? Friday
Welcome Josh

What is your name? Bill
What is the password? Saturday
I don't know you.

ExercisesEdit

  1. Write a program that has a user guess your name, but they only get 3 chances to do so until the program quits.

Solutions



Conditional Statements


DecisionsEdit

A Decision is when a program has more than one choice of actions depending on a variable's value. Think of a traffic light. When it is green, we continue our drive. When we see the light turn yellow, we reduce our speed, and when it is red, we stop. These are logical decisions that depend on the value of the traffic light. Luckily, Python has a decision statement to help us when our application needs to make such decision for the user.

If statementsEdit

Here is a warm-up exercise - a short program to compute the absolute value of a number:
absoval.py

n = raw_input("Integer? ")#Pick an integer.  And remember, if raw_input is not supported by your OS, use input()
n = int(n)#Defines n as the integer you chose. (Alternatively, you can define n yourself)
if n < 0:
    print ("The absolute value of",n,"is",-n)
else:
    print ("The absolute value of",n,"is",n)

Here is the output from the two times that I ran this program:

Integer? -34
The absolute value of -34 is 34

Integer? 1
The absolute value of 1 is 1

What does the computer do when it sees this piece of code? First it prompts the user for a number with the statement "n = raw_input("Integer? ")". Next it reads the line "if n < 0:". If n is less than zero Python runs the line "print "The absolute value of",n,"is",-n". Otherwise python runs the line "print "The absolute value of",n,"is",n".

More formally, Python looks at whether the expression n < 0 is true or false. An if statement is followed by an indented block of statements that are run when the expression is true. After the if statement is an optional else statement and another indented block of statements. This 2nd block of statements is run if the expression is false.

Expressions can be tested several different ways. Here is a table of all of them:

operator function
< less than
<= less than or equal to
> greater than
>= greater than or equal to
== equal
!= not equal


Another feature of the if command is the elif statement. It stands for "else if," which means that if the original if statement is false and the elif statement is true, execute the block of code following the elif statement. Here's an example:
ifloop.py

a = 0
while a < 10:
    a = a + 1
    if a > 5:
        print (a,">",5)
    elif a <= 7:
        print (a,"<=",7)
    else:
        print ("Neither test was true")

and the output:

1 <= 7
2 <= 7
3 <= 7
4 <= 7
5 <= 7
6 > 5
7 > 5
8 > 5
9 > 5
10 > 5

Notice how the elif a <= 7 is only tested when the if statement fails to be true. elif allows multiple tests to be done in a single if statement.

If ExampleEdit

High_low.py

# Plays the guessing game higher or lower 
# (originally written by Josh Cogliati, improved by Quique, now improved
# by Sanjith, further improved by VorDd, with continued improvement from
# the various Wikibooks contributors.)
 
# This should actually be something that is semi random like the
# last digits of the time or something else, but that will have to
# wait till a later chapter.  (Extra Credit, modify it to be random
# after the Modules chapter)
 
# This is for demonstration purposes only. 
# It is not written to handle invalid input like a full program would.
 
answer = 23
question = 'What number am I thinking of?  '
print ('Let\'s play the guessing game!')

while True:
    guess = int(input(question))

    if guess < answer:
        print ('Little higher')
    elif guess > answer:
        print ('Little lower')
    else: # guess == answer
        print ('MINDREADER!!!')
    break

Sample run:

Let's play the guessing game!
What number am I thinking of?  22
Little higher
What number am I thinking of?  25
Little Lower
What number am I thinking of?  23
MINDREADER!!!

As it states in its comments, this code is not prepared to handle invalid input (i.e., strings instead of numbers). If you are wondering how you would implement such functionality in Python, you are referred to the Errors Chapter of this book, where you will learn about error handling. For the above code you may try this slight modification of the while loop:

while True:
	inp = input(question)
	try:
		guess = int(inp)
	except ValueError:
		print('Your guess should be a number')
	else:
		if guess < answer:
			print ('Little higher')
		elif guess > answer:
			print ('Little lower')
		else: # guess == answer
			print ('MINDREADER!!!')
			break

even.py

#Asks for a number.
#Prints if it is even or odd

print ("Input [x] for exit.")

while True:
	inp = input("Tell me a number: ")
	if inp == 'x':
		break
	# catch any resulting ValueError during the conversion to float
	try:
		number = float(inp)
	except ValueError:
		print('I said: Tell me a NUMBER!')
	else:
		test = number % 2
		if test == 0:
			print (int(number),"is even.")
		elif test == 1:
			print (int(number),"is odd.")
		else:
			print (number,"is very strange.")

Sample runs.

Tell me a number: 3
3 is odd.

Tell me a number: 2
2 is even.

Tell me a number: 3.14159
3.14159 is very strange.

average1.py

#Prints the average value.
 
print ("Welcome to the average calculator program")
print ("NOTE- THIS PROGRAM ONLY CALCULATES AVERAGES FOR 3 NUMBERS")
x = int(input("Please enter the first number "))
y = int(input("Please enter the second number "))
z = int(input("Please enter the third number "))
str = x+y+z
print (float (str/3.0))
#MADE BY SANJITH sanrubik@gmail.com

Sample runs

Welcome to the average calculator program
NOTE- THIS PROGRAM ONLY CALCULATES AVERAGES FOR 3 NUMBERS
Please enter the first number 7
Please enter the second number 6
Please enter the third number 4
5.66666666667

average2.py

#keeps asking for numbers until count have been entered.
#Prints the average value.

sum = 0.0

print ("This program will take several numbers, then average them.")
count = int(input("How many numbers would you like to sum:  "))
current_count = 0
 
while current_count < count:
	print ("Number",current_count)
	number = float(input("Enter a number:  "))
	sum = sum + number
	current_count += 1
 
print("The average was:",sum/count)

Sample runs

This program will take several numbers, then average them.
How many numbers would you like to sum:  2
Number 0
Enter a number:  3
Number 1
Enter a number:  5
The average was: 4.0

This program will take several numbers, then average them.
How many numbers would you like to sum:  3
Number 0
Enter a number:  1
Number 1
Enter a number:  4
Number 2
Enter a number:  3
The average was: 2.66666666667

average3.py

#Continuously updates the average as new numbers are entered.

print "Welcome to the Average Calculator, please insert a number"
currentaverage = 0
numofnums = 0
while True:
    newnumber = int(raw_input("New number "))
    numofnums = numofnums + 1
    currentaverage = (round((((currentaverage * (numofnums - 1)) + newnumber) / numofnums), 3))
    print ("The current average is " + str((round(currentaverage, 3))))

Sample runs

Welcome to the Average Calculator, please insert a number
New number 1
The current average is 1.0
New number 3
The current average is 2.0
New number 6
The current average is 3.333
New number 6
The current average is 4.0
New number


If ExercisesEdit

  1. Write a password guessing program to keep track of how many times the user has entered the password wrong. If it is more than 3 times, print You have been denied access. and terminate the program. If the password is correct, print You have successfully logged in. and terminate the program.
  2. Write a program that asks for two numbers. If the sum of the numbers is greater than 100, print That is a big number and terminate the program.
  3. Write a program that asks the user their name. If they enter your name, say "That is a nice name." If they enter "John Cleese" or "Michael Palin", tell them how you feel about them ;), otherwise tell them "You have a nice name."
  4. Ask the user to enter the password. If the password is correct print "You have successfully logged in" and exit the program. If the password is wrong print "Sorry the password is wrong" and ask the user to enter the password 3 times. If the password is wrong print "You have been denied access" and exit the program.
##   Password guessing program using if statement and while statement only
###  source by zain


guess_count = 0

correct_pass = 'dee234'

while True:
	pass_guess = str(input("Please enter your password: "))

	guess_count += 1

	if pass_guess == correct_pass:
		print ('You have succesfully logged in.')
		break

	elif pass_guess != correct_pass:
		if guess_count >= 3:
			print ("You have been denied access.")
			break





def mard():
    for i in range(1,4):
        a = raw_input("enter a password:  ") # to ask password
        b = "sefinew" # the password
        if a == b: # if the password entered and the password are the same to print.
            print("You have successfully logged in")
            exit()# to terminate the program.  Using 'break' instead of 'exit()' will allow your shell or idle to dump the block and continue to run.
        else: # if the password entered and the password are not the same to print.
            print("Sorry the password is wrong ")
            if i == 3:
                print("You have been denied access")
                exit() # to terminate the program

mard()


#Source by Vanchi
import time
import getpass

password = getpass.getpass("Please enter your password")

print "Waiting for 3 seconds"
time.sleep(3)
got_it_right = False
for number_of_tries in range(1,4):
    reenter_password = getpass.getpass("Please reenter your password")
    if password == reenter_password:
        print "You are Logged in! Welcome User :)"
        got_it_right = True
        break

if not got_it_right:
    print "Access Denied!!"

Conditional StatementsEdit

Many languages (like Java and PHP) have the concept of a one-line conditional (called The Ternary Operator), often used to simplify conditionally accessing a value. For instance (in Java):

int in= ; // read from program input

// a normal conditional assignment
int res;
if(number < 0)
  res = -number;
else
  res = number;

For many years Python did not have the same construct natively, however you could replicate it by constructing a tuple of results and calling the test as the index of the tuple, like so:

number = int(raw_input("Enter a number to get its absolute value:"))
res = (-number, number)[number > 0]

It is important to note that, unlike a built in conditional statement, both the true and false branches are evaluated before returning, which can lead to unexpected results and slower executions if you're not careful. To resolve this issue, and as a better practice, wrap whatever you put in the tuple in anonymous function calls (lambda notation) to prevent them from being evaluated until the desired branch is called:

number = int(raw_input("Enter a number to get its absolute value:"))
res = (lambda: number, lambda: -number)[number < 0]()

Since Python 2.5 however, there has been an equivalent operator to The Ternary Operator (though not called such, and with a totally different syntax):

number = int(raw_input("Enter a number to get its absolute value:"))
res = -number if number < 0 else number

SwitchEdit

A switch is a control statement present in most computer programming languages to minimize a bunch of If - elif statements. Sadly Python doesn't officially support this statement, but with the clever use of an array or dictionary, we can recreate this Switch statement that depends on a value.

x = 1

def hello():
  print ("Hello")

def bye():
  print ("Bye")

def hola():
  print ("Hola is Spanish for Hello")

def adios():
  print ("Adios is Spanish for Bye")

# Notice that our switch statement is a regular variable, only that we added the function's name inside
# and there are no quotes
menu = [hello,bye,hola,adios]

# To call our switch statement, we simply make reference to the array with a pair of parentheses
# at the end to call the function
menu[3]()   # calls the adios function since is number 3 in our array.

menu[0]()   # Calls the hello function being our first element in our array.

menu[x]()   # Calls the bye function as is the second element on the array x = 1

This works because Python stores a reference of the function in the array at its particular index, and by adding a pair of parentheses we are actually calling the function. Here the last line is equivalent to:

Another way. Using function through user InputEdit

go = "y"
x = 0
def hello():
  print ("Hello")
 
def bye():
  print ("Bye")
 
def hola():
  print ("Hola is Spanish for Hello")
 
def adios():
  print ("Adios is Spanish for Bye")

menu = [hello, bye, hola, adios]
 

while x < len(menu) :
    print "function", menu[x].__name__, ", press " , "[" + str(x) + "]"
    x += 1
    
while go != "n":
    c = input("Select Function: ")
    menu[c]()
    go = raw_input("Try again? [y/n]: ")

print "\nBye!"
   

#end

Another wayEdit

if x == 0:
    hello()
elif x == 1:
    bye()
elif x == 2:
    hola()
else:
    adios()

Another wayEdit

Another way is to use lambdas. Code pasted here with permissions[4][dead link].

result = {
  'a': lambda x: x * 5,
  'b': lambda x: x + 7,
  'c': lambda x: x - 2
}[value](x)

For more information on lambda see anonymous functions in the function section.



Loops


While loopsEdit

This is our first control structure. Ordinarily the computer starts with the first line and then goes down from there. Control structures change the order that statements are executed or decide if a certain statement will be run. As a side note, decision statements (e.g., if statements) also influence whether or not a certain statement will run. Here's the source for a program that uses the while control structure:

a = 0
while a < 5:
    a += 1 # Same as a = a + 1 
    print (a)

And here is the output:

1
2
3
4
5

So what does the program do? First it sees the line a = 0 and sets the variable a to zero. Then it sees while a < 5: and so the computer checks to see if a < 5. The first time the computer sees this statement, a is zero, and zero is less than 5. In other words, while a is less than five, the computer will run the indented statements.

Here is another example of the use of while:

a = 1
s = 0
print ('Enter Numbers to add to the sum.')
print ('Enter 0 to quit.')
while a != 0:
    print ('Current Sum: ', s)
    a = raw_input('Number? ')
    a = float(a)
    s += a
print ('Total Sum = ',s)
Enter Numbers to add to the sum.
Enter 0 to quit.
Current Sum: 0
Number? 200
Current Sum: 200
Number? -15.25
Current Sum: 184.75
Number? -151.85
Current Sum: 32.9
Number? 10.00
Current Sum: 42.9
Number? 0
Total Sum = 42.9

Notice how print 'Total Sum =',s is only run at the end. The while statement only affects the lines that are tabbed in (a.k.a. indented). The  != means does not equal so while a != 0 : means until a is zero run the tabbed in statements that are afterwards.

Now that we have while loops, it is possible to have programs that run forever. An easy way to do this is to write a program like this:

while 1 == 1:
    print ("Help, I'm stuck in a loop.")

This program will output Help, I'm stuck in a loop. until the heat death of the universe or you stop it. The way to stop it is to hit the Control (or Ctrl) button and `c' (the letter) at the same time. This will kill the program. (Note: sometimes you will have to hit enter after the Control C.)

ExamplesEdit

Fibonacci.py

#This program calculates the Fibonacci sequence
a = 0
b = 1
count = 0
max_count = 20
while count < max_count:
    count = count + 1
    #we need to keep track of a since we change it
    old_a = a
    old_b = b
    a = old_b
    b = old_a + old_b
    #Notice that the , at the end of a print statement keeps it
    # from switching to a new line
    print (old_a),

Output:

0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181

Password.py

# Waits until a password has been entered.  Use control-C to break out without
# the password.

# Note that this must not be the password so that the 
# while loop runs at least once.
password = "foobar"

#note that != means not equal
while password != "unicorn":
    password = raw_input("Password: ")
print ("Welcome in")

Sample run:

Password:auo
Password:y22
Password:password
Password:open sesame
Password:unicorn
Welcome in

For LoopsEdit

The next type of loop in Python is the for loop. Unlike in most languages, for requires some __iterable__ object like a Set or List to work.

onetoten = range(1,11)
for count in onetoten:
    print (count)

The output:

1
2
3
4
5
6
7
8
9
10

The output looks very familiar, but the program code looks different. The first line uses the range function. The range function uses two arguments like this range(start,finish). start is the first number that is produced. finish is one larger than the last number. Note that this program could have been done in a shorter way:

for count in range(1,11):
    print (count)

Here are some examples to show what happens with the range function:

>>> range(1,10)
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> range(-32, -20)
[-32, -31, -30, -29, -28, -27, -26, -25, -24, -23, -22, -21]
>>> range(5,21)
[5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
>>> range(21,5)
[]

Another way to use the range() function in a for loop is to supply only one argument:

for a in range(10):
    print (a)

The above code acts exactly the same as:

for a in range(0, 10):
    print (a)

with 0 implied as the starting point. The output is

0 1 2 3 4 5 6 7 8 9

The code would cycle through the for loop 10 times as expected, but starting with 0 instead of 1.

The next line for count in onetoten: uses the for control structure. A for control structure looks like for variable in list:. list is gone through starting with the first element of the list and going to the last. As for goes through each element in a list it puts each into variable. That allows variable to be used in each successive time the for loop is run through. Here is another example to demonstrate:

demolist = ['life',42, 'the universe', 6,'and',7,'everything']
for item in demolist:
    print ("The Current item is: %s" % item)

The output is:

The Current item is: life
The Current item is: 42
The Current item is: the universe
The Current item is: 6
The Current item is: and
The Current item is: 7
The Current item is: everything

Notice how the for loop goes through and sets item to each element in the list. (Notice how if you don't want print to go to the next line add a comma at the end of the statement (i.e. if you want to print something else on that line). ) So, what is for good for? The first use is to go through all the elements of a list and do something with each of them. Here a quick way to add up all the elements:

list = [2,4,6,8]
sum = 0
for num in list:
    sum = sum + num
print ("The sum is: %d" % sum)

with the output simply being:

The sum is:  20

Or you could write a program to find out if there are any duplicates in a list like this program does:

list = [4, 5, 7, 8, 9, 1,0,7,10]
list.sort()
prev = list[0]
del list[0]
for item in list:
    if prev == item:
        print ("Duplicate of ",prev," Found")
    prev = item

and for good measure:

Duplicate of  7  Found

How does it work? Here is a special debugging version:

l = [4, 5, 7, 8, 9, 1,0,7,10]
print ("l = [4, 5, 7, 8, 9, 1,0,7,10]","\tl:",l)
l.sort()
print ("l.sort()","\tl:",l)
prev = l[0]
print ("prev = l[0]","\tprev:",prev)
del l[0]
print ("del l[0]","\tl:",l)
for item in l:
    if prev == item:
        print ("Duplicate of ",prev," Found")
    print ("if prev == item:","\tprev:",prev,"\titem:",item)
    prev = item
    print ("prev = item","\t\tprev:",prev,"\titem:",item)

with the output being:

l = [4, 5, 7, 8, 9, 1,0,7,10]   l: [4, 5, 7, 8, 9, 1, 0, 7, 10]
l.sort()        l: [0, 1, 4, 5, 7, 7, 8, 9, 10]
prev = l[0]     prev: 0
del l[0]        l: [1, 4, 5, 7, 7, 8, 9, 10]
if prev == item:        prev: 0         item: 1
prev = item             prev: 1         item: 1
if prev == item:        prev: 1         item: 4
prev = item             prev: 4         item: 4
if prev == item:        prev: 4         item: 5
prev = item             prev: 5         item: 5
if prev == item:        prev: 5         item: 7
prev = item             prev: 7         item: 7
if prev == item:        prev: 7         item: 7
Duplicate of  7  Found
prev = item             prev: 7         item: 7
if prev == item:        prev: 7         item: 8
prev = item             prev: 8         item: 8
if prev == item:        prev: 8         item: 9
prev = item             prev: 9         item: 9
if prev == item:        prev: 9         item: 10
prev = item             prev: 10        item: 10

Note: The reason there are so many print statements is because print statements can show the value of each variable at different times, and help debug the program. First the program starts with a old list. Next the program sorts the list. This is so that any duplicates get put next to each other. The program then initializes a prev(ious) variable. Next the first element of the list is deleted so that the first item is not incorrectly thought to be a duplicate. Next a for loop is gone into. Each item of the list is checked to see if it is the same as the previous. If it is a duplicate was found. The value of prev is then changed so that the next time the for loop is run through prev is the previous item to the current. Sure enough, the 7 is found to be a duplicate.

The other way to use for loops is to do something a certain number of times. Here is some code to print out the first 9 numbers of the Fibonacci series:

a = 1
b = 1
for c in range(1,10):
    print (a)
    n = a + b
    a = b
    b = n
print ("")

with the surprising output:

1
1
2
3
5
8
13
21
34

Everything that can be done with for loops can also be done with while loops but for loops give an easy way to go through all the elements in a list or to do something a certain number of times.


range Versus xrangeEdit

Above, you were introduced to the range function, which returns a list of all the integers in a specified range. Supposing you were to write an expression like range(0, 1000000): that would construct a list consisting of a million integers! That can take up a lot of memory.

Often, you do indeed need to process all the numbers over a very wide range. But you might only need to do so one at a time; as each number is processed, it can be discarded from memory before the next one is obtained.

To do this, you can use the xrange function instead of range. For example, the following simple loop

for i in xrange(0, 1000000) :
    print(i)
#end for

will print the million integers from 0 to 999999, but it will get them one at a time from the xrange call, instead of getting them all at once as a single list and going through that.

This is an example of an iterator, which yields values one at a time as they are needed, rather than all at once. As you learn more about Python, you will see a lot more examples of iterators in use, and you will learn how to write your own.

Note:

Python 3 Note: In Python 3.x, there is no function named xrange. Instead, the range function has the behaviour described above for xrange.

 

The break StatementEdit

A while-loop checks its termination condition before each entry into the loop body, and terminates if the condition has gone False. Thus, the loop body will normally iterate zero, one or more complete times.

A for-loop iterates its body once for each value returned from the iterator expression. Again, each iteration is normally of the complete loop body.

Sometimes you want to conditionally stop the loop in the middle of the loop body. An example situation might look like this:

  1. Obtain the next value to check
  2. Is there in fact another value to check? If not, exit the loop with failure.
  3. Is the value what I’m looking for? If so, exit the loop with success.
  4. Otherwise, go back to step 1.

As you can see, there are two possible exits from this loop. You can exit from the middle of a Python while- or for-loop with the break-statement. Here is one way to write such a loop:

found = False # initial assumption
for value in values_to_check():
    if is_what_im_looking_for(value):
        found = True
        break
    #end if
#end for
# ... found is True on success, False on failure

The trouble with this is the asymmetry between the two ways out of the loop: one through normal for-loop termination, the other through the break. As a stylistic matter, it would be more consistent to follow this principle:

If one exit from a loop is represented by a break, then all exits from the loop should be represented by breaks.

In particular, this means that the loop construct itself becomes a “loop-forever” loop; the only out of it is via a break statement.

We can do this by explicitly dealing with an iterator yielding the values to be checked. Perhaps the values_to_check() call above already yields an iterator; if not, it can be converted to one by wrapping it in a call to the iter() built-in function, and then using the next() built-in function to obtain successive values from this iterator. Then the loop becomes:

values = iter(values_to_check())
while True:
    value = next(values, None)
    if value is None:
        found = False
        break
    #end if
    if is_what_im_looking_for(value):
        found = True
        break
    #end if
#end while
# ... found is True on success, False on failure

This uses the special Python value None to indicate that the iterator has reached its end. This is a common Python convention. But if you need to use None as a valid item in your sequence of values, then it is easy enough to define some other unique value (e.g. a custom dummy class) to represent the end of the list.

ExercisesEdit

  1. Create a program to count by prime numbers. Ask the user to input a number, then print each prime number up to that number.
  2. Instruct the user to pick an arbitrary number from 1 to 100 and proceed to guess it correctly within seven tries. After each guess, the user must tell whether their number is higher than, lower than, or equal to your guess.

Solutions



Functions


Function CallsEdit

A callable object is an object that can accept some arguments (also called parameters) and possibly return an object (often a tuple containing multiple objects).

A function is the simplest callable object in Python, but there are others, such as classes or certain class instances.

Defining FunctionsEdit

A function is defined in Python by the following format:

def functionname(arg1, arg2, ...):
    statement1
    statement2
    ...
>>> def functionname(arg1,arg2):
...     return arg1+arg2
...
>>> t = functionname(24,24) # Result: 48

If a function takes no arguments, it must still include the parentheses, but without anything in them:

def functionname():
    statement1
    statement2
    ...

The arguments in the function definition bind the arguments passed at function invocation (i.e. when the function is called), which are called actual parameters, to the names given when the function is defined, which are called formal parameters. The interior of the function has no knowledge of the names given to the actual parameters; the names of the actual parameters may not even be accessible (they could be inside another function).

A function can 'return' a value, for example:

def square(x):
    return x*x

A function can define variables within the function body, which are considered 'local' to the function. The locals together with the arguments comprise all the variables within the scope of the function. Any names within the function are unbound when the function returns or reaches the end of the function body.

You can return multiple values as follows:

def first2items(list1):
  return list1[0], list1[1]
a, b = first2items(["Hello", "world", "hi", "universe"])
print a + " " + b

Keywords: returning multiple values, multiple return values.

Declaring ArgumentsEdit

When calling a function that takes some values for further processing, we need to send some values as Function Arguments. For example:

>>> def find_max(a,b):
   if(a>b):
      print "a is greater than b"
   else:
      print "b is greater than a"
>>> find_max(30,45)  #Here (30,45) are the arguments passing for finding max between this two numbers
The ouput will be: 45 is greater than 30

Default Argument ValuesEdit

If any of the formal parameters in the function definition are declared with the format "arg = value," then you will have the option of not specifying a value for those arguments when calling the function. If you do not specify a value, then that parameter will have the default value given when the function executes.

>>> def display_message(message, truncate_after=4):
...     print message[:truncate_after]
...
>>> display_message("message")
mess
>>> display_message("message", 6)
messag

Links:

Variable-Length Argument ListsEdit

Python allows you to declare two special arguments which allow you to create arbitrary-length argument lists. This means that each time you call the function, you can specify any number of arguments above a certain number.

def function(first,second,*remaining):
    statement1
    statement2
    ...

When calling the above function, you must provide value for each of the first two arguments. However, since the third parameter is marked with an asterisk, any actual parameters after the first two will be packed into a tuple and bound to "remaining."

>>> def print_tail(first,*tail):
...     print tail
...
>>> print_tail(1, 5, 2, "omega")
(5, 2, 'omega')

If we declare a formal parameter prefixed with two asterisks, then it will be bound to a dictionary containing any keyword arguments in the actual parameters which do not correspond to any formal parameters. For example, consider the function:

def make_dictionary(max_length=10, **entries):
    return dict([(key, entries[key]) for i, key in enumerate(entries.keys()) if i < max_length])

If we call this function with any keyword arguments other than max_length, they will be placed in the dictionary "entries." If we include the keyword argument of max_length, it will be bound to the formal parameter max_length, as usual.

>>> make_dictionary(max_length=2, key1=5, key2=7, key3=9)
{'key3': 9, 'key2': 7}

Links:

By Value and by ReferenceEdit

Objects passed as arguments to functions are passed by reference; they are not being copied around. Thus, passing a large list as an argument does not involve copying all its members to a new location in memory. Note that even integers are objects. However, the distinction of by value and by reference present in some other programming languages often serves to distinguish whether the passed arguments can be actually changed by the called function and whether the calling function can see the changes.

Passed objects of mutable types such as lists and dictionaries can be changed by the called function and the changes are visible to the calling function. Passed objects of immutable types such as integers and strings cannot be changed by the called function; the calling function can be certain that the called function will not change them. For mutability, see also Data Types chapter.

An example:

def appendItem(ilist, item):
  ilist.append(item) # Modifies ilist in a way visible to the caller

def replaceItems(ilist, newcontentlist):
  del ilist[:]                 # Modification visible to the caller
  ilist.extend(newcontentlist) # Modification visible to the caller
  ilist = [5, 6] # No outside effect; lets the local ilist point to a new list object,
                 # losing the reference to the list object passed as an argument
def clearSet(iset):
  iset.clear()

def tryToTouchAnInteger(iint):
  iint += 1 # No outside effect; lets the local iint to point to a new int object,
            # losing the reference to the int object passed as an argument
  print "iint inside:",iint # 4 if iint was 3 on function entry 

list1 = [1, 2]
appendItem(list1, 3)
print list1 # [1, 2, 3]
replaceItems(list1, [3, 4])
print list1 # [3, 4]
set1 = set([1, 2])
clearSet(set1 )
print set1 # set([])
int1 = 3
tryToTouchAnInteger(int1)
print int1 # 3

Preventing Argument ChangeEdit

An argument cannot be declared to be constant, not to be changed by the called function. If an argument is of an immutable type, it cannot be changed anyway, but if it is of a mutable type such as list, the calling function is at the mercy of the called function. Thus, if the calling function wants to make sure a passed list does not get changed, it has to pass a copy of the list.

An example:

def evilGetLength(ilist):
  length = len(ilist)
  del ilist[:] # Muhaha: clear the list
  return length

list1 = [1, 2]
print evilGetLength(list1[:]) # Pass a copy of list1
print list1
list1 = [1, 2]
print evilGetLength(list1) # list1 gets cleared
print list1
list1 = []

Calling FunctionsEdit

A function can be called by appending the arguments in parentheses to the function name, or an empty matched set of parentheses if the function takes no arguments.

foo()
square(3)
bar(5, x)

A function's return value can be used by assigning it to a variable, like so:

x = foo()
y = bar(5,x)

As shown above, when calling a function you can specify the parameters by name and you can do so in any order

def display_message(message, start=0, end=4):
   print message[start:end]

display_message("message", end=3)

This above is valid and start will have the default value of 0. A restriction placed on this is after the first named argument then all arguments after it must also be named. The following is not valid

display_message(end=5, start=1, "my message")

because the third argument ("my message") is an unnamed argument.

Nested functionsEdit

Nested functions are functions defined within other functions. Arbitrary level of nesting is possible.

Nested functions can read variables declared in the immeditely outside function. For such variables that are mutable, nested functions can even modify them. For such variables that are immutable such as integers, attept at modification in the nested function throws UnboundLocalError. In Python 3, an immutable immediately outside variable can be declared in the nested function to be nonlocal, in an analogy to global. Once this is done, the nested function can assign a new value to that variable and that modification is going to be seen outside of the nested function.

Nested functions can be used in #Closures, on which see below. Furthermore, they can be used to reduce repetion of code that pertains only to a single function, often with reduced argument list owing to seeing the immediately outside variables.

An example of a nested function that modifies an immediately outside variable that is a list and therefore mutable:

def outside():
  outsideList = [1, 2]
  def nested():
    outsideList.append(3)
  nested()
  print outsideList

An example in which the outside variable is first accessed below the nested function definition and it still works:

def outside():
  def nested():
    outsideList.append(3)
  outsideList = [1, 2]
  nested()
  print outsideList

Keywords: inner functions, internal functions, local functions.

Links:

ClosuresEdit

A closure is a nested function with an after-return access to the data of the outer function, where the nested function is returned by the outer function as a function object. Thus, even when the outer function has finished its execution after being called, the closure function returned by it can refer to the values of the variables that the outer function had when it defined the closure function.

An example:

def adder(outer_argument): # outer function
  def adder_inner(inner_argument): # inner function, nested function
    return outer_argument + inner_argument # Notice outer_argument
  return adder_inner
add5 = adder(5) # a function that adds 5 to its argument
add7 = adder(7) # a function that adds 7 to its argument
print add5(3) # prints 8
print add7(3) # prints 10

Closures are possible in Python because functions are first-class objects. A function is merely an object of type function. Being an object means it is possible to pass a function object (an uncalled function) around as argument or as return value or to assign another name to the function object. A unique feature that makes closure useful is that the enclosed function may use the names defined in the parent function's scope.

Lambda ExpressionsEdit

A lambda is an anonymous (unnamed) function. It is used primarily to write very short functions that are a hassle to define in the normal way. A function like this:

>>> def add(a, b):
...    return a + b
...
>>> add(4, 3)
7

may also be defined using lambda

>>> print (lambda a, b: a + b)(4, 3)
7

Lambda is often used as an argument to other functions that expects a function object, such as sorted()'s 'key' argument.

>>> sorted([[3, 4], [3, 5], [1, 2], [7, 3]], key=lambda x: x[1])
[[1, 2], [7, 3], [3, 4], [3, 5]]

The lambda form is often useful as a closure, such as illustrated in the following example:

>>> def attribution(name):
...    return lambda x: x + ' -- ' + name
...
>>> pp = attribution('John')
>>> pp('Dinner is in the fridge')
'Dinner is in the fridge -- John'

Note that the lambda function can use the values of variables from the scope in which it was created (like pre and post). This is the essence of closure.

Links:

Generator FunctionsEdit

When discussing loops, you can across the concept of an iterator. This yields in turn each element of some sequence, rather than the entire sequence at once, allowing you to deal with sequences much larger than might be able to fit in memory at once.

You can create your own iterators, by defining what is known as a generator function. To illustrate the usefulness of this, let us start by considering a simple function to return the concatenation of two lists:

def concat(a, b):
    return a + b

print concat([5, 4, 3], ["a", "b", "c"])
# prints [5, 4, 3, 'a', 'b', 'c']

Imagine wanting to do something like concat(range(0, 1000000), range(1000000, 2000000))

That would work, but it would consume a lot of memory.

Consider an alternative definition, which takes two iterators as arguments:

def concat(a, b):
    for i in a:
        yield i
    for i in b:
        yield i

Notice the use of the yield statement, instead of return. We can now use this something like

for i in concat(xrange(0, 1000000), xrange(1000000, 2000000))
    print i

and print out an awful lot of numbers, without using a lot of memory at all.

Note: You can still pass a list or other sequence type wherever Python expects an iterator (like to an argument of your concat function); this will still work, and makes it easy not to have to worry about the difference where you don’t need to.

Links:

External LinksEdit



Scoping


VariablesEdit

Variables in Python are automatically declared by assignment. Variables are always references to objects, and are never typed. Variables exist only in the current scope or global scope. When they go out of scope, the variables are destroyed, but the objects to which they refer are not (unless the number of references to the object drops to zero).

Scope is delineated by function and class blocks. Both functions and their scopes can be nested. So therefore

def foo():
    def bar():
        x = 5 # x is now in scope
        return x + y # y is defined in the enclosing scope later
    y = 10
    return bar() # now that y is defined, bar's scope includes y

Now when this code is tested,

>>> foo()
15
>>> bar()
Traceback (most recent call last):
  File "<pyshell#26>", line 1, in -toplevel-
    bar()
NameError: name 'bar' is not defined

The name 'bar' is not found because a higher scope does not have access to the names lower in the hierarchy.

It is a common pitfall to fail to lookup an attribute (such as a method) of an object (such as a container) referenced by a variable before the variable is assigned the object. In its most common form:

>>> for x in range(10):
         y.append(x) # append is an attribute of lists

Traceback (most recent call last):
  File "<pyshell#46>", line 2, in -toplevel-
    y.append(x)
NameError: name 'y' is not defined

Here, to correct this problem, one must add y = [] before the for loop.

A loop does not create its own scope:

for x in [1, 2, 3]:
  inner = x
print inner # 3 rather than an error

Keyword globalEdit

Global variables of a Python module are read-accessible from functions in that module. In fact, if they are mutable, they can be also modified via method call. However, they cannot modified via a plain assignment unless declared global in the function.

An example to clarify:

count1 = 1
count2 = 1
list1 = []
list2 = []

def test1():
  print count1    # Read access is unproblematic, referring to the global

def test2():
  try:
    print count1  # This print would be unproblematic, but it throws an error ...
    count1 += 1   # ... since count1 += 1 causes count1 to be local.
  except UnboundLocalError as error:
    print "Error caught:", error

def test3():
  list1 = [2]     # No outside effect; this rebinds list1 to be a local variable

def test4():
  global count2, list2
  print count1    # Read access is unproblematic, referring to the global
  count2 += 1     # We can modify count2 via assignment
  list1.append(1) # Impacts the global list1 even without global declaration
  list2 = [2]     # Impacts the global list2

test1()
test2()
test3()
test4()

print "count1:", count1  # 1
print "count2:", count2  # 2
print "list1:", list1    # [1]
print "list2:", list2    # [2]

Links:

Keyword nonlocalEdit

Keyword nonlocal, available since Python 3.0, is an analogue of global for nested scopes. It enables a nested function of assign-modify a variable that is local to the outer function.

An example:

# Requires Python 3
def outer():
  outerint = 0
  outerint2 = 10
  def inner():
    nonlocal outerint
    outerint = 1 # Impacts outer's outerint only because of the nonlocal declaration
    outerint2 = 1 # No impact
  inner()
  print(outerint)
  print(outerint2)

outer()

Simulation of nonlocal in Python 2 via a mutable object:

def outer():
  outerint = [1]           # Technique 1: Store int in a list
  class outerNL: pass      # Technique 2: Store int in a class
  outerNL.outerint2 = 11
  def inner():
    outerint[0] = 2        # List members can be modified
    outerNL.outerint2 = 12 # Class members can be modified
  inner()
  print outerint[0]
  print outerNL.outerint2

outer()

Links:

globals and localsEdit

To find out which variables exist in the global and local scopes, you can use locals() and globals() functions, which return dictionaries:

int1 = 1
def test1():
  int1 = 2
  globals()["int1"] = 3  # Write access seems possible
  print locals()["int1"] # 2
  
test1()

print int1               # 3

Write access to locals() dictionary is discouraged by the Python documentation.

Links:

External linksEdit



Input and output


InputEdit

Note on Python version: The following uses the syntax of Python 2.x. Some of the following is not going to work with Python 3.x.

Python has two functions designed for accepting data directly from the user:

  • input()
  • raw_input()

There are also very simple ways of reading a file and, for stricter control over input, reading from stdin if necessary.

raw_input()Edit

raw_input() asks the user for a string of data (ended with a newline), and simply returns the string. It can also take an argument, which is displayed as a prompt before the user enters the data. E.g.

print raw_input('What is your name? ')

prints out

What is your name? <user input data here>

Example: in order to assign the user's name, i.e. string data, to a variable "x" you would type

x = raw_input('What is your name?')

Once the user inputs his name, e.g. Simon, you can call it as x

print 'Your name is ' + x

prints out

Your name is Simon

Note:
in 3.x "...raw_input() was renamed to input(). That is, the new input() function reads a line from sys.stdin and returns it with the trailing newline stripped. It raises EOFError if the input is terminated prematurely. To get the old behavior of input(), use eval(input())."

input()Edit

input() uses raw_input to read a string of data, and then attempts to evaluate it as if it were a Python program, and then returns the value that results. So entering

[1,2,3]

would return a list containing those numbers, just as if it were assigned directly in the Python script.

More complicated expressions are possible. For example, if a script says:

x = input('What are the first 10 perfect squares? ')

it is possible for a user to input:

map(lambda x: x*x, range(10))

which yields the correct answer in list form. Note that no inputted statement can span more than one line.

input() should not be used for anything but the most trivial program. Turning the strings returned from raw_input() into python types using an idiom such as:

x = None
while not x:
    try:
        x = int(raw_input())
    except ValueError:
        print 'Invalid Number'

is preferable, as input() uses eval() to turn a literal into a python type. This will allow a malicious person to run arbitrary code from inside your program trivially.

File InputEdit

File ObjectsEdit

Python includes a built-in file type. Files can be opened by using the file type's constructor:

f = file('test.txt', 'r')

This means f is open for reading. The first argument is the filename and the second parameter is the mode, which can be 'r', 'w', or 'rw', among some others.

The most common way to read from a file is simply to iterate over the lines of the file:

f = open('test.txt', 'r')
for line in f:
    print line[0]
f.close()

This will print the first character of each line. Note that a newline is attached to the end of each line read this way.

The newer and better way to read from a file:

with open("test.txt", "r") as txt:
    for line in txt:
        print line

The advantage is, that the opened file will close itself after reading each line.

Because files are automatically closed when the file object goes out of scope, there is no real need to close them explicitly. So, the loop in the previous code can also be written as:

for line in open('test.txt', 'r'):
    print line[0]

You can read limited numbers of characters at a time like this:

c = f.read(1)
while len(c) > 0:
    if len(c.strip()) > 0: print c,
    c = f.read(1)

This will read the characters from f one at a time, and then print them if they're not whitespace.

A file object implicitly contains a marker to represent the current position. If the file marker should be moved back to the beginning, one can either close the file object and reopen it or just move the marker back to the beginning with:

f.seek(0)

Standard File ObjectsEdit

Like many other languages, there are built-in file objects representing standard input, output, and error. These are in the sys module and are called stdin, stdout, and stderr. There are also immutable copies of these in __stdin__, __stdout__, and __stderr__. This is for IDLE and other tools in which the standard files have been changed.

You must import the sys module to use the special stdin, stdout, stderr I/O handles.

import sys

For finer control over input, use sys.stdin.read(). In order to implement the UNIX 'cat' program in Python, you could do something like this:

import sys
for line in sys.stdin:
    print line,

Note that sys.stdin.read() will read from standard input till EOF. (which is usually Ctrl+D.)

Parsing command lineEdit

Command-line arguments passed to a Python program are stored in sys.argv list. The first item in the list is name of the Python program, which may or may not contain the full path depending on the manner of invocation. sys.argv list is modifiable.

Printing all passed arguments except for the program name itself:

import sys
for arg in sys.argv[1:]:
  print arg

Parsing passed arguments for passed minus options:

import sys
option_f = False
option_p = False
option_p_argument = ""
i = 1
while i < len(sys.argv):
  if sys.argv[i] == "-f":
    option_f = True
    sys.argv.pop(i)
  elif sys.argv[i] == "-p":
    option_p = True
    sys.argv.pop(i)
    option_p_argument = sys.argv.pop(i)
  else:
    i += 1

Above, the arguments at which options are found are removed so that sys.argv can be looped for all remaining arguments.

Parsing of command-line arguments is further supported by library modules optparse (deprecated), argparse (since Python 2.7) and getopt (to make life easy for C programmers).

Links:

OutputEdit

Note on Python version: The following uses the syntax of Python 2.x. Much of the following is not going to work with Python 3.x. In particular, Python 3.x requires round brackets around arguments to "print".

The basic way to do output is the print statement.

print 'Hello, world'

To print multiple things on the same line separated by spaces, use commas between them, like this:

print 'Hello,', 'World'

This will print out the following:

Hello, World

While neither string contained a space, a space was added by the print statement because of the comma between the two objects. Arbitrary data types can be printed this way:

print 1,2,0xff,0777,(10+5j),-0.999,map,sys

This will output the following:

1 2 255 511 (10+5j) -0.999 <built-in function map> <module 'sys' (built-in)>

Objects can be printed on the same line without needing to be on the same line if one puts a comma at the end of a print statement:

for i in range(10):
    print i,

This will output the following:

0 1 2 3 4 5 6 7 8 9

To end the printed line with a newline, add a print statement without any objects.

for i in range(10):
    print i,
print
for i in range(10,20):
    print i,

This will output the following:

0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19

If the bare print statement were not present, the above output would look like:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

You can use similar syntax when writing to a file instead of to standard output, like this:

print >> f, 'Hello, world'

This will print to any object that implements write(), which includes file objects.

Omitting newlinesEdit

To avoid adding spaces and newlines between objects' output with subsequent print statements, you can do one of the following:

Concatenation: Concatenate the string representations of each object, then later print the whole thing at once.

print str(1)+str(2)+str(0xff)+str(0777)+str(10+5j)+str(-0.999)+str(map)+str(sys)

This will output the following:

12255511(10+5j)-0.999<built-in function map><module 'sys' (built-in)>

Write function: You can make a shorthand for sys.stdout.write and use that for output.

import sys
write = sys.stdout.write
write('20')
write('05\n')

This will output the following:

2005

You may need sys.stdout.flush() to get that text on the screen quickly.

ExamplesEdit

Examples of output with Python 2.x:

  • print "Hello"
  • print "Hello", "world"
    • Separates the two words with a space.
  • print "Hello", 34
    • Prints elements of various data types, separating them by a space.
  • print "Hello " + 34
    • Throws an error as a result of trying to concatenate a string and an integer.
  • print "Hello " + str(34)
    • Uses "+" to concatenate strings, after converting a number to a string.
  • print "Hello",
    • Prints "Hello " without a newline, with a space at the end.
  • sys.stdout.write("Hello")
    • Prints "Hello" without a newline. Doing "import sys" is a prerequisite. Needs a subsequent "sys.stdout.flush()" in order to display immediately on the user's screen.
  • sys.stdout.write("Hello\n")
    • Prints "Hello" with a newline.
  • print >> sys.stderr, "An error occurred."
    • Prints to standard error stream.
  • sys.stderr.write("Hello\n")
    • Prints to standard error stream.
  • sum=2+2; print "The sum: %i" % sum
    • Prints a string that has been formatted with the use of an integer passed as an argument.
  • formatted_string = "The sum: %i" % (2+2); print formatted_string
    • Like the previous, just that the formatting happens outside of the print statement.
  • print "Float: %6.3f" % 1.23456
    • Outputs "Float: 1.234". The number 3 after the period specifies the number of decimal digits after the period to be displayed, while 6 before the period specifies the total number of characters the displayed number should take, to be padded with spaces if needed.
  • print "%s is %i years old" % ("John", 23)
    • Passes two arguments to the formatter.

Examples of output with Python 3.x:

  • from __future__ import print_function
    • Ensures Python 2.6 and later Python 2.x can use Python 3.x print function.
  • print ("Hello", "world")
    • Prints the two words separated with a space. Notice the surrounding brackets, ununsed in Python 2.x.
  • print ("Hello world", end="")
    • Prints without the ending newline.
  • print ("Hello", "world", sep="-")
    • Prints the two words separated with a dash.

File OutputEdit

Printing numbers from 1 to 10 to a file, one per line:

file1 = open("TestFile.txt","w")
for i in range(1,10+1):
  print >>file1, i
file1.close()

With "w", the file is opened for writing. With ">>file", print sends its output to a file rather than standard output.

Printing numbers from 1 to 10 to a file, separated with a dash:

file1 = open("TestFile.txt","w")
for i in range(1,10+1):
  if i>1:
    file1.write("-")
  file1.write(str(i))
file1.close()

Opening a file for appending rather than overwriting:

file1 = open("TestFile.txt","a")

See also Files chapter.

FormattingEdit

Formatting numbers and other values as strings using the string percent operator:

v1 = "Int: %i" % 4               # 4
v2 = "Int zero padded: %03i" % 4 # 004
v3 = "Int space padded: %3i" % 4 #   4
v4 = "Hex: %x" % 31              # 1f
v5 = "Hex 2: %X" % 31            # 1F - capitalized F
v6 = "Oct: %o" % 8               # 10
v7 = "Float: %f" % 2.4           # 2.400000
v8 = "Float: %.2f" % 2.4         # 2.40
v9 = "Float in exp: %e" % 2.4    # 2.400000e+00
vA = "Float in exp: %E" % 2.4    # 2.400000E+00
vB = "List as string: %s" % [1, 2, 3]
vC = "Left padded str: %10s" % "cat"
vD = "Right padded str: %-10s" % "cat"
vE = "Truncated str: %.2s" % "cat"
vF = "Dict value str: %(age)s" % {"age": 20}
vG = "Char: %c" % 65             # A
vH = "Char: %c" % "A"            # A

Formatting numbers and other values as strings using the format() string method, since Python 2.6:

v1 = "Arg 0: {0}".format(31)     # 31
v2 = "Args 0 and 1: {0}, {1}".format(31, 65)
v3 = "Args 0 and 1: {}, {}".format(31, 65)
v4 = "Arg indexed: {0[0]}".format(["e1", "e2"])
v5 = "Arg named: {a}".format(a=31)
v6 = "Hex: {0:x}".format(31)     # 1f
v7 = "Hex: {:x}".format(31)      # 1f - arg 0 is implied
v8 = "Char: {0:c}".format(65)    # A
v9 = "Hex: {:{h}}".format(31, h="x") # 1f - nested evaluation

Formatting numbers and other values as strings using literal string interpolation, since Python 3.6:

int1 = 31; int2 = 41; str1="aaa"; myhex = "x"
v1 = f"Two ints: {int1} {int2}"
v2 = f"Int plus 1: {int1+1}"      # 32 - expression evaluation
v3 = f"Str len: {len(str1)}"      # 3 - expression evaluation
v4 = f"Hex: {int1:x}"             # 1f
v5 = f"Hex: {int1:{myhex}}"       # 1f - nested evaluation

Links:

External LinksEdit



Files


File I/OEdit

Read entire file:

inputFileText = open("testit.txt", "r").read()
print(inputFileText)

In this case the "r" parameter means the file will be opened in read-only mode.

Read certain amount of bytes from a file:

inputFileText = open("testit.txt", "r").read(123)
print(inputFileText)

When opening a file, one starts reading at the beginning of the file, if one would want more random access to the file, it is possible to use seek() to change the current position in a file and tell() to get to know the current position in the file. This is illustrated in the following example:

>>> f=open("/proc/cpuinfo","r")
>>> f.tell()
0L
>>> f.read(10)
'processor\t'
>>> f.read(10)
': 0\nvendor'
>>> f.tell()
20L
>>> f.seek(10)
>>> f.tell()
10L
>>> f.read(10)
': 0\nvendor'
>>> f.close()
>>> f
<closed file '/proc/cpuinfo', mode 'r' at 0xb7d79770>

Here a file is opened, twice ten bytes are read, tell() shows that the current offset is at position 20, now seek() is used to go back to position 10 (the same position where the second read was started) and ten bytes are read and printed again. And when no more operations on a file are needed the close() function is used to close the file we opened.

Read one line at a time:

for line in open("testit.txt", "r"):
    print line

In this case readlines() will return an array containing the individual lines of the file as array entries. Reading a single line can be done using the readline() function which returns the current line as a string. This example will output an additional newline between the individual lines of the file, this is because one is read from the file and print introduces another newline.

Write to a file requires the second parameter of open() to be "w", this will overwrite the existing contents of the file if it already exists when opening the file:

outputFileText = "Here's some text to save in a file"
open("testit.txt", "w").write(outputFileText)

Append to a file requires the second parameter of open() to be "a" (from append):

outputFileText = "Here's some text to add to the existing file."
open("testit.txt", "a").write(outputFileText)

Note that this does not add a line break between the existing file content and the string to be added.

Since Python 2.5, you can use with keyword to ensure the file handle is released as soon as possible and to make it exception-safe:

with open("input.txt") as file1:
  data = file1.read()
  # process the data

Or one line at a time:

with open("input.txt") as file1:
  for line in file1:
    print line

Related to the with keywords is Context Managers chapter.

Links:

Testing FilesEdit

Determine whether path exists:

import os
os.path.exists('<path string>')

When working on systems such as Microsoft Windows™, the directory separators will conflict with the path string. To get around this, do the following:

import os
os.path.exists('C:\\windows\\example\\path')

A better way however is to use "raw", or r:

import os
os.path.exists(r'C:\windows\example\path')

But there are some other convenient functions in os.path, where os.path.exists() only confirms whether or not path exists, there are functions which let you know if the path is a file, a directory, a mount point or a symlink. There is even a function os.path.realpath() which reveals the true destination of a symlink:

>>> import os
>>> os.path.isfile("/")
False
>>> os.path.isfile("/proc/cpuinfo")
True
>>> os.path.isdir("/")
True
>>> os.path.isdir("/proc/cpuinfo")
False
>>> os.path.ismount("/")
True
>>> os.path.islink("/")
False
>>> os.path.islink("/vmlinuz")
True
>>> os.path.realpath("/vmlinuz")
'/boot/vmlinuz-2.6.24-21-generic'

Common File OperationsEdit

To copy or move a file, use the shutil library.

import shutil
shutil.move("originallocation.txt","newlocation.txt")
shutil.copy("original.txt","copy.txt")

To perform a recursive copy it is possible to use copytree(), to perform a recursive remove it is possible to use rmtree()

import shutil
shutil.copytree("dir1","dir2")
shutil.rmtree("dir1")

To remove an individual file there exists the remove() function in the os module:

import os
os.remove("file.txt")

Finding FilesEdit

Files can be found using glob:

glob.glob('*.txt') # Finds files in the currect directory ending in dot txt 
glob.glob('*\\*.txt') # Finds files in any of the direct subdirectories
                      # of the currect directory ending in dot txt 
glob.glob('C:\\Windows\\*.exe')
for fileName in glob.glob('C:\\Windows\\*.exe'):
  print fileName

The content of a directory can be listed using listdir:

filesAndDirectories=os.listdir('.')
for item in filesAndDirectories:
  if os.path.isfile(item) and item.endswith('.txt'):
    print "Text file: " + item
  if os.path.isdir(item):
    print "Directory: " + item

Getting a list of all items in a directory, including the nested ones:

for root, directories, files in os.walk('/user/Joe Hoe'):
  print "Root: " + root                          # e.g. /user/Joe Hoe/Docs
  for dir1 in directories:
    print "Dir.: " + dir1                        # e.g. Fin
    print "Dir. 2: " + os.path.join(root, dir1)  # e.g. /user/Joe Hoe/Docs/Fin
  for file1 in files:
    print "File: " + file1                       # e.g. MyFile.txt
    print "File 2: " + os.path.join(root, file1) # e.g. /user/Joe Hoe/Docs/MyFile.txt

Above, root takes value of each directory in /user/Joe Hoe including /user/Joe Hoe itself, and directories and files are only those directly present in each root.

Links:

Current DirectoryEdit

Getting current working directory:

os.getcwd()

Changing current working directory:

os.chdir('C:\\')

External LinksEdit



Text


To get the length of a string, we use the len() function:

>>> len("Hello Wikibooks!")
16

You can slice strings just like lists and any other sequences:

>>> "Hello Wikibooks!"[0:5]
'Hello'
>>> "Hello Wikibooks!"[5:11]
' Wikib'
>>> "Hello Wikibooks!"[:5] #equivalent of [0:5]
'Hello'

To get the ASCII code of a character, use the ord() function.

>>> ord('h')
104
>>> ord('a')
97
>>> ord('^')
94

To get the character encoded by an ASCII code number, use the chr() function.

>>> chr(104)
'h'
>>> chr(97)
'a'
>>> chr(94)
'^'
To know if all the characters present in a string are alphanumeric i.e. they are alphabets and numeric, use the isalnum() function. It returns true if there is at least one character present in the string and all the characters present are alphanumeric.

To know if all the characters present in a string are pure alphabets, use the isalpha() function. It returns true if there is at least one character present in the string and all the characters present are alphabetic.

ExampleEdit

stringparser.py

# Add each character, and it's ordinal, of user's text input, to two lists
s = input("Enter value: ")  # this line requires Python 3.x, use raw_input() instead of input() in Python 2.x
l1 = [] 
l2 = []
for c in s:   # in Python, a string is just a sequence, so we can iterate over it!
    l1.append(c) 
    l2.append(ord(c))
print(l1)
print(l2)

Or shorter (using list comprehension instead of the for block):

# Add each character, and it's ordinal, of user's text input, to two lists
s = input("Enter value: ")  # this line requires Python 3.x, use raw_input() instead of input() in Python 2.x

l1=[c for c in s]   # in Python, a string is just a sequence, so we can iterate over it!
l2=[ord(c) for c in s]

print(l1)
print(l2)


Output:

Enter value: string
['s', 't', 'r', 'i', 'n', 'g']
[115, 116, 114, 105, 110, 103]

Or

Enter value: Hello, Wikibooks!
['H', 'e', 'l', 'l', 'o', ',', ' ', 'W', 'i', 'k', 'i', 'b', 'o', 'o', 'k', 's', '!']
[72, 101, 108, 108, 111, 44, 32, 87, 105, 107, 105, 98, 111, 111, 107, 115, 33]


ExercisesEdit

  1. Use Python to determine the difference in ASCII code between lowercase and upper case letters.
  2. Write a program that converts a lowercase letter to an upper case letter using the ASCII code. (Note that there are better ways to do this, but you should do it once using the ASCII code to get a feel for how the language works)



Modules


Modules are a way to structure a program and create reusable libraries. A module is usually stored in and corresponds to a separate .py file. Many modules are available from the standard library. You can create your own modules. Python searches for modules in the current directory and other locations; the list of module search locations can be expanded by expanding PYTHONPATH environment variable and by other means.

Importing a ModuleEdit

To use the functions and classes offered by a module, you have to import the module:

import math
print math.sqrt(10)

The above imports the math standard module, making all of the functions in that module namespaced by the module name. It imports all functions and all classes, if any.

You can import the module under a different name:

import math as Mathematics
print Mathematics.sqrt(10)

You can import a single function, making it available without the module name namespace:

from math import sqrt
print sqrt(10)

You can import a single function and make it available under a different name:

from math import cos as cosine
print cosine(10)

You can import multiple modules in a row:

import os, sys, re

You can make an import as late as in a function definition:

def sqrtTen():
  import math
  print math.sqrt(10)

Such an import only takes place when the function is called.

You can import all functions from the module without the module namespace, using an asterisk notation:

from math import *
print sqrt(10)

However, if you do this inside a function, you get a warning in Python 2 (and error in Python 3):

def sqrtTen():
  from math import *
  print sqrt(10)

You can guard for a module not found:

try:
  import custommodule
except ImportError:
  pass

Modules can be different kinds of things:

  • Python files
  • Shared Objects (under Unix and Linux) with the .so suffix
  • DLL's (under Windows) with the .pyd suffix
  • Directories

Modules are loaded in the order they're found, which is controlled by sys.path. The current directory is always on the path.

Directories should include a file in them called __init__.py, which should probably include the other files in the directory.

Creating a DLL that interfaces with Python is covered in another section.

Imported CheckEdit

You can check whether a module has been imported as follows:

if "re" in sys.modules:
  print "Regular expression module is ready for use."

Links:

Creating a ModuleEdit

From a FileEdit

The easiest way to create a module is by having a file called mymod.py either in a directory recognized by the PYTHONPATH variable or (even easier) in the same directory where you are working. If you have the following file mymod.py

class Object1:
        def __init__(self):
                self.name = 'object 1'

you can already import this "module" and create instances of the object Object1.

import mymod
myobject = mymod.Object1()
from mymod import *
myobject = Object1()

From a DirectoryEdit

It is not feasible for larger projects to keep all classes in a single file. It is often easier to store all files in directories and load all files with one command. Each directory needs to have a __init__.py file which contains python commands that are executed upon loading the directory.

Suppose we have two more objects called Object2 and Object3 and we want to load all three objects with one command. We then create a directory called mymod and we store three files called Object1.py, Object2.py and Object3.py in it. These files would then contain one object per file but this not required (although it adds clarity). We would then write the following __init__.py file:

from Object1 import *
from Object2 import *
from Object3 import *

__all__ = ["Object1", "Object2", "Object3"]

The first three commands tell python what to do when somebody loads the module. The last statement defining __all__ tells python what to do when somebody executes from mymod import *. Usually we want to use parts of a module in other parts of a module, e.g. we want to use Object1 in Object2. We can do this easily with an from . import * command as the following file Object2.py shows:

from . import *

class Object2:
        def __init__(self):
                self.name = 'object 2'
                self.otherObject = Object1()

We can now start python and import mymod as we have in the previous section.

Making a program usable as a moduleEdit

In order to make a program usable both as a standalone program to be called from a command line and as a module, it is advisable that you place all code in functions and methods, designate one function as the main one, and call then main function when __name__ built-in equals '__main__'. The purpose of doing so is to make sure that the code you have placed in the main function is not called when your program is imported as a module; the code would be called upon import if it were placed outside of functions and methods.

Your program, stored in mymodule.py, can look as follows:

def reusable_function(x, y):
  return x + y

def main():
  pass
  # Any code you like

if __name__ == '__main__':
  main()

The uses of the above program can look as follows:

from mymodule import reusable_function
my_result = reusable_function(4, 5)

Links:

Extending Module PathEdit

When import is requested, modules are searched in the directories (and zip files?) in the module path, accessible via sys.path, a Python list. The module path can be extended as follows:

import sys
sys.path.append("/My/Path/To/Module/Directory")
from ModuleFileName import my_function

Above, if ModuleFileName.py is located at /My/Path/To/Module/Directory and contains a definition of my_function, the 2nd line ensures the 3rd line actually works.

Links:

Module NamesEdit

Module names seem to be limited to alphanumeric characters and underscore; dash cannot be used. While my-module.py can be created and run, importing my-module fails. The name of a module is the name of the module file minus the .py suffix.

Module names are case sensitive. If the module file is called MyModule.py, doing "import mymodule" fails while "import MyModule" is fine.

PEP 0008 recommends module names to be in all lowercase, with possible use of underscores.

Examples of module names from the standard library include math, sys, io, re, urllib, difflib, and unicodedata.

Links:

Built-in ModulesEdit

For a module to be built-in is not the same as to be part of the standard library. For instance, re is not a built-in module but rather a module written in Python. By contrast, _sre is a built-in module.

Obtaining a list of built-in module names:

print sys.builtin_module_names
print "_sre" in sys.builtin_module_names # True
print "math" in sys.builtin_module_names # True

Links:

External linksEdit



Classes


Classes are a way of aggregating similar data and functions. A class is basically a scope inside which various code (especially function definitions) is executed, and the locals to this scope become attributes of the class, and of any objects constructed by this class. An object constructed by a class is called an instance of that class.

OverviewEdit

Classes in Python at a glance:

import math
class MyComplex:
  """A complex number"""       # Class documentation
  classvar = 0.0               # A class attribute, not an instance one
  def phase(self):             # A method
    return math.atan2(self.imaginary, self.real)
  def __init__(self):          # A constructor
    """A constructor"""
    self.real = 0.0            # An instance attribute
    self.imaginary = 0.0
c1 = MyComplex()
c1.real = 3.14                 # No access protection
c1.imaginary = 2.71
phase = c1.phase()             # Method call
c1.undeclared = 9.99           # Add an instance attribute
del c1.undeclared              # Delete an instance attribute

print vars(c1)                 # Attributes as a dictionary
vars(c1)["undeclared2"] = 7.77 # Write access to an attribute
print c1.undeclared2           # 7.77, indeed

MyComplex.classvar = 1         # Class attribute access
print c1.classvar == 1         # True; class attribute access, not an instance one
print "classvar" in vars(c1)   # False
c1.classvar = -1               # An instance attribute overshadowing the class one
MyComplex.classvar = 2         # Class attribute access
print c1.classvar == -1        # True; instance attribute acccess
print "classvar" in vars(c1)   # True

class MyComplex2(MyComplex):   # Class derivation or inheritance
  def __init__(self, re = 0, im = 0):
    self.real = re             # A constructor with multiple arguments with defaults
    self.imaginary = im
  def phase(self):
    print "Derived phase"
    return MyComplex.phase(self) # Call to a base class; "super"
c3 = MyComplex2()
c4 = MyComplex2(1, 1)
c4.phase()                     # Call to the method in the derived class

class Record: pass             # Class as a record/struct with arbitrary attributes
record = Record()
record.name = "Joe"
record.surname = "Hoe"

Defining a ClassEdit

To define a class, use the following format:

class ClassName:
    "Here is an explanation about your class"
    pass

The capitalization in this class definition is the convention, but is not required by the language. It's usually good to add at least a short explanation of what your class is supposed to do. The pass statement in the code above is just to say to the python interpreter just go on and do nothing. You can remove it as soon as you are adding your first statement.

Instance ConstructionEdit

The class is a callable object that constructs an instance of the class when called. Let's say we create a class Foo.

class Foo:
    "Foo is our new toy."
    pass

To construct an instance of the class, Foo, "call" the class object:

f = Foo()

This constructs an instance of class Foo and creates a reference to it in f.

Class MembersEdit

In order to access the member of an instance of a class, use the syntax <class instance>.<member>. It is also possible to access the members of the class definition with <class name>.<member>.

MethodsEdit

A method is a function within a class. The first argument (methods must always take at least one argument) is always the instance of the class on which the function is invoked. For example

>>> class Foo:
...     def setx(self, x):
...         self.x = x
...     def bar(self):
...         print self.x

If this code were executed, nothing would happen, at least until an instance of Foo were constructed, and then bar were called on that instance.

Invoking MethodsEdit

Calling a method is much like calling a function, but instead of passing the instance as the first parameter like the list of formal parameters suggests, use the function as an attribute of the instance.

>>> f = Foo()
>>> f.setx(5)
>>> f.bar()

This will output

5

It is possible to call the method on an arbitrary object, by using it as an attribute of the defining class instead of an instance of that class, like so:

>>> Foo.setx(f,5)
>>> Foo.bar(f)

This will have the same output.

Dynamic Class StructureEdit

As shown by the method setx above, the members of a Python class can change during runtime, not just their values, unlike classes in languages like C++ or Java. We can even delete f.x after running the code above.

>>> del f.x
>>> f.bar()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 5, in bar
AttributeError: Foo instance has no attribute 'x'

Another effect of this is that we can change the definition of the Foo class during program execution. In the code below, we create a member of the Foo class definition named y. If we then create a new instance of Foo, it will now have this new member.

>>> Foo.y = 10
>>> g = Foo()
>>> g.y
10
Viewing Class DictionariesEdit

At the heart of all this is a dictionary that can be accessed by "vars(ClassName)"

>>> vars(g)
{}

At first, this output makes no sense. We just saw that g had the member y, so why isn't it in the member dictionary? If you remember, though, we put y in the class definition, Foo, not g.

>>> vars(Foo)
{'y': 10, 'bar': <function bar at 0x4d6a3c>, '__module__': '__main__',
 'setx': <function setx at 0x4d6a04>, '__doc__': None}

And there we have all the members of the Foo class definition. When Python checks for g.member, it first checks g's vars dictionary for "member," then Foo. If we create a new member of g, it will be added to g's dictionary, but not Foo's.

>>> g.setx(5)
>>> vars(g)
{'x': 5}

Note that if we now assign a value to g.y, we are not assigning that value to Foo.y. Foo.y will still be 10, but g.y will now override Foo.y

>>> g.y = 9
>>> vars(g)
{'y': 9, 'x': 5}
>>> vars(Foo)
{'y': 10, 'bar': <function bar at 0x4d6a3c>, '__module__': '__main__',
 'setx': <function setx at 0x4d6a04>, '__doc__': None}

Sure enough, if we check the values:

>>> g.y
9
>>> Foo.y
10

Note that f.y will also be 10, as Python won't find 'y' in vars(f), so it will get the value of 'y' from vars(Foo).

Some may have also noticed that the methods in Foo appear in the class dictionary along with the x and y. If you remember from the section on lambda functions, we can treat functions just like variables. This means that we can assign methods to a class during runtime in the same way we assigned variables. If you do this, though, remember that if we call a method of a class instance, the first parameter passed to the method will always be the class instance itself.

Changing Class DictionariesEdit

We can also access the members dictionary of a class using the __dict__ member of the class.

>>> g.__dict__
{'y': 9, 'x': 5}

If we add, remove, or change key-value pairs from g.__dict__, this has the same effect as if we had made those changes to the members of g.

>>> g.__dict__['z'] = -4
>>> g.z
-4

New Style ClassesEdit

New style classes were introduced in python 2.2. A new-style class is a class that has a built-in as its base, most commonly object. At a low level, a major difference between old and new classes is their type. Old class instances were all of type instance. New style class instances will return the same thing as x.__class__ for their type. This puts user defined classes on a level playing field with built-ins. Old/Classic classes are slated to disappear in Python 3. With this in mind all development should use new style classes. New Style classes also add constructs like properties and static methods familiar to Java programmers.

Old/Classic Class

>>> class ClassicFoo:
...     def __init__(self):
...         pass

New Style Class

>>> class NewStyleFoo(object):
...     def __init__(self):
...         pass

PropertiesEdit

Properties are attributes with getter and setter methods.

>>> class SpamWithProperties(object):
...     def __init__(self):
...         self.__egg = "MyEgg"
...     def get_egg(self):
...         return self.__egg
...     def set_egg(self, egg):
...         self.__egg = egg
...     egg = property(get_egg, set_egg)

>>> sp = SpamWithProperties()
>>> sp.egg
'MyEgg'
>>> sp.egg = "Eggs With Spam"
>>> sp.egg
'Eggs With Spam'
>>>

and since Python 2.6, with @property decorator

>>> class SpamWithProperties(object):
...     def __init__(self):
...         self.__egg = "MyEgg"
...     @property
...     def egg(self):
...         return self.__egg
...     @egg.setter
...     def egg(self, egg):
...         self.__egg = egg

Static MethodsEdit

Static methods in Python are just like their counterparts in C++ or Java. Static methods have no "self" argument and don't require you to instantiate the class before using them. They can be defined using staticmethod()

>>> class StaticSpam(object):
...     def StaticNoSpam():
...         print "You can't have have the spam, spam, eggs and spam without any spam... that's disgusting"
...     NoSpam = staticmethod(StaticNoSpam)

>>> StaticSpam.NoSpam()
You can't have have the spam, spam, eggs and spam without any spam... that's disgusting

They can also be defined using the function decorator @staticmethod.

>>> class StaticSpam(object):
...     @staticmethod
...     def StaticNoSpam():
...         print "You can't have have the spam, spam, eggs and spam without any spam... that's disgusting"

InheritanceEdit

Like all object oriented languages, Python provides support for inheritance. Inheritance is a simple concept by which a class can extend the facilities of another class, or in Python's case, multiple other classes. Use the following format for this:

class ClassName(BaseClass1, BaseClass2, BaseClass3,...):
    ...

ClassName is what is known as the derived class, that is, derived from the base classes. The derived class will then have all the members of its base classes. If a method is defined in the derived class and in the base class, the member in the derived class will override the one in the base class. In order to use the method defined in the base class, it is necessary to call the method as an attribute on the defining class, as in Foo.setx(f,5) above:

>>> class Foo:
...     def bar(self):
...         print "I'm doing Foo.bar()"
...     x = 10
...
>>> class Bar(Foo):
...     def bar(self):
...         print "I'm doing Bar.bar()"
...         Foo.bar(self)
...     y = 9
...
>>> g = Bar()
>>> Bar.bar(g)
I'm doing Bar.bar()
I'm doing Foo.bar()
>>> g.y
9
>>> g.x
10

Once again, we can see what's going on under the hood by looking at the class dictionaries.

>>> vars(g)
{}
>>> vars(Bar)
{'y': 9, '__module__': '__main__', 'bar': <function bar at 0x4d6a04>,
 '__doc__': None}
>>> vars(Foo)
{'x': 10, '__module__': '__main__', 'bar': <function bar at 0x4d6994>,
 '__doc__': None}

When we call g.x, it first looks in the vars(g) dictionary, as usual. Also as above, it checks vars(Bar) next, since g is an instance of Bar. However, thanks to inheritance, Python will check vars(Foo) if it doesn't find x in vars(Bar).

Multiple inheritanceEdit

As shown in section #Inheritance, a class can be derived from multiple classes:

class ClassName(BaseClass1, BaseClass2, BaseClass3):
    pass

A tricky part about multiple inheritance is method resolution: upon a method call, if the method name is available from multiple base classes or their base classes, which base class method should be called.

The method resolution order depends on whether the class is an old-style class or a new-style class. For old-style classes, derived classes are considered from left to right, and base classes of base classes are considered before moving to the right. Thus, above, BaseClass1 is considered first, and if method is not found there, the base classes of BaseClass1 are considered. If that fails, BaseClass2 is considered, then its base classes, and so on. For new-style classes, see the Python documentation online.

Links:

Special MethodsEdit

There are a number of methods which have reserved names which are used for special purposes like mimicking numerical or container operations, among other things. All of these names begin and end with two underscores. It is convention that methods beginning with a single underscore are 'private' to the scope they are introduced within.

Initialization and DeletionEdit

__init__Edit

One of these purposes is constructing an instance, and the special name for this is '__init__'. __init__() is called before an instance is returned (it is not necessary to return the instance manually). As an example,

class A:
    def __init__(self):
        print 'A.__init__()'
a = A()

outputs

A.__init__()

__init__() can take arguments, in which case it is necessary to pass arguments to the class in order to create an instance. For example,

class Foo:
    def __init__ (self, printme):
        print printme
foo = Foo('Hi!')

outputs

Hi!

Here is an example showing the difference between using __init__() and not using __init__():

class Foo:
    def __init__ (self, x):
         print x
foo = Foo('Hi!')
class Foo2:
    def setx(self, x):
        print x
f = Foo2()
Foo2.setx(f,'Hi!')

outputs

Hi!
Hi!
__del__Edit

Similarly, '__del__' is called when an instance is destroyed; e.g. when it is no longer referenced.

__enter__ and __exit__Edit

These methods are also a constructor and a destructor but they're only executed when the class is instantiated with with. Example:

class ConstructorsDestructors:
    def __init__(self):
        print 'init'

    def __del__(self):
        print 'del'

    def __enter__(self):
        print 'enter'

    def __exit__(self, exc_type, exc_value, traceback):
        print 'exit'

with ConstructorsDestructors():
    pass
init
enter
exit
del
__new__Edit

Metaclass constructor.

RepresentationEdit

__str__Edit

Converting an object to a string, as with the print statement or with the str() conversion function, can be overridden by overriding __str__. Usually, __str__ returns a formatted version of the objects content. This will NOT usually be something that can be executed.

For example:

class Bar:
    def __init__ (self, iamthis):
        self.iamthis = iamthis
    def __str__ (self):
        return self.iamthis
bar = Bar('apple')
print bar

outputs

apple
__repr__Edit

This function is much like __str__(). If __str__ is not present but this one is, this function's output is used instead for printing. __repr__ is used to return a representation of the object in string form. In general, it can be executed to get back the original object.

For example:

class Bar:
    def __init__ (self, iamthis):
        self.iamthis = iamthis
    def __repr__(self):
        return "Bar('%s')" % self.iamthis
bar = Bar('apple')
bar

outputs (note the difference: it may not be necessary to put it inside a print, however in Python 2.7 it does)

Bar('apple')
String Representation Override Functions
Function Operator
__str__ str(A)
__repr__ repr(A)
__unicode__ unicode(x) (2.x only)

AttributesEdit

__setattr__Edit

This is the function which is in charge of setting attributes of a class. It is provided with the name and value of the variables being assigned. Each class, of course, comes with a default __setattr__ which simply sets the value of the variable, but we can override it.

>>> class Unchangable:
...    def __setattr__(self, name, value):
...        print "Nice try"
...
>>> u = Unchangable()
>>> u.x = 9
Nice try
>>> u.x
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: Unchangable instance has no attribute 'x'
__getattr___Edit

Similar to __setattr__, except this function is called when we try to access a class member, and the default simply returns the value.

>>> class HiddenMembers:
...     def __getattr__(self, name):
...         return "You don't get to see " + name
...
>>> h = HiddenMembers()
>>> h.anything
"You don't get to see anything"
__delattr__Edit

This function is called to delete an attribute.

>>> class Permanent:
...     def __delattr__(self, name):
...         print name, "cannot be deleted"
...
>>> p = Permanent()
>>> p.x = 9
>>> del p.x
x cannot be deleted
>>> p.x
9
Attribute Override Functions
Function Indirect form Direct Form
__getattr__ getattr(A, B) A.B
__setattr__ setattr(A, B, C) A.B = C
__delattr__ delattr(A, B) del A.B

Operator OverloadingEdit

Operator overloading allows us to use the built-in Python syntax and operators to call functions which we define.

Binary OperatorsEdit

If a class has the __add__ function, we can use the '+' operator to add instances of the class. This will call __add__ with the two instances of the class passed as parameters, and the return value will be the result of the addition.

>>> class FakeNumber:
...     n = 5
...     def __add__(A,B):
...         return A.n + B.n
...
>>> c = FakeNumber()
>>> d = FakeNumber()
>>> d.n = 7
>>> c + d
12

To override the augmented assignment operators, merely add 'i' in front of the normal binary operator, i.e. for '+=' use '__iadd__' instead of '__add__'. The function will be given one argument, which will be the object on the right side of the augmented assignment operator. The returned value of the function will then be assigned to the object on the left of the operator.

>>> c.__imul__ = lambda B: B.n - 6
>>> c *= d
>>> c
1

It is important to note that the augmented assignment operators will also use the normal operator functions if the augmented operator function hasn't been set directly. This will work as expected, with "__add__" being called for "+=" and so on.

>>> c = FakeNumber()
>>> c += d
>>> c
12
Binary Operator Override Functions
Function Operator
__add__ A + B
__sub__ A - B
__mul__ A * B
__truediv__ A / B
__floordiv__ A // B
__mod__ A % B
__pow__ A ** B
__and__ A & B
__or__ A | B
__xor__ A ^ B
__eq__ A == B
__ne__ A != B
__gt__ A > B
__lt__ A < B
__ge__ A >= B
__le__ A <= B
__lshift__ A << B
__rshift__ A >> B
__contains__ A in B
A not in B
Unary OperatorsEdit

Unary operators will be passed simply the instance of the class that they are called on.

>>> FakeNumber.__neg__ = lambda A : A.n + 6
>>> -d
13
Unary Operator Override Functions
Function Operator
__pos__ +A
__neg__ -A
__inv__ ~A
__abs__ abs(A)
__len__ len(A)
Item OperatorsEdit

It is also possible in Python to override the indexing and slicing operators. This allows us to use the class[i] and class[a:b] syntax on our own objects.

The simplest form of item operator is __getitem__. This takes as a parameter the instance of the class, then the value of the index.

>>> class FakeList:
...     def __getitem__(self,index):
...         return index * 2
...
>>> f = FakeList()
>>> f['a']
'aa'

We can also define a function for the syntax associated with assigning a value to an item. The parameters for this function include the value being assigned, in addition to the parameters from __getitem__

>>> class FakeList:
...     def __setitem__(self,index,value):
...         self.string = index + " is now " + value
...
>>> f = FakeList()
>>> f['a'] = 'gone'
>>> f.string
'a is now gone'

We can do the same thing with slices. Once again, each syntax has a different parameter list associated with it.

>>> class FakeList:
...     def __getslice___(self,start,end):
...         return str(start) + " to " + str(end)
...
>>> f = FakeList()
>>> f[1:4]
'1 to 4'

Keep in mind that one or both of the start and end parameters can be blank in slice syntax. Here, Python has default value for both the start and the end, as show below.

>> f[:]
'0 to 2147483647'

Note that the default value for the end of the slice shown here is simply the largest possible signed integer on a 32-bit system, and may vary depending on your system and C compiler.

  • __setslice__ has the parameters (self,start,end,value)

We also have operators for deleting items and slices.

  • __delitem__ has the parameters (self,index)
  • __delslice__ has the parameters (self,start,end)

Note that these are the same as __getitem__ and __getslice__.

Item Operator Override Functions
Function Operator
__getitem__ C[i]
__setitem__ C[i] = v
__delitem__ del C[i]
__getslice__ C[s:e]
__setslice__ C[s:e] = v
__delslice__ del C[s:e]

Other OverridesEdit

Other Override Functions
Function Operator
__cmp__ cmp(x, y)
__hash__ hash(x)
__nonzero__ bool(x)
__call__ f(x)
__iter__ iter(x)
__reversed__ reversed(x) (2.6+)
__divmod__ divmod(x, y)
__int__ int(x)
__long__ long(x)
__float__ float(x)
__complex__ complex(x)
__hex__ hex(x)
__oct__ oct(x)
__index__
__copy__ copy.copy(x)
__deepcopy__ copy.deepcopy(x)
__sizeof__ sys.getsizeof(x) (2.6+)
__trunc__ math.trunc(x) (2.6+)
__format__ format(x, ...) (2.6+)

Programming PracticesEdit

The flexibility of python classes means that classes can adopt a varied set of behaviors. For the sake of understandability, however, it's best to use many of Python's tools sparingly. Try to declare all methods in the class definition, and always use the <class>.<member> syntax instead of __dict__ whenever possible. Look at classes in C++ and Java to see what most programmers will expect from a class.

EncapsulationEdit

Since all python members of a python class are accessible by functions/methods outside the class, there is no way to enforce encapsulation short of overriding __getattr__, __setattr__ and __delattr__. General practice, however, is for the creator of a class or module to simply trust that users will use only the intended interface and avoid limiting access to the workings of the module for the sake of users who do need to access it. When using parts of a class or module other than the intended interface, keep in mind that the those parts may change in later versions of the module, and you may even cause errors or undefined behaviors in the module.since encapsulation is private.

Doc StringsEdit

When defining a class, it is convention to document the class using a string literal at the start of the class definition. This string will then be placed in the __doc__ attribute of the class definition.

>>> class Documented:
...     """This is a docstring"""
...     def explode(self):
...         """
...         This method is documented, too! The coder is really serious about
...         making this class usable by others who don't know the code as well
...         as he does.
...
...         """
...         print "boom"
>>> d = Documented()
>>> d.__doc__
'This is a docstring'

Docstrings are a very useful way to document your code. Even if you never write a single piece of separate documentation (and let's admit it, doing so is the lowest priority for many coders), including informative docstrings in your classes will go a long way toward making them usable.

Several tools exist for turning the docstrings in Python code into readable API documentation, e.g., EpyDoc.

Don't just stop at documenting the class definition, either. Each method in the class should have its own docstring as well. Note that the docstring for the method explode in the example class Documented above has a fairly lengthy docstring that spans several lines. Its formatting is in accordance with the style suggestions of Python's creator, Guido van Rossum in PEP 8.

Adding methods at runtimeEdit

To a classEdit

It is fairly easy to add methods to a class at runtime. Lets assume that we have a class called Spam and a function cook. We want to be able to use the function cook on all instances of the class Spam:

class Spam:
  def __init__(self):
    self.myeggs = 5

def cook(self):
  print "cooking %s eggs" % self.myeggs

Spam.cook = cook   #add the function to the class Spam
eggs = Spam()      #NOW create a new instance of Spam
eggs.cook()        #and we are ready to cook!

This will output

cooking 5 eggs
To an instance of a classEdit

It is a bit more tricky to add methods to an instance of a class that has already been created. Lets assume again that we have a class called Spam and we have already created eggs. But then we notice that we wanted to cook those eggs, but we do not want to create a new instance but rather use the already created one:

class Spam:
  def __init__(self):
    self.myeggs = 5

eggs = Spam()

def cook(self):
  print "cooking %s eggs" % self.myeggs

import types
f = types.MethodType(cook, eggs, Spam)
eggs.cook = f

eggs.cook()

Now we can cook our eggs and the last statement will output:

cooking 5 eggs
Using a functionEdit

We can also write a function that will make the process of adding methods to an instance of a class easier.

def attach_method(fxn, instance, myclass):
  f = types.MethodType(fxn, instance, myclass)
  setattr(instance, fxn.__name__, f)

All we now need to do is call the attach_method with the arguments of the function we want to attach, the instance we want to attach it to and the class the instance is derived from. Thus our function call might look like this:

attach_method(cook, eggs, Spam)

Note that in the function add_method we cannot write instance.fxn = f since this would add a function called fxn to the instance.

External linksEdit



Exceptions


Python 2 handles all errors with exceptions.

An exception is a signal that an error or other unusual condition has occurred. There are a number of built-in exceptions, which indicate conditions like reading past the end of a file, or dividing by zero. You can also define your own exceptions.

OverviewEdit

Exceptions in Python at a glance:

import random
try:
  ri = random.randint(0, 2)
  if ri == 0:
    infinity = 1/0
  elif ri == 1:
    raise ValueError("Message")
    #raise ValueError, "Message" # Deprecated
  elif ri == 2:
    raise ValueError # Without message
except ZeroDivisionError:
  pass
except ValueError as valerr:
# except ValueError, valerr: # Deprecated?
  print valerr
  raise # Raises the exception just caught
except: # Any other exception
  pass
finally: # Optional
  pass # Clean up

class CustomValueError(ValueError): pass # Custom exception
try:
  raise CustomValueError
  raise TypeError
except (ValueError, TypeError): # Value error catches custom, a derived class, as well
  pass                          # A tuple catches multiple exception classes

Raising exceptionsEdit

Whenever your program attempts to do something erroneous or meaningless, Python raises exception to such conduct:

>>> 1 / 0
Traceback (most recent call last):
    File "<stdin>", line 1, in ?
ZeroDivisionError: integer division or modulo by zero

This traceback indicates that the ZeroDivisionError exception is being raised. This is a built-in exception -- see below for a list of all the other ones.

Catching exceptionsEdit

In order to handle errors, you can set up exception handling blocks in your code. The keywords try and except are used to catch exceptions. When an error occurs within the try block, Python looks for a matching except block to handle it. If there is one, execution jumps there.

If you execute this code:

try:
    print 1/0
except ZeroDivisionError:
    print "You can't divide by zero, you're silly."

Then Python will print this:

You can't divide by zero, you're silly.

If you don't specify an exception type on the except line, it will cheerfully catch all exceptions. This is generally a bad idea in production code, since it means your program will blissfully ignore unexpected errors as well as ones which the except block is actually prepared to handle.

Exceptions can propagate up the call stack:

def f(x):
    return g(x) + 1

def g(x):
    if x < 0: raise ValueError, "I can't cope with a negative number here."
    else: return 5

try:
    print f(-6)
except ValueError:
    print "That value was invalid."

In this code, the print statement calls the function f. That function calls the function g, which will raise an exception of type ValueError. Neither f nor g has a try/except block to handle ValueError. So the exception raised propagates out to the main code, where there is an exception-handling block waiting for it. This code prints:

That value was invalid.

Sometimes it is useful to find out exactly what went wrong, or to print the python error text yourself. For example:

try:
    the_file = open("the_parrot")
except IOError, (ErrorNumber, ErrorMessage):
    if ErrorNumber == 2: # file not found
        print "Sorry, 'the_parrot' has apparently joined the choir invisible."
    else:
        print "Congratulation! you have managed to trip a #%d error" % ErrorNumber
        print ErrorMessage

Which of course will print:

Sorry, 'the_parrot' has apparently joined the choir invisible.

Custom ExceptionsEdit

Code similar to that seen above can be used to create custom exceptions and pass information along with them. This can be extremely useful when trying to debug complicated projects. Here is how that code would look; first creating the custom exception class:

class CustomException(Exception):
    def __init__(self, value):
        self.parameter = value
    def __str__(self):
        return repr(self.parameter)

And then using that exception:

try:
    raise CustomException("My Useful Error Message")
except CustomException, (instance):
    print "Caught: " + instance.parameter

Trying over and over againEdit

Recovering and continuing with finallyEdit

Exceptions could lead to a situation where, after raising an exception, the code block where the exception occurred might not be revisited. In some cases this might leave external resources used by the program in an unknown state.

finally clause allows programmers to close such resources in case of an exception. Between 2.4 and 2.5 version of python there is change of syntax for finally clause.

  • Python 2.4
try:
    result = None
    try:
        result = x/y
    except ZeroDivisionError:
        print "division by zero!"
    print "result is ", result
finally:
    print "executing finally clause"
  • Python 2.5
try:
    result = x / y
except ZeroDivisionError:
    print "division by zero!"
else:
    print "result is", result
finally:
    print "executing finally clause"

Built-in exception classesEdit

All built-in Python exceptions

Exotic uses of exceptionsEdit

Exceptions are good for more than just error handling. If you have a complicated piece of code to choose which of several courses of action to take, it can be useful to use exceptions to jump out of the code as soon as the decision can be made. The Python-based mailing list software Mailman does this in deciding how a message should be handled. Using exceptions like this may seem like it's a sort of GOTO -- and indeed it is, but a limited one called an escape continuation. Continuations are a powerful functional-programming tool and it can be useful to learn them.

Just as a simple example of how exceptions make programming easier, say you want to add items to a list but you don't want to use "if" statements to initialize the list we could replace this:

if hasattr(self, 'items'):
    self.items.extend(new_items)
else:
    self.items = list(new_items)

Using exceptions, we can emphasize the normal program flow—that usually we just extend the list—rather than emphasizing the unusual case:

try:
    self.items.extend(new_items)
except AttributeError:
    self.items = list(new_items)

External linksEdit



Errors


In python there are three types of errors; syntax errors, logic errors and exceptions.

Syntax errorsEdit

Syntax errors are the most basic type of error. They arise when the Python parser is unable to understand a line of code. Syntax errors are almost always fatal, i.e. there is almost never a way to successfully execute a piece of code containing syntax errors. Some syntax errors can be caught and handled, like eval(""), but these are rare.

In IDLE, it will highlight where the syntax error is. Most syntax errors are typos, incorrect indentation, or incorrect arguments. If you get this error, try looking at your code for any of these.

Logic errorsEdit

These are the most difficult type of error to find, because they will give unpredictable results and may crash your program.  A lot of different things can happen if you have a logic error. However these are very easy to fix as you can use a debugger, which will run through the program and fix any problems.

ExceptionsEdit

Exceptions arise when the python parser knows what to do with a piece of code but is unable to perform the action. An example would be trying to access the internet with python without an internet connection; the python interpreter knows what to do with that command but is unable to perform it.

Dealing with exceptionsEdit

Unlike syntax errors, exceptions are not always fatal. Exceptions can be handled with the use of a try statement.

Consider the following code to display the HTML of the website 'example.com'. When the execution of the program reaches the try statement it will attempt to perform the indented code following, if for some reason there is an error (the computer is not connected to the internet or something) the python interpreter will jump to the indented code below the 'except:' command.

import urllib2
url = 'http://www.example.com'
try:
    req = urllib2.Request(url)
    response = urllib2.urlopen(req)
    the_page = response.read()
    print the_page
except:
    print "We have a problem."

Another way to handle an error is to except a specific error.

try:
    age = int(raw_input("Enter your age: "))
    print "You must be {0} years old.".format(age)
except ValueError:
    print "Your age must be numeric."

If the user enters a numeric value as his/her age, the output should look like this:

Enter your age: 5
Your age must be 5 years old.

However, if the user enters a non-numeric value as his/her age, a ValueError is thrown when trying to execute the int() method on a non-numeric string, and the code under the except clause is executed:

Enter your age: five
Your age must be numeric.

You can also use a try block with a while loop to validate input:

valid = False
while valid == False:
    try:
        age = int(raw_input("Enter your age: "))
        valid = True     # This statement will only execute if the above statement executes without error.
        print "You must be {0} years old.".format(age)
    except ValueError:
        print "Your age must be numeric."

The program will prompt you for your age until you enter a valid age:

Enter your age: five
Your age must be numeric.
Enter your age: abc10
Your age must be numeric.
Enter your age: 15
You must be 15 years old.

In certain other cases, it might be necessary to get more information about the exception and deal with it appropriately. In such situations the except as construct can be used.

f=raw_input("enter the name of the file:")
l=raw_input("enter the name of the link:")
try:
    os.symlink(f,l)
except OSError as e:
    print "an error occured linking %s to %s: %s\n error no %d"%(f,l,e.args[1],e.args[0])
enter the name of the file:file1.txt
enter the name of the link:AlreadyExists.txt
an error occured linking file1.txt to AlreadyExists.txt: File exists
 error no 17

enter the name of the file:file1.txt
enter the name of the link:/Cant/Write/Here/file1.txt
an error occured linking file1.txt to /Cant/Write/Here/file1.txt: Permission denied
 error no 13



Source Documentation and Comments


Documentation is the process of leaving information about your code. The two mechanisms for doing this in Python are comments and documentation strings.

CommentsEdit

There will always be a time in which you have to return to your code. Perhaps it is to fix a bug, or to add a new feature. Regardless, looking at your own code after six months is almost as bad as looking at someone else's code. What one needs is a means to leave reminders to yourself as to what you were doing.

For this purpose, you leave comments. Comments are little snippets of text embedded inside your code that are ignored by the Python interpreter. A comment is denoted by the hash character (#) and extends to the end of the line. For example:

#!/usr/bin/env python
# commentexample.py

# Display the knights that come after Scene 24
print("The Knights Who Say Ni!")
# print("I will never see the light of day!")

As you can see, you can also use comments to temporarily remove segments of your code, like the second print statement.

Comment GuidelinesEdit

The following guidelines are from PEP 8, written by Guido van Rossum.

  • General
    • Comments that contradict the code are worse than no comments. Always make a priority of keeping the comments up-to-date when the code changes!
    • Comments should be complete sentences. If a comment is a phrase or sentence, its first word should be capitalized, unless it is an identifier that begins with a lower case letter (never alter the case of identifiers!).
    • If a comment is short, the period at the end can be omitted. Block comments generally consist of one or more paragraphs built out of complete sentences, and each sentence should end in a period.
    • You should use two spaces after a sentence-ending period.
    • When writing English, Strunk and White applies.
    • Python coders from non-English speaking countries: please write your comments in English, unless you are 120% sure that the code will never be read by people who don't speak your language.
  • Inline Comments
    • An inline comment is a comment on the same line as a statement. Inline comments should be separated by at least two spaces from the statement. They should start with a # and a single space.
    • Inline comments are unnecessary and in fact distracting if they state the obvious. Don't do this:
      x = x + 1  # Increment x
      
      But sometimes, this is useful:
      x = x + 1  # Compensate for border
      

Documentation StringsEdit

But what if you just want to know how to use a function, class, or method? You could add comments before the function, but comments are inside the code, so you would have to pull up a text editor and view them that way. But you can't pull up comments from a C extension, so that is less than ideal. You could always write a separate text file with how to call the functions, but that would mean that you would have to remember to update that file. If only there was a mechanism for being able to embed the documentation and get at it easily...

Fortunately, Python has such a capability. Documentation strings (or docstrings) are used to create easily-accessible documentation. You can add a docstring to a function, class, or module by adding a string as the first indented statement. For example:

#!/usr/bin/env python
# docstringexample.py

"""Example of using documentation strings."""

class Knight:
    """
    An example class.
    
    Call spam to get bacon.
    """
    
    def spam(eggs="bacon"):
        """Prints the argument given."""
        print(eggs)

The convention is to use triple-quoted strings, because it makes it easier to add more documentation spanning multiple lines.

To access the documentation, you can use the help function inside a Python shell with the object you want help on, or you can use the pydoc command from your system's shell. If we were in the directory where docstringexample.py lives, one could enter pydoc docstringexample to get documentation on that module.



Idioms

Python is a strongly idiomatic language: there is generally a single optimal way of doing something (a programming idiom), rather than many ways: “There’s more than one way to do it” is not a Python motto.

This section starts with some general principles, then goes through the language, highlighting how to idiomatically use operations, data types, and modules in the standard library.

PrinciplesEdit

Use exceptions for error-checking, following EAFP (It’s Easier to Ask Forgiveness than Permission) instead of LBYL (Look Before You Leap): put an action that may fail inside a try...except block.

Use context managers for managing resources, like files. Use finally for ad hoc cleanup, but prefer to write a context manager to encapsulate this.

Use properties, not getter/setter pairs.

Use dictionaries for dynamic records, classes for static records (for simple classes, use collections.namedtuple): if a record always has the same fields, make this explicit in a class; if the fields may vary (be present or not), use a dictionary.

Use _ for throwaway variables, like discarding a return value when a tuple is returned, or to indicate that a parameter is being ignored (when required for an interface, say). You can use *_, **__ to discard positional or keyword arguments passed to a function: these correspond to the usual *args, **kwargs parameters, but explicitly discarded. You can also use these in addition to positional or named parameters (following the ones you use), allowing you to use some and discard any excess ones.

Use implicit True/False (truthy/falsy values), except when needing to distinguish between falsy values, like None, 0, and [], in which case use an explicit check like is None or == 0.

Use the optional else clause after try, for, while not just if.

ImportsEdit

For very robust code, only import modules, not names (like functions or classes), as this creates a new (name) binding, which is not necessarily in sync with the existing binding.[1] For example, given a module m which defines a function f, importing the function with from m import f means that m.f and f can differ if either is assigned to (creating a new binding).

In practice, this is frequently ignored, particularly for small-scale code, as changing a module post-import is rare, so this is rarely a problem, and both classes and functions are imported from modules so they can be referred to without a prefix. However, for robust, large-scale code, this is an important rule, as it risks creating very subtle bugs.

For robust code with low typing, one can use a renaming import to abbreviate a long module name:

import module_with_very_long_name as vl
vl.f()  # easier than module_with_very_long_name.f, but still robust

Note that importing submodules (or subpackages) from a package using from is completely fine:

from p import sm  # completely fine
sm.f()

OperationsEdit

Swap values
b, a = a, b
Attribute access on nullable value

To access an attribute (esp. to call a method) on a value that might be an object, or might be None, use the boolean shortcircuiting of and:

a and a.x
a and a.f()

Particularly useful for regex matches:

match and match.group(0)
in

in in can be used for substring checking

Data typesEdit

All sequence typesEdit

Indexing during iteration

Use enumerate() if you need to keep track of iteration cycles over an iterable:

for i, x in enumerate(l):
    # ...

Anti-idiom:

for i in range(len(l)):
    x = l[i]  # why did you go from list to numbers back to the list?
    # ...
Finding first matching element

Python sequences do have an index method, but this returns the index of the first occurence of a specific value in the sequence. To find the first occurence of a value that satisfies a condition, instead, use next and a generator expression:

try:
    x = next(i for i, n in enumerate(l) if n > 0)
except StopIteration:
    print('No positive numbers')
else:
    print('The index of the first positive number is', x)

If you need the value, not the index of its occurence, you can get it directly through:

try:
    x = next(n for n in l if n > 0)
except StopIteration:
    print('No positive numbers')
else:
    print('The first positive number is', x)

The reason for this construct is twofold:

  • Exceptions let you signal “no match found” (they solve the semipredicate problem): since you’re returning a single value (not an index), this can’t be returned in the value.
  • Generator expressions let you use an expression without needing a lambda or introducing new grammar.
Truncating

For mutable sequences, use del, instead of reassigning to a slice:

del l[j:]
del l[:i]

Anti-idiom:

l = l[:j]
l = l[i:]

The simplest reason is that del makes your intention clear: you’re truncating.

More subtly, slicing creates another reference to the same list (because lists are mutable), and then unreachable data can be garbage-collected, but generally this is done later. Deleting instead immediately modifies the list in-place (which is faster than creating a slice and then assigning it to the existing variable), and allows Python to immediately deallocate the deleted elements, instead of waiting for garbage collection.

In some cases you do want 2 slices of the same list – though this is rare in basic programming, other than iterating once over a slice in a for loop – but it’s rare that you’ll want to make a slice of a whole list, then replace the original list variable with a slice (but not change the other slice!), as in the following funny-looking code:

m = l
l = l[i:j]  # why not m = l[i:j] ?
Sorted list from an iterable

You can create a sorted list directly from any iterable, without needing to first make a list and then sort it. These include sets and dictionaries (iterate on the keys):

s = {1, 'a', ...}
l = sorted(s)
d = {'a': 1, ...}
l = sorted(s)

TuplesEdit

Use tuples for constant sequences. This is rarely necessary (primarily when using as keys in a dictionary), but makes intention clear.

StringsEdit

Substring

Use in for substring checking.

However, do not use in to check if a string is a single-character match, since it matches substrings and will return spurious matches – instead use a tuple of valid values. For example, the following is wrong:

def valid_sign(sign):
    return sign in '+-'  # wrong, returns true for sign == '+-'

Instead, use a tuple:

def valid_sign(sign):
    return sign in ('+', '-')
Building a string

To make a long string incrementally, build a list and then join it with '' – or with newlines, if building a text file (don’t forget the final newline in this case!). This is faster and clearer than appending to a string, which is often slow. (In principle can be in overall length of string and number of additions, which is if pieces are of similar sizes.)

However, there are some optimizations in some versions CPython that make simple string appending fast – string appending in CPython 2.5+, and bytestring appending in CPython 3.0+ are fast, but for building Unicode strings (unicode in Python 2, string in Python 3), joining is faster. If doing extensive string manipulation, be aware of this and profile your code. See Performance Tips: String Concatenation and Concatenation Test Code for details.

Don’t do this:

s = ''
for x in l:
    # this makes a new string every iteration, because strings are immutable
    s += x

Instead:

# ...
# l.append(x)
s = ''.join(l)

You can even use generator expressions, which are extremely efficient:

s = ''.join(f(x) for x in l)

If you do want a mutable string-like object, you can use StringIO.

DictionariesEdit

To iterate through a dictionary, either keys, values, or both:

# Iterate over keys
for k in d:
    ...

# Iterate over values, Python 3
for v in d.values():
    ...

# Iterate over values, Python 2
# In Python 2, dict.values() returns a copy
for v in d.itervalues():
    ...

# Iterate over keys and values, Python 3
for k, v in d.items():
    ...

# Iterate over values, Python 2
# In Python 2, dict.items() returns a copy
for k, v in d.iteritems():
    ...

Anti-patterns:

for k, _ in d.items():  # instead: for k in d:
    ...
for _, v in d.items():  # instead: for v in d.values()
    ...

FIXME:

  • setdefault
  • usually better to use collections.defaultdict

dict.get is useful, but using dict.get and then checking if it is None as a way of testing if the key is in the dictionary is an anti-idiom, as None is a potential value, and whether the key is in the dictionary can be checked directly. It’s ok to use get and compare with None if this is not a potential value, however.

Simple:

if 'k' in d:
    # ... d['k']

Anti-idiom (unless None is not a potential value):

v = d.get('k')
if v is not None:
    # ... v
Dict from parallel sequences of keys and values

Use zip as: dict(zip(keys, values))

ModulesEdit

reEdit

Match if found, else None:

match = re.match(r, s)
return match and match.group(0)

…returns None if no match, and the match contents if there is one.

ReferencesEdit

Further readingEdit



Decorators


Duplicated code is recognized as bad practice in software for lots of reasons, not least of which is that it requires more work to maintain. If you have the same algorithm operating twice on different pieces of data you can put the algorithm in a function and pass in the data to avoid having to duplicate the code. However, sometimes you find cases where the code itself changes, but two or more places still have significant chunks of duplicated boilerplate code. A typical example might be logging:

def multiply(a, b):
    result = a * b
    log("multiply has been called")
    return result

def add(a, b):
    result = a + b
    log("add has been called")
    return result

In a case like this, it's not obvious how to factor out the duplication. We can follow our earlier pattern of moving the common code to a function, but calling the function with different data is not enough to produce the different behavior we want (add or multiply). Instead, we have to pass a function to the common function. This involves a function that operates on a function, known as a higher-order function.

Decorator in Python is a syntax sugar for high-level function.

Minimal example of property decorator:

>>> class Foo(object):
...     @property
...     def bar(self):
...         return 'baz'
...
>>> F = Foo()
>>> print F.bar
baz

The above example is really just a syntax sugar for codes like this:

>>> class Foo(object):
...     def bar(self):
...         return 'baz'
...     bar = property(bar)
...
>>> F = Foo()
>>> print F.bar
baz

Minimal Example of generic decorator:

>>> def decorator(f):
...     def called(*args, **kargs):
...         print 'A function is called somewhere'
...         return f(*args, **kargs)
...     return called
...
>>> class Foo(object):
...     @decorator
...     def bar(self):
...         return 'baz'
...
>>> F = Foo()
>>> print F.bar()
A function is called somewhere
baz

A good use for the decorators is to allow you to refactor your code so that common features can be moved into decorators. Consider for example, that you would like to trace all calls to some functions and print out the values of all the parameters of the functions for each invocation. Now you can implement this in a decorator as follows:

#define the Trace class that will be 
#invoked using decorators
class Trace(object):
    def __init__(self, f):
        self.f =f

    def __call__(self, *args, **kwargs):
        print "entering function " + self.f.__name__
        i=0
        for arg in args:
            print "arg {0}: {1}".format(i, arg)
            i =i+1
            
        return self.f(*args, **kwargs)

Then you can use the decorator on any function that you defined by:

@Trace
def sum(a, b):
    print "inside sum"
    return a + b

On running this code you would see output like

>>> sum(3,2)
entering function sum
arg 0: 3
arg 1: 2
inside sum

Alternately, instead of creating the decorator as a class, you could have used a function as well.

def Trace(f):
    def my_f(*args, **kwargs):
        print "entering " +  f.__name__
        result= f(*args, **kwargs)
        print "exiting " +  f.__name__
        return result
    my_f.__name = f.__name__
    my_f.__doc__ = f.__doc__
    return my_f

#An example of the trace decorator
@Trace
def sum(a, b):
    print "inside sum"
    return a + b

#if you run this you should see
>>> sum(3,2)
entering sum
inside sum
exiting sum
5

Remember it is good practice to return the function or a sensible decorated replacement for the function so that decorators can be chained.



Context Managers


A basic issue in programming is resource management: a resource is anything in limited supply, notably file handles, network sockets, locks, etc., and a key problem is making sure these are released after they are acquired. If they are not released, you have a resource leak, and the system may slow down or crash. More generally, you may want cleanup actions to always be done, other than simply releasing resources.

Python provides special syntax for this in the with statement, which automatically manages resources encapsulated within context manager types, or more generally performs startup and cleanup actions around a block of code. You should always use a with statement for resource management. There are many built-in context manager types, including the basic example of File, and it is easy to write your own. The code is not hard, but the concepts are slightly subtle, and it is easy to make mistakes.

Basic resource managementEdit

Basic resource management uses an explicit pair of open()...close() functions, as in basic file opening and closing. Don’t do this, for the reasons we are about to explain:

f = open(filename)
# ...
f.close()

The key problem with this simple code is that it fails if there is an early return, either due to a return statement or an exception, possibly raised by called code. To fix this, ensuring that the cleanup code is called when the block is exited, one uses a try...finally clause:

f = open(filename)
try:
    # ...
finally:
    f.close()

However, this still requires manually releasing the resource, which might be forgotten, and the release code is distant from the acquisition code. The release can be done automatically by instead using with, which works because File is a context manager type:

with open(filename) as f:
    # ...

This assigns the value of open(filename) to f (this point is subtle and varies between context managers), and then automatically releases the resource, in this case calling f.close(), when the block exits.

Technical detailsEdit

Newer objects are context managers (formally context manager types: subtypes, as they implement the context manager interface, which consists of __enter__(), __exit__()), and thus can be used in with statements easily (see With Statement Context Managers).

For older file-like objects that have a close method but not __exit__(), you can use the @contextlib.closing decorator. If you need to roll your own, this is very easy, particularly using the @contextlib.contextmanager decorator.[1]

Context managers work by calling __enter__() when the with context is entered, binding the return value to the target of as, and calling __exit__() when the context is exited. There’s some subtlety about handling exceptions during exit, but you can ignore it for simple use.

More subtly, __init__() is called when an object is created, but __enter__() is called when a with context is entered.

The __init__()/__enter__() distinction is important to distinguish between single use, reusable and reentrant context managers. It’s not a meaningful distinction for the common use case of instantiating an object in the with clause, as follows:

with A() as a:
    ...

…in which case any single use context manager is fine.

However, in general it is a difference, notably when distinguishing a reusable context manager from the resource it is managing, as in here:

a_cm = A()
with a_cm as a:
   ...

Putting resource acquisition in __enter__() instead of __init__() gives a reusable context manager.

Notably, File() objects do the initialization in __init__() and then just returns itself when entering a context, as in def __enter__(): return self. This is fine if you want the target of the as to be bound to an object (and allows you to use factories like open as the source of the with clause), but if you want it to be bound to something else, notably a handle (file name or file handle/file descriptor), you want to wrap the actual object in a separate context manager. For example:

@contextmanager
def FileName(*args, **kwargs):
   with File(*args, **kwargs) as f:
       yield f.name

For simple uses you don’t need to do any __init__() code, and only need to pair __enter__()/__exit__(). For more complicated uses you can have reentrant context managers, but that’s not necessary for simple use.

CaveatsEdit

try...finallyEdit

Note that a try...finally clause is necessary with @contextlib.contextmanager, as this does not catch any exceptions raised after the yield, but is not necessary in __exit__(), which is called even if an exception is raised.

Context, not scopeEdit

The term context manager is carefully chosen, particularly in contrast to “scope”. Local variables in Python have function scope, and thus the target of a with statement, if any, is still visible after the block has exited, though __exit__() has already been called on the context manager (the argument of the with statement), and thus is often not useful or valid. This is a technical point, but it’s worth distinguishing the with statement context from the overall function scope.

GeneratorsEdit

Generators that hold or use resources are a bit tricky.

Beware that creating generators within a with statement and then using them outside the block does not work, because generators have deferred evaluation, and thus when they are evaluated, the resource has already been released. This is most easily seen using a file, as in this generator expression to convert a file to a list of lines, stripping the end-of-line character:

with open(filename) as f:
    lines = (line.rstrip('\n') for line in f)

When lines is then used – evaluation can be forced with list(lines) – this fails with ValueError: I/O operation on closed file. This is because the file is closed at the end of the with statement, but the lines are not read until the generator is evaluated.

The simplest solution is to avoid generators, and instead use lists, such as list comprehensions. This is generally appropriate in this case (reading a file) since one wishes to minimize system calls and just read the file all at once (unless the file is very large):

with open(filename) as f:
    lines = [line.rstrip('\n') for line in f]

In case that one does wish to use a resource in a generator, the resource must be held within the generator, as in this generator function:

def stripped_lines(filename):
    with open(filename) as f:
        for line in f:
            yield line.rstrip('\n')

As the nesting makes clear, the file is kept open while iterating through it.

To release the resource, the generator must be explicitly closed, using generator.close(), just as with other objects that hold resources (this is the dispose pattern). This can in turn be automated by making the generator into a context manager, using @contextlib.closing, as:

from contextlib import closing

with closing(stripped_lines(filename)) as lines:
    # ...

Not RAIIEdit

Resource Acquisition Is Initialization is an alternative form of resource management, particularly used in C++. In RAII, resources are acquired during object construction, and released during object destruction. In Python the analogous functions are __init__() and __del__() (finalizer), but RAII does not work in Python, and releasing resources in __del__() does not work. This is because there is no guarantee that __del__() will be called: it’s just for memory manager use, not for resource handling.

In more detail, Python object construction is two-phase, consisting of (memory) allocation in __new__() and (attribute) initialization in __init__(). Python is garbage-collected via reference counting, with objects being finalized (not destructed) by __del__(). However, finalization is non-deterministic (objects have non-deterministic lifetimes), and the finalizer may be called much later or not at all, particularly if the program crashes. Thus using __del__() for resource management will generally leak resources.

It is possible to use finalizers for resource management, but the resulting code is implementation-dependent (generally working in CPython but not other implementations, such as PyPy) and fragile to version changes. Even if this is done, it requires great care to ensure references drop to zero in all circumstances, including: exceptions, which contain references in tracebacks if caught or if running interactively; and references in global variables, which last until program termination. Prior to Python 3.4, finalizers on objects in cycles were also a serious problem, but this is no longer a problem; however, finalization of objects in cycles is not done in a deterministic order.

ReferencesEdit

External linksEdit



Reflection


A Python script can find out about the type, class, attributes and methods of an object. This is referred to as reflection or introspection. See also Metaclasses.

Reflection-enabling functions include type(), isinstance(), callable(), dir() and getattr().

TypeEdit

The type method enables to find out about the type of an object. The following tests return True:

  • type(3) is int
  • type(3.0) is float
  • type(10**10) is long # Python 2
  • type(1 + 1j) is complex
  • type('Hello') is str
  • type([1, 2]) is list
  • type([1, [2, 'Hello']]) is list
  • type({'city': 'Paris'}) is dict
  • type((1,2)) is tuple
  • type(set()) is set
  • type(frozenset()) is frozenset
  • ----
  • type(3).__name__ == "int"
  • type('Hello').__name__ == "str"
  • ----
  • import types, re, Tkinter # For the following examples
  • type(re) is types.ModuleType
  • type(re.sub) is types.FunctionType
  • type(Tkinter.Frame) is types.ClassType
  • type(Tkinter.Frame).__name__ == "classobj"
  • type(Tkinter.Frame()).__name__ == "instance"
  • type(re.compile('myregex')).__name__ == "SRE_Pattern"
  • type(type(3)) is types.TypeType

The type function disregards class inheritance: "type(3) is object" yields False while "isinstance(3, object)" yields True.

Links:

IsinstanceEdit

Determines whether an object is an instance of a class.

The following tests return True:

  • isinstance(3, int)
  • isinstance([1, 2], list)
  • isinstance(3, object)
  • isinstance([1, 2], object)
  • import Tkinter; isinstance(Tkinter.Frame(), Tkinter.Frame)
  • import Tkinter; Tkinter.Frame().__class__.__name__ == "Frame"

Note that isinstance provides a weaker condition than a comparison using #Type.

Function isinstance and a user-defined class:

class Plant: pass                        # Dummy class
class Tree(Plant): pass                  # Dummy class derived from Plant
tree = Tree()                            # A new instance of Tree class
print isinstance(tree, Tree)             # True
print isinstance(tree, Plant)            # True
print isinstance(tree, object)           # True
print type(tree) is Tree                 # False
print type(tree).__name__ == "instance"  # True
print tree.__class__.__name__ == "Tree"  # True

Links:

IssubclassEdit

Determines whether a class is a subclass of another class. Pertains to classes, not their instances.

class Plant: pass                        # Dummy class
class Tree(Plant): pass                  # Dummy class derived from Plant
tree = Tree()                            # A new instance of Tree class
print issubclass(Tree, Plant)            # True
print issubclass(Tree, object)           # False in Python 2
print issubclass(int, object)            # True
print issubclass(tree, Plant)            # Error - tree is not a class

Links:

Duck typingEdit

Duck typing provides an indirect means of reflection. It is a technique consisting in using an object as if it was of the requested type, while catching exceptions resulting from the object not supporting some of the features of the class or type.

Links:

CallableEdit

For an object, determines whether it can be called. A class can be made callable by providing a __call__() method.

Examples:

  • callable(2)
    • Returns False. Ditto for callable("Hello") and callable([1, 2]).
  • callable([1,2].pop)
    • Returns True, as pop without "()" returns a function object.
  • callable([1,2].pop())
    • Returns False, as [1,2].pop() returns 2 rather than a function object.

Links:

DirEdit

Returns the list of names of attributes of an object, which includes methods. Is somewhat heuristic and possibly incomplete, as per python.org.

Examples:

  • dir(3)
  • dir("Hello")
  • dir([1, 2])
  • import re; dir(re)
    • Lists names of functions and other objects available in the re module for regular expressions.

Links:

GetattrEdit

Returns the value of an attribute of an object, given the attribute name passed as a string.

An example:

  • getattr(3, "imag")

The list of attributes of an object can be obtained using #Dir.

Links:

KeywordsEdit

A list of Python keywords can be obtained from Python:

import keyword
pykeywords = keyword.kwlist
print keyword.iskeyword("if")      # True
print keyword.iskeyword("True")    # False

Links:

Built-insEdit

A list of Python built-in objects and functions can be obtained from Python:

print dir(__builtins__)           # Output the list
print type(__builtins__.list)     # = <type 'type'>
print type(__builtins__.open)     # = <type 'builtin_function_or_method'>
print list is __builtins__.list   # True
print open is __builtins__.open   # True

Links:

External linksEdit



Metaclasses


In Python, classes are themselves objects. Just as other objects are instances of a particular class, classes themselves are instances of a metaclass.

Python3Edit

The Pep 3115 defines the changes to python 3 metaclasses. In python3 you have a method __prepare__ that is called in the metaclass to create a dictionary or other class to store the class members.[1] Then there is the __new__ method that is called to create new instances of that class. [2]

Class FactoriesEdit

The simplest use of Python metaclasses is a class factory. This concept makes use of the fact that class definitions in Python are first-class objects. Such a function can create or modify a class definition, using the same syntax one would normally use in declaring a class definition. Once again, it is useful to use the model of classes as dictionaries. First, let's look at a basic class factory:

>>> def StringContainer():
...     # define a class
...     class String:
...         def __init__(self):
...             self.content_string = ""
...         def len(self):
...             return len(self.content_string)
...     # return the class definition
...     return String
...
>>> # create the class definition
... container_class = StringContainer()
>>>
>>> # create an instance of the class
... wrapped_string = container_class()
>>>
>>> # take it for a test drive
... wrapped_string.content_string = 'emu emissary'
>>> wrapped_string.len()
12

Of course, just like any other data in Python, class definitions can also be modified. Any modifications to attributes in a class definition will be seen in any instances of that definition, so long as that instance hasn't overridden the attribute that you're modifying.

>>> def DeAbbreviate(sequence_container):
...     sequence_container.length = sequence_container.len
...     del sequence_container.len
...
>>> DeAbbreviate(container_class)
>>> wrapped_string.length()
12
>>> wrapped_string.len()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: String instance has no attribute 'len'

You can also delete class definitions, but that will not affect instances of the class.

>>> del container_class
>>> wrapped_string2 = container_class()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'container_class' is not defined
>>> wrapped_string.length()
12

The type MetaclassEdit

The metaclass for all standard Python types is the "type" object.

>>> type(object)
<type 'type'>
>>> type(int)
<type 'type'>
>>> type(list)
<type 'type'>

Just like list, int and object, "type" is itself a normal Python object, and is itself an instance of a class. In this case, it is in fact an instance of itself.

>>> type(type)
<type 'type'>

It can be instantiated to create new class objects similarly to the class factory example above by passing the name of the new class, the base classes to inherit from, and a dictionary defining the namespace to use.

For instance, the code:

>>> class MyClass(BaseClass):
...     attribute = 42

Could also be written as:

>>> MyClass = type("MyClass", (BaseClass,), {'attribute' : 42})

MetaclassesEdit

It is possible to create a class with a different metaclass than type by setting its __metaclass__ attribute when defining. When this is done, the class, and its subclass will be created using your custom metaclass. For example

class CustomMetaclass(type):
    def __init__(cls, name, bases, dct):
        print "Creating class %s using CustomMetaclass" % name
        super(CustomMetaclass, cls).__init__(name, bases, dct)

class BaseClass(object):
    __metaclass__ = CustomMetaclass

class Subclass1(BaseClass):
    pass

This will print

Creating class BaseClass using CustomMetaclass
Creating class Subclass1 using CustomMetaclass

By creating a custom metaclass in this way, it is possible to change how the class is constructed. This allows you to add or remove attributes and methods, register creation of classes and subclasses creation and various other manipulations when the class is created.

More resourcesEdit

ReferencesEdit

Clipboard

To do:
[Incomplete] (see Putting Metaclasses to Work, Ira R. Forman, Scott H. Danforth?)



Namespace




Performance

Since Python is an interpreted language in its most commonly used CPython implementation, it is many times slower in a variety of tasks than the most commonly used compiled non-managed languages such as C and C++; for some tasks, it is more than 100 times slower. CPython seems to be somewhat slower than Perl, another interpreted language, in multiple tasks.

Peformance can be measured using benchmarks. Benchmarks are often far from representative of the real-world usage and have to be taken with a grain of salt. Some benchmarks are outright wrong in that non-idiomatic code is used for a language, yielding avoidably low performance for the language.

PyPy is a just-in-time (JIT) compiler that often runs faster than CPython. Another compiler that can lead to greater speeds is Numba, which works for a subset of Python. Yeat another compiler is Cython, not to be confused with CPython.

External linksEdit



Tips and Tricks


There are many tips and tricks you can learn in Python:

StringsEdit

  • Triple quotes are an easy way to define a string with both single and double quotes.
  • String concatenation is expensive. Use percent formatting and str.join() for concatenation:

(but don't worry about this unless your resulting string is more than 500-1000 characters long) [1]

print "Spam" + " eggs" + " and" + " spam"               # DON'T DO THIS
print " ".join(["Spam","eggs","and","spam"])            # Much faster/more
                                                        # common Python idiom
print "%s %s %s %s" % ("Spam", "eggs", "and", "spam")   # Also a pythonic way of
                                                        # doing it - very fast

Optimized C modulesEdit

Several modules have optimized versions written in C, which provide an almost-identical interface and are frequently much faster or more memory-efficient than the pure Python implementations. Module behavior generally does differ in some respects, often minor, and thus C versions are frequently used.

This is primarily a Python 2.x feature, which has been largely removed in Python 3, with modules automatically using optimized implementations if available.[2] However, the cProfile/profile pair still exists (as of Python 3.4).

importingEdit

The C version of a module named module or Module is called cModule, and frequently imported using import...as to strip off the prefix, as:

import cPickle as pickle

For compatibility, one can try to import the C version and fall back to the Python version if the C version is not available; in this case using import...as is required, so the code does not depend on which module was imported:

try:
  import cPickle as pickle
except ImportError:
  import pickle

ExamplesEdit

Notable examples include:

  • (Python 2.x) cPickle for pickle, up to 1000× faster.
  • (Python 2.x) cStringIO for StringIO, replaced by io.StringIO in Python 3
  • cProfile for profile – the Python profile adds significant overhead, and thus cProfile is recommended for most use.
  • (not needed in Python 3.3+) cElementTree for ElementTree, 15–20 times faster and uses 2–5 times less memory;[3] not needed in Python 3.3+, which automatically uses a fast implementation if possible.

List comprehension and generatorsEdit

  • List comprehension and generator expressions are very useful for working with small, compact loops. Additionally, it is faster than a normal for-loop.
directory = os.listdir(os.getcwd())       # Gets a list of files in the
                                          # directory the program runs from
filesInDir = [item for item in directory] # Normal For Loop rules apply, you
                                          # can add "if condition" to make a
                                          # more narrow search.
  • List comprehension and generator expression can be used to work with two (or more) lists with zip or itertools.izip
[a - b for (a,b) in zip((1,2,3), (1,2,3))]  # will return [0, 0, 0]

Data type choiceEdit

Choosing the correct data type can be critical to the performance of an application. For example, say you have 2 lists:

list1 = [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'e': 5, 'f': 6}]
list2 = [{'e': 5, 'f': 6}, {'g': 7, 'h': 8}, {'i': 9, 'j': 10}]

and you want to find the entries common to both lists. You could iterate over one list, checking for common items in the other:

common = []
for entry in list1:
    if entry in list2:
        common.append(entry)

For such small lists, this will work fine, but for larger lists, for example if each contains thousands of entries, the following will be more efficient, and produces the same result:

set1 = set([tuple(entry.items()) for entry in list1])
set2 = set([tuple(entry.items()) for entry in list2])
common = set1.intersection(set2)
common = [dict(entry) for entry in common]

Sets are optimized for speed in such functions. Dictionaries themselves cannot be used as members of a set as they are mutable, but tuples can. If one needs to do set operations on a list of dictionaries, one can convert the items to tuples and the list to a set, perform the operation, then convert back. This is often much faster than trying to replicate set operations using string functions.

OtherEdit

  • Decorators can be used for handling common concerns like logging, db access, etc.
  • While Python has no built-in function to flatten a list you can use a recursive function to do the job quickly.
def flatten(seq, list = None):
    """flatten(seq, list = None) -> list

    Return a flat version of the iterator `seq` appended to `list`
    """
    if list == None:
        list = []
    try:                          # Can `seq` be iterated over?
        for item in seq:          # If so then iterate over `seq`
            flatten(item, list)      # and make the same check on each item.
    except TypeError:             # If seq isn't iterable
        list.append(seq)             # append it to the new list.
    return list
  • To stop a Python script from closing right after you launch one independently, add this code:
print 'Hit Enter to exit'
raw_input()
  • Python already has a GUI built in: Tkinter, based on Tcl's Tk. More are available, such as PyQt4, pygtk3, and wxPython.
  • Ternary Operators:
[on_true] if [expression] else [on_false]

x, y = 50, 25

small = x if x < y else y
  • Booleans as indexes:
b = 1==1
name = "I am %s" % ["John","Doe"][b]
#returns I am Doe

ReferencesEdit



Standard Library


The Python Standard Library is a collection of script modules accessible to a Python program to simplify the programming process and removing the need to rewrite commonly used commands. They can be used by 'calling/importing' them at the beginning of a script.

A list of the Standard Library modules can be found at http://www.python.org/doc/.

The following are among the most important:

  • time
  • sys
  • os
  • math
  • random
  • pickle
  • urllib
  • re
  • cgi
  • socket




Regular Expression


Python includes a module for working with regular expressions on strings. For more information about writing regular expressions and syntax not specific to Python, see the regular expressions wikibook. Python's regular expression syntax is similar to Perl's

To start using regular expressions in your Python scripts, import the "re" module:

import re

OverviewEdit

Regular expression functions in Python at a glance:

import re
if re.search("l+","Hello"):        print 1  # Substring match suffices
if not re.match("ell.","Hello"):   print 2  # The beginning of the string has to match
if re.match(".el","Hello"):        print 3
if re.match("he..o","Hello",re.I): print 4  # Case-insensitive match
print re.sub("l+", "l", "Hello")            # Prints "Helo"; replacement AKA substitution
print re.sub(r"(.*)\1", r"\1", "HeyHey")    # Prints "Hey"; backreference
print re.sub("EY", "ey", "HEy", flags=re.I) # Prints "Hey"; case-insensitive sub
print re.sub(r"(?i)EY", r"ey", "HEy")       # Prints "Hey"; case-insensitive sub
for match in re.findall("l+.", "Hello Dolly"):
  print match                               # Prints "llo" and then "lly"
for match in re.findall("e(l+.)", "Hello Dolly"):
  print match                               # Prints "llo"; match picks group 1
for match in re.findall("(l+)(.)", "Hello Dolly"):
  print match[0], match[1]                  # The groups end up as items in a tuple
matchObj = re.match("(Hello|Hi) (Tom|Thom)","Hello Tom Bombadil")
if matchObj is not None:
  print matchObj.group(0)                   # Prints the whole match disregarding groups
  print matchObj.group(1) + matchObj.group(2) # Prints "HelloTom"

Matching and searchingEdit

One of the most common uses for regular expressions is extracting a part of a string or testing for the existence of a pattern in a string. Python offers several functions to do this.

The match and search functions do mostly the same thing, except that the match function will only return a result if the pattern matches at the beginning of the string being searched, while search will find a match anywhere in the string.

>>> import re
>>> foo = re.compile(r'foo(.{,5})bar', re.I+re.S)
>>> st1 = 'Foo, Bar, Baz'
>>> st2 = '2. foo is bar'
>>> search1 = foo.search(st1)
>>> search2 = foo.search(st2)
>>> match1 = foo.match(st1)
>>> match2 = foo.match(st2)

In this example, match2 will be None, because the string st2 does not start with the given pattern. The other 3 results will be Match objects (see below).

You can also match and search without compiling a regexp:

>>> search3 = re.search('oo.*ba', st1, re.I)

Here we use the search function of the re module, rather than of the pattern object. For most cases, its best to compile the expression first. Not all of the re module functions support the flags argument and if the expression is used more than once, compiling first is more efficient and leads to cleaner looking code.

The compiled pattern object functions also have parameters for starting and ending the search, to search in a substring of the given string. In the first example in this section, match2 returns no result because the pattern does not start at the beginning of the string, but if we do:

>>> match3 = foo.match(st2, 3)

it works, because we tell it to start searching at character number 3 in the string.

What if we want to search for multiple instances of the pattern? Then we have two options. We can use the start and end position parameters of the search and match function in a loop, getting the position to start at from the previous match object (see below) or we can use the findall and finditer functions. The findall function returns a list of matching strings, useful for simple searching. For anything slightly complex, the finditer function should be used. This returns an iterator object, that when used in a loop, yields Match objects. For example:

>>> str3 = 'foo, Bar Foo. BAR FoO: bar'
>>> foo.findall(str3)
[', ', '. ', ': ']
>>> for match in foo.finditer(str3):
...     match.group(1)
...
', '
'. '
': '

If you're going to be iterating over the results of the search, using the finditer function is almost always a better choice.

Match objectsEdit

Match objects are returned by the search and match functions, and include information about the pattern match.

The group function returns a string corresponding to a capture group (part of a regexp wrapped in ()) of the expression, or if no group number is given, the entire match. Using the search1 variable we defined above:

>>> search1.group()
'Foo, Bar'
>>> search1.group(1)
', '

Capture groups can also be given string names using a special syntax and referred to by matchobj.group('name'). For simple expressions this is unnecessary, but for more complex expressions it can be very useful.

You can also get the position of a match or a group in a string, using the start and end functions:

>>> search1.start()
0
>>> search1.end()
8
>>> search1.start(1)
3
>>> search1.end(1)
5

This returns the start and end locations of the entire match, and the start and end of the first (and in this case only) capture group, respectively.

ReplacingEdit

Another use for regular expressions is replacing text in a string. To do this in Python, use the sub function.

sub takes up to 3 arguments: The text to replace with, the text to replace in, and, optionally, the maximum number of substitutions to make. Unlike the matching and searching functions, sub returns a string, consisting of the given text with the substitution(s) made.

>>> import re
>>> mystring = 'This string has a q in it'
>>> pattern = re.compile(r'(a[n]? )(\w) ')
>>> newstring = pattern.sub(r"\1'\2' ", mystring)
>>> newstring
"This string has a 'q' in it"

This takes any single alphanumeric character (\w in regular expression syntax) preceded by "a" or "an" and wraps in in single quotes. The \1 and \2 in the replacement string are backreferences to the 2 capture groups in the expression; these would be group(1) and group(2) on a Match object from a search.

The subn function is similar to sub, except it returns a tuple, consisting of the result string and the number of replacements made. Using the string and expression from before:

>>> subresult = pattern.subn(r"\1'\2' ", mystring)
>>> subresult
("This string has a 'q' in it", 1)

Replacing without constructing and compiling a pattern object:

>>> result = re.sub(r"b.*d","z","abccde")
>>> result
'aze'

SplittingEdit

The split function splits a string based on a given regular expression:

>>> import re
>>> mystring = '1. First part 2. Second part 3. Third part'
>>> re.split(r'\d\.', mystring)
['', ' First part ', ' Second part ', ' Third part']

EscapingEdit

The escape function escapes all non-alphanumeric characters in a string. This is useful if you need to take an unknown string that may contain regexp metacharacters like ( and . and create a regular expression from it.

>>> re.escape(r'This text (and this) must be escaped with a "\" to use in a regexp.')
'This\\ text\\ \\(and\\ this\\)\\ must\\ be\\ escaped\\ with\\ a\\ \\"\\\\\\"\\ to\\ use\\ in\\ a\\ regexp\\.'

FlagsEdit

The different flags use with regular expressions:

Abbreviation Full name Description
re.I re.IGNORECASE Makes the regexp case-insensitive
re.L re.LOCALE Makes the behavior of some special sequences (\w, \W, \b, \B, \s, \S) dependent on the current locale
re.M re.MULTILINE Makes the ^ and $ characters match at the beginning and end of each line, rather than just the beginning and end of the string
re.S re.DOTALL Makes the . character match every character including newlines.
re.U re.UNICODE Makes \w, \W, \b, \B, \d, \D, \s, \S dependent on Unicode character properties
re.X re.VERBOSE Ignores whitespace except when in a character class or preceded by an non-escaped backslash, and ignores # (except when in a character class or preceded by an non-escaped backslash) and everything after it to the end of a line, so it can be used as a comment. This allows for cleaner-looking regexps.

Pattern objectsEdit

If you're going to be using the same regexp more than once in a program, or if you just want to keep the regexps separated somehow, you should create a pattern object, and refer to it later when searching/replacing.

To create a pattern object, use the compile function.

import re
foo = re.compile(r'foo(.{,5})bar', re.I+re.S)

The first argument is the pattern, which matches the string "foo", followed by up to 5 of any character, then the string "bar", storing the middle characters to a group, which will be discussed later. The second, optional, argument is the flag or flags to modify the regexp's behavior. The flags themselves are simply variables referring to an integer used by the regular expression engine. In other languages, these would be constants, but Python does not have constants. Some of the regular expression functions do not support adding flags as a parameter when defining the pattern directly in the function, if you need any of the flags, it is best to use the compile function to create a pattern object.

The r preceding the expression string indicates that it should be treated as a raw string. This should normally be used when writing regexps, so that backslashes are interpreted literally rather than having to be escaped.

External linksEdit



External commands

The traditional way of executing external commands is using os.system():

import os
os.system("dir")
os.system("echo Hello")
exitCode = os.system("echotypo")

The modern way, since Python 2.4, is using subprocess module:

subprocess.call(["echo", "Hello"])
exitCode = subprocess.call(["dir", "nonexistent"])

The traditional way of executing external commands and reading their output is via popen2 module:

import popen2
readStream, writeStream, errorStream = popen2.popen3("dir")
# allLines = readStream.readlines()
for line in readStream:
  print line.rstrip()
readStream.close()
writeStream.close()
errorStream.close()

The modern way, since Python 2.4, is using subprocess module:

import subprocess
process = subprocess.Popen(["echo","Hello"], stdout=subprocess.PIPE)
for line in process.stdout:
   print line.rstrip()

Keywords: system commands, shell commands, processes, backtick, pipe.

External linksEdit



XML Tools


IntroductionEdit

Python includes several modules for manipulating xml.


xml.sax.handlerEdit

Python Doc

import xml.sax.handler as saxhandler
import xml.sax as saxparser

class MyReport:
    def __init__(self):
        self.Y = 1


class MyCH(saxhandler.ContentHandler):
    def __init__(self, report):
        self.X = 1
        self.report = report

    def startDocument(self):
        print 'startDocument'

    def startElement(self, name, attrs):
        print 'Element:', name

report = MyReport()          #for future use
ch = MyCH(report)

xml = """\
<collection>
  <comic title=\"Sandman\" number='62'>
     <writer>Neil Gaiman</writer>
     <penciller pages='1-9,18-24'>Glyn Dillon</penciller>
     <penciller pages="10-17">Charles Vess</penciller>
  </comic>
</collection>
"""

print xml

saxparser.parseString(xml, ch)

xml.dom.minidomEdit

An example of doing RSS feed parsing with DOM

from xml.dom import minidom as dom
import urllib2

def fetchPage(url):
    a = urllib2.urlopen(url)
    return ''.join(a.readlines())

def extract(page):
    a = dom.parseString(page)
    item = a.getElementsByTagName('item')
    for i in item:
        if i.hasChildNodes():
            t = i.getElementsByTagName('title')[0].firstChild.wholeText
            l = i.getElementsByTagName('link')[0].firstChild.wholeText
            d = i.getElementsByTagName('description')[0].firstChild.wholeText
            print t, l, d

if __name__=='__main__':
    page = fetchPage("http://rss.slashdot.org/Slashdot/slashdot")
    extract(page)

XML document provided by pyxml documentation.



Email


Python includes several modules in the standard library for working with emails and email servers.

Sending mailEdit

Sending mail is done with Python's smtplib using an SMTP (Simple Mail Transfer Protocol) server. Actual usage varies depending on complexity of the email and settings of the email server, the instructions here are based on sending email through Google's Gmail.

The first step is to create an SMTP object, each object is used for connection with one server.

import smtplib
server = smtplib.SMTP('smtp.gmail.com', 587)

The first argument is the server's hostname, the second is the port. The port used varies depending on the server.

Next, we need to do a few steps to set up the proper connection for sending mail.

server.ehlo()
server.starttls()
server.ehlo()

These steps may not be necessary depending on the server you connect to. ehlo() is used for ESMTP servers, for non-ESMTP servers, use helo() instead. See Wikipedia's article about the SMTP protocol for more information about this. The starttls() function starts Transport Layer Security mode, which is required by Gmail. Other mail systems may not use this, or it may not be available.

Next, log in to the server:

server.login("youremailusername", "password")

Then, send the mail:

msg = "\nHello!" # The /n separates the message from the headers (which we ignore for this example)
server.sendmail("you@gmail.com", "target@example.com", msg)

Note that this is a rather crude example, it doesn't include a subject, or any other headers. For that, one should use the email package.

The email packageEdit

Python's email package contains many classes and functions for composing and parsing email messages, this section only covers a small subset useful for sending emails.

We start by importing only the classes we need, this also saves us from having to use the full module name later.

from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart

Then we compose some of the basic message headers:

fromaddr = "you@gmail.com"
toaddr = "target@example.com"
msg = MIMEMultipart()
msg['From'] = fromaddr
msg['To'] = toaddr
msg['Subject'] = "Python email"

Next, we attach the body of the email to the MIME message:

body = "Python test mail"
msg.attach(MIMEText(body, 'plain'))

For sending the mail, we have to convert the object to a string, and then use the same prodecure as above to send using the SMTP server..

import smtplib
server = smtplib.SMTP('smtp.gmail.com', 587)
server.ehlo()
server.starttls()
server.ehlo()
server.login("youremailusername", "password")
text = msg.as_string()
server.sendmail(fromaddr, toaddr, text)

If we look at the text, we can see it has added all the necessary headers and structure necessary for a MIME formatted email. See MIME for more details on the standard:

The full text of our example message
>>> print text
Content-Type: multipart/mixed; boundary="===============1893313573=="
MIME-Version: 1.0
From: you@gmail.com
To: target@example.com
Subject: Python email

--===============1893313573==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

Python test mail
--===============1893313573==--




Threading


Threading in python is used to run multiple threads (tasks, function calls) at the same time. Note that this does not mean that they are executed on different CPUs. Python threads will NOT make your program faster if it already uses 100 % CPU time. In that case, you probably want to look into parallel programming. If you are interested in parallel programming with python, please see here.

Python threads are used in cases where the execution of a task involves some waiting. One example would be interaction with a service hosted on another computer, such as a webserver. Threading allows python to execute other code while waiting; this is easily simulated with the sleep function.

ExamplesEdit

A Minimal Example with Function CallEdit

Make a thread that prints numbers from 1-10, waits for 1 sec between:

import threading
import time

def loop1_10():
    for i in range(1, 11):
        time.sleep(1)
        print(i)

threading.Thread(target=loop1_10).start()

A Minimal Example with ObjectEdit

#!/usr/bin/env python
from __future__ import print_function       #should be on the top
import threading
import time


class MyThread(threading.Thread):
    def run(self):
        print("{} started!".format(self.getName()))              # "Thread-x started!"
        time.sleep(1)                                      # Pretend to work for a second
        print("{} finished!".format(self.getName()))             # "Thread-x finished!"

if __name__ == '__main__':
    for x in range(4):                                     # Four times...
        mythread = MyThread(name = "Thread-{}".format(x + 1))  # ...Instantiate a thread and pass a unique ID to it
        mythread.start()                                   # ...Start the thread
        time.sleep(.9)                                     # ...Wait 0.9 seconds before starting another

This should output:

Thread-1 started!
Thread-2 started!
Thread-1 finished!
Thread-3 started!
Thread-2 finished!
Thread-4 started!
Thread-3 finished!
Thread-4 finished! 


There seems to be a problem with this, if you replace sleep(1) with (2), and change range(4) to range(10). Thread-2 finished is the first line before its even started. in WING IDE, Netbeans, Eclipse is fine.



Sockets


HTTP ClientEdit

Make a very simple HTTP client

import socket
s = socket.socket()
s.connect(('localhost', 80))
s.send('GET / HTTP/1.1\nHost:localhost\n\n')
s.recv(40000) # receive 40000 bytes

NTP/SocketsEdit

Connecting to and reading an NTP time server, returning the time as follows

ntpps       picoseconds portion of time
ntps        seconds portion of time
ntpms       milliseconds portion of time
ntpt        64-bit ntp time, seconds in upper 32-bits, picoseconds in lower 32-bits



GUI Programming


There are various GUI toolkits usable from Python.

TkinterEdit

Tkinter is a Python wrapper for Tcl/Tk providing a cross-platform GUI toolkit. On Windows, it comes bundled with Python; on other operating systems, it can be installed. The set of available widgets is smaller than in some other toolkits, but since Tkinter widgets are extensible, many of the missing compound widgets can be created using the extensibility, such as combo box and scrolling pane.

A minimal example:

from Tkinter import *
root = Tk()
frame = Frame(root)
frame.pack()
label = Label(frame, text="Hey there.")
label.pack()
quitButton = Button(