Language basics¶

This chapter will start with a short tutorial to get you familiar with Python. You will quickly see the similarities with whatever programming language you already know. After this introduction we will start by formalizing things and naming them (semantics). As we discussed last week, using clear semantics is primordial to understand software documentation and to "ask questions the right way" in search engines.

Python objects have a type (synonym: data type). In the previous tutorial, you used exclusively built-in types. Built-in data types are directly available in the interpreter, as opposed to other data types which maybe obtained either by importing them (e.g. from collections import OrderedDict) or by creating new data types yourselves.

Asking for the type of an object¶

type(1)

int

a = 'Hello'
type(a)

str

Exercise: add a print call in the statement above to see the difference with ipython's simplified print. What is the type of type, by the way?

Numeric types¶

There are three distinct numeric types: integers (int), floating point numbers (float), and complex numbers (complex). We will talk about these in more details in the numerics chapter.

Booleans¶

There is a built-in boolean data type (bool) useful to test for truth value. Examples:

type(True), type(False)

(bool, bool)

type(a == 'Hello')

bool

3 < 5

True

Note that there are other rules about testing for truth in python. This is quite convenient if you want to avoid doing operation on invalid or empty containers:

if '':
    print('This should not happen')

In Python, like in C, any non-zero integer value is true; zero is false:

if 1 and 2:
    print('This will happen')

This will happen

Refer to the docs for an exhaustive list of boolean operations and comparison operators.

Text¶

In python (and many other languages) text sequences are named strings (str), which can be of any length:

type('Français, 汉语')  # unicode characters are no problem in Python

str

Unlike some languages, there is no special type for characters:

for char in 'string':
    # "char" is also a string of length 1
    print(char, type(char))

s <class 'str'>
t <class 'str'>
r <class 'str'>
i <class 'str'>
n <class 'str'>
g <class 'str'>

Since strings behave like lists in many ways, they are often classified together with the sequence types, as we will see below.

Python strings cannot be changed - they are immutable. Therefore, assigning to an indexed position in the string results in an error:

word = 'Python'
word[0] = 'J'

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-38-ad89e228b316> in <module>()
      1 word = 'Python'
----> 2 word[0] = 'J'

TypeError: 'str' object does not support item assignment

Python objects have methods attached to them. We will learn more about methods later, but here is an example:

word.upper()  # the method .upper() converts all letters in a string to upper case

'PYTHON'

"She's a witch!".split(' ')  # the .split() method divides strings using a separator

["She's", 'a', 'witch!']

Sequence types - list, tuple, range¶

Python knows a number of sequence data types, used to group together other values. The most versatile is the list, which can be written as a list of comma-separated values (items) between square brackets. Lists might contain items of different types, but usually the items all have the same type.

squares = [1, 4, 9, 16, 25, 36, 49]
squares

[1, 4, 9, 16, 25, 36, 49]

Lists can be indexed and sliced:

squares[0]

1

squares[-3:]

[25, 36, 49]

squares[0:7:2]  # new slicing! From element 0 to 7 in steps of 2

[1, 9, 25, 49]

squares[::-1]  # new slicing! All elements in steps of -1, i.e. reverse

[49, 36, 25, 16, 9, 4, 1]

Careful! Lists are not the equivalent of arrays in Matlab. One major difference being that the addition operator concatenates lists together (like strings), instead of adding the numbers elementwise like in Matlab:

squares + [64, 81, 100]

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

Unlike strings, which are immutable, lists are a mutable type, i.e. it is possible to change their content:

cubes = [1, 8, 27, 65, 125]  # something's wrong here
cubes[3] = 64
cubes

[1, 8, 27, 64, 125]

Assignment to slices is also possible, and this can even change the size of the list:

letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
letters[2:5] = ['C', 'D', 'E']  # replace some values
letters

['a', 'b', 'C', 'D', 'E', 'f', 'g']

letters[2:5] = []  # now remove them
letters

['a', 'b', 'f', 'g']

The built-in function len() also applies to lists:

len(letters)

4

It is possible to nest lists (create lists containing other lists), as it is possible to store different objects in lists. For example:

a = ['a', 'b', 'c']
n = [1, 2, 3]
x = [a, n, 3.14]
x

[['a', 'b', 'c'], [1, 2, 3], 3.14]

x[0][1]

'b'

Lists also have methods attached to them (see 5.1 More on lists for the most commonly used). For example:

alphabet = ['c', 'b', 'd']
alphabet.append('a')  # add an element to the list
alphabet.sort() # sort it
alphabet

['a', 'b', 'c', 'd']

Other sequence types include: string, tuple, range. Sequence types support a common set of operations and are therefore very similar:

l = [0, 1, 2]
t = (0, 1, 2)
r = range(3)
s = '123'

# Test if elements can be found in the sequence(s)
1 in l, 1 in t, 1 in r, '1' in s

(True, True, True, True)

# Ask for the length
len(l), len(t), len(r), len(s)

(3, 3, 3, 3)

# Addition
print(l + l)
print(t + t)
print(s + s)

[0, 1, 2, 0, 1, 2]
(0, 1, 2, 0, 1, 2)
123123

The addition operator won't work for the range type though. Ranges are a little different than lists or strings:

r = range(2, 13, 2)
r  # r is an object of type "range". It doesn't print all the values, just the interval and steps

range(2, 13, 2)

list(r)  # applying list() converts range objects to a list of values

[2, 4, 6, 8, 10, 12]

Ranges are usually used as loop counter or to generate other sequences. Ranges have a strong advantage over lists and tuples: their elements are generated when they are needed, not before. Ranges have therefore a very low memory consumption. See the following:

range(2**100)  # no problem

range(0, 1267650600228229401496703205376)

list(range(2**100))  # trying to make a list of values out of it results in an error

---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-61-d049a54d5d56> in <module>()
----> 1 list(range(2**100))  # trying to make a list of values out of it results in an error

OverflowError: Python int too large to convert to C ssize_t

An OverflowError tells me that I'm trying to create an array too big to fit into memory.

The "tuple" data type is probably a new concept for you, as tuples are quite specific to python. A tuple behaves almost like a list, but the major difference is that a tuple is immutable:

l[1] = 'ha!'  # I can change an element of a list
l

[0, 'ha!', 2]

t[1] = 'ha?'  # But I cannot change an element of a tuple

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-63-9bb684dc595d> in <module>()
----> 1 t[1] = 'ha?'  # But I cannot change an element of a tuple

TypeError: 'tuple' object does not support item assignment

It is immutability which makes tuples useful, but for beginners this is not really obvious at the first sight. We will get back to tuples later in the lecture.

Sets¶

Sets are an unordered collection of distinct objects:

s1 = {'why', 1, 9}
s2 = {9, 'not'}
s1

{1, 9, 'why'}

# Let's compute the union of these two sets. We use the method ".union()" for this purpose:
s1.union(s2)  # 9 was already in the set, however it is not doubled in the union

{1, 9, 'not', 'why'}

Sets are useful for operations such as intersection, union, difference, and symmetric difference between sequences. You won't see much use for them in this semester, but remember that they exist.

Mapping types - dictionaries¶

A mapping object maps values (keys) to arbitrary objects (values): the most frequently used mapping object is called a dictionary. It is a collection of (key, value) pairs:

tel = {'jack': 4098, 'sape': 4139}
tel

{'jack': 4098, 'sape': 4139}

tel['guido'] = 4127
tel

{'guido': 4127, 'jack': 4098, 'sape': 4139}

del tel['sape']
tel

{'guido': 4127, 'jack': 4098}

Keys can be of any immutable type: e.g. strings and numbers are often used as keys. The keys in a dictionary are all unique (they have to be):

d = {'a':1, 2:'b', 'c':1}  # a, 2, and c are keys
d

{2: 'b', 'a': 1, 'c': 1}

You can ask whether a (key, value) pair is available in a dict with the statement:

2 in d

True

However, you cannot check appartenance by value, since the values are not necessarily unique:

1 in d

False

Dictionaries are (together with lists) the container type you will use the most often.

Note: there are other mapping types in python, but they are all related to the original dict. Examples include collections.OrderedDict, which is a dictionary preserving the order in which the keys are entered.

Exercise: can you think of examples of application of a dict? Describe a couple of them!

Semantics parenthesis: "literals"¶

Literals are the fixed values of a programming language ("notations"). Some of them are pretty universal, like numbers or strings (9, 3.14, "Hi!", all literals) some are more language specific and belong to the language's syntax. Curly brackets {} for example are the literal representation of a dict. The literal syntax has been added for convenience only:

d1 = dict(bird='parrot', plant='crocus')  # one way to make a dict
d2 = {'bird':'parrot', 'plant':'crocus'}  # another way to make a dict
d1 == d2

True

Both {} and dict() are equivalent: using one or the other to construct your containers is a matter of taste, but in practice you will see the literal version more often.

Control flow¶

First steps towards programming¶

Of course, we can use Python for more complicated tasks than adding two and two together. For instance, we can write an initial sub-sequence of the Fibonacci series as follows:

# Fibonacci series:
# the sum of two previous elements defines the next
a, b = 0, 1
while a < 10:
    print(a)
    a, b = b, a+b

This example introduces several new features.

The first line contains a multiple assignment: the variables a and b simultaneously get the new values 0 and 1. On the last line this is used again, demonstrating that the expressions on the right-hand side are all evaluated first before any of the assignments take place. The right-hand side expressions are evaluated from the left to the right.
The while loop executes as long as the condition (here: a < 10) remains true. The standard comparison operators are written the same as in C: < (less than), > (greater than), == (equal to), <= (less than or equal to), >= (greater than or equal to) and != (not equal to).
The body of the loop is indented: indentation is Python’s way of grouping statements, and not via brackets or begin .. end statements. Hate it or love it, this is how it is ;-). I learned to like this style a lot. Note that each line within a basic block must be indented by the same amount. Although the indentation could be anything (two spaces, three spaces, tabs...), the recommended way is to use four spaces.

The print() function accepts multiple arguments:

i = 256*256
print('The value of i is', i)

The value of i is 65536

The keyword argument (see definition below) end can be used to avoid the newline after the output, or end the output with a different string:

a, b = 0, 1
while a < 1000:
    print(a, end=',')
    a, b = b, a+b

0,1,1,2,3,5,8,13,21,34,55,89,144,233,377,610,987,

The `if` statement¶

Perhaps the most well-known statement type is the if statement:

x = 12
if x < 0:
    x = 0
    print('Negative changed to zero')
elif x == 0:
    print('Zero')
elif x == 1:
    print('Single')
else:
    print('More')

More

There can be zero or more elif parts, and the else part is optional. The keyword elif is short for "else if", and is useful to avoid excessive indentation.

The `for` statement¶

The for loops in python can be quite different than in other languages: in python, one iterates over sequences, not indexes. This is a feature I very much like for its readability:

words = ['She', 'is', 'a', 'witch']
for w in words:
    print(w)

She
is
a
witch

The equivalent for loop with a counter is considered "unpythonic", i.e. not elegant.

Unpythonic:

seq = ['This', 'is', 'very', 'unpythonic']
# Do not do this at home!
n = len(seq)
for i in range(n):
    print(seq[i])

This
is
very
unpythonic

Pythonic:

seq[-1] = 'pythonic'
for s in seq:
    print(s)

This
is
very
pythonic

for i in range(xx) is almost never what you want to do in python. If you have several sequences you want to iterate over, then do:

squares = [1, 4, 9, 25]
for s, l in zip(seq, squares):
    print(l, s)

1 This
4 is
9 very
25 pythonic

The `break` and `continue` statements¶

The break statement breaks out of the innermost enclosing for or while loop:

for letter in 'Python':
    if letter == 'h':
        break
    print('Current letter:', letter)

Current letter: P
Current letter: y
Current letter: t

The continue statement continues with the next iteration of the loop:

for num in range(2, 10):
    if num % 2 == 0:
        print("Found an even number", num)
        continue
    print("Found a number", num)

Found an even number 2
Found a number 3
Found an even number 4
Found a number 5
Found an even number 6
Found a number 7
Found an even number 8
Found a number 9

Defining functions¶

A first example¶

def fib(n):
    """Print a Fibonacci series up to n."""
    a, b = 0, 1
    while a < n:
        print(a, end=' ')
        a, b = b, a+b

# Now call the function we just defined:
fib(2000)

0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597

The def statement introduces a function definition. It must be followed by the function name and the parenthesized list of formal parameters. The statements that form the body of the function start at the next line, and must be indented.

The first statement of the function body can optionally be a string literal; this string literal is the function's documentation string, or docstring (more about docstrings later: in the meantime, make a habit out of it).

A function definition introduces the function name in the current scope (we will learn about scopes soon). The value of the function name has a type that is recognized by the interpreter as a user-defined function. This value can be assigned to another name which can then also be used as a function. This serves as a general renaming mechanism:

fib

<function __main__.fib(n)>

f = fib
f(100)

0 1 1 2 3 5 8 13 21 34 55 89

Coming from other languages, you might object that fib is not a function but a procedure since it doesn't return a value. In fact, even functions without a return statement do return a value, albeit a rather boring one. This value is called None (it’s a built-in name). Writing the value None is normally suppressed by the interpreter if it would be the only value written. You can see it if you really want to by using print():

fib(0)  # shows nothing

print(fib(0))  # prints None

None

It is simple to write a function that returns a list of the numbers of the Fibonacci series, instead of printing it:

def fib2(n):  # return Fibonacci series up to n
    """Return a list containing the Fibonacci series up to n."""
    result = []
    a, b = 0, 1
    while a < n:
        result.append(a) 
        a, b = b, a+b
    return result

r = fib2(100)  # call it
r  # print the result

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]

Positional and keyword arguments¶

Functions have two types of arguments: positional arguments and keyword arguments.

keyword arguments are preceded by an identifier (e.g. name=) and are attributed a default value. They are therefore optional:

def f(arg1, arg2, kwarg1=None, kwarg2='Something'):
    """Some function with arguments."""
    print(arg1, arg2, kwarg1, kwarg2)

f(1, 2)  # no need to specify them - they are optional and have default values

1 2 None Something

f(1, 2, kwarg1=3.14, kwarg2='Yes')  # but you can set them to a new value
f(1, 2, kwarg2='Yes', kwarg1=3.14)  # and the order is not important!

1 2 3.14 Yes
1 2 3.14 Yes

Unfortunately, it is also possible to set keyword arguments without naming them, in which case the order matters:

f(1, 2, 'Yes', 'No')

1 2 Yes No

I am not a big fan of this feature because it reduces the clarity of the code. I recommend to always use the kwarg= syntax. Others agree with me, and therefore python implemented a syntax to make calls like the above illegal:

# The * before the keyword arguments make them keyword arguments ONLY
def f(arg1, arg2, *, kwarg1=None, kwarg2='None'):
    print(arg1, arg2, kwarg1, kwarg2)

f(1, 2, 'Yes', 'No')  # This now raises an error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-94-81123ad67022> in <module>()
----> 1 f(1, 2, 'Yes', 'No')  # This now raises an error

TypeError: f() takes 2 positional arguments but 4 were given

positional arguments are named like this because their position matters, and unlike keyword arguments they don't have a default value and they are mandatory. Forgetting to set them results in an error:

f(1)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-95-281ab0a37d7d> in <module>()
----> 1 f(1)

TypeError: f() missing 1 required positional argument: 'arg2'

Importing modules and functions¶

Although python ships with some built-in functions available in the interpreter (e.g. len(), print()), it is by far not enough to do real world programming. Thankfully, python comes with a mechanism which allows us to access much more functionality:

import math
print(math)
print(math.pi)

<module 'math' (built-in)>
3.141592653589793

math is a module, and it has attributes (e.g. pi) and functions attached to it:

math.sin(math.pi / 4)  # compute a sinus

0.7071067811865475

math is available in the python standard library (https://docs.python.org/3/library/): this means that it comes pre-installed together with python itself. Other modules can be installed (like numpy or matplotlib), but we won't need them for now.

Modules often have a thematic grouping, i.e. math, time, multiprocessing. You will learn more about them in the next lecture.

Take home points¶

in python, everything is an object
all objects have a data type: examples of data types include floats, strings, dicts, lists...
you can ask for the type of an object with the built-in function type()
"built-in" means that a function or data type is available at the command prompt without import statement
objects also have methods attached to them, e.g. .upper() for strings, .append() for lists
lists and dicts are the container data types you will use most often
certain objects are immutable (strings, tuples), but others are mutable and can change their state (dicts, lists)
in python, indentation matters! This is how you define blocks of code. Keep your indentation consistent, with 4 spaces
in python, one iterates over sequences, not indexes
functions are defined with def, and also rely on indentation to define blocks. They can have a return statement
there are two types or arguments in functions: positional (mandatory) and keyword (optional) arguments
the import statement opens a whole new world of possibilities: you can access other standard tools that are not available at the top-level prompt

What's next?¶

We learned the basic elements of the python syntax: to become fluent with this new language you will have to get familiar with all of the elements presented above. With time, you might want to get back to this chapter (or to the python reference documentation) to revisit what you've learned. I also highly recommend to follow the official python tutorial, sections 3 to 5.

Back to the table of contents, or jump to this week's assignment.

Language basics¶

Table of Contents

An entry level tutorial¶

Python as a Calculator¶

Strings¶

Basic data types¶

Asking for the type of an object¶

Numeric types¶

Booleans¶

Text¶

Sequence types - list, tuple, range¶

Sets¶

Mapping types - dictionaries¶

Semantics parenthesis: "literals"¶

Control flow¶

First steps towards programming¶

The `if` statement¶

The `for` statement¶

The `break` and `continue` statements¶

Defining functions¶

A first example¶

Positional and keyword arguments¶

Importing modules and functions¶

Take home points¶

What's next?¶

License¶

Language basics¶

Table of Contents

An entry level tutorial¶

Python as a Calculator¶

Strings¶

Basic data types¶

Asking for the type of an object¶

Numeric types¶

Booleans¶

Text¶

Sequence types - list, tuple, range¶

Sets¶

Mapping types - dictionaries¶

Semantics parenthesis: "literals"¶

Control flow¶

First steps towards programming¶

The if statement¶

The for statement¶

The break and continue statements¶

Defining functions¶

A first example¶

Positional and keyword arguments¶

Importing modules and functions¶

Take home points¶

What's next?¶

License¶

The `if` statement¶

The `for` statement¶

The `break` and `continue` statements¶