# Python basics

## Variables
In python, as in many programming language, objects are stored in variables.
* A value is assigned to a variable using the `=` sign. 
* **Warning:** unlike in mathematics, the `=` sign in python is directional: the variable name must always be on the left of the `=`, and the value to assign to the variable on the right.  
  Example:
  ```python
  a = 23    # is a valid assignment.
  8 = b     # is NOT a valid assignment.
  ```

In python variables names must adhere to these restrictions:
* must be composed of uppercase, lowercase letters `A-Z`, `a-z`, digits `0-9` , and the underscore character `_`.
* the first character of a variable name cannot be a digit. E.g. `file_1` is a valid variable name, but `1_file` is not.
* by convention, variable names starting with a single or double underscore `_`/`__` are reserved for "special" variables (class private attributes, "magic" variables).


> **pro tip**: using explicit variable names makes your code more understandable to others, as well as you-from-the-future. For instance `input_file` is better than `iptf`.

In [1]:
myVariable = 35      # assign the value 35 to variable "myVariable".
var_a = 2.3          # assign the value 2.3 to variable "a".
var_b = var_a        # assign to value of "var_a" to "var_b".

# By the way, text located after a "#" character - just like this line - are comments. 
# Comments are text that will not be executed, but is useful for code documentation
print(myVariable)
print(var_a)
print(var_b)

<br>

## Functions
Another very important concept in Python - as in most programming language - are **functions**.
* functions are a re-usable blocks of code that have been given a name and are ready to perform an action. How to define your your own functions will be detailed later in this course.
* functions can be written to perform anything, from the simplest task to the most complex.
* to call a function, one uses its name followed by parentheses `()`, which contain an eventual set of arguments (arguments are the variables that the function needs to work on).

### The help function - your best friend in Python
In python, almost any object or function is extensively documented: what it is, what it does, how to use it, ...  
We access this help using the `help()` function, which takes as argument the object we want to get help with.

In [2]:
## let's try to look up the help page of the print function that you have already encountered.
help(print)

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.



It tells us that:
 * `print` is a function.
 * it "Prints the values to a stream, or to sys.stdout by default.", here a stream can be a file, sys.stdout is the console -> so the function prints to the console (and possibly to a file).
 * its arguments are the things that will be printed.
 * it has 4 optional arguments that refines its use.

Let's try to apply our new knowledge of the `print` function :

In [3]:
print('test')                # simple usage.
print('test' , 42)           # we can make it print several values. by default, they are separated by spaces.
print('test' , 42 , sep='/') # the sep argument can be used to change the separator between values.

test
test 42
test/42


**Don't hesitate to use the `help` function on any object or function to understand how they work.**

## Reading and understanding errors

Unless you are a perfect human being, your code will contain some error at some point.  
Errors ~~can sometimes be~~ are frustrating, but they are unavoidable, and the best way to correct them is to actually read and try to understand them.

Here is an error example:

In [4]:
var_a = 42
var_b = var_a + 3
print(var_c)

NameError: name 'var_c' is not defined

The first line indicates the **type** of ther error. In our example we got a `NameError`, which means that a name has not been found.  
If you want to know more about a certain error type, you can use the help function on it: `help(NameError)`.

The following lines point out the line where the error occured, which is very useful when if have hundreds of them. Here the error occured on line `3`, the line of the `print` statement.

Finally, we have `NameError: name 'var_c' is not defined`, which points out that we tried to print the variable `var_c` when that variable does not exists (i.e., that name is not defined).

> <span style="color:blue">Arguably, being able to **read and understand errors** and being able to **read the help** accounts for ~50% of "coding skills"...</span>.



**Micro exercise** : look at the error given by the following code. Try to understand it and modify the code accordingly.

In [1]:
42 + "a"

TypeError: unsupported operand type(s) for +: 'int' and 'str'

<br>

## Object types: simple types
Python divides objects into several categories which are called **type**.  
There exist plenty of type (is it even common to define your own new type), but there a few very common ones - known as **built-in** types - that you ought to know.
* `bool`: boolean/logical values, either `True` or `False`, like a 0 or 1
* `int` : integer number
* `float`: floating point number

To know the type of an object, use the `type()` function.  
**Note:** contrary to some other languages (like C++ for instance) variables in python are not restricted to a single type and can be reassigned another type of value at any time.

In [None]:
# In this example we successively assign different values and types to the variable "a".
# boolean
a = True
print("type of a is:", type(a))

# float
a = 4.2
print("type of a is:", type(a))

# integer
a = 42
print("type of a is:", type(a))
print("type of 42 is:", type(42))

Type conversion is (often) fairly easy : juste use the type name as a function

In [None]:
# Convert a integer to a float:
print("type of a before conversion:", type(a))
a = float(a)
print("type of a after conversion:", type(a))

**Mirco Exercise** : convert `a` back to an integer (`int`). Look up the `help` for integers.

<br>

## Operators
Now that you have variables containing objects of a certain _**type**_, you can begin to play with them using operators.

### Arithmetic operators 
You know most of these already:

In [None]:
print( 3 + 7 )            # + : addition
print( 1.1 - 5 )          # - : substraction
print( 5 / 2 )            # / : division
print( 5 // 2 )           # //: integer division
print( 5 * 2 )            # * : multiplication
print( 2 ** 4 )           # **: power
print( 5 % 2 )            # % : modulus (remainder of the division)

# Variables can be used there as well:
x = 4
y = 16 * x**2 - 2 * x + 0.5 
print(y)

Now you can use Python as a fancy calculator!

**Bonus:** when modifying the value of a variable, you can use the following shortcut operators:

In [None]:
a = 0

# Same as a = a + 3
a += 3
print("The value of 'a' is now:", a)

# Same as a = a - 1
a -= 1                                 
print("The value of 'a' is now:", a)

# Same as a = a * 3
a *= 3
print("The value of 'a' is now:", a)

# Same as a = a / 2
a /= 2
print("The value of 'a' is now:", a)

<br>

### Comparison operators

These operators return a `bool` value (`True`  or `False`)


In [None]:
a = 5
print("is a equal to 1?:", a == 1)			        # == : equality
print("is a different to 13.37?:", a != 13.37)	    # != : inequality
print("is a greater than 5?:", a > 5 )				# >  : larger than
print("is a lower than 10?:", a < 10 )				# <  : lower than
print("is a above 5?:", a >= 5 )					# <= : lower or equal
print("is a lower than 10?:", a <= 10 )		    	# >= : larger or equal

print("is a equal to '5'?:", a == '5')              # note that comparisons are sensible to types, so this evaluates to False.

# boolean (resulting from comparisons) can be combined using 'and' or 'or'
print("'and' requires both elements to be True:" , True and ( 1 + 1 != 2 ) )
print("'or' requires at least element to be True:" , ( a * 2 > 10 ) or ( a > 0 ) )

**Micro-exercises** : compute the product of 348 and 157.2. Use a comparison operator to check if the result is larger than 230 square (`230**2`)

<br>

## Object types: container types
These types are object that contain other objects:
* `str`: string - text
* `list`: "mutable" list of python object
* `tuple`: "immutable" list of python object
* `dict`: dictionnary associating 'key' to 'value'

They all have a dedicated `[]` operator that lets user access one - or several - of the object they contain.  
In addition, the number of objects a container has (its length) can be accessed using the `len()` function.  
**Important:** in python (unlike e.g. in R), indexing is zero-based. This means that the first element of an container type object is accessed with `object[0]`, and not `object[1]`.

### Strings
* In python, the `string` type is a sequences of characters that can be used to represent text of any length.
* Strings are represented surrounded by single `'` or double `"` quotes. One can also use triple quotes `"""` to make a multi-line string.

In [None]:
# Both single and double quotes can be used to define a string.
gene_seq = "ATGCGACTGATCGATCGATCGATCGATGATCGATCGATCGATGCTAGCTAC"
name = 'Sir Lancelot of Camelot'

# Triple quotes can be used to define multi-line strings.
long_string = """Let me tell you something, my lad. 
When you’re walking home tonight and some great 
homicidal maniac comes after you with a bunch 
of loganberries, don’t come crying to me!\n"""
print(long_string)

# Special characters are possible.
my_quote = """Gracieux : « aimez-vous à ce point les oiseaux
que paternellement vous vous préoccupâtes
de tendre ce perchoir à leurs petites pattes ? »"""

# We also commonly use special characters, such as:
print('a\tb')  # \t : tabulation
print('a\nb')  # \n : newline

# We can use the len() function to know the length of a string:
print("The length of the string in the 'name' variable is:", len(name))

# NB: strings can be added together and multiplied by an integer:
print( 'dead' + 'parrot' ) 
print( 'spam' * 5 ) 

The different letters of a string are accessed using the **`[]` operator**, with the index of the desired element.  
Remember that in python, the index of the first element is `[0]`.

In [None]:
my_string = "And now, something completely different."
print("The first element of this string is:", my_string[0] )    # 0 is the index of the first element of the string
print("The 5th element of this string is:", my_string[4] )      # 5th element of the string
print("The last element of this string is:", my_string[-1] )     # -1 is the index of the last element of the string

Indices can also be used to retrieve several element at once: this is called a **slice operation** or **slicing**.

In [None]:
print(my_string[0:5])   # slice operation: get all elements from index 0 (included) to index 5 (excluded)
print(my_string[:5])    # this implicitely slices from the beginning of the string up to (but not included) index 5.
print(my_string[5:])    # this implicitely slices until the end of the string

**Micro exercise** : create a `str` variable containing your name. Extract the last 3 letters from it using slicing.

<br>

### Lists and tuples
Lists and tuples are containers that can contain any type of element.  
* Lists are declared by surrounding a comma separated list of objects with `[]`.  
* Tuples are declared similarly, but using `()`.

In [None]:
myList = [ 1 , 2 , 3 , 5 , 5.2 , 6.99 ]
myTuple = ( 'a' , 4.2 , 5 ) # a list/tuple is not limited to a single type

The `[]` operator works in much the same way than with strings:

In [None]:
print(myTuple[0])
print(myList[2:])

A `tuple` is **_immutable_** : its length is fixed and its elements cannot be changed.

By comparison, a `list` is **_mutable_** : it can be extended, reduced, and its elements can be changed. 

In [None]:
# Changing an element in a list
myList[3] = "Spam"
print(myList[3])

In [None]:
# Trying the same with a tuple raises an error:
myTuple[3] = "Spam"

Remember the `help()` function ? Let's use it to gain a better undertsanding of the lists :

In [None]:
help(list)

That's a lot of information... let's go through it!  
The help page first defines the object `Built-in mutable sequence.`, then it describes the behaviour of `list()` if no argument is given (creates an empty list). 

Then, it says `Methods defined here:`. **methods** are functions that are attached to a type to enable some basic manipulation of objects of that type.  
Methods are called using the syntax `object.method(...)`

Let's focus on two of them :
 * `append(self, object, /) `: this method adds an object (given as argument) at the end of the list.
 * `insert(self, index, object, /)` : this method inserts an object (given as 2nd argument) at before the index given as the 1st argument.
 
Let's try that:

In [None]:
print("list before:", myList)

# Calling the "append()" method of myList to add an element at the end of it.
myList.append("ham") 
print("list after append:", myList)

# Calling the method insert of myList to add an element in second position. 
# Remember that python indices start with 0, so inserting before position 1 puts the new object in second position in myList.
myList.insert(1 , "beans") 
print("list after insert:", myList)

Methods are an important part of python. Before you start writing your own code to manipulate an object, **always** check if the object already has a method that does exactly (or nearly) what you want.
This will save you a lot of time and grief.

#### From list to string, and back again ...

String variable can be converted to list using the `list()` function:

In [None]:
myString = "Drop your panties Sir William, I cannot wait till lunchtime."
myList = list(myString)
print(myList)

The default behavior is that each letter of the string becomes an element in the list.

Often we prefer to create a list that contains each word of the string. For this we use the `split()` method of string :

In [None]:
myWords = myString.split()
print(myWords)

The `split()` method is very useful when reading formatted text files. 
It can accepts an optional `sep` argument that allows separation of fields using another character (look up `help(str.split)` for details).

To convert a list to a string, use the `join` method can be used (which may be seen as the converse from split).
Somehow counter-intuitively, the `join` method applies to strings, and takes a list as argument:

In [None]:
# Here, the separator calls the join method which accepts the list "myWords" as argument.
myString = " ".join(myWords) 
print(myString)

# One can use a more exotic separator.
myString = "_SEP_".join(myWords) 
print(myString)

# TIP: use an empty separator to just join letters.
myString = "".join(['to','ba','c','co','ni','st']) 
print(myString)

**Bonus**: lists can be concatenated with the `+` operator, extended with `+=` (addition assignment) and "multiplied" with `*`:

In [None]:
# Crate a new list by appending two lists.
list_one = [ ',' , 1159 ]
list_two = list_one + [10.1, '45', 7] 
print(list_two)

# Extend a list with the += operator
# BTW, this could be written with the += operator:
# list_one += [10.1,'45',7] 

# As well as multiplication
menu = ['spam','eggs'] * 3  
print(menu)

**Micro exercise** : create a list with all integers from 0 to 3 in it. Add two numbers at the end of the list. Use a slicing operation to select the fourth element in the list.

<br><br>

### Dictionnaries
Dictionnaries, or `dict`, are containers that associate a **key** to a **value**, just like a real world dictionnary associates a word to its definition.
* Dictionaries are instantiated with the `{key:value}` or `dict()` syntax.
* **keys** must be unique in the dictionnary.
* **values** can appear as many time as desired in the dictionnary.
* the `[]` operator is used to select objects from the dictionnary, but using their key instead of their index.
* Unlike **Lists** or **Tuples**, **Dict** are unordered collections: they do not record element position or order of insertion. Therefore values cannot be retrieved by index position.  
  E.g. `test_dict[0]` is not a valid syntax (and will raise a `keyError`), unless there is a key value of "0" in the dict.

In [None]:
student_age = dict()                          # this is one way to initiate a dictionnary
student_age = { 'Anne' : 26 , 'Viktor' : 31 } # this is another way to initiate a dictionnary, directly with data in

# Adding key:value pairs to an existing dictionary is as easy as:
student_age['Eleonore'] = 5
print('dictionnary:', student_age)

# Modifying the value associated to a key is equally easy:
student_age['Eleonore'] = 25
print('dictionnary:',student_age)

# We are not restricted to a particular type for keys, nor for values. We can e.g. make dict of lists or dict of dict.
student_age[0] = 'zero' 
student_age['group_1'] = [23, 25, 28] 
student_age['group_2'] = {'bob':26, 'alice':27}
print('dictionnary:',student_age)

# Removing objects from the dictionnary is done with the pop() method, look at the help for more details.
student_age.pop('Anne') 
print('dictionnary:',student_age)


<br><br>

## Exercises: 1.1 - 1.7

<br>

We recommend you have a look at the `Mutable_vs_immutable.ipynb` notebook to gain a better understanding of the difference between some the objects presented here.
This is an important notation that newcomers to Python need to be aware of, which otherwise can lead to serious bugs in our codes.