{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Plotting in python using `matplotlib`\n", "\n", "The [matplotlib](https://matplotlib.org/) library is the standard when it comes to creating plots in python.\n", "It is reasonnably easy to use, accomodates a wide variety of visualisation (including interactive plots), and offers plenty of possibilities to personnalize your plots.\n", "\n", "Most of matplotlib's functionality can be accessed through its `pyplot` module, a collection of functions that were originally developed to make matplotlib resemble MATLAB. Since having to always type `matplotlib.pyplot` each time a function is accessed does quickly become tieresome, `matplotlib.pyplot` is tradinonally imported as `plt`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> **Note:** if this line fails, it's probably because `matplotlib` is not installed on your system. \n", "> We recommend you install `matplotlib` as part of \n", " [`scipy`](https://www.scipy.org/install.html) using anaconda or pip.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### During this tutorial you will learn how to:\n", " * Plot data with matplotlib.\n", " * Decorate plots: labels, titles, legend.\n", " * Set custom syles for your plots.\n", " * Create multi-plot layouts.\n", " * Save your plots to file.\n", " * Plenty of other useful graphs to plot your data.\n", " \n", "The **Matplotlib** documentation also contains a [gallery](https://matplotlib.org/gallery.html), where many examples of how to plot your data in different ways are shown." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "### Procedural vs. Object-Oriented interface\n", "Matplotlib can be used in two different modes:\n", "* **Procedural** (also referred to as **state-based** or **MATLAB-like**): in this mode, successive calls to `pyplot` function are made, and the state of the plot (or figure) is preserved between calls. This mode was developped with the intention to resemble MATLAB usage, and is also similar to basic plotting in R.\n", "\n", "* **Object-oriented:** in this usage mode, a figure consists of a **figure** object that can contain one or more **axes**. An **axes** object represent one plot inside a figure - not just an axis, as its name would suggest! Elements such as data content, legends or axis legends are all drawn onto **axes** objects.\n", "\n", "Importantly, both the procedural and object-oriented approaches can be used to achieve the same results. So the choice is really up to your personal preference. Both approaches will be illustrated in this course module." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "
\n", "\n", "# Your first plot \n", "Adding data to a figure is done with the `plt.plot()` function.\n", "\n", "`plt.plot()` accepts 1 or 2 sequences (e.g. tuples or lists) with the same length.\n", "\n", "* When passing only **one sequence** to `plt.plot()`, its value are used as **Y axis** values.\n", " The **X axis** values default to the position of each value in the sequence (starting with index 0).\n", " In the example below, you can e.g. see that the x-coordinate of the value 8 is 3, because 3 is the\n", " index of value 8 in our list (it's the 4th value in the list)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Passing a single sequence to plot()\n", "x = [1,2,3,8]\n", "plt.plot(x)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* When passing **two sequences** to `plt.plot()`, the **first sequence** is taken as **X axis** values and the **second sequence** as **Y axis** values." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = [1,2,3,5]\n", "y = [1,2,3,8]\n", "\n", "# Passing 2 arguments to plot()\n", "plt.plot(x, y)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see from the examples above, one particularity of `matplotlib.pylot` functions is that they preserve the state of a plot between functions calls. For instance, when calling `plt.show()`, there is no need to pass it any argument - it automatically shows the last plot we have created with `plt.plot()`.\n", "\n", "The general workflow when creating plots with pyplot is therefore to:\n", "1. Create a new plot or figure, e.g. with `plt.plot()`.\n", "2. Add elements to the plot using various pyplot functions (e.g. add a title, axes names, legends, ...).\n", "3. Render the plot with `plt.show()`.\n", "\n", "Note however that, once a plot is rendered, its underlying plot object is deleted and it cannot be rendered again without building it again. In the cell below, we are calling the `plt.show()` method again, but nothing is being plotted:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Calling the show() function does not produce anything,\n", "# because the underlying plot object no longer exists.\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Add a title, name the axes\n", "To add a title to our plot:\n", "* `plt.title()`\n", "\n", "To label the axes we use the following functions:\n", "* `plt.xlabel()` for the X axis\n", "* `plt.ylabel()` for the Y axis" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = [1,2,3,5]\n", "y = [1,2,3,8]\n", "\n", "# Create plot.\n", "plt.plot(x,y)\n", "\n", "# Add a title and axis names to our plot.\n", "plt.title(\"my very first plot, now with a title\")\n", "plt.xlabel(\"X values\")\n", "plt.ylabel(\"Y values\")\n", "\n", "# Render the plot.\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Adding a legend\n", "Adding a legend to your plot is ofen a necessity. Luckily, matplotlib makes it easy, thanks to the `plt.legend()` function.\n", "\n", "Here is an example to illustrate its usage:\n", "* Let's generate 100 numbers equally spaced between 0 and 10 (we use `numpy` to do that easily).\n", " Then let's apply the `sin()` and `cos()` functions to these numbers and plot the result" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "x = np.linspace(0, 10, 100)\n", "plt.plot(x, np.sin(x))\n", "plt.plot(x, np.cos(x))\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* To indicate which curve is the sine and which is the cosine, let's add a legend to our plot with \n", " `plt.legend()`.\n", "* By default, the legend is placed at what matplotlib considers to be the \"best\" location, i.e. the \n", " location with the least overlap with other features.\n", "* **Important:** to be able to display a legend, pyplot **needs a label for each plotted element**. \n", " * This is specified by passing the `label=...` argument to `plt.plot()`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.plot(x, np.sin(x), label='sin')\n", "plt.plot(x, np.cos(x), label='cos')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* More control over the position of the legend can be achieved through the `loc` argument, e.g.:\n", " * `plt.legend(loc='upper right')`\n", " * `plt.legend(loc='upper center')`\n", " * `plt.legend(loc='best')` - this is the default.\n", " * [see here for all options](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.legend.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "plt.plot(x, np.sin(x), label='sin')\n", "plt.plot(x, np.cos(x), label='cos')\n", "plt.legend(loc='upper right')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Saving your plot as a file\n", "Now that you created a plot, you might also want to save it permanently as an image file. This is exactly what `plt.savefig()` is designed to do. The only mandatory argument for `plt.savefig()` is the file name (or path + name) where to save the plot:\n", "* `plt.savefig(fname)`\n", "\n", "Note that the exension given to the file name is used by matplotlib to determine the format of the file:\n", "* `plot.png` will create a PNG file - this is the default.\n", "* `plot.jpg` will create a JPEG file.\n", "* `plot.pdf` will create a PDF file.\n", "\n", "Other optional arguments that relate to image quality or size can also be passed to `plt.savefig()`, for instance:\n", "* `dpi`: dot-per-inch value to use for the output image file.\n", "* `quality`: only applies to JPEG files - the image quality, ranging from 1 (low quality, smaller size) to 95 (highest quality, largest size).\n", "* `transparent`: if True, the image background is set to transparent. \n", "\n", "For a more comprehensive documentation of the `plt.savefig()` function, [see this link](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.savefig.html).\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "plt.plot(x, np.sin(x), label='sin')\n", "plt.plot(x, np.cos(x), label='cos')\n", "plt.legend(loc='best')\n", "\n", "plt.savefig('sin_cos_plot.png', dpi=300) # saves the plot as a png file.\n", "plt.savefig('sin_cos_plot.jpg', dpi=150) # saves the plot as a jpeg file.\n", "plt.savefig('sin_cos_plot.pdf', dpi=150) # saves the plot as a PDF file." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "## The object-oriented mode\n", "In object-oriented mode, the steps to create a figure/plot is the following:\n", "1. Create a **figure** and an **axes** object.\n", "2. Add content to the **axes** object (plotted data, legends, axes labels).\n", "3. Display or save the figure.\n", "\n", "Several options exist to create a new **figure** and an **axes** object.\n", "* One that is frequently used is the `plt.subplots()` function, that returns a tuple composed \n", " of a figure object, and an axes object `(figure, axes)`. This tuple is generally immediatly \n", " **unpacked** into its individual components (the **figure** and the **axes** object), which is why you will frequently see the syntax:\n", " * `fig, ax = plt.subplots()`\n", "\n", "\n", "* Another method is to first create a figure object, and then use its `add_subplot()` method to\n", " create an axes object.\n", " ```python\n", " fig = plt.figure()\n", " ax = fig.add_subplot()\n", " ```" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Two methods of creating new figure and axes objects.\n", "# In both cases, the figure is just an empty shell at this point.\n", "fig, ax = plt.subplots()\n", "\n", "fig = plt.figure()\n", "ax = fig.add_subplot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is important to remember that the **axes** object is the object onto which plots are drawn. Here we use the `plot()`, `set_xlabel()`, `set_ylabel()`, `set_title()`and `legend()` methods of the **axes** object to add elements to the plot.\n", "\n", "As you might have noted, many of the axes methods have the same name as the pyplot functions but with an added `set_` prefix:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Add content to the axes object\n", "fig, ax = plt.subplots()\n", "ax.plot(x, np.sin(x), label='sin')\n", "ax.plot(x, np.cos(x), label='cos')\n", "ax.set_title(\"sine and cosine plot\")\n", "ax.set_xlabel(\"value\")\n", "ax.set_ylabel(\"sine/cosine value\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we can also save our figure to an image file with the figure's `savefig()` method. The arguments of the `savefig()` method are the same than those of the `plt.savefig()` function we saw earlier." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Save figure to as a file.\n", "fig.savefig('sin_cos_plot_objectoriented.png', dpi=300)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "## Exercise 6.1\n", "* Try to plot the Distance, Velocity, and Acceleration vs. Time - Usain Bolt's 100 meter olympic record (2008).\n", "* You should draw the following elements:\n", " * Plot distance, velocity and acceleration as a function of time.\n", " * Add a title.\n", " * Add legends to your axes.\n", " * Add a legend to the plot.\n", " \n", "Here is the raw data:\n", "\n", "```python\n", "time = [0, 1.85, 2.87, 3.78, 4.65, 5.50, 6.31, 7.14, 7.96, 8.79, 9.69]\n", "distance = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]\n", "velocity = [0, 5.41, 9.80, 10.99, 11.49, 11.76, 12.35, 12.05, 12.20, 12.05, 11.11]\n", "acceleration = [0, 2.92, 4.30, 1.31, 0.57, 0.32, 0.73, -0.36, 0.18, -0.18, -1.04]\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "
\n", "\n", "# Plot styling\n", "Changing the style of a plot (e.g. color, use a line or dots, shape of line or dots) can be achieved using two different methods:\n", "* passing specific **keyword arguments** to `plt.plot()` (i.e. passing arguments while specifying their names).\n", "* using so called **format strings**. These are essentially shortcuts to the keyword arguments.\n", "\n", "## Changing plot style via keyword arguments\n", "Here is a list of useful keyword arguments that can be passed to `plt.plot()`:\n", "* `color`: the color of the line or dots. This arguments accepts different types of inputs, e.g.:\n", " * strings giving the name of the color: e.g. `\"green\"`, `\"blue\", \"red\", \"orange\", \"yellow\", \"black\"`.\n", " * hex strings: e.g. `\"#008000\"`.\n", " * RGB tuples: e.g. `(150, 150, 150)`. \n", "\n", "\n", "* `marker`: the type of symbol uses to draw data points, for instance:\n", " * `\"o\"`: circles.\n", " * `\"s\"`: square.\n", " * `\"^\"`: triangle.\n", " * `\"+\"`: as a \"+\" shape.\n", " * `\"None\"`: do not draw data points (the keyword can also simply be omitted). Note that this value is\n", " a the string \"None\", not the python None object.\n", " * [see here for a complete list of matplotlib markers](https://matplotlib.org/2.1.1/api/markers_api.html#module-matplotlib.markers)\n", "\n", "\n", "* `linestyle`: the type of the line to, e.g. `\"solid\"`, `\"dashed\"`, `\"dotted\"`, `\"dashdot\"`.\n", "* `linewidth`: float value giving the width (thickness) of the line in points. The default is `1.0`.\n", "* `markersize`: float value giving the size of markers (i.e. symbols used to draw points).\n", "* `markerfacecolor` and `markeredgecolor`: color of respectively the inside and edge of markers.\n", "\n", "A complete list of keyword arguments can be found [here](https://matplotlib.org/2.1.1/api/_as_gen/matplotlib.pyplot.plot.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = [1,2,3,5]\n", "y = [1,2,3,8]\n", "\n", "# Create plot with different line and marker styles.\n", "plt.plot(x, y)\n", "plt.plot(x, [y * 2 for y in y], color='orange', marker='^', linestyle='dashed')\n", "plt.plot(x, [y * 3 for y in y], color='red', marker='o', linestyle='dotted')\n", "plt.plot(x, [y * 4 for y in y], color='#008000', marker='s', linestyle='dashdot', linewidth=1)\n", "plt.plot(x, [y * 5 for y in y], color='#008000', marker='s', linestyle='dashdot', linewidth=3, \n", " markersize=12, markerfacecolor=\"yellow\", markeredgecolor=\"purple\")\n", "plt.plot(x, [y * 10 for y in y], color='None', marker='+', markersize=8, markeredgecolor=\"grey\")\n", "\n", "# Render the plot.\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Changing plot style via format strings\n", "**Format strings** are a sort of shortcut to pass style arguments as a single and short string to `plt.plot()`. The format string must be passed as a positional argument to the function, just after the data to plot.\n", "\n", "Here are a few common format string values:\n", "* color: `'b'` (blue), `'r'` (red), `'green'` (), `'y'` (yellow), `'k'` (black).\n", "* marker symbol: `'o'` (circle), `'s'` (square), `'^'` (triangle), `'+'` (\"+\").\n", "* line type: `'-'` (solid), `'--'` (dashed), `':'` (dotted), `'-.'` (dashdot).\n", "* [see here for a complete list of format strings](https://matplotlib.org/2.1.1/api/_as_gen/matplotlib.pyplot.plot.html)\n", "\n", "These can then be combined in a single string, making for a really compact way to pass styling instructions.\n", "Here are a few examples:\n", "* `'-ob'` = solid blue line with blue dots for data points.\n", "* `'--g'` = dashed green line with no markers for data points.\n", "\n", "Note that, while more convenient, format strings do not offer all the customization options provided by keyword arguments." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Same as above, but created with format strings.\n", "x = [1,2,3,5]\n", "y = [1,2,3,8]\n", "plt.plot(x, y, '-')\n", "plt.plot(x, [y * 2 for y in y], '--^')\n", "plt.plot(x, [y * 3 for y in y], ':or')\n", "plt.plot(x, [y * 4 for y in y], '-.sg')\n", "plt.plot(x, [y * 5 for y in y], '-.sg', linewidth=3, markersize=12, \n", " markerfacecolor=\"yellow\", markeredgecolor=\"purple\")\n", "plt.plot(x, [y * 10 for y in y], '+k')\n", "\n", "# Render the plot.\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise 6.1 - continued\n", "* Change the styling of your Usain Bolt 100m plot, e.g.:\n", " * Draw some lines with dashes.\n", " * Add data point marker symbols on some lines.\n", " * Change the color of lines and marker symbols." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "
\n", "\n", "# Matplotlib Figures\n", "So far we have seen how to draw a single plot. But matplotlib can do much more than that, allowing you e.g. to have multiple subplots. This requires to introduce the concept of **figure**, which can be seen as the \"drawing board\" on which you then add individual plots.\n", "\n", "## Modifying a plot's size\n", "A first use of the matplotlib figure is to modify a plot's size and aspect ratio. This can be done using the `plt.figure()` function:\n", "* `plt.figure(figsize=(width, height))`, where width and height define the size of the plot in inches.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = [1,2,3,5]\n", "y = [1,2,3,8]\n", "\n", "# Here we ask for our plot to be 10 inches wide and 2 inches high.\n", "plt.figure(figsize=(10, 2))\n", "plt.plot(x, y, color='green', marker='o', linestyle='dashdot')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Same as above, but now with a square aspect-ratio.\n", "plt.figure(figsize=(5, 5))\n", "plt.plot(x, y, color='green', marker='o', linestyle='dashdot')\n", "plt.show()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Object-oriented approach\n", "The size of a figure can also be modified in object-oriented approach, with either:\n", "* `plt.figure(figsize=(5, 5))` - exactly same in procedural mode.\n", "* The `.set_size_inches(w=10,h=3)` method of the a figure object." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = [1,2,3,5]\n", "fig = plt.figure(figsize=(5, 5))\n", "ax = fig.add_subplot()\n", "ax.plot(x, [x**2 for x in x])\n", "plt.show()\n", "\n", "fig = plt.figure()\n", "fig.set_size_inches(w=10,h=3)\n", "ax = fig.add_subplot()\n", "ax.plot(x, [x**2 for x in x])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "## Drawing multiple plots per figure\n", "Having more than one plot per figure is possible using pyplot's `plt.subplot()` or `plt.subplots()` functions.\n", "Chosing between these two functions comes down to whether we want to use the **procedural** or an **object-oriented** approach:\n", "* In the **procedural** approach, we make repeated calls to `plt.subplot()` to each time focus on a different subplot of the figure.\n", "* In the **object-oriented** approach, we use `plt.subplots()` to create a **figure** and an **axes** objet. The individual subplots can then be drawn on each **axis** of the axes object.\n", "\n", "In both cases, the underlying idea is that sub-plots are organized on a grid defined by a **number of rows** and **number of columns**." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Procedural approach\n", "This approach uses repetitive calls to the `plt.subplot()` function, passing it 3 arguments each time (**in this order**): \n", "* number of rows.\n", "* number of columns.\n", "* number of the sub-plot to currently draw on.\n", "\n", "Example: `plt.subplot(2, 3, 1)` will:\n", "* Create a figure with 6 subplots distributed over 2 rows and 3 columns.\n", "* Set the focus on the first subplot (i.e. any command passed at this point will draw on subplot 1).\n", "\n", "Subplots are numbered starting with 1 for the top-left subplot, increase increasing as we move to the right of the first row, then continuing on the second row, etc.\n", "\n", "The general code structure then looks something like this:\n", "```python\n", " plt.figure()\n", " plt.subplot(2, 3, 1) # create a 2 rows x 3 cols matrix of subplots, draw on the first subplot.\n", " ... # pyplot instructions for the first subplot.\n", " ...\n", " plt.subplot(2, 3, 2) # move to the second subplot.\n", " ... # pyplot instructions for the second subplot.\n", " ...\n", " plt.subplot(2, 3, 3) # move to the third subplot.\n", " etc...\n", "```\n", "\n", "Here is an actual example where we draw a figure with 2 subplots:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.linspace(0, 10, 100)\n", "\n", "# Create figure.\n", "plt.figure(figsize=(12, 4))\n", "\n", "# Create subplots - focus on subplot 1.\n", "plt.subplot(1, 2, 1)\n", "plt.plot(x, np.sin(x), label='sin', color='cornflowerblue')\n", "plt.xlabel('X')\n", "plt.ylabel('Y')\n", "plt.title('sin function')\n", "plt.legend(loc='best')\n", "\n", "# Focus on subplot 2.\n", "plt.subplot(1, 2, 2)\n", "plt.plot(x, np.cos(x), label='cos', color='orange')\n", "plt.xlabel('X')\n", "plt.ylabel('Y')\n", "plt.title('cos function')\n", "plt.legend(loc='best')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Object-oriented approach\n", "We start by invoking `plt.subplots(nrows, ncols)`, and pass it the number of rows and columns that our subplot matrix should have. \n", "For instance:\n", "* `plt.subplots(2, 3)` creates a matrix of 2 rows and 3 columns, allowing to draw a total of 6 subplots.\n", "* `plt.subplots(4, 3)` creates a matrix of 4 rows and 3 columns, allowing to draw a total of 12 subplots.\n", "\n", "`plt.subplots()` returns a tuple composed of a **figure** object, and an **axes** object. \n", "This tuple is traditionally immediatly **unpacked** into its individual components, which is why you will very often see the syntax:\n", "* `fig, ax = plt.subplots(nrows, ncols)`\n", "\n", "where `fig` stores the **figure** object, and `ax` the **axes** object. \n", "The `ax` object is an array-like structure, allowing you to access individual **AxesSubplot**, on each of which you can then draw a subplot.\n", "\n", "Here is an example to illustrate this:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nrows = 2\n", "ncols = 4\n", "fig, ax = plt.subplots(nrows, ncols)\n", "fig.set_size_inches(15, 4)\n", "counter = 0\n", "for row in range(nrows):\n", " for col in range(ncols):\n", " ax[row][col].text(0.4, 0.45, f\"plot {counter}\", fontsize=20)\n", " counter += 1\n", "\n", "print(\"Axes object is of type:\", type(ax), \", and has a dimension (rows x cols) of:\", ax.shape)\n", "print(\"It is composed of objects of type:\", type(ax[0][0]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "x = np.linspace(0, 10, 100)\n", "\n", "# subplots() returns a tuple with a \"Figure\" and an \"Axes\" object, that we immediatly unpack.\n", "fig, ax = plt.subplots(nrows=1, ncols=2)\n", "fig.set_size_inches(w=12, h=3)\n", "\n", "# Accessing the first subplot with ax[0]\n", "ax[0].plot(x, np.sin(x), label='sin', color='cornflowerblue')\n", "ax[0].set_ylabel('Y')\n", "ax[0].set_xlabel('X')\n", "ax[0].set_title('Sin function')\n", "ax[0].legend(loc='best')\n", "\n", "# Accessing to the second subplot with ax[1]\n", "ax[1].plot(x, np.cos(x), label='cos', color='orange')\n", "ax[1].set_ylabel('Y')\n", "ax[1].set_xlabel('X')\n", "ax[1].set_title('Cos function')\n", "ax[1].legend(loc='best')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Verical layout\n", "By inverting the values for `nrows` and `ncols`, we can have a vertical layout.\n", "\n", "Because this layout does not give enough space by default between the plots, we must adjust it\n", "with the funtion `plt.subplots_adjust()`.\n", "\n", "There is more documentation [here](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.subplots_adjust.html)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.linspace(0, 10, 100)\n", "\n", "# Create figure and axes objects. Define the size of the figure.\n", "fig, ax = plt.subplots(nrows=2, ncols=1, figsize=(8,5))\n", "\n", "# Draw the first subplot with ax[0]\n", "ax[0].plot(x, np.sin(x), label='sin', color='cornflowerblue')\n", "ax[0].set_ylabel('Y') # you can define the fontsize of the labels\n", "ax[0].set_xlabel('X')\n", "ax[0].set_title('Sin function') \n", "ax[0].legend(loc='best')\n", "\n", "# Draw the second subplot with ax[1]\n", "ax[1].plot(x, np.cos(x), label='cos', color='orange')\n", "ax[1].set_ylabel('Y')\n", "ax[1].set_xlabel('X')\n", "ax[1].set_title('Cos function')\n", "ax[1].legend(loc='best')\n", "\n", "# Adjust space between plots in the subplot layout\n", "plt.subplots_adjust(hspace=0.8)\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "### Multiple plots with different sizes\n", "We can create a grid-shaped combination of differently-sized subplots using the following functions:\n", "* `gridspec.GridSpec(nrows, ncols)` to specify the geometry of the grid where will draw our subplots.\n", "* `plt.subplot2grid(shape, loc, rowspan, colspan)`, that it is similar to the `subplot` function.\n", " * `shape`: a sequence of 2 integers giving the shape of the grid, e.g. `(3, 3)`.\n", " * `loc`: a sequence of 2 integers giving location of the subplot within the grid. \n", " * The location is given as `(row, col)` coordinate, e.g. `(0, 0)` is the top-leftmost subplot\n", " and `(0,2)` is the subplot in the 1st row, 3rd col. \n", " * When the subplot spans over several rows/columns, the coordinate is the one of the top-left\n", " subplot in the range or rows/columns covered by the subplot.\n", " * `rowspan`: The number of grid rows occupied by the subplot. by default `rowspan=1`.\n", " * `colspan`: The number of grid rows occupied by the subplot. by default `colspan=1`.\n", " \n", "\n", "You can find more information with `gridspec.GridSpec` and `help(plt.subplot2grid)`, or [here](https://matplotlib.org/3.2.1/tutorials/intermediate/gridspec.html).\n", "\n", "Note that the first thing we need to do is to import matplotlib's `gridspec` module:\n", "\n", " import matplotlib.gridspec as gridspec" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.gridspec as gridspec\n", "import numpy as np\n", "\n", "x = np.linspace(0, 10, 100)\n", "\n", "# Plot figure with subplots of different sizes.\n", "fig = plt.figure(1)\n", "\n", "# Set up the subplot grid: in this example we create a 3 by 3 matrix.\n", "gridspec.GridSpec(3, 3)\n", "\n", "# Draw a large subplot: it spans over 3 rows and 2 columns.\n", "plt.subplot2grid((3, 3), (0, 0), rowspan=3, colspan=2) \n", "plt.title('Sin function')\n", "plt.xlabel('x values')\n", "plt.plot(x, np.sin(x), label='sin', color='cornflowerblue')\n", "\n", "# small subplot 1\n", "plt.subplot2grid((3, 3), (0, 2))\n", "plt.title('Tan function')\n", "plt.xlabel('x values')\n", "plt.plot(x, np.tan(x), label='tan', color='cornflowerblue')\n", "\n", "# small subplot 2\n", "plt.subplot2grid((3, 3), (1, 2))\n", "plt.title('Cos function')\n", "plt.xlabel('x values')\n", "plt.plot(x, np.cos(x), label='cos', color='cornflowerblue')\n", "\n", "\n", "# Automatically adjust subplot \n", "fig.tight_layout()\n", "\n", "# Set the size of the figure\n", "fig.set_size_inches(w=11,h=7)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "\n", "## Scatter plots\n", "* Good to visualize two numeric variables.\n", "* Drawn with `plt.scatter(x, y, s=None, c=None)`.\n", " * `x, y` mandatory arguments, the **x and y coordinates** of the data points.\n", " * `s` optional, the **marker size** used for each data point, allows to represent different points\n", " with different sizes.\n", " * `c` optional, the **marker color** used for each data point.\n", " * `marker` optional, the **type of marker** to use to represent data points (same as what we have seen in\n", " the plot styling section).\n", "* For a full documentation, see [here](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.scatter.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import random\n", "\n", "# Generate some random data for X and Y coordinates.\n", "length = np.random.uniform(low=0, high=10, size=(50,))\n", "width = np.random.uniform(low=8, high=10, size=(50,))\n", "\n", "# Plot the data. Since we do not give any values for the \"s\", \"c\" and \"marker\" arguments\n", "# of plt.scatter, the default values are used.\n", "plt.scatter(length, width)\n", "plt.title(\"length vs. width\")\n", "plt.xlabel('length')\n", "plt.ylabel('width')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's plot our values again, but this time passing the `s` and `c` optional arguments to `scatter()`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot scatter points with different sizes and colors.\n", "marker_size = range(1, 101, 2)\n", "marker_color = random.choices(['red','teal','orange','lightgreen','pink'], k=50)\n", "plt.scatter(length, width, c=marker_color, s=marker_size)\n", "\n", "plt.title(\"length vs width\")\n", "plt.xlabel('length')\n", "plt.ylabel('width')\n", "\n", "plt.show()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "### Exercise 6.2:\n", "* Given the a lists of height, weight and gender below:\n", " * Create a scatter plot of height (y) as a function of weight (x).\n", " * Give different colors to females and males (see `gender` list below). \n", " Specifically, use \"teal\" color for males, and \"darkorange\" for females.\n", " * Add a title, and axes labels.\n", "\n", "```\n", "height = [150, 152, 152, 152, 152, 152, 152, 152, 155, 155, 155, 155, 155, 155, 155, 155, 155, 155, 157, 157, 157, 157, 157, 157, 157, 157, 160, 160, 160, 160, 160, 160, 160, 160, 160, 160, 160, 163, 163, 163, 163, 163, 163, 163, 163, 163, 163, 163, 165, 165, 165, 165, 165, 165, 165, 165, 168, 168, 168, 168, 168, 168, 168, 168, 168, 168, 168, 168, 168, 168, 170, 170, 170, 170, 170, 170, 170, 173, 173, 173, 173, 173, 173, 173, 173, 173, 173, 173, 175, 175, 175, 175, 175, 175, 175, 178, 178, 178, 178, 178, 178, 178, 180, 180, 180, 183, 183, 183, 183, 183, 183, 183, 183, 183, 183, 183, 183, 185, 188, 188, 191, 193]\n", "\n", "weight = [73, 72, 45, 59, 95, 62, 54, 68, 77, 92, 83, 56, 47, 77, 63, 81, 52, 110, 54, 59, 61, 90, 81, 63, 63, 59, 52, 77, 50, 63, 58, 72, 63, 65, 69, 59, 72, 67, 59, 59, 65, 77, 61, 111, 93, 74, 68, 59, 68, 50, 77, 74, 65, 72, 77, 86, 61, 65, 53, 60, 67, 99, 61, 70, 97, 54, 88, 99, 68, 70, 56, 86, 63, 65, 72, 81, 68, 79, 70, 81, 79, 85, 99, 77, 71, 81, 89, 72, 106, 56, 90, 81, 87, 81, 94, 83, 77, 94, 86, 95, 77, 77, 92, 77, 77, 131, 97, 101, 108, 99, 90, 91, 86, 113, 126, 104, 81, 77, 117, 63, 68, 101]\n", "\n", "gender = [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 1, 2, 2, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 1, 1, 2, 2, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "# Other plot types" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pie Charts" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "channels = ['Netflix', 'HBO', 'Disney+', 'Amazon Prime']\n", "clients = [45, 20, 25, 10]\n", "\n", "plt.figure()\n", "\n", "# Specify the size of the plot\n", "fig.set_size_inches(w=11,h=7)\n", "\n", "plt.pie(clients, labels = channels, shadow = True, autopct='%1.1f%%')\n", "plt.legend(title=\"List of channels\",loc='upper left')\n", "\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Histograms\n", "* To graphicaly summarize the distribution of your data, you can plot it as a histogram using the `plt.hist()` function.\n", "* You can find more documentation about plotting histograms [here](https://matplotlib.org/3.2.1/gallery/statistics/hist.html)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import random\n", "\n", "mu = 4 # mean\n", "sigma = 1 #standard deviation\n", "\n", "x = np.random.normal(mu, sigma, 1000)\n", "\n", "plt.hist(x)\n", "plt.xlim((min(x), max(x))) # you can also specify the limits of the plot axes\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Boxplots\n", "* It is possible to make boxplots to visualize your data using the function `plt.boxplot()` and the argument passed should be a list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import random\n", "\n", "mu = 4 # mean\n", "sigma = 1 #standard deviation\n", "x = np.random.normal(mu, sigma, 1000)\n", "\n", "plt.boxplot(x)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also visualize multiple boxplots inside the same plot." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "mu = 4 # mean\n", "sigma = 1 #standard deviation\n", "x = np.random.normal(mu, sigma, 1000)\n", "\n", "y = x + 2*np.random.randn(len(x))\n", "\n", "# you need to pass a list of list\n", "plt.boxplot([x,y])\n", "\n", "plt.xticks([1, 2], ['x', 'y']) # set the current tick locations and labels of the x-axis\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bar plots" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mu = 4 # mean\n", "sigma = 1 #standard deviation\n", "x = np.random.normal(mu, sigma, 1000)\n", "\n", "y = np.random.normal(6, 3, 1000)\n", "\n", "mean_variables = [np.std(x), np.std(y)]\n", "variable_name = ['x', 'y']\n", "\n", "plt.bar(variable_name, mean_variables, color='cornflowerblue')\n", "\n", "plt.ylabel('Standard deviation')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Heatmap plots\n", "\n", "Heatmap plots allows you to visualize with colors the different values contained in a matrix.\n", "\n", "You can find more documentation and example [here](https://matplotlib.org/3.1.1/gallery/images_contours_and_fields/image_annotated_heatmap.html) and [here too](https://python-graph-gallery.com/heatmap/) about heatmap plots." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "channels = ['Netflix', 'HBO', 'Disney+', 'Amazon Prime']\n", "country = ['Spain', 'Switzerland', 'France', 'Italy']\n", "clients = np.array([[45, 20, 25, 10],\n", " [43, 18,22,17],\n", " [30,28,17,25],\n", " [33,23,20,24]])\n", "\n", "\n", "fig, ax = plt.subplots()\n", "# with the parameter cmap, you can choose different colors: cmap=\"Spectral\"\n", "im = ax.imshow(clients, cmap=\"Set3\") \n", "\n", "# We want to show all ticks...\n", "ax.set_xticks(np.arange(len(channels)))\n", "ax.set_yticks(np.arange(len(country)))\n", "\n", "# ... and label them with the respective list entries\n", "ax.set_xticklabels(channels)\n", "ax.set_yticklabels(country)\n", "\n", "# Rotate the tick labels and set their alignment.\n", "plt.setp(ax.get_xticklabels(), rotation=45, ha=\"right\",\n", " rotation_mode=\"anchor\")\n", "\n", "# Loop over data dimensions and create text annotations.\n", "for i in range(len(channels)):\n", " for j in range(len(country)):\n", " text = ax.text(j, i, clients[i, j],\n", " ha=\"center\", va=\"center\", color=\"k\")\n", "\n", "fig.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Colormaps\n", "\n", "Colormaps are useful to have a good representation of the plotted data.\n", "There are different types:\n", "* Sequential: 'Purples', 'Blues', 'Greens', ...\n", "* Diverging: 'RdYlGn', 'Spectral', 'coolwarm', ... \n", "* Cyclic: 'twilight', 'twilight_shifted', 'hsv'\n", "* Qualitative: 'Pastel1', 'Pastel2', 'Paired', ...\n", "\n", "You can find a more extensive documentation [here](https://matplotlib.org/2.0.1/users/colormaps.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "\n", "# Additional Exercises" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 6.3 \n", "Given a dictionary with the popularity of each programming language, plot the popularity of the different languages in the best way." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "code_languages = {'Java': 17.2,\n", " 'Python': 22.2, \n", " 'PHP': 8.8,\n", " 'JavaScript':8, \n", " 'R':9.3,\n", " 'C++': 6.7}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 6.4\n", "\n", "Grab the most recent data on:\n", " * [number of hospitalised person](https://raw.githubusercontent.com/daenuprobst/covid19-cases-switzerland/master/covid19_hospitalized_switzerland_openzh.csv)\n", "\n", "1. download and read this data file as pandas `DataFrame`\n", "\n", "> import pandas as pd\n", "\n", "> datHospit = pd.read_csv(file.csv)\n", "\n", "2. replace all the nan values by 0\n", "> df.fillna(0)\n", "3. plot the distribution of one canton, to see how it was evolving througth time" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.6" } }, "nbformat": 4, "nbformat_minor": 2 }