DV-20: Exercices | SIB

Visualizing a dataset

The file etubiol.csv contains information collected from biology students at the University of Lausanne. You can load the file in R using the command:

etubiol <- read.csv("etubiol.csv")

You can view the content of the dataset using the command

View(etubiol)

Look at the different variables, and think about how you would be visualize each of them separately.

Then, think about how you would best display them so that you would be able to see if the variable is distributed differently between males and females.

For example, draw a scatterplot of student's height vs weight, specifying a title and using different colours according to sex. Then plot two histograms comparing the distribution of heights for both sexes.

Survey on graphics

The file quiz.csv provides the average of the scores that you provided for the "utility" and "aesthetic" of the graphs that were shown to you. You can load it from R using the command

quiz <- read.csv("quiz.csv")

How would you visualize this data ?

Note: if you want to use ggplot2 and split the data according to the type of score (utility vs variable for example), you will need to convert the data to the "long" format (only one value per line, and "type of score" becomes a separate variable). You can do this with the melt() command from the reshape2 package:

library(reshape2)
quiz_long <- melt(quiz)

Timecourse experiment

The timecourse.csv file (which can be read in the same way as the previous files) contains information about 10 animals (5 wild-type, 5 knock-out), taken over 5 time points.

How would you represent the data ?

How would you represent the data, if we are interested in seeing both the individual data points and the average per group ?

Note: you will probably need to convert the data to the "long" format, as described above.

Last modified: Tuesday, 4 February 2020, 12:17 PM