R is a complete and flexible system for statistical analysis which has become a tool of choice for biologists and biomedical scientists, who need to analyze and visualize large amounts of data. One reason for this success is the availability of many contributed packages, which are available freely and can be installed and run directly from R. In bioinformatics, in particular, most published papers include a link to an R package implementing the methods described in the article. This "First Steps with R" course is addressed to beginners wanting to become familiar with the R environment and master the most common commands to be able to start exploring their own datasets.

During the first part of this workshop, researchers and professionals involved in Big Data management at VitalIT/SIB as well as in Data Management Plan preparation at UNIL/CHUV will teach you best practices in data management and how to collect, describe, store, secure and archive research data. You will be introduced to the need for a Data Management Plan (DMP) preparation, an evolving document reporting how the research data will be managed during and after a research project.

This course will present all the bioinformatics tools required to analyze RNA-seq gene expression data, from the raw data to the biological interpretation. This two-day course will discuss the following topics:

  • Quality control and reads cleanup
  • RNAseq reads mapping to genome & transcriptome
  • Gene reads counting, gene & exons differential expression
  • GO enrichment and pathway analysis

This course is designed to provide researchers in biomedical sciences with experience in the application of basic statistical analysis techniques to a variety of biological problems.

The course will combine lectures on statistics and practical exercises. The participants will learn how to work with the widely used "R" language and environment for statistical computing and graphics.

Topics covered during the course include: reminders about numerical and graphical summaries, and hypothesis testing; multiple testing, linear models, correlation and regression, and other topics. Participants will also have the opportunity to ask questions about the analysis of their own data.


With a constant evolution of technologies, laboratory biologists are faced with an increasing need of bioinformatics skills to deal with high-throughput data storage, retrieval and analysis.

Although several resources developped for such tasks have a web interface (most of the time, the first choice of biologists), many operations can be more efficiently handled with command lines (CLI).

In this course, R programmers will learn how to create R packages, the best way to make R scripts reusable. Participants will learn how to identify and create clear, clean and usable packages in R.

This course is recommended even for programmers who do not plan to distribute their R scripts or datasets: R packages are also useful for a developer who works alone, and wants to keep track of his scripts and the related documentation.

This two-day course will provide an overview of the RNA-seq analysis pipeline, as well as the downstream analysis of the resulting data using bioconductor packages in R. The course will cover the following topics:

  • The structure of an RNAseq analysis pipeline:
    • Raw data quality check;
    • RNAseq reads alignment;
    • Gene Expression level quantification and normalization by reads counting;
    • De novo Transcripts reconstruction and differential splicing.
  • Overview of downstream analysis
    • Differential Expression analysis with R/Bioconductor packages;
    • Class discovery: usage of Principal Component Analysis, Clustering, Heatmaps, Gene Set Enrichment Analysis in RNA-seq analysis.

Next Generation Sequencing (NGS) techniques will not be covered in this course; experimental design as well as the statistical methods will not be detailed in this course.