The course will focus on learning and internalizing the practices of unit testing, refactoring, and version control through hands-on experience. The first morning will start with an introduction into these concepts and tools used to support them. In the afternoon, we will transition to a code clinic and work together in small groups applying these practices to make improvements to code brought by participants. The second day will continue with the code clinic.
Usage of NGS is increasing in several biological fields due to a very rapid decrease in cost. However, it often results in hundreds of Gbs of data making the downstream analysis very challenging and requires bioinformatics skills.
In this module, we will introduce the most used sequencing technologies and explain their decryption concepts.
We will also introduce the repositories e.g. the European Nucleotide Archive (ENA), Sequence Read Archive (SRA) from which you could retrieve raw data based on specific experiments. We will practice the usage of command line tools to search and fetch NGS raw data in a powerful way.
Finally, using different datasets, we will practice screening for quality control, filtering reads for better downstream analysis, mapping reads to reference genome and visualize the output.
With a constant evolution of technologies, laboratory biologists are faced with an increasing need of bioinformatics skills to deal with high-throughput data storage, retrieval and analysis.
Although several resources developped for such tasks have a web interface (most of the time, the first choice of biologists), many operations can be more efficiently handled with command lines (CLI).