Topic outline

  • General

    Biological Data Analysis Enabled via High Performance Computing

    Riyadh, Saudi Arabia, 3-5 November 2013

    In the Big Data area, analysing data-intensive genomes is a complex task and needs combined knowledge and tools of specialists in the domains of genomics, data analysis and high performance computing (HPC). In particular, genome analysis pipelines usually require high performance compute clusters and large-scale, fast storage systems. In the following workshop we will provide both theoretical as well as hands-on experience on how to efficiently use computational tools on KAIMRC’s modern HPC infrastructure to analyse biological data.

    Objectives:

    • Basic and advanced UNIX knowledge to get ready to work with KAIMRC's Linux-based HPC infrastructure
    • Obtain basic knowledge on high throughput and high performance computing
    • Learn how to use KAIMRC's infrastructure for scientific applications (including hands-on exercises)
    • Discussion of bioinformatics pipelines (examples RNA-Seq, chip-seq) to analyse biological data (incl. steps that are required to get the data from the sequencing machine to the HPC infrastructure and final analysis)

    Requirements:

    • Basic understanding of working with command line tools on Linux or Windows-based operating systems
    • Scripting language such as bash or Perl is an asset

    Background:

    This workshop is based on a common project between the

    • KAIMRC (King Abdullah International Medical Research Center), Riyadh, Saudia Arabia
    • SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland