Autumn School Machine Learning applied to Bioinformatics - November 2017
Section outline
-
Autumn School Machine Learning applied to Systems Biology
Schwarzenberg 19-24 November 2017
This page is addressed to registered participants. To access course description and application form (now closed), please click here.
For any assistance, please contact training@sib.swiss
-
Hotel
The address of the Autumn School event is Hotel und Bildungszentrum Matt, Mattstrasse 19, 6103 Schwarzenberg, Switzerland.
This is a nice location in the mountains, near the famous Pilatus. Schwarzenberg, village of about 1700 inhabitants, is about 20 minutes by car from Luzern.The hotel website shows information but is unfortunately only in german: https://www.bzmatt.ch/
Venue
By public transportation:
- the final stop is Schwarzenberg Ennematt. To get there:
- from main railway stations, get to Luzern Station. Then,
- Luzern -> Malters by regional train, then
- Malters -> Scharzenberg Ennematt by bus, then
- in front of the bus stop (you have to cross the road), take the Mattstrasse and walk for 5 minutes
You can find timetables at https://www.sbb.ch/en
On Sunday, you might plan the following:
Luzern -> Schwarzenberg Ennematt
16h16 -> 16h43 or
17h16 -> 17h43By car:
The address of the hotel is:
Hotel & Bildungszentrum Matt
Mattstrasse 19
CH-6103 Schwarzenberg / LuzernCar parking is free of charge.
-
Approximate timing for a typical day is the following:
09:00 - 12:30 lectures
12:30 - 14:00 lunch
14:00 - 17:00/30 practical / exercises
17:00/30 - 19:00/30 free
19:00/30 dinnerSunday 19 November - Broad introduction and welcome dinner
Dr Frédéric Schütz, SIB Swiss Institute of Bioinformatics
17:00 - 18:00 Arrival of the participants
18:15 Informal welcome and presentation of the event (Grégoire Rossier, co-organizer, SIB Swiss Institute of Bioinformatics)
18:30 Broad machine learning introduction
19:30 "Round table" with participants' background and expectations
~20:15 Welcome dinner
Monday 20 November: Introduction to machine learning
Dr Frédéric Schütz, SIB Swiss Institute of BioinformaticsMorning: Lectures
-
- Supervised vs unsupervised learning
- Introduction to some classification and machine learning algorithms: k-means, LDA/QDA, Random forest, etc.
- Evaluating performance
- generalization/overfitting
- training, test sets
- cross-validation, bootstrap, jackknife
- Model selection
- ROC curves
Tuesday 21 November: Best practice in applied machine learning
Dr Eric Paquet, Computational Systems Biology, EPFL Morning: lectures- Pitfalls, experimental design and batch effect
- Diagnostic/QC plots in R
- PCA
- Clustering/heatmaps
- Boxplots
- Normalization
- Feature selections
- Regularization (lasso, ridge and elastic net)
- Neural networks (perceptron)
- Kernel trick (spectral)
- Reproducible research, Sweave, Jupyter notebooks, git
- Example of the MAQC II
- Example of applied machine learning in Systems Biology
- Cancer subtypes. How many subtypes? and identification
- HMM
- image analysis (drug discovery)
- image analysis (morphology classification)
- HMM
- Image analysis
- http://www.nature.com/articles/sdata201718 use this dataset.
Wednesday 22 November: Participants’ day
Morning: "participants, the floor is yours..."- Lightning presentations
- Poster session
see a detailed list below
- visit of a glass factory, including fun activities.
Thursday 23 November: Machine Learning and metagenomics to study microbial communities
Dr Luis Pedro Coelho, EMBL, Heidelberg, Germany Morning: lectures**- Brief Introduction to microbial community wetlab technologies
- Presentation of important questions in the field
- Overview of raw data processing with NGLess tool
- Classification based on metagenomics-derived features
- Example based on Zeller et al., 2014: http://doi.org/10.15252/msb.20145645
- Feature normalization/filtering
- Biomarker discovery
Afternoon: exercises- Clustering for metagenomics: Metagenomic species, mOTUs, subspecies discovery…
- Machine learning for the exploration of community/environmental links:
- Example based on Sunagawa et al., 2015: http://doi.org/10.1126/science.1261359
- Different forms of ordination analysis
- Feature normalization for clustering
- Discussion of batch effects and techniques to minimize their impact on the final analysis
- Computer vision techniques for studying micro-eukaryotic communities
Friday 24 November : Deep learning in single-cell analysis
Dr María Rodríguez-Martínez, IBM Research Lab Zurich Morning: lectures- Introduction to deep learning
- Why and how deep
- Activations functions
- Cost functions
- Backpropagation
- Regularization
- Optimization
- Multi-Layer Perceptron (MLP)
- Auto-enconders (AE)
- Convolutional Neural Networks (CNN)
- Recurrent Neural Networks (RNN)
- Word Embeddings for molecular interaction inference (INtERAcT)
- Deep SWATH-MS, deep and unsupervised MS processing (DeepSWATH)
- Characterizing cell populations on single-cell data
-
-
Posters
Bulak Arpat Analysis of Translational Pausing by Disome Profiling Amel Bekkar Logical modeling of cardiovascular disease Violeta Castelo Szekely Sequence determinants of DENR-MCTS1 mediated translation reinitiation Chiara Cotroneo Computational prediction of clusters of bacterial genes Sunniva Foerster Pairwise drug combinations against Neisseria gonorrhoeae Anamarija Fofonjka An elastic instability generates predictable folding of the frilled dragon erectile ruff during development Qingyao Huang Integrative analysis of cancer genome profiling data to study the interplay of genetic background and molecular mechanisms in cancer Lidia Lacruz Prevision of facial morphology within the context of forensic DNA phenotyping Mose Manni Machine Learning for predictions in infectious diseases outbreaks Marco Meola Improved classification of short read sequences from dairy products for bacterial species identification using the manually curated reference database DAIRYdb Gautam Munglani Image feature recognition and quantification of tip-growing cells Rocío Rama Ballesteros Pattern recognition of relevant features involved in coevolution Stephan Schmeing ReSequenceR: Simulating more realistic high-throughput sequencing data Marthe Solleder Analysis and prediction of phosphopeptide-HLA interaction Daniel Spies De-convolution of epigenetic regulators of RNAi mutant mESC in MES and 2i media Christoph Stritt Population genomics of transposable elements in a Mediterranean grass species Lightning presentations
Each presentation should be 5 minutes + 2-3 minutes question.
Nicolas Blöchliger Quantifying the uncertainty of antimicrobial susceptibility testing Janko Tackmann Inference of Microbial Interaction Networks from Massive Data Sets through Causal Knowledge Discovery Monica Ticlla Ccenhua Identifying metabolic pathways in Mycobacterium tuberculosis relevant for transmission Mariamawit Ashenafi 3D Functional Organization of Plant Nucleus Marti Bernardo Faura A systems biologist´s expedition: from systems biomedicine to plant biotechnology Adhideb Ghosh Prediction of gain-of-function and loss-of-function variants using in silico bioinformatics tools David Dreher Towards shapes as landmarks in development Athos Fiori Gene expression dynamic across cell cycle Simon Friedensohn Mining immune repertoires for functional antibodies POSTER SESSION Annika Gable Using biological networks to identify new drug targets for rare genetic diseases Tilman Flock Beyond structure-guided drug design Alicia Kaestli A software for automated arrhythmia detection in iPSC-derived cardiomyocytes Mattia Tomasoni GWAS on features extracted from the retinal images Lisa Lamberti Efficient methods for detecting genetic interactions Hyunjin Shim Feature learning of virus genome evolution with the nucleotide skip-gram neural network Marie Zufferey Comparison of TAD calling methods Garif Yalak Exoenzymes: Enzymes of the extracellular matrix -
Knowledge / skills:
- Active participation.
- Ready for networking with peers and teachers.
- Good programming skills in Python and R.
- Basic statistical knowledge.
- Basic of terminal (shell) usage
Here is the list with links to the installation page
R 3.4.2 :and following packages :gplotse1071classROCRggplot2randomForestcaretnnet
Rstudio Desktop (open source license) :
Python :and following packages :-Scikit-learn-matplotlib-seaborn-numpy-scipy
Jupyter notebook :
Docker :
Weka 3.8 :