Applied Statistics - Week 5

Monday the 20th - Tuesday the 21st of December 2021

The following is a description of what we will go through during this week of the course. The chapter references and computer exercises are considered read, understood, and solved by the beginning of the following class, where I'll shortly go through the exercise solution.

General notes, links, and comments:
  • How to produce (great?) plots: Plotting inspiration and code
  • A neat tree-based algorithm is XGBoost, which is described in the XGBoost paper.

    Monday:
    We will use the last two days of this year to consider the theme MultiVariate Analysis (MVA), that is analysis of data with more than one (typically many) variables. To begin with, we will consider the relatively simple linear case, which is described by Fisher's Discriminant, and then move on to more complex sets of data, for which more advanced non-linear methods, such as Boosted Decision Trees (BDT) and Neural Networks (NN) are more/very useful.

    Reading:
  • NOTE: You should by now have read curriculum (roughly Barlow chapters 1-8).
    Lecture(s):
  • Bayes Theorem
  • MultiVariate Analysis - Part I
    Zoom:
  • Link to lecture. Recording of Lecture video.
  • Link to exercises.
    Computer Exercise(s):
  • 2par_discriminant.ipynb
  • fisher_discriminant.ipynb and data.

    Tuesday:
    While Fisher's Discriminant is a powerful (and transparent) tool, it is superseded by the more performant Machine Learning (ML) algorithms, which this lecture and exercise is meant to wet your appetite for. Note that Machine Learning is not part of the curriculum.

    Reading:
  • No formal reading, but please consider these introductions to Decision Trees and Neural Nets.
  • Further inspiration can be found here: ML links.
    Lecture(s):
  • MultiVariate Analysis - Part II
    Zoom: Link to lecture. Recording of Lecture video.
                 Link to exercises.
    Computer Exercise(s):
  • MachineLearningExample.ipynb and associated data sample: DataSet_ML.txt.
  • Illustration of DecisionTree_InteractiveExample.ipynb through an interactive example.


    Finally, a link to an online course on Machine Learning (by Udacity).
    Alternatively, our own Applied Machine Learning course runs in block 4 (Schedule C).
    Last updated: 15th of December 2021.