Applied Statistics - Week 5
Monday the 20th - Tuesday the 21st of December 2021
The following is a description of what we will go through during this
week of the course. The chapter references and computer exercises are
considered read, understood, and solved by the beginning of the
following class, where I'll shortly go through the exercise
solution.
General notes, links, and comments:
How to produce (great?) plots:
Plotting inspiration and code
A neat tree-based algorithm is XGBoost, which is described in
the XGBoost paper.
Monday:
We will use the last two days of this year to consider the theme
MultiVariate Analysis (MVA),
that is analysis of data with more than one (typically many) variables. To begin with, we will consider
the relatively simple linear case, which is described by Fisher's Discriminant, and then move on to
more complex sets of data, for which more advanced non-linear methods, such as Boosted Decision Trees (BDT)
and Neural Networks (NN) are more/very useful.
Reading:
NOTE: You should by now have read curriculum (roughly Barlow chapters 1-8).
Lecture(s):
Bayes Theorem
MultiVariate Analysis - Part I
Zoom:
Link to lecture.
Recording of Lecture video.
Link to exercises.
Computer Exercise(s):
2par_discriminant.ipynb
fisher_discriminant.ipynb and
data.
Tuesday:
While Fisher's Discriminant is a powerful (and transparent) tool, it is superseded by
the more performant Machine Learning (ML) algorithms, which this lecture and exercise
is meant to wet your appetite for. Note that Machine Learning is not part of the curriculum.
Reading:
No formal reading, but please consider these introductions to
Decision Trees and
Neural Nets.
Further inspiration can be found here: ML links.
Lecture(s):
MultiVariate Analysis - Part II
Zoom: Link to lecture.
Recording of Lecture video.
             Link to exercises.
Computer Exercise(s):
MachineLearningExample.ipynb and
associated data sample: DataSet_ML.txt.
Illustration of DecisionTree_InteractiveExample.ipynb
through an interactive example.
Finally, a link to an
online
course on Machine Learning (by Udacity).
Alternatively, our own
Applied Machine Learning
course runs in block 4 (Schedule C).
Last updated: 15th of December 2021.