Applied Statistics - Week 6

Friday the 3rd of January 2025

The following is a description of what we will go through during this week of the course. The chapter references and computer exercises are considered read, understood, and solved by the beginning of the following class, where I'll shortly go through the exercise solution.

General notes, links, and comments:
  • A neat tree-based algorithm is XGBoost, described in the nice XGBoost paper.
  • An alternative which is faster and roughly equally performant is LightGBM.


    Friday:
    We will start the new year with an introduction to Machine Learning, which are non-linear methods for classification and regression, typically based on algorithms such as Boosted Decision Trees (BDT) and Neural Networks (NN).
    While Fisher's Discriminant is a powerful (and transparent) tool, it is superseded by the more performant Machine Learning (ML) algorithms, which this lecture and exercise is meant to whet your appetite for. Note that Machine Learning is not part of the curriculum.

    Reading:
  • No formal reading, but please consider these introductions to Decision Trees and Neural Nets.
  • Further inspiration can be found here: ML links.
    Lecture(s):
  • Introduction to Machine Learning
    Computer Exercise(s):
  • MachineLearningExample.ipynb and associated data sample: DataSet_ML.txt.

    Finally, a link to the NBI Applied Machine Learning course runs in block 4 (Schedule C).
    Last updated: 29th of December 2024.