Applied Machine Learning - Week 1

Monday the 20th - Friday the 24th of April 2026

Groups: We highly recommend that you also work/collaborate/discuss in a group for exercises, and for the final project you should find a group (administrated by Norman).

Reference Data I (Aleph b-quark identification):
In order to learn about ML, we need to have a nice, simple, scaled, mutually exclusive sampled, nummerically sound, unflawed, possibly large, and perfectly labelled (i.e. simulated) dataset with competitive predictions on (to compare performance) to train and test on. It sounds like an impossibility, but I happen to have the "Aleph b-quark tagging" dataset, with a Neural Net prediction (first paper from 1992!) in, see bottom of page.

Monday 20th of April (afternoon):
Lectures: Intro to course, outline, groups, and discussion of data and goals (TP).
     Introduction to AppML Course and Introduction to Machine Learning (TP).

Exercise: Setup of infrastructure (Python, Github, etc.). Test your Python setup with ML_MethodsDemos.ipynb.
     Getting a feel for the Curse of Dimensionality, making life in high dimensions a lonely one!
     Inspecting data and making a "human" decision tree for classification: Code for initial analysis: BjetSelection_original.ipynb (classifying with if-sentences!)

Wednesday 22nd of April (morning - starting exceptionally 8:15!):
Lectures: Intro to Tree-based algorithms, Stochastic Gradient Descent, and Training/Validation (TP).

Exercise: Exercise: Classification of b-quark jets in Aleph data with tree based methods.
     Compare performance to your own Decision Tree and the Aleph NN.
     Additional (reference) data, on classifying stars, galaxies, and quasars: Data_SDSS.txt (6.3 MB).

Wednesday 22nd of April (afternoon):
Lectures: Introduction to NeuralNet-based algorithms and Loss Functions (TP).
     Additional slides: ML2026_AppliedML_Top10.pdf

Exercise: Exercise: Classification of b-quark jets in Aleph data with Neural Net based methods.
     Compare performance to your tree based method(s) and the Aleph NN.
     Challenge: Given a "large" dataset on b-jets, see how performance improves with data size.

Last updated: 15th of of April 2026 by Troels Petersen.