Applied Machine Learning 2025

"Despite the connotations of machine learning and artificial intelligence as a mysterious and radical departure from traditional approaches, we stress that machine learning has a mathematical formulation that is closely tied to statistics, the calculus of variations, approximation theory, and optimal control theory."
[Introduction to Machine Learning, Particle Data Group (pdg.lbl.org) 2023]

Troels C. Petersen Daniel Murnane Johann "Janni" Nikolaides Norman Pedersen Aayush Arya
Lecturer - Associate Prof. Teacher - DDSA Fellow Teaching assistent - Ph.D. Teaching assistent - Ph.D. Teaching assistent - Ph.D.
NBI - High Energy Physics NBI - High Energy Physics NBI - Neutrino Physics NBI/WARD - Vital Sign ML NBI - Astrophysics
26 28 37 39 93 83 89 58 31 50 23 59 60 74 44 04 50 55 01 82
Mac user Windows/Linux expert Linux expert Mac/Windows expert Mac/Linux expert
petersennbi.dk daniel.murnanenbi.ku.dk johann.nikolaidesnbi.ku.dk norman.pedersennbi.ku.dk aayush.aryanbi.ku.dk


What, when, where, prerequisites, books, curriculum and evaluation:
Content: Graduate course on Machine Learning and application/project in science (7.5 ECTS).
Level: Intended for students at graduate level (4th-5th year) and new Ph.D. students.
Prerequisites: Programming experience (essential, preferably in Python) and Math (calculus and linear algebra).
When: Mondays 13-14 / 14-17 and Wednesdays 9-10 / 10-12 & 13-14 / 14-17 for lectures/exercises (Week Schedule Group C).
Where (lectures): Mondays: Library Room at DIKU (bib 4-0-17, right next to Lille UP1). Wednesdays: Lille UP1 at DIKU,
Where (exercises): Mondays: Biocenter 4-0--2, 4-0-10, 4-0-32, Wednesday morning: DIKU 1-0-18, 1-0-26, and 1-0-30, Wednesday afternoon: Biocenter 4-0-02 and 2-2-07/09, see KU Room Schedule plan.
Format: Shorter lectures followed by computer exercises and discussion with emphasis on application and projects.
Text book: References to (the excellent!) Applied Machine Learning by David Forsyth.
Suppl. literature: We (i.e. you) will make extensive use of online ML resources, collected for this course.
Programming: Primarily Python 3.12 with a few packages on top, though this is an individual choice.
Code repository: All code we provide can be found in the AppliedML2025 GitHub respository.
Communication: All announcements will be given through Absalon. To reach me, Email is preferable.
Initial Project: Initial project (a la Kaggle competition) to be submitted Sunday the 18th of May at 22:00.
Final Project: Final project (Exam) presentations on Wednesday the 11th (all day) and Thursday the 12th of June (morning).
Evaluation: Initial project (40%), and final project (60%), evaluated by lecturers following the Danish 7-step scale.


"People often say that data is the new oil, and it's not. The rare asset is what TO DO with all this data, what's actionable - this is the power of AI."
[Matt Wilson, at Google]


Before course start:
An introduction to the course can be gotten from this ML subject overview and related film introducing the course subjects (23 min, 1.48 GB).
Specific course information can be found here: ML2025_CourseInformation.pdf
To better know who you are, and optimising the course accordingly, please fill in the course questionnaire.
To test your "Python & Packages" setup, you can try to run ML_MethodsDemos.ipynb (which is also meant to whet your appetite).



Course outline:
Below is the preliminary course outline, subject to possible changes throughout the course.

Week 1 (Introduction to course and Machine Learning concepts. Tree and Neural Network learning):
Apr 21: 13:15-17:00: No teaching (Easter Monday).
     Bonus (self) study: Recipe for training a neural network and associated video on the same subject.
Apr 23: 8:15-12:00: Introduction to course itself, ML concept, Loss functions, Training, Cross Validation, and Tree-based algorithms (TP).
     Exercise: Setting up. Classification on reference data sets with Boosted Decision Tree based methods.
Apr 23: 13:15-17:00: Introduction to NeuralNet-based algorithms (TP).
     Exercise: Classification (and regression) on reference data sets with Neural Net based methods.

Week 2 (Initial project kickoff, Hyper Parameter optimisation, Feature Importance, Introduction to unsupervised learning and clustering):
Apr 28: 13:15-17:00: Hyperparameters, Overtraining, and Early stopping (TP).
     Exercise: Hyperparameter optimisation of simple tree and NN algorithms.
Apr 30: 9:15-12:00: Initial project kickoff. Feature Importance calculated using permutations and Shapley values (TP).
     Exercise: Determine feature ranking for reference data sets, and cross check these with actual models.
Apr 30: 13:15-17:00: Introduction to Unsupervised Learning: Clustering and Nearest Neighbor algorithms (TP).
     Exercise: Try to apply the k-NN (and other) algorithms to reference data sets.

Week 3 (Convolutional Neural Networks (CNNs), Graph Neural Networks (GNNs), Time-Series, and Natural Language Processing (NLP)):
May 5: 13:15-17:00: Convolutional Neural Networks (CNNs) and image analysis (DM).
     Exercise: Recognize images (MNIST dataset and insoluables from Greenland ice cores) with a CNN.
May 7: 9:15-12:00: Graph Neural Networks (GNNs) and geometric learning (DM). IceCube example (TP).
     Exercise: Work on classic GNN example data (TBD).
May 7: 13:15-17:00: Time-series, Transformers, and Natural Language Processing (NLP) (Inar Timiryasov).
     Exercise: Predict future flight traffic and do NLP on IMDB movie reviews.

Week 4 (AutoEncoders and anamaly detection, Final Project kickoff, and Dimensionality reduction):
May 12: 13:15-17:00: (Variational) AutoEncoders and anomaly detection (TP). Preparing groups and subjects for Final Project.
     Exercise: AutoEncoding the MNIST dataset and possibly detecting anomalies in the data sample.
May 14: 9:15-12:00: Final projects kickoff. Discussion of projects and how to work on them (TP).
     Exercise: Getting, plotting and planning on final project data and discussion of project goals.
May 14: 13:15-17:00: Dimensionality reduction with introduction to t-SNE and UMAP algorithms (TP).
     Exercise: Work on initial project and/or final project.

Week 5 (GPU acceleration, Data preprocessing, Summary of Curriculum, and Foundation Models):
May 19: 13:15-17:00: GPU accelerated data analysis - Rapids (Mads Ruben Kristensen, Nvidia - formerly NBI).
Initial project should be submitted the day before (18th of May) by 22:00 on Absalon!.
May 21: 9:15-12:00: Preprocessing and summary of curriculum so far (TP).
     Exercise: Clean data and run algorithms on reference flawed data sets. Work on final project.
May 21: 13:15-17:00: Fast data loaders and Foundation Models (Inar Timiryasov + TP).
     Exercise: Work on final project.

Week 6 (Generative Adversarial Networks, Reinforcement Learning, CNNs on Beer, and Exam Example presentations):
May 26: 13:15-17:00: Example of CNN at work on beer, AutoEncoders at work on food, and environmental techniques in ML (Carl Johnsen).
     Exercise: Work on final project.
May 28: 9:15-12:00: Generative Adversarial Networks, Diffusion Models (TBC), and Reinforcement Learning (TP).
     Exercise: Work on final project.
May 28: 13:15-17:00: Discussion of differences between real and simulated data and Hybrid/Adversarial training (TP). Example exam presentation (TBC).
     Exercise: Work on final project.

Week 7 (Results and Feedback on initial project, new ML developments, and Ethics in ML):
Jun 2: 13:15-17:00: Results and Feedback on initial project. Discussion of industry cases.
     Exercise: Work on final project.
Jun 4: 9:15-12:00: New developments in Machine Learning (optimal transport, diffusion models, generative production chains) (Malte Algren + DM).
     Exercise: Work on final project.
Jun 4: 13:15-17:00: Ethics in the usage of Machine Learning (TP+DM).
     Special (15:00-): ML discussions and exchanges on board heritage train from Osterport to Helsingor!

Week 8 (Bonus self study and... Exam):
Jun 9: 13:15-17:00: No teaching (Whit Monday). Potentially, work on initial project.
     Bonus (self) study (by video): Infrastructure, Networks, Scaling, and Speed (Brian Vinter).
Jun 11: 8:45-12:00: Presentations of final projects (TP, DM, JN, NP, AA, and potentially others!).
Jun 11: 13:15-17:00: Presentations of final projects (continued).
Jun 12: 8:45-12:00: Presentations of final projects (if needed!) (TP, DM, JN, NP, AA, and potentially others!).




Presentations from previous years


Course comments/praise (very biased selection!):
"Best day of my life!" (Pressumably at the University, red.)
[Christian M. Clausen, on the day of final project presentations, 2019]

"Student 1: Damn..."
"Student 2: I was just thinking what a shame you didn't get to see a whole classroom worth of 'damn' faces! But the feeling is there."

[Reaction in Zoom chat, after having explained the capabilities of Reinforcement Learning examplified by AlphaZero, 2020]
[Fortunately, I got to see the reaction the year before!]

"Troels is the perfect shepherd guiding relatively inexperienced statisticians to machine learning in an approachable and fun way."
[Anon, course evaluation, 2021]

"This course (and Applied Statistics) were among the most useful and insightful courses I have taken in my academic life."
[Petroula Karakosta, 2022]

"I applaud the delivery with hands-on tutorial sessions, supported by overview lectures. The assessments excellently supported the learning with the initial project helping us get over the initial bump, and the group project showing us how to apply ML to our own interests. 5/5 stars!"
[Alice Patig, Ph.D. student at DTU, 2023]

"I have really enjoyed working on the final project, as it becomes super clear how important data preparation is. I also find that we discuss possibilities of using almost every ML method we've covered to tackle different issues in preparing, handling, and evaluating data."
[Anon, course evaluation, 2024]


Last updated 16th of April 2025 by Troels Petersen.