Lecturer: D. Jason Koskinen

Email: koskinen (at) nbi.ku.dk

- Block 3 - Timetable A of the 2016 academic calendar
- Tues 08:00 - 12:00 and Thurs 08:00- 12:00 & 13:00 - 15:00
- Actual
- 08:00 - 08:30 student study time for both days
- 08:30 - 09:00 Q&A or discussion with teacher in Aud. M
- 09:00+ lecture on new material
- Auditorium M at the Blegdamsvej campus
- Odd-numbered classes are 4-hours while even-numbered consist of 2 blocks of 4-hours.
- Classes will be composed of ~20-30% lecture and demonstrations followed by exercise
- While assignments, projects, and exercises can be done in the programming language of the students choice, the examples and demonstrations will be mainly in Python and/or scientific packages thereof, i.e. SciPy, PyROOT, etc.
- Required text or textbooks: None

- Oral presentation and 1-page summary (10%)
- ~8-9 minute summary presentation. Plan on ~6 slides if you are doing a PowerPoint-type presentation.
- Can work alone or in groups of up to 3.
- A single 1-page summary including any and all group members names.
- Presentation does NOT have to be given by all group members.
- Be sure to put down which article you are using here to avoid duplication
- Example presentation on Finite Monte Carlo article
- Other possible articles
- Frequency Difference Gating: A Multivariate Method for Identifying Subsets That Differ Between Samples (article)
- Probability binning comparison: a metric for quantitating multivariate distribution differences (article)
- FIREFLY MONTE CARLO: EXACT MCMC WITH SUBSETS OF DATA (article)
- This is just a small sample. Find something related, interesting, or applicable to your area of research.
- The 1-page summary is due via email on March 9 by 17:00 (5pm) CET.
- Presentations will be selected at random and begin during class time on March 10. At the discretion of the Lecturer and if needed, some presentations will be postponed for a later date.
- If you have any questions or concerns email Jason

- Graded problem sets (15%)
- Problem set 1 (5%)
- Deadline has now passed
- Grades have been posted on Absalon
- Problem set 2 (10%)
- Will be assigned sometime between March 7 and March 17
- Due: Friday April 8 at 16:00 Copenhagen time via email to Jason
- Problem Set 2 assignment
- Solution(s) to Problem Set 2

- Project (25%)
- Similar to the oral presentation, this project focuses on using a method or statistical treatment that is preferably related to your field of research that you or your group select. Unlike the oral presentation, the project includes not just understanding and explaining the method, but also using it on a some appropriate data set of your own choosing.
- Can be done alone or in groups of up to 3 people
- The only hand-in is a 4-6 page written report. You can submit the code as well if you would like.
- Due: Friday April 8 at 16:00 Copenhagen time via email to Jason

**Final exam**(50%)- Must work on your own!
- Take home exam
- 28 hour between start and submission
- Start at 08:00 on MONDAY April 11
- Submit by 12:00 on Tuesday April 12
- If for some reason you are absolutely positive that there is no way you can do a 28-hour take home exam from April 11 to 12, let Jason know immediately.
- The exam will be very similar to problem set 2 and here is the simplest of outlines of what the exam may look like
- Here are two extra practice problems similar to what will be on the exam for those

- Extra Credit (+2% to final course grade based on a 1-100% scale)
- 2016 NCAA Men's Basketball Bracket submission due by 21:00 on March 17
- This is NOT a requirement or obligation for the course
- Extra Credit Outline

The outline is a rough sketch of the course material, and is 100% likely to change throughout the course. Even so, we will absolutely cover the following topics which may require additional software support:

- Multivariate analysis (MVA) techniques including Boosted Decision Trees (BDTS)
- The MultiNest bayesian inference tool
- Basis splines
- Markov Chain Monte Carlo
- Likelihood minimization techniques

- Join the slack-team AdvancedMethodsKU2016
- sign in with your "******@alumni.ku.dk" - mail
- Choose password and username
- Profit

Class 0 - Pre_Course, attendance is not required

- Optional time to make sure your laptop is setup
- Feb. 2, 2016
- 10:00-12:00 in Aud. D
- Lecture 0
- Example Submission for problem sets (PDF)

- Course Information
- Chi-square
- Code chi-square
- Data for exercise 1 (FranksNumbers.txt)
- Review of 'basic' statistics
- Lecture 1
- Jason's python code for exercise 1
- Problem set 1 (Due Feb. 17 at 17:00 CET)
- Example Submission (PDF)

- Note: LIGO is announcing something important with a webcast starting at 16:30 to be shown in Aud. M (since all they can observe is gravitational waves, it's probably gravitational waves). Also, Jason has an unavoidable scientific 'thing' at 15:30.
- Random number generators
- Simple Monte Carlo
- Lecture 2
- Extra exercise about Merged Binning Combinatorics

Class 3 - Method of Least Squares

- Today's lecture is more analytic and math than normal, but should be used as a reference
- Lecture 3
- Some useful links

Class 4 - Likelihoods and Numerical Minimization Fitting

Class 5 - Bayesian Statistics Introduction

- Lecture 5
- Time to get 2 software packages ready for later in the course and very likely Lecture 6
- Markov Chain Monte Carlo
- MultiNest (install packages available for at least python, R, and Matlab)

Class 6 - Markov Chain Monte Carlo

Class 7 - Parameter Estimation

Class 8 - Hypothesis Testing

- Lecture 8
- Data file for one of the exercises
- Journal Articles related to this lecture
- Failure of Wilk's theorem in neutrino physics

Class 9 - Splines

- Lecture 9
- Data (SplineOsc1.txt, SplineCubic.txt, and DustLogger.dat)
- Here is the full 1.27M+ entry dust logger file
- Journal Articles
- Penalized splines for smoothness in higher-dimensions

Class 10 - Oral presentations (in class) & Non-parametric Tests

- Lecture 10
- Many of the presentations can be found here

Class 11 - Multi-Variate Analysis technique (MVA)

- Last group of oral presentations
- Boosted Decision Trees
- Lecture 11
- Exercise 1 python TMVA (see 2017 course webpage)
- Exercise 2 python TMVA (see 2017 course webpage)
- Data
- Exercise 1 (training signal, training background, testing signal, testing background)
- Exercise 2 (16 variable file)
- The first column is the index, hence there are 17 'variables', but the index variable only for book keeping and has no impact on whether an event is signal or background.
- Every even row is the 'signal' and every odd row is the 'background'. Thus, there are two rows for each index in the first column: the first is the signal and the second is the background. [Format is odd, but I got it from a colleague].

Class 12 - Data Processing and Signal Processing

- To prepare for the class make sure that a wavelet package is available
- Python - "pip install PyWavelets"
- Matlab - http://se.mathworks.com/products/wavelet/
- Lecture 12 (by Dr. James Monk)
- Python example scripts (wavelet_gaussian.py, wavelet_LIGO.py)
- The LIGO data can be found via the link in the lecture notes or:
- We will be covering wavelet and Hough transform

Class 13 - Rare Events

- Lecture 13 ( by Dr. Ken Clark)
- Data

Class 14 - Nested Sampling in Bayesian Inference

- Lecture 14
- Note that external packages for conducting nested sampling, e.g. MultiNest, are necessary
- Jason's pymultinest code for the exercises

Class 15 - Background subtraction and sPlots

- Lecture 15 ( by Troels Peteresen)
- Data file for the exercises
- Scripts - sWeights.py and sWeights_solution.py ( Check 2017 class)

Class 16 - Review

- Lecture 16
- Pure review of the course

Extra Projects of a more difficult nature, for those who want something more challenging.

- Parameter Goodness-of-fit (PG) in Global physics fits