Applied Statistics - From data to results (Fall 2012)

"Coincidences, in general, are great stumbling blocks in the way of that class of thinkers who have been educated to know nothing of the theory of probabilities [and statistics] - that theory to which the most glorious objects of human research are indebted for the most glorious of illustration."
[Edgar Allan Poe, "The Murders in the Rue Morgue", 1841]
The final take-home exam has been posted! (Hand in by Friday 12:00)

General information:
Lecturer: Troels C. Petersen (NBI High Energy Physics (HEP)) (petersennbi.dk).
Additional teacher: Sascha Mehlhase (NBI High Energy Physics (HEP)) (mehlhasenbi.dk).
When: Monday 9-12, Tuesday 13-17, and Friday 9-12 (Week Schedule Group B).
Where: Auditorium M (Building M at NBI).
Period: Blok 1 (3rd of September - 2nd of November 2012), 9 weeks.
Evaluation: Problem set (15%), Projects (15% each), Take-home exam (55%).
Exam: Take-home (24 hour) exam given Thursday the 1st of November 2012 at 8:15.
Censur: Internal censor evaluation (following the Danish 7-step scale)
Credits: 7.5 ECTS (i.e. 1/8 academic years work).
Level: Intended for students at 3rd - 5th year of studies and new Ph.D. students.
Prerequisites: Simple mathematics and some programming (any language, but see below).
Note: Programming is an essential tool and is therefore necessary for the course.
Programs used: Simple C++ and the CERN software ROOT.
Text book: Roger Barlow: Statistics: A guide to the use of statistics.
Additional litterature: Philip R. Bevington: Data Reduction and Error Analysis.
Glen Cowan: Statistical Data Analysis.
Pensum/Curriculum: The course curriculum can be found here.
Outline: Graduate statistics course giving an advanced introduction to data analysis.
Course format: Shorter lectures followed by computer exercises and discussion.
Key words: PDF, Uncertainties, Correlation, Chi-Square, Likelihood, Fitting, Monte Carlo.
Language: Danish (English if requested). All exercises, problem sets, exams, notes, etc. are in English.

Further information can be found here: Applied Statistics course information
A "course introduction" questionnaire can be found at: http://goo.gl/nZdbo






Problem sets, projects, and exam set:
During the course there will be a problem set to be solved, two projects to be carried out, and a final take-home exam to be handed in, all of which can (in time) be found below:
  • Project 1 (Mon. 10th - Fri. 21st September).
  • Problem Set (Fri. 21st September - Tues. 2nd October).
               Solution Suggestions/Discussion
  • Project 2 (Tues. 9th - Fri. 26th October).
  • Final take-home exam (Thur. 1st - Fri. 2nd November).



    Course outline:
    Week 0: (Pre-course session)
    31 (13:15-15:00): Setting up computers, introduction to C++/ROOT (Aud. M).

    Week 1 (Introduction, general concepts)
    3: Intro to course, photos, questionnaire and table measurements (Aud. A). Central limit theorem. Mean, RMS and estimators. Correlation.
    4: Distributions and Error propagation (which is a science!).
    7: ChiSquare and introduction to project 1.

    Week 2 (ChiSquare, Systematic Errors)
    10: Start project 1 (for Friday the 21st of September) [Sascha teaching].
    11: Work on project 1 and ROOT tutorial [Sascha teaching].
    14: Random numbers and their use in MC. Systematic errors.

    Week 3 (Likelihood, Fitting, Using Simulation):
    17: Likelihood and fitting. Fitting data (which is an art!).
    18: More fitting and ROOT tutorial [Sascha teaching].
    21: Producing random numbers (handing in project 1). Handing out problem set (for Tuesday the 2nd of October).

    Week 4 (Hypothesis Testing):
    24: Hypothesis testing. Kolmogorov-Smirnov and Wald-Wolfowits tests.
    25: Catching up, including extended examples.
    28: Evaluation of project 1 results. Midway repetition.

    Week 5 (Bayes Theorem and Confidence Intervals):
    1: Bayes theorem. Separating/classifying events.
    2: Limits and confidence intervals (handing in the problem set).
    5: Summary of curriculum.

    Week 6 (Classifying events and calibration):
    8: Multi-Variate Analysis (MVA). Fisher and ROOT's TMVA.
    9: Start project 2 (for Friday the 26th of October) [Sascha teaching].
    12: Designing experiments. Calibration and use of control channels. Blind analysis. Work on 2nd project.

    Week 7: (efterÄrsferie/project2)
    15:
    16:
    19:

    Week 8 (Project 2 presentations and Fitting):
    22: Project 2
    23: Project 2
    26: Presentations of 2nd projects.

    Week 9 (Calibration, summary and exam):
    29: Calibration
    30: Summary/repetition of course curriculum.
    1: Exam given (posted on course webpage in the morning 8:00).
    2: 12:00 Exam to be handed in.



    Notes:
    In addition to the text book and other litterature, some notes will be used during the course:
  • PDG notes on Probability.
  • PDG notes on Statistics.
  • PDG notes on Monte Carlo Techniques.
  • The Data Analysis Brief Book.
  • Statistics resources.
  • Note on Frequentialist vs. Bayesian statistics and discoveries.

    Links:
  • Blog on how to use crime rates for predicting taxi demand!.