Applied Statistics - Week 2

Monday the 26th - Friday the 30th of November 2018

ERDA shared link to full week material: FxhETpGzhA

The following is a description of what we will go through during this week of the course. The chapter references and computer exercises are considered read, understood, and solved by the beginning of the following class, where I'll shortly go through the exercise solution.

General notes, links, and comments:
  • Friday of this week and Monday next week are special, as the class will be divided into two halves, which will alter between doing experiments for the project in First Lab, and follow the usual lectures and associated exercise (done by Jason Koskinen).
  • The exercise on Friday/Monday next week (i.e. 30th of November and 3rd of December) is also a bit special, as this will the first time, that the exercise has very little code in it! It is thus up to you to write/copy code into your analysis to yield the best estimate of the length of the table in Auditorium A.
  • Finally, the table measurement exercise is also slightly special in that we would like you to submit your answers!

    Even in a complex world, a few PDFs play a central role again and again. We will go through these "natural" PDFs, in particular the Binomial, Poisson, and Gaussian distributions and see how they are related. Other PDFs will also be discussed.

  • Barlow, chapter 3
  • Binomial, Poisson, and Gaussian
    Computer Exercise(s):
  • Binomial, Poisson and Gaussian: BinomialPoissonGaussian.ipynb

    The main theme will be the Likelihood function, and the central role it plays in statistics. It is in principle the most powerful method for fitting, and estimation and ChiSquare can be derived from it. As a little "bonus", there is an illustration of Simpson's paradox, which regards correlations!

  • Barlow, chapter 5.1 to 5.7 (but not 5.5 and the proofs).
  • Maximum likelihood function
    Computer Exercise(s):
  • Likelihood fit illustration: LikelihoodFit.ipynb
  • Simpson's paradox: Simpsons_Paradox.ipynb

    Experiments for project: (Group A)
    We will be working on the experiments for Project in First Lab.
    This project should be handed in (PDF by mail to me) by 22:00 on Sunday the 16th of December 2018 (please, don't sit up all night!).
    I would be happy, if you would give the file the logical name "Project_GroupX_Name1Name2Name3Name4Name5.pdf", where NameX is the first name of the group members.

    Lectures and exercises: (Group B)
    Real data almost never follows theoretical PDFs, as the real world contains dirty wires, unknown biases, and mismeasurements. We will devote the day to discussion of real data analysis and systematic errors, and apply this to our "Table Measurements" from Aud. A.

  • Barlow, chapter 4.4
  • Chauvenet's Criterion on Wikipedia
  • Systematic Uncertainties (given by Jason): Systematic Errors
    Computer Exercise(s):
  • TableMeasurements:,

    In addition, the 2018 data exists in an expanded format, where two columns are added: Gender (M/F - sorry, no third gender option, except blank) and if the speed was done with pleanty of time (i.e. in Week0) or at high pace (Monday the 19th). If you managed to get a (good?) result on the "standard" problem, you can consider if the hurried measurements are worse or more faulty than the slower ones, and/or if there is any difference between men and women in the measurements: data_TableMeasurements2018_WithGenderSpeedInfo.txt

    The result of your table measurement analysis should be submitted HERE by Thursday the 6th before 16:00.
    Last updated: 25th of November 2018.