Applied Statistics - Week 2
Monday the 26th - Friday the 30th of November 2018
ERDA shared link to full week material:
FxhETpGzhA
The following is a description of what we will go through during this
week of the course. The chapter references and computer exercises are
considered read, understood, and solved by the beginning of the
following class, where I'll shortly go through the exercise
solution.
General notes, links, and comments:
Friday of this week and Monday next week are special, as the
class will be divided into two halves, which will alter between
doing experiments for the project in First Lab, and follow the usual
lectures and associated exercise (done by Jason Koskinen).
The exercise on Friday/Monday next week (i.e. 30th of November
and 3rd of December) is also a bit special, as this will the first
time, that the exercise has very little code in it! It is thus up to
you to write/copy code into your analysis to yield the best estimate
of the length of the table in Auditorium A.
Finally, the table measurement exercise is also slightly special
in that we would like you to submit your answers!
Monday:
Even in a complex world, a few PDFs play a central role again and
again. We will go through these "natural" PDFs, in particular the
Binomial, Poisson, and Gaussian distributions and see how they are
related. Other PDFs will also be discussed.
Reading:
Barlow, chapter 3
Lecture(s):
Binomial,
Poisson, and Gaussian
Computer Exercise(s):
Binomial, Poisson and Gaussian:
BinomialPoissonGaussian.ipynb
Tuesday:
The main theme will be the Likelihood function, and the central
role it plays in statistics. It is in principle the most powerful
method for fitting, and estimation and ChiSquare can be derived from
it. As a little "bonus", there is an illustration of Simpson's
paradox, which regards correlations!
Reading:
Barlow, chapter 5.1 to 5.7 (but not 5.5 and the proofs).
Lecture(s):
Maximum likelihood function
Computer Exercise(s):
Likelihood fit illustration: LikelihoodFit.ipynb
Simpson's paradox: Simpsons_Paradox.ipynb
Friday:
Experiments for project: (Group A)
We will be working on the experiments for
Project in First Lab.
This project should be handed in (PDF by mail to me) by 22:00 on Sunday the
16th of December 2018 (please, don't sit up all night!).
I would be happy, if you would give the file the logical name
"Project_GroupX_Name1Name2Name3Name4Name5.pdf", where NameX is the
first name of the group members.
Lectures and exercises: (Group B)
Real data almost never follows theoretical PDFs, as the real world
contains dirty wires, unknown biases, and mismeasurements. We will
devote the day to discussion of real data analysis and systematic
errors, and apply this to our "Table Measurements" from Aud. A.
Reading:
Barlow, chapter 4.4
Chauvenet's
Criterion on Wikipedia
Lecture(s):
Systematic Uncertainties (given by Jason):
Systematic Errors
Computer Exercise(s):
TableMeasurements:
TableMeasurement.py,
data_TableMeasurements2009.txt
data_TableMeasurements2010.txt
data_TableMeasurements2011.txt
data_TableMeasurements2012.txt
data_TableMeasurements2013.txt
data_TableMeasurements2014.txt
data_TableMeasurements2015.txt
data_TableMeasurements2016.txt
data_TableMeasurements2017.txt
data_TableMeasurements2018.txt
In addition, the 2018 data exists in an expanded format, where two
columns are added: Gender (M/F - sorry, no third gender option, except
blank) and if the speed was done with pleanty of time (i.e. in Week0)
or at high pace (Monday the 19th).
If you managed to get a (good?) result on the "standard" problem, you
can consider if the hurried measurements are worse or more faulty than
the slower ones, and/or if there is any difference between men and
women in the measurements:
data_TableMeasurements2018_WithGenderSpeedInfo.txt
The result of your table measurement analysis should be submitted
HERE by Thursday
the 6th before 16:00.
Last updated: 25th of November 2018.