Applied Statistics - Week 4
Monday the 10th - Friday the 14th of December 2018
ERDA shared link to full week material:
B9Q922ZbiI
ERDA shared link to solution examples:
cXOhPVbMT4
The following is a description of what we will go through during this
week of the course. The chapter references and computer exercises are
considered read, understood, and solved by the beginning of the
following class, where I'll shortly go through the exercise
solution.
General notes, links, and comments:
Louis Lyons discussing discovery levels::
1310.1284v1_LouisLyons_Why5sigma.pdf
Comparison between different tests for normality::
Power_Comparisons_of_Shapiro-Wilk_Kolmogorov-Smirn.pdf
Illustration of ROC curves: ROCcurves_GaussianSeparations.pdf
Comment on multiple hypothesis testing p-values::
p-value histogram
Paper of George Marsaglia on testing random numbers::
Random Number Generators
Monday:
The main theme of this week will be
Hypothesis testing, and
we will start with an exercise gently introducing the subject.
In addition to the ChiSquare test, there are several other tests, some
simple (one/two sample tests) and some more conceptually challenging
(Kolmogorov and Wald-Wolfowitz runs test).
Reading:
Barlow, chapter 8 on hypothesis testing (in particular 8.1-8.3).
Cohen, chapter 4 on hypothesis testing (perhaps omitting 4.2-4.4).
Lecture(s):
Hypothesis Testing
On p-value histograms
Computer Exercise(s):
Hypothesis testing: HypothesisTests.ipynb
Producing a ROC curve: MakeROCfigure.ipynb
Tuesday:
Today's lecture will be on Confidence Intervals, which in
principle is a simple subject (and we will not go beyond simple here),
but one with complicated details. For once, we will not have a
matching exercise associated (as it is fairly general).
I will re-iterate on hypothesis tests, and the exercise of the day will
focus exactly on different tests for your own random (?) data.
Reading:
Barlow, chapter 7.2
Lecture(s):
Confidence Intervals And Limits
Computer Exercise(s):
Random Digits Runs Test:
RandomDigitsTest.py,
data_RandomDigits2018_A.txt,
data_RandomDigits2018_B.txt,
data_RandomDigits2018_C.txt,
data_RandomDigits2018_D.txt,
data_RandomDigits2018_E.txt,
data_RandomDigits2018_F.txt, and
data_RandomDigits2018_G.txt
For a large scale test, try one million digits of pi: pi1000000.txt
In order to see, if you can test individuals ability to produce
randon numbers, consider this data file (from last year):
PersonsDigitsForTest2017.txt
Friday:
In the lecture, we will mainly focus on discussion of the
TableMeasurement (in Aud. A), which covers both the philosophy of data
handling and analysis, and actually also the construction of fits.
In the exercises, we'll try a simple example of doing integration in
many dimensions using simple simulation. First, it is the estimate of
pi, followed by the rational numbers in front of (hyper) volumes of
balls in many dimensions!
Reading:
No reading - logic and reason suffices (along with math and Python!).
Lecture(s):
Table Measurement Solution/Discussion
Testing random numbers
Computer Exercise(s):
Estimating pi from simulation PiEstimate.ipynb
NBI Coffee Usage problem CoffeeUsage.ipynb
Last updated: 14th of December 2018.