Applied Statistics - Week 4

Monday the 10th - Friday the 14th of December 2018

ERDA shared link to full week material: B9Q922ZbiI

ERDA shared link to solution examples: cXOhPVbMT4

The following is a description of what we will go through during this week of the course. The chapter references and computer exercises are considered read, understood, and solved by the beginning of the following class, where I'll shortly go through the exercise solution.

General notes, links, and comments:
  • Louis Lyons discussing discovery levels:: 1310.1284v1_LouisLyons_Why5sigma.pdf
  • Comparison between different tests for normality:: Power_Comparisons_of_Shapiro-Wilk_Kolmogorov-Smirn.pdf
  • Illustration of ROC curves: ROCcurves_GaussianSeparations.pdf
  • Comment on multiple hypothesis testing p-values:: p-value histogram
  • Paper of George Marsaglia on testing random numbers:: Random Number Generators

    Monday:
    The main theme of this week will be Hypothesis testing, and we will start with an exercise gently introducing the subject. In addition to the ChiSquare test, there are several other tests, some simple (one/two sample tests) and some more conceptually challenging (Kolmogorov and Wald-Wolfowitz runs test).

    Reading:
  • Barlow, chapter 8 on hypothesis testing (in particular 8.1-8.3).
  • Cohen, chapter 4 on hypothesis testing (perhaps omitting 4.2-4.4).
    Lecture(s):
  • Hypothesis Testing
  • On p-value histograms
    Computer Exercise(s):
  • Hypothesis testing: HypothesisTests.ipynb
  • Producing a ROC curve: MakeROCfigure.ipynb

    Tuesday:
    Today's lecture will be on Confidence Intervals, which in principle is a simple subject (and we will not go beyond simple here), but one with complicated details. For once, we will not have a matching exercise associated (as it is fairly general).
    I will re-iterate on hypothesis tests, and the exercise of the day will focus exactly on different tests for your own random (?) data.

    Reading:
  • Barlow, chapter 7.2
    Lecture(s):
  • Confidence Intervals And Limits
    Computer Exercise(s):
  • Random Digits Runs Test: RandomDigitsTest.py,
    data_RandomDigits2018_A.txt,
    data_RandomDigits2018_B.txt,
    data_RandomDigits2018_C.txt,
    data_RandomDigits2018_D.txt,
    data_RandomDigits2018_E.txt,
    data_RandomDigits2018_F.txt, and
    data_RandomDigits2018_G.txt
    For a large scale test, try one million digits of pi: pi1000000.txt
    In order to see, if you can test individuals ability to produce randon numbers, consider this data file (from last year): PersonsDigitsForTest2017.txt

    Friday:
    In the lecture, we will mainly focus on discussion of the TableMeasurement (in Aud. A), which covers both the philosophy of data handling and analysis, and actually also the construction of fits.
    In the exercises, we'll try a simple example of doing integration in many dimensions using simple simulation. First, it is the estimate of pi, followed by the rational numbers in front of (hyper) volumes of balls in many dimensions!

    Reading:
  • No reading - logic and reason suffices (along with math and Python!).
    Lecture(s):
  • Table Measurement Solution/Discussion
  • Testing random numbers
    Computer Exercise(s):
  • Estimating pi from simulation PiEstimate.ipynb
  • NBI Coffee Usage problem CoffeeUsage.ipynb
    Last updated: 14th of December 2018.