Applied Statistics - Week 3

Monday the 30th of November - Friday the 4th of December 2020

The following is a description of what we will go through during this week of the course. The chapter references and computer exercises are considered read, understood, and solved by the beginning of the following class, where I'll shortly go through the exercise solution.

General notes, links, and comments:
The project groups (Version 3.0, Tuesday the 26th of November) can be found here: ProjectGroups_v3.0_26nov.pdf.
The overview of where to show up (and later do experiments) can be found here: AS2020_NBIoverview_ExperimentalLocations.pdf.


Monday:
Experiments for project: (Group B)
We will be working on the experiments for Project in First Lab.
This project should be handed in (PDF by mail to me) by 22:00 on Sunday the 13th of December 2020 (please, don't sit up all night!).
I would be very happy, if you would give the file the logical name "Project_GroupX_Name1Name2Name3Name4Name5.pdf", where NameX is the first name of the group members.

Lectures and exercises: (Group A)
Real data almost never follows theoretical PDFs, as the real world contains dirty wires, unknown biases, and mismeasurements. We will devote the day to discussion of real data analysis and systematic errors, and apply this to our "Table Measurements" from Aud. A.

Reading:
  • Barlow, chapter 4.4
  • Chauvenet's Criterion on Wikipedia
    Podcast:
  • Systematic Uncertainties.
    Lecture(s):
  • Systematic Errors
  • Systematic Errors (Pre-recorded Zoom lecture): Lecture video and Lecture audio.
    Zoom: Link to exercises.
  • Recording of Experiment Lecture video and Experiment Lecture audio.
    Computer Exercise(s):
  • TableMeasurements: TableMeasurement_original.ipynb,
    data_TableMeasurements2009.txt
    data_TableMeasurements2010.txt
    data_TableMeasurements2011.txt
    data_TableMeasurements2012.txt
    data_TableMeasurements2013.txt
    data_TableMeasurements2014.txt
    data_TableMeasurements2015.txt
    data_TableMeasurements2016.txt
    data_TableMeasurements2017.txt
    data_TableMeasurements2018.txt
    data_TableMeasurements2019.txt
    data_TableMeasurements2020.txt


    The result of your table measurement analysis should be submitted here: Table measurement submission form, the purpose being an analysis of your results (without names!) showing to what extend you given the same data and the same questions can get different answers.

    In addition, the 2020 data exists in an expanded format, where the day of measurement was included: data_TableMeasurements2020_WithDates.txt
    If you managed to get a (good?) result on the "standard" problem, you can consider if the variations in the instructions for the measurements, affects the measurement results.


    Tuesday:
    We will consider Monte Carlo Techniques, which is a ubiquitious tool in statistics. The central point is to be able to generate random numbers according to any given distribution, and subsequently use this.

    Reading:
    Interestingly, Barlow does not cover this important area, but there are fortunately plenty of other references:
  • Glen Cowan: Chapter 3.
  • Wiki transformation method.
  • Wiki Hit-and-Miss (Von Neumann) method.
  • Particle Data Group (PDG) note on Monte Carlo generators (optional - extends GC chapter 3).
    Lecture(s):
  • Monte Carlo methods.
    Zoom: Link to lecture.
  • Recording of Lecture video, Lecture audio, and Lecture chat.
    Computer Exercise(s):
  • Making Random Numbers according to any distribution:
             For illustration (with linear function): TransformationAcceptReject_simple_original.ipynb
             For testing (with 3rd degree polynomial): TransformationAcceptReject_pol3_original.ipynb
             For general problems (various functions): TransformationAcceptReject_general_original.ipynb


    Friday:
    The main theme will be fitting and fitting strategies when faced with real data, and in this case with ChiSquare fits. However, now we are considering real data, and more complicated functions including discontinuities.
    I'll also shortly comment on calculating a ChiSquare between two histograms, and there is a small exercises on calculating the ChiSquare for a weighted mean.
    Finally, I'll be lecturing on types of data and ways of plotting, and we'll shortly discuss Simpson's Paradox, for which there is also an exercise (for those who want).

    Reading:
  • Barlow, chapter 5.3 to 5.7 (but not 5.5 and the proofs).
    Lecture(s):
  • ChiSquare between histograms
  • Fitting and significance
  • Types of data and ways of plotting
    Zoom: Link to lecture (and conceptual questions during exercises).
                 Link to exercises.
  • Recording of Lecture video, Lecture audio, and Lecture chat.
    Computer Exercise(s):
  • Weighted Mean - and relation to ChiSquare: WeightedMeanSigmaChi2.ipynb (small exercise in preparation for project)
  • Danish Company Sizes: CompanySizes_original.ipynb
  • NBI Coffee Usage and Xmas vacation problem (extra problem): CoffeeUsage_original.ipynb (empty version)
    Last updated: 1st of December 2020.