Applied Statistics - Week 5

Monday the 16th - Friday the 20th of December 2024

The following is a description of what we will go through during this week of the course. The chapter references and computer exercises are considered read, understood, and solved by the beginning of the following class, where I'll shortly go through the exercise solution.

General notes, links, and comments:
  • Lady tasting tea (Wikipedia).
  • Short note on Lady tasting tea.

    Monday:
    In the lecture, we will start with a discussion of the TableMeasurement (in Aud. A), which covers both the philosophy of data handling and analysis, and actually also the construction of fits.
    Then we will focus on calibration, which is a subtle subject, yet fairly straight forward, once you get the hang of the idea. The associated exercise is inspired by typical data analysis and calibration work.
    Reading:
  • Barlow, chapter 7.2
    Lecture(s):
  • Table Measurement Solution/Discussion
  • Calibration
    Computer Exercise(s):
  • Calibration: Calibration_original.ipynb (empty)
  • Calibration data file: data_calib.txt

    Tuesday:
    We will considering the theme MultiVariate Analysis (MVA), that is analysis of data with more than one (typically many) variables. To begin with, we will consider the relatively simple linear case, which is described by (Fisher's) Linear Discriminant Analysis (LDA), and then move on to more complex sets of data.
    Reading:
  • Barlow, chapter 7
  • An additional possible source is Fisher’s Linear Discriminant: Intuitively Explained
    Lecture(s):
  • Linear MultiVariate Analysis
    Computer Exercise(s):
  • 2par_discriminant.ipynb
  • fisher_discriminant.ipynb and Fisher's Iris data

    Friday:
    I will start by putting a few words on the problems set, just to ensure that all parts are clear. We will then spend the Friday on a more "intuitive" exercise, which illustrates the idea of separating data into catagories, and how to measure and optimise the performance of this in real data with all of its quirks and twists (i.e. non-Gaussianity).
    The data is from ATLAS testbeam data at CERN and deals with separating particles in a beam into electrons and pions, but could in principle be from any other area of research and/or business.

    Reading:
  • No reading - focus on ATLAS test beam data analysis.
    Lecture(s):
  • Real data analysis - ATLAS testbeam
    Computer Exercise(s):
  • The exercise is on the real ATLAS testbeam data (PDFs unknown!), where the use of three independent detectors is key.
  • Analysis of ATLAS testbeam data: ATLAStestbeam.ipynb along with main data (2 GeV) and alternative data (9 GeV).
  • In case you want to "fight" harder data, here is 2 GeV and 9 GeV with some of the original troubles in.

    Last updated: 13th of December 2024.