Applied Statistics - Week 1

Monday the 20th - Friday the 24th of November 2023

The following is a description of what we will go through during this week of the course. The chapter references and computer exercises are considered read, understood, and solved by the beginning of the following class, where I'll shortly go through the exercise solution.

Monday:
The first day of class will start 8.15 in Auditorium 2 at HCO. We will start by giving a general introduction to the course, and go through the different parts, so that you know what to expect.
Then we will lecture on Mean(s), Standard Deviation, Correlations, Significant Digits, and the Central Limit Theorem (CLT). You almost surely know about all or most of these, and this is simply to set the scene.
At 10:15 we will move to the exercises, where we'll work on the below exercise on the Central Limit Theorem, which is the reason why the Gaussian distribution plays such a central role in statistics.

Reading:

Barlow, chapter 1, 2 (most of which you should know), and 4.1 + 4.2.
Podcast:

Introduction to Mean, Standard Deviation (aka. RMSE), Correlations, and Central Limit Theorem.
Lecture(s):

Mean and Width.

Correlations.

Significant Digits.

Central Limit Theorem.

Why Statistics?.

Recording of Lecture video (course information, 2022).

Recording of Lecture video (statistics, 2022).
Computer Exercise(s):

Central Limit Theorem: CentralLimit_original.ipynb (empty version)

Anscombe's Quartet: AnscombesQuartet_original.ipynb (just for illustration!)

Tuesday:
The main theme will be the Error propagation, which most of you should know the basics of already. While error propagation is craftsmanship, there are nevertheless smart ways of doing it numerically.
You should also (when done with the exercise) do the analytical error propagation for the two formulae for the gravitational acceleration, g, from a pendulum and ball-on-incline experiments. You will need these in the project and its preparation (estimating largest source of error).

Reading:

Barlow, chapter 4.3.
Podcast:

Introduction to Error Propagation.
Lecture(s):

Error Propagation.

Recording of Lecture video (2022).
Computer Exercise(s):

Error Propagation: ErrorPropagation.ipynb (empty version)

Friday:
We will focus on the ChiSquare Method, which is basic method behind performing a fit to data. As it turns out, this method has the great advantage of providing a goodness-of-fit measure, which can be used to test, if the fit really resembles data.

Reading:

Barlow, chapter 6.
Podcast:

Introduction to ChiSquare.
Lecture(s):

ChiSquare Test

The ChiSquare "Miracle"

P-values

Recording of Lecture video I (2022).

Recording of Lecture video II (2022).
Computer Exercise(s):

ChiSquare Test: ChiSquareTest.ipynb (empty version)

ChiSquare Test - several examples: ChiSquareTest_SeveralExamples.ipynb

Weighted Mean - and relation to ChiSquare: WeightedMeanSigmaChi2.ipynb (small exercise for "anytime")

General notes, links, and comments:

Analytical Linear Fit note: StraightLineFit.pdf

Analytical Linear Fit implementation: AnalyticalLinearFit.ipynb

More Python intro: A good Python exercise is to consider the below program, which calculates and plots the distribution of prime numbers. As you surely know the math behind, you can see if you can also follow how it is programmed in Python: CalcAndPlotPrimeNumbers.ipynb.

Last updated: 16th of November 2023.