Applied Statistics - Project 2

Project description:
The purpose of Project 2 is for you to use your newly won statistical skills on data of your own liking. Only through applying the methods yourself do you realize their powers and weaknesses, and get the experience that they require. This is at the same time a chance for you to affiliate yourself with some of the groups at NBI, and to take part in the research that goes on here.

Requirements:
You are free to choose any dataset you like, however they should strive to fulfill the following very loose requirements:
  • There must be 500 data points/measurements.
  • These must not be from (simple) simulations.
  • You should apply a hypothesis test on them.

    If you don't have any data to analyze, the following two links contains many datasets of various sizes. Look through them, and choose an exciting one:
    Quora large datasets.
    Kaggle data competitions.


    Suggestions, comments and advice:
    While you are free to do your project as you see fit, the following are a few pieces of advice and suggestions:
  • Start the article by presenting your motive/aim and the data you will use to test your hypothesis. State how much data you have and the format (No, not "It is a text file of 2MB").
  • Describe what you do to the data in enough detail that others will be able to redo what you did.
  • Set up hypothesis tests (or whatever you're looking at) for each subject stated in the opening, and state the result quantitatively.
  • Summary and abstract should be very short (5-10 half lines) and short (5-10 abstract lines) respectively, and the abstract should summarize your results as well.
  • Most importantly, you should think about what figures you want to include, and how to make them the best possible. They should contain as much information as possible (be your main result!) while not getting cluttered and hard to read. Remember, they will be 75% of what people see, refere to, include in slides and posters, and understand from.

    Experiments:
  • Experiment 1 (Allan): "Do our students learn anything?"
  • Experiment 2 (Anne, Stine): "Temperatures in North-East Greenland"
  • Experiment 3 (Rasmus, Maria, Magnus): "Measuring the Higgs mass"
  • Experiment 4 (Sissal): "Testing model of mercury in the atmosphere"
  • Experiment 5 (Willi, Malthe, Bjarke): "Timing calibration of ATLAS detector"
  • Experiment 6 (Mathias S, Mathias K): "Correlations in pregnancy data"
  • Experiment 7 (Esben B, Amalie, Marius): "Mobile phone data and habits"
  • Experiment 8 (Mikkel): "AGN distance measurements"
  • Experiment 9 (Julius, Mathias R, Rogvi (RD)): "When does mRNA unfold?"
  • Experiment 10 (Ursula, Monika, Tim): "Testing protein difference in ALS mice"
  • Experiment 11 (Rune, Merlin, Nino): "Comparing time series of climate data"
  • Experiment 12 (Dana): "Optimizing VBF Higgs->WW selection "
  • Experiment 13 (Patryk): "Higher order correlation measures"
  • Experiment 14 (Timmi): "Correlations in exoplanet data"
  • Experiment 15 (Mikkel, Kristoffer, Andreas): "Flight delays"
  • Experiment 16 (Aparajita): "Single molecule microscope noise measurements"
  • Experiment 17 (Valdas): "Estimating peaks in laser spectres"
  • Experiment 18 (Jonathan): "Determining the melting of Antarctica"
  • Experiment 19 (Jochen): "Who survived on Titanic?"
  • Experiment 20 (Thomas, Esben M): "Measuring the Higgs mass"
  • Experiment 21 (Christian Okkels): "Analyzing financial data"


    Writing up results:
    The project should be written in Physical Review Letter style (or something close to it, if you don't like Latex) thus not more than 3-6 pages, and below you can find the files needed (works with pdftex as well, except for the figures, which needs to be converted into .pdf or .png):
  • PRL Latex template.
  • Test figure 1.
  • Test figure 2 (wide).
  • Result using current template.

    Comments:
    Enjoy, have fun, and throw yourself without worries at the data.


    Last updated: 8th of October 2012.