Applied Statistics - Project 2
The purpose of Project 2 is for you to use your newly won statistical
skills on data of your own liking. Only through applying the methods
yourself do you realize their powers and weaknesses, and get the
experience that they require.
This is at the same time a chance for you to affiliate yourself with
some of the groups at NBI, and to take part in the research that goes
on here.
Requirements:
You are free to choose any dataset you like, however they should
strive to fulfill the following very loose requirements:
There must be 500 data points/measurements.
These must not be from (simple) simulations.
You should apply a hypothesis test on them.
Make a writeup of no more than 4 pages for Sunday the 26th of October.
Prepare a 5-6 minute presentation for Monday the 27th of October.
Ideas and proposals for projects:
Project 2 is meant as a possibility to throw yourself at data in your
favorit field of research, or one you would like to explore. Talk to
research groups around NBI or elsewhere, as surely they all have data
or aspects of it that they never got around to analyze.
If you don't have any data to analyze, I have some interesting and
illustrative yet relatively simple data sets. Come and ask me.
Finally, the following two links contains many datasets of various
sizes:
Quora large datasets and
Kaggle data competitions.
Suggestions, comments and advice:
While you are free to do your project as you see fit, the following
are a few pieces of advice and suggestions:
Start the article by presenting your motive/aim and the data
you will use to test your hypothesis. State how much data you have
and the format (No, not "It is a text file of 2MB").
Describe what you do to the data in enough detail that others
will be able to redo what you did.
Set up hypothesis tests (or whatever you're looking at) for
each subject stated in the opening, and state the result
quantitatively.
Summary and abstract should be very short (5-10 half lines) and
short (5-10 abstract lines) respectively, and the abstract should
summarize your results as well.
Most importantly, you should think about what figures you want
to include, and how to make them the best possible. They should
contain as much information as possible (be your main result!) while
not getting cluttered and hard to read. Remember, they will be 75%
of what people see, refere to, include in slides and posters, and
understand from.
Writing up results:
The project should be written in Physical Review Letter style (or
something close to it, if you don't like Latex) thus not more than
3-4 pages, and below you can find example files:
PRL Latex template.
Test figure 1.
Test figure 2 (wide).
Result using current template.
Comments:
Enjoy, have fun, and throw yourself boldly at the data.
Last updated: 15th of October 2014.