1. Classification_JonathanGreve_LightGBM.txt:
  Algorithm: LightGBM.
  Key HP values: num_boosting_round=600, boosting_type='dart', num_leaves=50, min_child_samples=100, learning_rate=0.05, subsample=0.75, subsample_freq=5.
  HP optimisation: I did 5-fold CV with 100 fits in total (20 * each fold).
  Parameters in model: I don't know how to calculate it.
  Loss function and value on validation set: I used LogLoss and got 0.192051 on the last test run on the validation set.
  Own evaluation: It performs well with an AOC of ~0.957 on the validation set. Used sklearns SelectKBest and mutual_info_classif
	          together with cross validation to select 25 features. I tried with fewer features and PCA as well but with worse results.

2. Classification_JonathanGreve_LightGBMWith25BestShapVariables.txt:
  Algorithm: LightGBM.
  Key HP values: Same as in 1. above. I used SHAP instead to find 25 features.
  HP optimisation: Same as in 1. above.
  Parameters in model: Same as in 1. above.
  Loss function and value on validation set: I used LogLoss and got 0.1388144626468073on the last test run on the validation set.
  Own evaluation: This model performed extremely well. Using SHAP to find the variables decreased the error (logloss) dramatically.
                  Got an AUC of ~0.981 on the validation set.

3: Classification_JonathanGreve_NN.txt:
  Algorithm: Tensorflow's Sequential NN
  Key HP values: Nhidden1=2048, Dropout(0.5), Nhidden2=1024, Dropout(0.5), Nhidden3=512, Dropout(0.5) LearningRate=0.0001
  HP optimisation: I tried many different configurations of the layers, both increasing the number of layers and neurons in each layer.
                   Ultimately there was only a few percentage difference in performance between my choices.
  Parameters in model: 2676737
  Pre-processing: Scaled the input features using StandardScaler(). First ran the model on all 150 (removed 10 constant columns) features,
                  then I used SHAP to extract the best 25 features which I used for the final model.
  Loss function and value on validation set: 0.17188113176232153
  Own evaluation: Very good model close to LightGBM but still slightly worse. Performed best on all features but was basically equally as goo
		  with the 25 features selected using SHAP. SHAP is very good.

4: Regression_JonathanGreve_LightGBM.txt:
  Algorithm: LightGBM
  Key HP values: gbdt', subsample=0.75, subsample_freq=5
  HP optimisation: 3-fold CV on num_leaves, min_child_samples and learning_rate.
  Parameters in model: I don't know how to calculate it.
  Pre-processing: Removed constant columns before training on all features. Then I used SHAP to get the best 15 variables that I used in the
	          3-fold CV. I tried various types of preprocessing such as normalizing using StandarScaler and clipping outliers and using PCA but
                  using those preprocessing steps did not increase the validation error so I didn't use them in the end.
  Loss function and value on validation set: 5878.6397 MAE
  Own evaluation: Quite good model and better than my Neural Network model I also trained.

5: Regression_JonathanGreve_NN.txt:
  Algorithm: Tensorflow's Sequential NN
  Key HP values: Nhidden1=1024, Nhidden2=512, Nhidden3=256, Nhidden4=128, Nhidden5=64, LearningRate=0.001
  HP optimisation: I manually tried different layout (i.e. number of layers and neurons in each layer). Dropout seemed to have a negative effect in this case
                   as opposed to the classification solution above.
  Parameters in model: 713729
  Pre-processing: I scaled the data using sklearns StandarScaler and then clipped all data to within 5 standard deviations to remove the effect of outliers.
  Loss function and value on validation set: 6303.8306 MAE
  Own evaluation: Still a good model but noticeably worse than the model trained using LightGBM.



6: Clustering_JonathanGreve_Kmeans.txt:
  Algorithm: Sklearn k-means clustering
  Key HPs: n_clusters=15
  HP optimisation: Tested various values of n_clusters between 3 and 40.
  Parameters in model: 2 (50 bytes)
  Pre-processing: No preprocessing done other than removing potential constant features and selecting 10 features.
  Loss function and value on validation set: I had an accuracy of 0.93. I used my own error function which found the cluster with the most correct classifications
                                             and then computed the accuracy. This only worked because we have to ground truth for the training set.
                                             Without a ground truth it would be a lot more difficult to evaluate the model.
  Own evaluation: I don't know much about particle physics but it seems possible to me that the collission will create multiple types of particles which should
		  be possible to cluster. I do not know if 15 if a large or small number of clusters in this case but if I chose more than 20 clusters the
                  number of correct cluster predictions fell. between 3 and 15 clusters seemed to perform the best.

		  It performed alright but not as well as I could hope on the validation set. But I'm satisfied for now until I study this area some more.