The idea was to implement both a NN and a boosted tree for classification and regression. Both NN's were implemented with tensorflow since I set it up to run on my GPU. For the boosted trees, I wanted to try different algorithms. Also for classification and regression I wanted to NOT use scikit. All HP optimization was done with cross-validation and Optuna (thanks to Hævi for introducing us to that package). To find the best features I wanted to use SHAP values for all supervised algorithms, and for the clustering i wanted to go with the combined best 10 from the two classification algorithms. Finally for the clustering i wanted to try at least two different algorithms. For all alogirithms the RobustScaler was applied to the data, which helps in particular with the NN's. 1: Classification_KasperNielsen_LightGBM.txt: Algorithm: LightGBMClassifier Key HP values: learning_rate=0.008729220139305117, num_leaves=174, max_depth=78, min_data=55 HP optimisation: Optimized the parameters above with Optuna over 25 trials and 5 fold cross-validation Parameters in model: at max 1076*174=187224 (1076 is the number of estimators(trees)) Loss function and value on validation set: 0.144483 (binary cross entropy(logloss) on 10% of training data) Own evaluation: This model is doing pretty well, gets a 94.5% acc on the validation set for the final training. 2: Classification_KasperNielsen_Tensorflow.txt: Algorithm: Tenforflow (sequential of dense layers) Key HP values: n_layers=2, n_units_l0=87, n_units__l1=87, adam_learning_rate=0.006612175544531648 HP optimization: Optuna for 25 trials with 5 fold CV. Parameters in model: 10006 Loss function and value on validation set: 0.1974 (binary crossentropy on 10% of the training data) Own evaluation: This model is also pretty good, gets 92.8% acc on the validation set - slightly worse than lgbm. 3: Regression_KasperNielsen_XGBoost.txt Algorithm: XGBoostClassifier Key HP values: max_depth=7, gamma=277.75, eta=0.06, min_child_weight=23.7 HP optimization: Optuna for 25 trials with 5 fold CV. Parameters in model: at max 2^7*376=48128 Loss function and value on validation set: 0.024 [mean((y_pred-y_true)/y_true) on 10% of the training data)] Own evaluation: This model is pretty good average relevative deviation of 2.5% is quite good. 4: Regression_KasperNielsen_Tensorflow.txt Algorithm: Tenforflow Key HP values: n_layers=1, n_units_l0=13, adam_learning_rate=0.0058963348477237415 HP optimization: Optuna for 25 trials with 5 fold CV. Parameters in model: 222 Loss function and value on validation set: 0.039 [mean((y_pred-y_true)/y_true) on 10% of the training data)] Own evaluation: This model is also pretty good, but not as good as XGB. An average relevative deviation of 3.9%. 5: Clustering_KasperNielsen_KMeans.txt Algorithm: sklearn.cluster.KMeans Key HP values: n_clusters=7 HP optimization: plot of n_cluster vs inertia and look for the "elbow" (manual inspection for n_clusters in range [3,15]) Parameters in model: 7? (coordinates of center of each cluster) Loss function and value on validation set: 3.5e6 inertia Own evaluation: Assuming that class 0 is electrons and the other are non electrons the accuracy is 86.4%, which is actually quite good for such a simple model. 6: Clustering_KasperNielsen_GaussianMixture.txt Algorithm: sklearn.mixture.GaussianMixture Key HP values: n_clusters=9 HP optimization: plot of n_cluster vs aic and bic (Akaike information criterion, bayesian information criterion) and look for minima (manual inspection) Parameters in model: 9 Loss function and value on validation set: 2.4e6 bic Own evaluation: For this clustering algorithm it mixes electrons into different classes more, and thus one cannot say that one class is mostly electrons. Also a minimum of either AIC or BIC was not found. 7: Clustering_KasperNielsen_GaussianMixture2.txt Algorithm: sklearn.mixture.GaussianMixture Key HP values: n_clusters=3 HP optimization: plot of n_cluster vs aic and bic (Akaike information criterion, bayesian information criterion) and look for minima (manual inspection) Parameters in model: 3 Loss function and value on validation set: 3.6e6 bic Own evaluation: (This part is not really a seperate algorithm, just different setup of previous) Same algorithm as above but now only 3 clusters/classes. Gets 80% accuracy on training data if we assume class 0 is electrons. 90% i we sa class 0 and 2 are electrions.