1. Identify_ArnauMoranchoTarda_XGBoost.txt: Algorithm: XGBoost XGBClassifier Key HP values: max_depth=6, n_estimators=300, learning_rate=0.01 HP optimisation: HP optimization not done since the method was very slow Parameters in model: Loss function and value on validation set: 0.1793 (binary logloss, 25% for validation) Own evaluation: Very slow model,25 best variables using Permutation Importance. 2: Identify_ArnauMoranchoTarda_LightGBM.txt: Algorithm: LightGBM Key HP values: max_depth=93, num_leaves=55, learning_rate=0.063 HP optimisation: HP with Bayesian optimization while for (max_depth, num_leaves, learning_rate) while cross validating Parameters in model: Loss function and value on validation set: 0.1442 (binary logloss, 25% for validation) Own evaluation: Good model with great results in validation (3% better)and much faster than XGBoost. When finding 25 best variables SHAP was much faster than permutation importance and gave better results. 3: Regression_ArnauMoranchoTarda_LightGBM.txt: Algorithm: LightGBM Key HP values: learning_rate=0.0999, max_depth=11, n_estimators=528, num_leaves=54 HP optimisation: HP using Randomized Search with cross validation Parameters in model: Pre-processing: Loss function and value on validation set: 0.0677 (MAE of relative deviation = (P-T)/T, 20% for validation) Own evaluation: Great model, with good accuracy and easy to run. 15 best variables using Feature Importance. 3: Regression_ArnauMoranchoTarda_TensorFlow.txt: Algorithm: TensorFlow Keras Key HP values: n_hidden1=165, n_hidden2=142, n_hidden3=365, n_epochs=19 HP optimisation: HP was tuned using Bayesian Optimization while cross-validating Parameters in model: Pre-processing: Scaled the input features using QuantileTransformer Loss function and value on validation set: 0.7583 (MAE of relative deviation = (P-T)/T, 25% for validation) Own evaluation: Good model, but I find it hard with all the classes and definitions. 15 best variables using Permutation Importance. 4: Clustering_ArnauMoranchoTarda_K-MC.txt: Algorithm: k means Key HPs: n_init=30 , n_clusters=3 HP optimisation: Tested various values of n_clusters. Parameters in model: Pre-processing: Scaled the input features using QuantileTransformer and used PCA to improve. Loss function and value on validation set: In validation the electron cluster was 96% accurate but the other 2 cluster not good (35% and 40% of electrons). Own evaluation: Clusters found, but the result was not much great. Sometimes I was getting two electron cluster (but I did not find much sense on it). I selected the best 10 features that I found. 5: Clustering_ArnauMoranchoTarda_DBSCAN.txt: Algorithm: DBSCAN Key HPs: eps=0.3, min_samples=250 HP optimisation: Tested various values for the HP. Parameters in model: Pre-processing: Scaled the input features using QuantileTransformer and used PCA to improve. Loss function and value on validation set: Most of the points are not found in any of the 3 clusters it found Own evaluation: This algorithm is not good at least with the features and HP chosen. I am not happy with it.