A Worrying Analysis of Recent Neural Recommendation Approaches
Date:
The talk will focus on two of our recent reproducibility and evaluation articles for neural recommendation approaches. In our study we report the results of a systematic analysis of algorithmic proposals for top-n recommendation tasks. Specifically, we considered 26 algorithms that were presented at top-level research conferences in the last years. Only 12 of them could be reproduced with reasonable effort. However, it turned out that 11 of them can often be outperformed with comparably simple methods, e.g., based on nearest-neighbor or graph-based techniques or by simple machine-learning linear ranking methods. The remaining one clearly outperformed the nearest-neighbor and graph-based baselines but did not consistently outperform a well-tuned non-neural linear ranking method. Overall, our work sheds light on a number of potential problems in today’s machine learning scholarship and calls for improved scientific practices in this area. We also report a very frequent occurrence of methodological issues in the current experimental practice e.g., incorrect data splitting, use of test data during training, use of default hyper-parameters and weak baselines.
See the recorded talk on YouTube!
Reference articles:
Are we really making much progress? A worrying analysis of recent neural recommendation approaches (ArXiv)
A Troubling Analysis of Reproducibility and Progress in Recommender Systems Research