Xgboost: A scalable tree boosting system
Users: 3 - Average Rating: 4.33
In this paper, tree boosting algorithm is described, which is a very popular and effective solution widely recognized in several machine learning and data mining competitions. The main reason for the success of XGBoost is its scalability in different scenarios: it runs 10 times faster than the existing popular solutions and scales beyond billions of examples using far fewer resources than existing systems. In particular, the authors introduce a novel sparsity-aware algorithm for handling sparse data and weighted quantile sketch for approximate tree learning which both contribute to the scalability of XGBoost; more importantly, they provide insights on cache access patterns, and out-of-core computations (data compression and sharding) to build a scalable tree boosting system.
Different datasets have been considered to evaluate the scaling property of the system and the impact of out-of-core computations and sparsity-aware algorithm.
Different datasets have been considered to evaluate the scaling property of the system and the impact of out-of-core computations and sparsity-aware algorithm.
Type:
Scientific Paper
Area:
Data Analytics, Machine Learning
Target Group:
Advanced
DOI:
https://doi.org/10.1145/2939672.2939785
Cite as:
Chen, T. and Guestrin, C., Xgboost: A scalable tree boosting system, Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (2016): 785-794.
Author of the review:
Giulia Cademartori
University of Genoa
You have to login to leave a comment. If you are not registered click here
Andrej Košir
Mauro Bozzetti
Joana Lopes