Abstract:
The software implementation of a parallel gradient boosted trees algorithm that requires a distributed data storage and is intended mostly to large machine learning tasks is described. Computational experimental results are given to show an advantage in the performance and scalability of the proposed implementation over some other open-source implementations while using large datasets. Experimental quality evaluations are given also to show the competitiveness of the proposed implementation.