Aprendizaje automático en flujos de datos. Comparativa entre MLlib y Mahout.

Arlety Leticia Garcia Garcia; Armando Jesús Plasencia Salgueiro

Aprendizaje automático en flujos de datos. Comparativa entre MLlib y Mahout.

Date

2018

Authors

Arlety Leticia Garcia Garcia

Armando Jesús Plasencia Salgueiro

Abstract

More and more organizations need to process big data collections in IT different systems. Due to its size, very often they need the use of a parallel paradigm for an efficient calculation. There was realized an analysis of Mahout and MLlib as regards the yield, the usability, the cost of implementation, the prosecution of information, the tolerance to mistakes and the safety, as well as the algorithms that they use. The use of them depends on the characteristics of the information and the context of achievement. Mllib offers better solutions for the treatment of data streams in real time, while Mahout offers better solutions for extraction of characteristics and reduction of dimensions.