Resumen
Many emerging applications generate high volume data streams. These data streams need to be processed in an online manner considering limited memory resources and strict time constraints. Thus data streams pose new challenges not present in classical machine learning techniques. They need to be modified, or new algorithms have to be devised that respond to their specific requirements. In particular, in this paper, we present a new clustering algorithm based on Centroid Tracking for data streams. The idea behind this algorithm is to model centroid movements and use this model to predict the next movements. The centroid movement model is updated with new stream samples, and only in the rare event of a significant quality loss, we fall back to a standard clustering algorithm. We compare our algorithm experimentally with a state of the art stream clustering algorithm called ClusTree and determine their robustness in the presence of noisy data. We conduct experiments based on real-world and synthetic datasets. The results show that the proposed approach has good performance.
| Idioma original | Inglés |
|---|---|
| Número de artículo | 25 |
| Publicación | Espacios |
| Volumen | 39 |
| N.º | 14 |
| Estado | Publicada - 2018 |
Nota bibliográfica
Publisher Copyright:© 2018.
Tipos de Productos Minciencias
- Artículos de investigación con calidad Q3
Huella
Profundice en los temas de investigación de 'ClusCTA: A Clustering technique based on Centroid Tracking for Data Streams'. En conjunto forman una huella única.Citar esto
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver