Skip to main navigation Skip to search Skip to main content

ClusCTA: A Clustering technique based on Centroid Tracking for Data Streams

    Research output: Contribution to scientific journalArticle in an indexed scientific journalpeer-review

    1 Scopus citations

    Abstract

    Many emerging applications generate high volume data streams. These data streams need to be processed in an online manner considering limited memory resources and strict time constraints. Thus data streams pose new challenges not present in classical machine learning techniques. They need to be modified, or new algorithms have to be devised that respond to their specific requirements. In particular, in this paper, we present a new clustering algorithm based on Centroid Tracking for data streams. The idea behind this algorithm is to model centroid movements and use this model to predict the next movements. The centroid movement model is updated with new stream samples, and only in the rare event of a significant quality loss, we fall back to a standard clustering algorithm. We compare our algorithm experimentally with a state of the art stream clustering algorithm called ClusTree and determine their robustness in the presence of noisy data. We conduct experiments based on real-world and synthetic datasets. The results show that the proposed approach has good performance.

    Original languageEnglish
    Article number25
    JournalEspacios
    Volume39
    Issue number14
    StatePublished - 2018

    Bibliographical note

    Publisher Copyright:
    © 2018.

    Keywords

    • Adaptive learning
    • Classification
    • Concept drift
    • Data Stream Mining

    Types Minciencias

    • Artículos de investigación con calidad Q3

    Fingerprint

    Dive into the research topics of 'ClusCTA: A Clustering technique based on Centroid Tracking for Data Streams'. Together they form a unique fingerprint.

    Cite this