An Automatic Merge Technique to Improve the Clustering Quality Performed by LAMDA

Morales, Luis; Aguilar, Jose

An Automatic Merge Technique to Improve the Clustering Quality Performed by LAMDA

dc.citation.journalTitle	IEEE Access	eng
dc.contributor.author	Morales, Luis
dc.contributor.author	Aguilar, Jose
dc.contributor.department	Universidad EAFIT. Departamento de Ingeniería de Sistemas	spa
dc.contributor.researchgroup	I+D+I en Tecnologías de la Información y las Comunicaciones	spa
dc.creator	Morales, Luis
dc.creator	Aguilar, Jose
dc.date.accessioned	2021-04-12T20:55:50Z
dc.date.available	2021-04-12T20:55:50Z
dc.date.issued	2020-01-01
dc.description.abstract	Clustering is a research challenge focused on discovering knowledge from data samples whose goal is to build good quality partitions. In this paper is proposed an approach based on LAMDA (Learning Algorithm for Multivariable Data Analysis), whose most important features are: a) it is a non-iterative fuzzy algorithm that can work with online data streams, b) it does not require the number of clusters, c) it can generate new partitions with objects that do not have enough similarity with the preexisting clusters (incremental-learning). However, in some applications, the number of created partitions does not correspond with the number of desired clusters, which can be excessive or impractical for the expert. Therefore, our contribution is the formalization of an automatic merge technique to update the cluster partition performed by LAMDA to improve the quality of the clusters, and a new methodology to compute the Marginal Adequacy Degree that enhances the individual-cluster assignment. The proposal, called LAMDA-RD, is applied to several benchmarks, comparing the results against the original LAMDA and other clustering algorithms, to evaluate the performance based on different metrics. Finally, LAMDA-RD is validated in a real case study related to the identification of production states in a gas-lift well, with data stream. The results have shown that LAMDA-RD achieves a competitive performance with respect to the other well-known algorithms, especially in unbalanced benchmarks and benchmarks with an overlapping of around 9%. In these cases, our algorithm is the best, reaching a Rand Index (RI) >98%. Besides, it is consistently among the best for all metrics considered (Silhouette coefficient, modification of the Silhouette coefficient, WB-index, Performance Coefficient, among others) in all case studies analyzed in this paper. Finally, in the real case study, it is better in all the metrics.	eng
dc.identifier	https://eafit.fundanetsuite.com/Publicaciones/ProdCientif/PublicacionFrw.aspx?id=12241
dc.identifier.doi	10.1109/ACCESS.2020.3021675
dc.identifier.issn	21693536
dc.identifier.other	WOS;000572947800001
dc.identifier.uri	http://hdl.handle.net/10784/28656
dc.language.iso	eng	eng
dc.publisher	Institute of Electrical and Electronics Engineers Inc.
dc.relation	DOI;10.1109/ACCESS.2020.3021675
dc.relation	WOS;000572947800001
dc.rights	https://v2.sherpa.ac.uk/id/publication/issn/2169-3536
dc.source	IEEE Access
dc.subject	Clustering algorithms	eng
dc.subject	Partitioning algorithms	eng
dc.subject	Production	eng
dc.subject	Unsupervised learning	eng
dc.subject	Data analysis	eng
dc.subject	Proposals	eng
dc.subject	Benchmark testing	eng
dc.subject	Automatic merging	eng
dc.subject	clustering	eng
dc.subject	LAMDA	eng
dc.subject	unsupervised learning	eng
dc.title	An Automatic Merge Technique to Improve the Clustering Quality Performed by LAMDA	eng
dc.type	info:eu-repo/semantics/article	eng
dc.type	article	eng
dc.type	info:eu-repo/semantics/publishedVersion	eng
dc.type	publishedVersion	eng
dc.type.local	Artículo	spa

Archivos

Bloque original

Mostrando 1 - 1 de 1

Nombre:: An_Automatic_Merge_Technique_to_Improve_the_Clustering_Quality_Performed_by_LAMDA.pdf
Tamaño:: 7.71 MB
Formato:: Adobe Portable Document Format
Descripción:

Descargar

Colecciones

Artículos (GIDITIC)