Maestría en Ciencias de los Datos y Analítica (tesis)
URI permanente para esta colección
Examinar
Examinando Maestría en Ciencias de los Datos y Analítica (tesis) por Materia "ACORDES"
Mostrando 1 - 1 de 1
Resultados por página
Opciones de ordenación
Publicación Detección automática de acordes empleando técnicas de caracterización de audio y machine learning(Universidad EAFIT, 2025) Gil Urrego, Rafael Alejandro; Martínez Vargas, Juan David; Sepúlveda Cano, Lina MaríaAutomatic chord detection in audio tracks is essential for developing various musical applications, such as music transcription and score generation. For this reason, there has been a growing interest in the field of data science to explore different strategies to address this need. The main approach studied in recent years is based on extracting features from audio files that contain chord information. Transforming the audio signal using different frequency analysis tools has generated data with a greater ability to describe the musical components present in the processed audio track. The Mel spectrogram and the Chromagram are some of the methods used for these tasks. Additionally, classical supervised analytical models such as Support Vector Machines (SVM), Random Forest, and Convolutional Neural Networks (CNN) have been employed in several studies. These models have demonstrated a high level of accuracy in chord identification. However, in most cases, they have been limited by the number of chord classes to estimate, as an increase in the number of classes can confuse the system, typically allowing a maximum of 24. In this thesis, a system for automatic chord identification was developed by implementing different classical and modern analytical models. For audio feature extraction, the pre-trained models HuBERT and VGGish were used. These extracted features were then fed into three classical models—SVM, Random Forest, and Gradient Boosting—to compare their results with those obtained by a modern model. The HuBERT architecture was chosen as the modern baseline model since it can function both as a feature extractor and a classifier. The experiments were conducted using recordings of 48 different chord classes, all played on a digital piano, providing a solid dataset for training and evaluating the proposed system’s performance. The study confirmed previous research findings: to obtain accurate chord class estimations, it is crucial to improve the characterization techniques of the input audio recordings. A recurring issue identified was the lack of a detailed description of the musical components in the recordings, which affected the models’ ability to deliver optimal results. Our findings highlight that precise feature extraction is key to reducing model generalization error, enabling better chord class identification in both classical supervised approaches and modern architectures such as HuBERT. Finally, it is concluded that modern models, including those based on Transformers, have a high dependency on the quantity and diversity of the data. To achieve effective adaptability, the training data must exhibit sufficient variations within the same class. When data lack intra-class variability, these systems struggle to adapt to new recordings, especially those with background noise or distortions.