Maestría en Ciencias de los Datos y Analítica (tesis)
URI permanente para esta colección
Examinar
Examinando Maestría en Ciencias de los Datos y Analítica (tesis) por Título
Mostrando 1 - 20 de 173
Resultados por página
Opciones de ordenación
Publicación A Dynamic Approach to Modeling Count Data Based on Intensity Functions of Non-Homogeneous Poisson Processes and Functional Data Techniques(Universidad EAFIT, 2024) Chavarría Serna, Juan Esteban; Ortiz Arias, Santiago; Velasco, HenryPublicación A Multivariate Outlier Detection Methodology Based on S-Orthogonal DOBIN Projections(Universidad EAFIT, 2024) Cano Campiño, Andrés Mauricio; Ortiz Arias, SantiagoPublicación A new segmentation approach using dynamic variables on individuals(Universidad EAFIT, 2021) Prieto Escobar, Nicolás; Laniado Rodas, Henry; Monroy Osorio, Juan CarlosPublicación A predictive approach based on fuzzy cognitive maps with federated learning(Universidad EAFIT, 2023) Garatejo Vargas, Edison Camilo; Aguilar Castro, José Lizandro; Hoyos, WilliamPublicación A retail demand forecasting system of product groups characterized by time series based on “ensemble machine learning” techniques with feature enginnering(Universidad EAFIT, 2022) Mejía Chitiva, Santiago; Aguilar Castro, José LisandroPublicación A Robust Version of a Risk-Inverse Weighing Methodology for Portfolio Selection(Universidad EAFIT, 2024) Renza Chavarría, Juan Felipe; Ortiz Arias, SantiagoPublicación Algoritmo evolutivo para resolver el problema de enrutamiento de vehículos tiempo dependiente con ventanas de tiempo en una compañía del sector de alimentos y bebidas en Colombia(Universidad EAFIT, 2023) Ramírez Guilombo, Camilo; Rivera Agudelo, Juan CarlosPublicación Análisis comparativo de modelos predictivos para la estimación de PM2.5 : un enfoque basado en aprendizaje automático y predicción conformal(Universidad EAFIT, 2024) Camelo Valera, Matías; Martínez Vargas, Juan David; Sepúlveda Cano, Lina MariaFine particulate matter (𝑃𝑀2.5pollution poses a significant environmental and public health challenge, requiring accurate predictive models for its monitoring and control. This study compares different machine learning approaches, including Linear Regression, Random Forest, and XGBoost, with and without the inclusion of mobility variables, to estimate 𝑃𝑀2.5 levels. Additionally, inductive conformal prediction is implemented to quantify uncertainty in the estimates and provide confidence intervals with 𝛼=0.05. The results show that while XGBoost experiences performance deterioration during training when mobility variables are included, it achieves the best validation performance with the lowest mean absolute error and the highest coefficient of determination. Conformal prediction enabled the establishment of confidence intervals with 89.26% coverage, close to the expected 95%, ensuring model reliability across different spatial and temporal scenarios. In conclusion, the use of machine learning models combined with advanced validation and calibration techniques, such as conformal prediction, enhances the accuracy and reliability of 𝑃𝑀2.5 estimation. However, the quality of input variables, particularly mobility-related data, remains a challenge, highlighting the need to incorporate meteorological information and improve data resolution. These findings contribute to the development of more reliable predictive tools for environmental management and air quality policy decision-making.Publicación Análisis de discurso basado en modelos grandes de lenguaje(Universidad EAFIT, 2024) Jiménez Jaimes, Edgar Leandro; Montoya Múnera, Edwin NelsonThis thesis explores the implementation of natural language processing techniques and large language models (LLMs) to support discourse analysis tasks in the context of the "Tenemos que hablar Colombia" program. Techniques such as topic modeling, sentiment analysis, clustering, visualization, and the creation of a conversational assistant based on Retrieval Augmented Generation (RAG) have been addressed using advanced text modeling, vector embeddings, and prompt engineering approaches. A text classification model focused on predicting the label of the verbal indicator variable, assigned manually by the interviewer, is also presented, although this model is not directly applied to discourse analysis. This work adds to the studies of the " Tenemos que hablar Colombia " program, where other authors have contributed through computational linguistics analysis and machine learning techniques. Using advanced NLP techniques, we have sought to improve the interpretation of text data and its application in discourse analysis. The results have shown improvements in the accuracy of data classification and analysis through the techniques explored, providing a better understanding of citizen perceptions.Publicación Análisis de discurso de los máximos responsables de las empresas participantes en el COLCAP(Universidad EAFIT, 2024) Cuervo Garcia, Dairo Alberto; Pantoja Robayo, Javier Orlando; Ceballos Cañón, Johan ArmandoPublicación Análisis de explicabilidad en modelos predictivos basados en técnicas de aprendizaje automático sobre el riesgo de re-ingresos hospitalarios(Universidad EAFIT, 2023) Lopera Bedoya, Juan Camilo; Aguilar Castro, José LisandroBig Data and medical care are essential to analyze the risk of re-hospitalization of patients with chronic diseases and can even help prevent their deterioration. By leveraging the information, healthcare institutions can deliver accurate preventive care, and thus, reduce hospital admissions. The level of risk calculation will allow planning the spending on in-patient care, in order to ensure that medical spaces and resources are available to those who need it most. This article presents several supervised models to predict when a patient can be hospitalized again, after its discharge. In addition, an explainability analysis will be carried out with the predictive models to extract information associated with the predictions they make, in order to determine, for example, the degree of importance of the predictors/descriptors. In this way, it seeks to make the results obtained more understandable for health personnel.Publicación Análisis de la tendencia de la solución de una interacción con un Chatbot de atención al cliente, basado en análisis de sentimiento y otras variables(Universidad EAFIT, 2023) Flórez Salazar, Luz Stella; Montoya Múnera, Edwin NelsonA chatbot is a program created with artificial intelligence. In the context of customer service, can establish conversations with customers and they are trained to resolve their queries, problems and complaints. A chatbot’s skill to identify when a customer is not meeting their request represents a challenge for companies that currently use this technology. One of the strategies to avoid quitting the conversation for this reason, is to escalate or transfer the conversation to a human agent. Therefore, it is essential to detect when it is time to carry out this escalation. This project evaluates different Natural Language Processing (NLP) techniques, rule-based labeling algorithms, classical supervised machine learning models and a simple neural network for classification, applied to interactions between a customer service chatbot and a user, in order to find a mechanism for automatic labeling of the data and to build a model that can be used to make the decision on whether the customer should continue interacting with the chatbot or if he should be transferred to a conversation with a human agent. The labeling mechanism could also be used to classify historical data, to later train a model. Different models and techniques are evaluated and those with the best performance in detecting the conversations that should escalate to a human agent are presented.Publicación Análisis de la utilidad potencial del mercado colombiano a través de modelos de segmentación y customer life value para una empresa originadora de créditos de libranza(Universidad EAFIT, 2022) González Cano, Juan José; Montoya Cano, Jorge Esteban; Ochoa, NataliaCurrently companies define their target market to have a greater focus on certain individuals and groups of the population, however, they fail to understand in depth what is the future economic benefit that these market niches represent, to understand if their business model is attractive from a financial point of view. This project is directly focused on the Colombian financial sector, seeking to make a direct contribution to the way in which companies in this sector analyze and define the economic potential of their target market, through the use of analytical and financial tools such as segmentation models and Customer Life Value analysis, resulting in the value that each niche can possibly represent in utility for the company, allowing it to outline a business strategy that ensures sustainability over time and in the market. Thanks to the comprehensive capabilities of the project team, segmentation techniques will be used to support different types of variables to find very homogeneous groups in their individuals, but very heterogeneous among them and thus get to know which clusters will lead the company to obtain a greater benefit.Publicación Análisis de los resultados de la aplicación del instrumento para la evaluación docente de la universidad EAFIT(Universidad EAFIT, 2024) Fernández Carmona, Laura Catalina; Guarín Zapata, Nicolás; Mola Ávila, José Antonio; Universidad EAFITPublicación Análisis de patrones de violencia armada en la frontera de Colombia con Venezuela usando algoritmos de aprendizaje automático(Universidad EAFIT, 2025) Lopera Pai, Daniela; Aguilar Castro, José LisandroPublicación Análisis de quiebra empresarial ante escenarios de contracción de la oferta y la demanda ocasionados por el Covid-19 : un estudio del sector comercio colombiano(Universidad EAFIT, 2021) Urán González, Ana María; Arjona, MateoPublicación Análisis de registros de mantenimiento de centrales de generación de energía con técnicas de procesamiento de lenguaje natural(Universidad EAFIT, 2024) Ocampo Davila, Andrés Alonso; Salazar Martínez, Carlos AndresPublicación Análisis de riesgo de impago en el sector financiero : enfoque en tarjetas de crédito(Universidad EAFIT, 2024) Herrera Olivares, Maher Stehisy; Moreno Reyes, Nicolás AlbertoPublicación Análisis del efecto que tienen los subsidios a la demanda para la adquisición de vivienda nueva en los ingresos monetarios de los beneficiarios(Universidad EAFIT, 2022) Betancur Londoño, David; Dávalos Álvarez, EleonoraPublicación Análisis del volumen útil diario del embalse de El Peñol de 2010 a 2023 a partir de datos funcionales(Universidad EAFIT, 2025) Giraldo Gómez, Sebastián; Ortiz Arias, SantiagoThis study analyzes the hydroelectric behavior of the El Peñol reservoir, with an emphasis on its historical dynamics. Comparisons were made with four Colombian reservoirs: El Peñol, Playas, Punchiná, and San Lorenzo. To achieve this, functional statistical techniques were applied to historical data from the period 2010-2023 provided by XM, along with information on the El Niño and La Niña phenomena obtained from the Institute of Hydrology, Meteorology, and Environmental Studies (IDEAM). The variables analyzed include the turbined volume, daily usable volume, total energy generation, and market prices, with the main objective of identifying temporal patterns, seasonal trends, and functional relationships between these variables. The analysis included the calculation of functional means, the estimation of functional variances, and the application of functional principal component analysis (functional PCA). These techniques made it possible to reduce the dimensionality of the data and understand the main factors influencing hydroelectric behavior. As part of the methodology, Fourier smoothing was used to represent the variables as continuous curves, facilitating noise removal and capturing underlying trends. This approach allowed for functional comparisons between the reservoirs, highlighting both similarities and differences in their operation. The results of this functional analysis provide a solid foundation for interpreting hydrological patterns in the Antioquia region, with special attention to the El Peñol reservoir and its impact on regional hydroelectric efficiency. This reservoir, one of the most important in the country, faces significant challenges arising from fluctuations in water availability and the effects of climate change, emphasizing the need for sustainable management strategies. In this context, functional indicators were developed to evaluate the sustainability of the reservoir’s operation and propose improvements in its management. This study contributes to the advancement of specific analytical tools for hydroelectric management in Colombia, also establishing a precedent for future research aimed at reservoirs with similar characteristics, both regionally and internationally.