Maestría en Ciencias de los Datos y Analítica (tesis)
URI permanente para esta colección
Examinar
Examinando Maestría en Ciencias de los Datos y Analítica (tesis) por Título
Mostrando 1 - 20 de 139
Resultados por página
Opciones de ordenación
Ítem A Dynamic Approach to Modeling Count Data Based on Intensity Functions of Non-Homogeneous Poisson Processes and Functional Data Techniques(Universidad EAFIT, 2024) Chavarría Serna, Juan Esteban; Ortiz Arias, Santiago; Velasco, HenryÍtem A Multivariate Outlier Detection Methodology Based on S-Orthogonal DOBIN Projections(Universidad EAFIT, 2024) Cano Campiño, Andrés Mauricio; Ortiz Arias, SantiagoÍtem A new segmentation approach using dynamic variables on individuals(Universidad EAFIT, 2021) Prieto Escobar, Nicolás; Laniado Rodas, Henry; Monroy Osorio, Juan CarlosÍtem A predictive approach based on fuzzy cognitive maps with federated learning(Universidad EAFIT, 2023) Garatejo Vargas, Edison Camilo; Aguilar Castro, José Lizandro; Hoyos, WilliamÍtem A retail demand forecasting system of product groups characterized by time series based on “ensemble machine learning” techniques with feature enginnering(Universidad EAFIT, 2022) Mejía Chitiva, Santiago; Aguilar Castro, José LisandroÍtem A Robust Version of a Risk-Inverse Weighing Methodology for Portfolio Selection(Universidad EAFIT, 2024) Renza Chavarría, Juan Felipe; Ortiz Arias, SantiagoÍtem Algoritmo evolutivo para resolver el problema de enrutamiento de vehículos tiempo dependiente con ventanas de tiempo en una compañía del sector de alimentos y bebidas en Colombia(Universidad EAFIT, 2023) Ramírez Guilombo, Camilo; Rivera Agudelo, Juan CarlosÍtem Análisis de discurso basado en modelos grandes de lenguaje(Universidad EAFIT, 2024) Jiménez Jaimes, Edgar Leandro; Montoya Múnera, Edwin NelsonThis thesis explores the implementation of natural language processing techniques and large language models (LLMs) to support discourse analysis tasks in the context of the "Tenemos que hablar Colombia" program. Techniques such as topic modeling, sentiment analysis, clustering, visualization, and the creation of a conversational assistant based on Retrieval Augmented Generation (RAG) have been addressed using advanced text modeling, vector embeddings, and prompt engineering approaches. A text classification model focused on predicting the label of the verbal indicator variable, assigned manually by the interviewer, is also presented, although this model is not directly applied to discourse analysis. This work adds to the studies of the " Tenemos que hablar Colombia " program, where other authors have contributed through computational linguistics analysis and machine learning techniques. Using advanced NLP techniques, we have sought to improve the interpretation of text data and its application in discourse analysis. The results have shown improvements in the accuracy of data classification and analysis through the techniques explored, providing a better understanding of citizen perceptions.Ítem Análisis de discurso de los máximos responsables de las empresas participantes en el COLCAP(Universidad EAFIT, 2024) Cuervo Garcia, Dairo Alberto; Pantoja Robayo, Javier Orlando; Ceballos Cañón, Johan ArmandoÍtem Análisis de explicabilidad en modelos predictivos basados en técnicas de aprendizaje automático sobre el riesgo de re-ingresos hospitalarios(Universidad EAFIT, 2023) Lopera Bedoya, Juan Camilo; Aguilar Castro, José LisandroBig Data and medical care are essential to analyze the risk of re-hospitalization of patients with chronic diseases and can even help prevent their deterioration. By leveraging the information, healthcare institutions can deliver accurate preventive care, and thus, reduce hospital admissions. The level of risk calculation will allow planning the spending on in-patient care, in order to ensure that medical spaces and resources are available to those who need it most. This article presents several supervised models to predict when a patient can be hospitalized again, after its discharge. In addition, an explainability analysis will be carried out with the predictive models to extract information associated with the predictions they make, in order to determine, for example, the degree of importance of the predictors/descriptors. In this way, it seeks to make the results obtained more understandable for health personnel.Ítem Análisis de la tendencia de la solución de una interacción con un Chatbot de atención al cliente, basado en análisis de sentimiento y otras variables(Universidad EAFIT, 2023) Flórez Salazar, Luz Stella; Montoya Múnera, Edwin NelsonA chatbot is a program created with artificial intelligence. In the context of customer service, can establish conversations with customers and they are trained to resolve their queries, problems and complaints. A chatbot’s skill to identify when a customer is not meeting their request represents a challenge for companies that currently use this technology. One of the strategies to avoid quitting the conversation for this reason, is to escalate or transfer the conversation to a human agent. Therefore, it is essential to detect when it is time to carry out this escalation. This project evaluates different Natural Language Processing (NLP) techniques, rule-based labeling algorithms, classical supervised machine learning models and a simple neural network for classification, applied to interactions between a customer service chatbot and a user, in order to find a mechanism for automatic labeling of the data and to build a model that can be used to make the decision on whether the customer should continue interacting with the chatbot or if he should be transferred to a conversation with a human agent. The labeling mechanism could also be used to classify historical data, to later train a model. Different models and techniques are evaluated and those with the best performance in detecting the conversations that should escalate to a human agent are presented.Ítem Análisis de la utilidad potencial del mercado colombiano a través de modelos de segmentación y customer life value para una empresa originadora de créditos de libranza(Universidad EAFIT, 2022) González Cano, Juan José; Montoya Cano, Jorge Esteban; Ochoa, NataliaCurrently companies define their target market to have a greater focus on certain individuals and groups of the population, however, they fail to understand in depth what is the future economic benefit that these market niches represent, to understand if their business model is attractive from a financial point of view. This project is directly focused on the Colombian financial sector, seeking to make a direct contribution to the way in which companies in this sector analyze and define the economic potential of their target market, through the use of analytical and financial tools such as segmentation models and Customer Life Value analysis, resulting in the value that each niche can possibly represent in utility for the company, allowing it to outline a business strategy that ensures sustainability over time and in the market. Thanks to the comprehensive capabilities of the project team, segmentation techniques will be used to support different types of variables to find very homogeneous groups in their individuals, but very heterogeneous among them and thus get to know which clusters will lead the company to obtain a greater benefit.Ítem Análisis de los resultados de la aplicación del instrumento para la evaluación docente de la universidad EAFIT(Universidad EAFIT, 2024) Fernández Carmona, Laura Catalina; Guarín Zapata, Nicolás; Mola Ávila, José AntonioÍtem Análisis de quiebra empresarial ante escenarios de contracción de la oferta y la demanda ocasionados por el Covid-19 : un estudio del sector comercio colombiano(Universidad EAFIT, 2021) Urán González, Ana María; Arjona, MateoÍtem Análisis de registros de mantenimiento de centrales de generación de energía con técnicas de procesamiento de lenguaje natural(Universidad EAFIT, 2024) Ocampo Davila, Andrés Alonso; Salazar Martínez, Carlos AndresÍtem Análisis de riesgo de impago en el sector financiero : enfoque en tarjetas de crédito(Universidad EAFIT, 2024) Herrera Olivares, Maher Stehisy; Moreno Reyes, Nicolás AlbertoÍtem Análisis del efecto que tienen los subsidios a la demanda para la adquisición de vivienda nueva en los ingresos monetarios de los beneficiarios(Universidad EAFIT, 2022) Betancur Londoño, David; Dávalos Álvarez, EleonoraÍtem Análisis predictivo de la deserción laboral en BPO : aplicaciones avanzadas de Machine Learning(Universidad EAFIT, 2023) Castelblanco Benítez, Julián; Almonacid Hurtado, Paula MariaÍtem Análisis predictivo del riesgo de default en microcrédito un enfoque de machine learning en sector financiero(Universidad EAFIT, 2024) Mendoza Trillos, Laura; Suárez Sierra, Biviana MarcelaÍtem Análisis y predicción de la deserción de empleados : un caso de estudio en la industria de software colombiana(Universidad EAFIT, 2022) Sierra Buriticá, Eliana Marcela; Almonacid Hurtado, Paula MaríaThe objective of this study is to carry out the analysis and prediction of the desertion of employees of a software company in Medellín, based on a private database that contains 19 characteristics of 1497 workers, where 900 are active in the company and the rest have left their job. In the first place, a descriptive and exploratory analysis was carried out, where it was found that there was some variables that did not contribute information to the model, such as: Type of identification, start date of the contract, among others, also in this part the correlation of some variables and proceeded to eliminate them from the set of descriptive characteristics of the problem, since that leaving them would be leaving redundant information in the model. Second, they trained 4 machine learning models (Niave Bayes, Random Forest, Decision Tree, Logistic Regression) and the results obtained by each were compared, in order to find the one that best fits the problem of labor desertion, in this step it was found that the best classifier of machine learning is a decision tree (Decision Tree) with 14 layers, since metrics such as its curve of learning and ROC curve gave better results than the other two trained models.