Maestría en Ciencias de los Datos y Analítica (tesis)
URI permanente para esta colección
Examinar
Envíos recientes
Ítem Análisis y predicción de ventas de motos haciendo uso de la metodología “Customer Value Map” y técnicas de Machine Learning(Universidad EAFIT, 2024) Díaz Cordero, Sandra Marcela; Martínez Vargas, Juan David; Vallejo Correa, Paola AndreaÍtem Análisis de registros de mantenimiento de centrales de generación de energía con técnicas de procesamiento de lenguaje natural(Universidad EAFIT, 2024) Ocampo Davila, Andrés Alonso; Salazar Martínez, Carlos AndresÍtem Un marco de trabajo para gobierno de datos. Caso de estudio : Empresa de Transmisión y Distribución de Energía(Universidad EAFIT, 2024) Hernández Montoya, María Margarita; Tabares Betancur, Marta Silvia del SocorroÍtem Optimización de cartera de activos financieros utilizando Markowitz y Black-Litterman : una perspectiva desde la computación cuántica(Universidad EAFIT, 2024) Jaramillo Pineda, Carlos Andrés; Almonacid Hurtado, Paula María; Lalinde Pulido, Juan GuillermoQuantum computing, currently in its emerging stage, holds the potential to revolutionize various sectors, including finance. While portfolio optimization strategies based on classical methods like Markowitz and Black Litterman have already proven effective, the introduction of quantum algorithms could significantly enhance these techniques in terms of predictive ability and computational efficiency. Building upon previous research such as that by Bova (2021), which underlines the crucial role that quantum computing could play given the volume and complexity of financial data, this paper proposes a framework that integrates classical Markowitz and Black-Litterman theories with quantum computing. Through this hybrid approach, we explore how hybrid classic-quantum algorithms approaches can enrich the portfolio optimization process, offering significant advantages in financial analysis and strategic decision-making.Ítem Imágenes satelitales de luces nocturnas como proxy del valor agregado de los municipios del departamento de Antioquia(Universidad EAFIT, 2024) Marín Castaño, Kevin Stiv; Almonacid Hurtado, Paula MaríaÍtem Dependencia espacial en el riesgo de crédito. Análisis espacial para el caso de Medellín(Universidad EAFIT, 2024) Castro Castaño, Esteban; García Cruz, Gustavo AdolfoÍtem A Multivariate Outlier Detection Methodology Based on S-Orthogonal DOBIN Projections(Universidad EAFIT, 2024) Cano Campiño, Andrés Mauricio; Ortiz Arias, SantiagoÍtem Open cluster membership using robust and non-parametric statistical methods(Universidad EAFIT, 2024) Madrid Álvarez, Simón; Laniado, Henry; Muñoz, Juan CarlosÍtem Flujo de trabajo basado en Procesamiento de Lenguaje Natural (PLN) para la extracción de insights del contenido generado por usuarios (CGU). Caso de estudio aplicado a una fintech(Universidad EAFIT, 2024) Barrera Ravelo, Angie Karina; Montoya Múnera, Edwin NelsonÍtem Analizando patrones de éxito en YouTube : un sistema de recomendación para creadores de contenidos educativos(Universidad EAFIT, 2024) Osorio Urrea, Vanessa; Ortiz Arias, Santiago; del Castillo Cortázar, Francisco JavierÍtem Predicción del rendimiento de cultivos agrícolas en los cinco corregimientos de la ciudad de Medellín, utilizando modelos de Machine Learning(Universidad EAFIT, 2024) Gómez Arango, Alba Miriam; Valencia Diaz, Edison; Zuluaga Orrego, Juan FernandoIn a global context where agriculture and food production play a crucial role in food security, employment, and sustainability, this study focuses on predicting the yield of agricultural crops in the five districts of Medellín. The main objective is to design a prediction model for nine local crops using machine learning techniques. Medellín is distinguished by its diversity of crops, including peri-urban agriculture characterized by productive small plots distributed across various chagra-type crops. These traditional agricultural practices are carried out by an aging population of farmers. Accuracy in yield prediction becomes essential, as a significant portion of the production is dedicated to self-consumption, with a subsistence focus. However, surpluses are also traded, directly impacting the food security of the local community. The results highlight the effectiveness of machine learning models, particularly Boosting models such as PCA Random Forest and PCA XGB Boosting, in predicting the crops under study. These models demonstrate the ability to capture relationships between variables and the heterogeneity present in territorial production. However, opportunities for improvement related to reducing model errors have been identified, which can be addressed through continuous data collection and technical support provided to farmers. This will not only increase data availability but also contribute to refining the model and understanding performance behavior in the analyzed crops, facilitating decision-making in the agricultural sector of the municipality of Medellín. This project represents a valuable tool for professionals in the agricultural sector and institutions responsible for planning and agricultural development. It offers an innovative approach to sector data analysis, leveraging the advantages of data science. Through these techniques, opportunities are opened to establish strategies, plans, and projects that contribute to crop planning, the management of productive areas in the municipality, and the strengthening of local food security.Ítem A Robust Version of a Risk-Inverse Weighing Methodology for Portfolio Selection(Universidad EAFIT, 2024) Renza Chavarría, Juan Felipe; Ortiz Arias, SantiagoÍtem Predicting Infarct In-Hospital Mortality Using Machine Learning Supervised Models and Administrative Data for Colombia(Universidad EAFIT, 2023) Giraldo González, Sebastián; Posso Suárez, Christian Manuel; Gómez Toro, CatalinaÍtem Predicción de alteraciones nutricionales en función del índice peso para la talla en niños menores de 5 años, de la ciudad de Medellín(Universidad EAFIT, 2023) Bedoya Ríos, Santiago; Martínez Vargas, Juan David; Sepúlveda Cano, Lina MaríaÍtem Dynamic cost forecasting Drayage Product in inland transport using machine learning models(Universidad EAFIT, 2023) Peralta Jaramillo, Juliana Andrea; Moreno Reyes, Nicolás AlbertoÍtem Aprendizaje reforzado profundo para la administración de portafolios de renta fija(Universidad EAFIT, 2023) Mejía Estrada, David; Almonacid Hurtado, Paula MaríaThis paper applies deep reinforced learning techniques to the management of fixed income investment portfolios, specifically sovereign securities issued by the Colombian government. The period of analysis covers seven years, from January 2015 to December 2022. We find that it is possible to generate profitability and achieve efficient risk management because of the trading strategies that deep reinforced learning models foresee more convenient given certain market conditions and of each of the securities, such as their implied risk in metrics like DV01, Duration and Convexity. Finally, this study contributes to the field of machine learning and artificial intelligence applications on investment portfolio management, with a relatively new focus on the fixed income market in general, consolidating itself as one of the first works to apply reinforcement learning techniques to the Colombian public debt market.Ítem Predicción de incumplimiento de pagos de crédito en una entidad financiera utilizando chats de servicio al cliente(Universidad EAFIT, 2023) Patiño Serna, Javier; Martínez Vargas, Juan David; Vallejo Correa, Paola AndreaÍtem Hybrid Heuristic Solution Strategies for the 𝑝-Regions Problem(Universidad EAFIT, 2023) Castañeda Bedoya, Leidy Marcela; Rivera Agudelo, Juan CarlosÍtem Estimating a relationship between tombstone cost and longevity using cemetery data in Medellín City in Colombia(Universidad EAFIT, 2023) Martínez Rodríguez, Astrid Lizeth; Zapata Múnera, UrielRecent studies in the West suggested that tombstone cost is associated with longevity. The main goal of this study was to investigate the association between tombstone cost and life expectancy in a large cemetery in Latin America. Age at death was obtained from 2,273 consecutive death certificates held at the San Pedro Cemetery Museum in Medellín, Colombia. Subjects died between 2020 and 2022. Tombs are arranged in galleries in the cemetery, and tombstone cost is based on the material from which the tombstone was made its position in the gallery, and its ornamentation. Approximately 76% of the tombstones were low cost, 16% medium cost, and 8% high cost. Analysis of variance was used, and the assumption of equal variance was not violated. Because the data did not show a normal distribution, it was necessary to apply non-parametric techniques to assess statistical differences. The Kruskal-Wallis's test was employed for this purpose. Longevity was similar in the low-cost group and medium-cost group: 61.5±23.9 versus 61.6±24.6 years [estimated mean (95% confidence interval)]. Longevity was lower in the high-cost group: 55.8±25.8 years. The inverse association between tombstone cost and longevity would suggest that people in Medellín are inclined to spend more on tombstones when commemorating the tragic death of a young person.Ítem Detecting Outliers with a Non-parametric estimation of the Mahalanobis distance(Universidad EAFIT, 2023) Piedrahita Jaramillo, Catalina; Laniado Rodas, Henry; Saldarriaga Aristizábal, Pablo AndrésThis paper proposes the creation of a robust version of the Mahalanobis distance for the outlier’s identification problem, using robust and non-parametric estimations for the covariance matrix, such as Kendall’s Tau and Median Absolute Deviation (MAD), as well as techniques that enhance the numerical properties of the covariance matrix to reduce error during numerical calculations like Ledoit and Wolf’s Shrinkage. The performance of the methods is evaluated through simulation of independent normal data, correlated normal data, and real data sets and compared with some methods from the literature. The proposed methods achieve a high percentage of correct identification of outliers and have a low false positive rate for both data types, particularly in the case of correlated normal data.