Maestría en Ciencias de los Datos y Analítica (tesis)

URI permanente para esta colección

Examinar

Envíos recientes

Mostrando 1 - 20 de 77
  • Ítem
    Analizando patrones de éxito en YouTube : un sistema de recomendación para creadores de contenidos educativos
    (Universidad EAFIT, 2024) Osorio Urrea, Vanessa; Ortiz Arias, Santiago; del Castillo Cortázar, Francisco Javier
  • Ítem
    Predicción del rendimiento de cultivos agrícolas en los cinco corregimientos de la ciudad de Medellín, utilizando modelos de Machine Learning
    (Universidad EAFIT, 2024) Gómez Arango, Alba Miriam; Valencia Diaz, Edison; Zuluaga Orrego, Juan Fernando
    In a global context where agriculture and food production play a crucial role in food security, employment, and sustainability, this study focuses on predicting the yield of agricultural crops in the five districts of Medellín. The main objective is to design a prediction model for nine local crops using machine learning techniques. Medellín is distinguished by its diversity of crops, including peri-urban agriculture characterized by productive small plots distributed across various chagra-type crops. These traditional agricultural practices are carried out by an aging population of farmers. Accuracy in yield prediction becomes essential, as a significant portion of the production is dedicated to self-consumption, with a subsistence focus. However, surpluses are also traded, directly impacting the food security of the local community. The results highlight the effectiveness of machine learning models, particularly Boosting models such as PCA Random Forest and PCA XGB Boosting, in predicting the crops under study. These models demonstrate the ability to capture relationships between variables and the heterogeneity present in territorial production. However, opportunities for improvement related to reducing model errors have been identified, which can be addressed through continuous data collection and technical support provided to farmers. This will not only increase data availability but also contribute to refining the model and understanding performance behavior in the analyzed crops, facilitating decision-making in the agricultural sector of the municipality of Medellín. This project represents a valuable tool for professionals in the agricultural sector and institutions responsible for planning and agricultural development. It offers an innovative approach to sector data analysis, leveraging the advantages of data science. Through these techniques, opportunities are opened to establish strategies, plans, and projects that contribute to crop planning, the management of productive areas in the municipality, and the strengthening of local food security.
  • Ítem
    A Robust Version of a Risk-Inverse Weighing Methodology for Portfolio Selection
    (Universidad EAFIT, 2024) Renza Chavarría, Juan Felipe; Ortiz Arias, Santiago
  • Ítem
    Predicting Infarct In-Hospital Mortality Using Machine Learning Supervised Models and Administrative Data for Colombia
    (Universidad EAFIT, 2023) Giraldo González, Sebastián; Posso Suárez, Christian Manuel; Gómez Toro, Catalina
  • Ítem
    Predicción de alteraciones nutricionales en función del índice peso para la talla en niños menores de 5 años, de la ciudad de Medellín
    (Universidad EAFIT, 2023) Bedoya Ríos, Santiago; Martínez Vargas, Juan David; Sepúlveda Cano, Lina María
  • Ítem
    Dynamic cost forecasting Drayage Product in inland transport using machine learning models
    (Universidad EAFIT, 2023) Peralta Jaramillo, Juliana Andrea; Moreno Reyes, Nicolás Alberto
  • Ítem
    Aprendizaje reforzado profundo para la administración de portafolios de renta fija
    (Universidad EAFIT, 2023) Mejía Estrada, David; Almonacid Hurtado, Paula María
    This paper applies deep reinforced learning techniques to the management of fixed income investment portfolios, specifically sovereign securities issued by the Colombian government. The period of analysis covers seven years, from January 2015 to December 2022. We find that it is possible to generate profitability and achieve efficient risk management because of the trading strategies that deep reinforced learning models foresee more convenient given certain market conditions and of each of the securities, such as their implied risk in metrics like DV01, Duration and Convexity. Finally, this study contributes to the field of machine learning and artificial intelligence applications on investment portfolio management, with a relatively new focus on the fixed income market in general, consolidating itself as one of the first works to apply reinforcement learning techniques to the Colombian public debt market.
  • Ítem
    Predicción de incumplimiento de pagos de crédito en una entidad financiera utilizando chats de servicio al cliente
    (Universidad EAFIT, 2023) Patiño Serna, Javier; Martínez Vargas, Juan David; Vallejo Correa, Paola Andrea
  • Ítem
    Hybrid Heuristic Solution Strategies for the 𝑝-Regions Problem
    (Universidad EAFIT, 2023) Castañeda Bedoya, Leidy Marcela; Rivera Agudelo, Juan Carlos
  • Ítem
    Estimating a relationship between tombstone cost and longevity using cemetery data in Medellín City in Colombia
    (Universidad EAFIT, 2023) Martínez Rodríguez, Astrid Lizeth; Zapata Múnera, Uriel
    Recent studies in the West suggested that tombstone cost is associated with longevity. The main goal of this study was to investigate the association between tombstone cost and life expectancy in a large cemetery in Latin America. Age at death was obtained from 2,273 consecutive death certificates held at the San Pedro Cemetery Museum in Medellín, Colombia. Subjects died between 2020 and 2022. Tombs are arranged in galleries in the cemetery, and tombstone cost is based on the material from which the tombstone was made its position in the gallery, and its ornamentation. Approximately 76% of the tombstones were low cost, 16% medium cost, and 8% high cost. Analysis of variance was used, and the assumption of equal variance was not violated. Because the data did not show a normal distribution, it was necessary to apply non-parametric techniques to assess statistical differences. The Kruskal-Wallis's test was employed for this purpose. Longevity was similar in the low-cost group and medium-cost group: 61.5±23.9 versus 61.6±24.6 years [estimated mean (95% confidence interval)]. Longevity was lower in the high-cost group: 55.8±25.8 years. The inverse association between tombstone cost and longevity would suggest that people in Medellín are inclined to spend more on tombstones when commemorating the tragic death of a young person.
  • Ítem
    Detecting Outliers with a Non-parametric estimation of the Mahalanobis distance
    (Universidad EAFIT, 2023) Piedrahita Jaramillo, Catalina; Laniado Rodas, Henry; Saldarriaga Aristizábal, Pablo Andrés
    This paper proposes the creation of a robust version of the Mahalanobis distance for the outlier’s identification problem, using robust and non-parametric estimations for the covariance matrix, such as Kendall’s Tau and Median Absolute Deviation (MAD), as well as techniques that enhance the numerical properties of the covariance matrix to reduce error during numerical calculations like Ledoit and Wolf’s Shrinkage. The performance of the methods is evaluated through simulation of independent normal data, correlated normal data, and real data sets and compared with some methods from the literature. The proposed methods achieve a high percentage of correct identification of outliers and have a low false positive rate for both data types, particularly in the case of correlated normal data.
  • Ítem
    Análisis predictivo de la deserción laboral en BPO : aplicaciones avanzadas de Machine Learning
    (Universidad EAFIT, 2023) Castelblanco Benítez, Julián; Almonacid Hurtado, Paula Maria
  • Ítem
    Predicting Stock prices in Latin America using Associative Deep Neural Networks
    (Universidad EAFIT, 2023) Gallego Rojas, Juan Fernando; Almonacid Hurtado, Paula María
    The stock market is a critical sector of the global economy, and predicting stock prices is of great interest to investors and companies. However, the movements of the market are volatile, non-linear, and complicated. This topic has attracted the attention of researchers, who have proposed formal models that demonstrate accurate predictions can be made with appropriate variables and techniques. Deep learning algorithms are often used for this purpose due to their superior accuracy in time series-based and complex pattern analysis. This paper proposes to predict the opening, closing, highest, and lowest stock prices of select Latin American market indexes using associative deep neural networks that can simultaneously predict related values based on the Long Short Term Memory (LSTM) technique, known for its high accuracy in this area. As well as using classic econometric methods for the analysis of time series such as ARIMA models. The proposed model achieved a good performance in terms of prediction, which in turn allows finding interesting trading opportunities for investors. The results of the models were measured using the average RMSE of the predicted prices metric and compared with those obtained using a naive model.
  • Ítem
    Uso de kernels en series tiempo para la detección de prácticas manipulativas en mercados financieros
    (Universidad EAFIT, 2023) Herrera Ochoa, José Daniel; Quintero Montoya, Olga Lucía
    Intuitively, one might think that any deviation in trading data could be easily detected due to the statistical basis on which finance sciences are based. However, the markets in which financial assets are traded operate under the principle of supply and demand, as well as the principle of opportunity. Elements that make them very susceptible to price manipulation. For this reason, it is increasingly relevant to consider techniques that allow the identification of elements in financial time series that can deliver information that show whether a stock has been subject of manipulative practices or not. The use of kernels for signals decomposition and filtering in financial time series is then proposed. By using this technique elements of the time series such as power and frequency can be obtained, which can later facilitate the characterization of a stock that has been subject of fraudulent or manipulative trading. Then considering diverse machine learning techniques, achieve a timelier detection based on said characterization, particularly in dynamic and constantly evolving trading environments. For this purpose, the performance of the kernels will be contrasted against traditional techniques, choosing the most appropriate ones. In the same way, various machine learning techniques will be evaluated and the one that best learns and represents the patterns or artifacts in fraudulent operations will be chosen. Trying in this way to raise trading standards in financial markets, as well as delving into the applications that the decomposition and filtering of signals with kernels can have, not only as a data visualization tool, but also as inputs. for machine learning techniques.
  • Ítem
    Análisis de explicabilidad en modelos predictivos basados en técnicas de aprendizaje automático sobre el riesgo de re-ingresos hospitalarios
    (Universidad EAFIT, 2023) Lopera Bedoya, Juan Camilo; Aguilar Castro, José Lisandro
    Big Data and medical care are essential to analyze the risk of re-hospitalization of patients with chronic diseases and can even help prevent their deterioration. By leveraging the information, healthcare institutions can deliver accurate preventive care, and thus, reduce hospital admissions. The level of risk calculation will allow planning the spending on in-patient care, in order to ensure that medical spaces and resources are available to those who need it most. This article presents several supervised models to predict when a patient can be hospitalized again, after its discharge. In addition, an explainability analysis will be carried out with the predictive models to extract information associated with the predictions they make, in order to determine, for example, the degree of importance of the predictors/descriptors. In this way, it seeks to make the results obtained more understandable for health personnel.
  • Ítem
    A predictive approach based on fuzzy cognitive maps with federated learning
    (Universidad EAFIT, 2023) Garatejo Vargas, Edison Camilo; Aguilar Castro, José Lizandro; Hoyos, William
  • Ítem
    Reconocimiento de emociones a partir del Speech (SER)
    (Universidad EAFIT, 2023) Giraldo Toro, Jeison Erley; Montoya Múnera, Edwin Nelson; Martínez Vargas, Juan David
  • Ítem
    Selección de sentimiento y tópicos a través de Transformers
    (Universidad EAFIT, 2023) Rendón Jiménez, Alejandro; Hernández Torres, Santiago
  • Ítem
    Development of a machine learning-based methodology for an automatic control model in a Kaolin washing process
    (Universidad EAFIT, 2023) Contreras Buitrago, Oscar Javier; Martínez Vargas, Juan David
  • Ítem
    Hacia un modelo predictivo para identificar aspectos clave en la rotación de empleados
    (Universidad EAFIT, 2023) Cárdenas López, Paula Andrea; Tabares Betancur, Marta Silvia
Todo persona que consulte en este repositorio podrá copiar apartes del texto citando siempre la fuentes, es decir el título del trabajo y el autor. Esta autorización no implica la renuncia a la facultad que tiene el autor de publicar total o parcialmente la obra.
La Universidad no será responsable de ninguna reclamación que pudiera surgir de terceros que invoquen autoría de la obra que presenta el autor.
Todos los derechos reservados.