Maestría en Ciencias de los Datos y Analítica (tesis)

URI permanente para esta colección

Examinar

Envíos recientes

Mostrando 1 - 20 de 65
  • Ítem
    Predicting Stock prices in Latin America using Associative Deep Neural Networks
    (Universidad EAFIT, 2023) Gallego Rojas, Juan Fernando; Almonacid Hurtado, Paula María
    The stock market is a critical sector of the global economy, and predicting stock prices is of great interest to investors and companies. However, the movements of the market are volatile, non-linear, and complicated. This topic has attracted the attention of researchers, who have proposed formal models that demonstrate accurate predictions can be made with appropriate variables and techniques. Deep learning algorithms are often used for this purpose due to their superior accuracy in time series-based and complex pattern analysis. This paper proposes to predict the opening, closing, highest, and lowest stock prices of select Latin American market indexes using associative deep neural networks that can simultaneously predict related values based on the Long Short Term Memory (LSTM) technique, known for its high accuracy in this area. As well as using classic econometric methods for the analysis of time series such as ARIMA models. The proposed model achieved a good performance in terms of prediction, which in turn allows finding interesting trading opportunities for investors. The results of the models were measured using the average RMSE of the predicted prices metric and compared with those obtained using a naive model.
  • Ítem
    Uso de kernels en series tiempo para la detección de prácticas manipulativas en mercados financieros
    (Universidad EAFIT, 2023) Herrera Ochoa, José Daniel; Quintero Montoya, Olga Lucía
    Intuitively, one might think that any deviation in trading data could be easily detected due to the statistical basis on which finance sciences are based. However, the markets in which financial assets are traded operate under the principle of supply and demand, as well as the principle of opportunity. Elements that make them very susceptible to price manipulation. For this reason, it is increasingly relevant to consider techniques that allow the identification of elements in financial time series that can deliver information that show whether a stock has been subject of manipulative practices or not. The use of kernels for signals decomposition and filtering in financial time series is then proposed. By using this technique elements of the time series such as power and frequency can be obtained, which can later facilitate the characterization of a stock that has been subject of fraudulent or manipulative trading. Then considering diverse machine learning techniques, achieve a timelier detection based on said characterization, particularly in dynamic and constantly evolving trading environments. For this purpose, the performance of the kernels will be contrasted against traditional techniques, choosing the most appropriate ones. In the same way, various machine learning techniques will be evaluated and the one that best learns and represents the patterns or artifacts in fraudulent operations will be chosen. Trying in this way to raise trading standards in financial markets, as well as delving into the applications that the decomposition and filtering of signals with kernels can have, not only as a data visualization tool, but also as inputs. for machine learning techniques.
  • Ítem
    Análisis de explicabilidad en modelos predictivos basados en técnicas de aprendizaje automático sobre el riesgo de re-ingresos hospitalarios
    (Universidad EAFIT, 2023) Lopera Bedoya, Juan Camilo; Aguilar Castro, José Lisandro
    Big Data and medical care are essential to analyze the risk of re-hospitalization of patients with chronic diseases and can even help prevent their deterioration. By leveraging the information, healthcare institutions can deliver accurate preventive care, and thus, reduce hospital admissions. The level of risk calculation will allow planning the spending on in-patient care, in order to ensure that medical spaces and resources are available to those who need it most. This article presents several supervised models to predict when a patient can be hospitalized again, after its discharge. In addition, an explainability analysis will be carried out with the predictive models to extract information associated with the predictions they make, in order to determine, for example, the degree of importance of the predictors/descriptors. In this way, it seeks to make the results obtained more understandable for health personnel.
  • Ítem
    A predictive approach based on fuzzy cognitive maps with federated learning
    (Universidad EAFIT, 2023) Garatejo Vargas, Edison Camilo; Aguilar Castro, José Lizandro; Hoyos, William
  • Ítem
    Reconocimiento de emociones a partir del Speech (SER)
    (Universidad EAFIT, 2023) Giraldo Toro, Jeison Erley; Montoya Múnera, Edwin Nelson; Martínez Vargas, Juan David
  • Ítem
    Selección de sentimiento y tópicos a través de Transformers
    (Universidad EAFIT, 2023) Rendón Jiménez, Alejandro; Hernández Torres, Santiago
  • Ítem
    Development of a machine learning-based methodology for an automatic control model in a Kaolin washing process
    (Universidad EAFIT, 2023) Contreras Buitrago, Oscar Javier; Martínez Vargas, Juan David
  • Ítem
    Hacia un modelo predictivo para identificar aspectos clave en la rotación de empleados
    (Universidad EAFIT, 2023) Cárdenas López, Paula Andrea; Tabares Betancur, Marta Silvia
  • Ítem
    Hurto a personas en la ciudad de Medellín : análisis predictivo de la cantidad de casos en diferentes zonas de la ciudad a partir de modelos de machine learning implementando técnicas de MLOps
    (Universidad EAFIT, 2023) Arboleda Colorado, Jeferson Stiven; Martínez Vargas, Juan David
    Robbery of individuals in Medellín is an issue demanding immediate attention. This prompted the study of the phenomenon within an analytics project, spanning data collection, database construction, modeling, and production deployment. It's worth noting that MLOps methodology was employed utilizing AWS services. Visual tools related to the phenomenon were integrated, facilitating analysis.
  • Ítem
    Marco de trabajo para modelo de atribución de ventas en una empresa de ecommerce
    (Universidad EAFIT, 2023) Patiño Barraza, Daniel; Salazar Martínez, Carlos Andrés
  • Ítem
    Aplicación de técnicas de clusterización para la clasificación de música dance electrónica
    (Universidad EAFIT, 2023) Murillo Martínez, Carlos Alberto; Alunno, Marco; Martínez Vargas, Juan David
    Audio processing is one of the essential tasks for a data scientist, and audio analysis has applications in a diverse range of fields, such as medicine, telecommunications, improving sound quality in music production, and even military applications (filtering suspicious or terrorist audio). This project aims to use hard clustering techniques (such as k-means or k-nearest neighbor) and soft clustering techniques (such as fuzzy clustering) to classify input songs using different metrics. The classification methods will be used to segment previously processed input audios and obtain a sample of representative segments of the songs, determining their similarity with other songs of the same genre. Another technique that has proven effective for audio classification is convolutional neural networks (CNNs), which have been used in a wide range of fields. In the music field, they have been used to classify violin bowing techniques [1] and even detect potential heart problems using heartbeat sounds [2]. In this project, we will use this technique up to the point of feature extraction, and then use classical classification techniques to determine which group a section of a song belongs to.
  • Ítem
    FocusNET : an autofocusing learning‐based model for digital lensless holographic microscopy
    (Universidad EAFIT, 2023) Montoya Zuluaga, Manuel; Trujillo Anaya, Carlos Alejandro; Lopera Acosta, María Josef
    This paper reports on a convolutional neural network (CNN) – based regression model, called FocusNET, to predict the accurate reconstruction distance of raw holograms in Digital Lensless Holographic Microscopy (DLHM). This proposal provides a physical-mathematical formulation to extend its use to different DLHM setups than the optical and geometrical conditions utilized for recording the training dataset; this unique feature is tested by applying the proposal to holograms of diverse samples recorded with different DLHM setups. Additionally, a comparison between FocusNET and conventional autofocusing methods in terms of processing times and accuracy is provided. Although the proposed method predicts reconstruction distances with approximately 54 µm standard deviation, accurate information about the samples in the validation dataset is still retrieved. When compared to a method that utilizes a stack of reconstructions to find the best focal plane, FocusNET performs 600 times faster, as no hologram reconstruction is needed. When implemented in batches, the network can achieve up to a 1200-fold reduction in processing time, depending on the number of holograms to be processed. The training and validation datasets, and the code implementations, are hosted on a public GitHub repository that can be freely accessed.
  • Ítem
    Predicción del cargue de rutas de distribución mediante aprendizaje de máquina
    (Universidad EAFIT, 2023) Ramírez Aguilar, Santiago; Téllez Falla, Diego Fernando; Marentes Cubillos, Luis Andrés
  • Ítem
    RF-kNN: A Novel Ensemble Method for Improved Classification tasks
    (Universidad EAFIT, 2023) Muñoz Mercado, José Jorge; Almonacid Hurtado, Paula María; López Aguirre, Esteban
  • Ítem
    Predicción de rotación de empleados usando modelos de aprendizaje automático
    (Universidad EAFIT, 2023) Palacio Mesa, Luis Javier; Suárez Sierra, Biviana Marcela; Román Calderón, Juan Pablo
  • Ítem
    Metodología para la clasificación de documentos de texto de hojas de vida basado en aprendizaje de máquina
    (Universidad EAFIT, 2023) Matamoros Villegas, Javier Leomar; Montoya Múnera, Edwin Nelson
  • Ítem
    Análisis de la tendencia de la solución de una interacción con un Chatbot de atención al cliente, basado en análisis de sentimiento y otras variables
    (Universidad EAFIT, 2023) Flórez Salazar, Luz Stella; Montoya Múnera, Edwin Nelson
    A chatbot is a program created with artificial intelligence. In the context of customer service, can establish conversations with customers and they are trained to resolve their queries, problems and complaints. A chatbot’s skill to identify when a customer is not meeting their request represents a challenge for companies that currently use this technology. One of the strategies to avoid quitting the conversation for this reason, is to escalate or transfer the conversation to a human agent. Therefore, it is essential to detect when it is time to carry out this escalation. This project evaluates different Natural Language Processing (NLP) techniques, rule-based labeling algorithms, classical supervised machine learning models and a simple neural network for classification, applied to interactions between a customer service chatbot and a user, in order to find a mechanism for automatic labeling of the data and to build a model that can be used to make the decision on whether the customer should continue interacting with the chatbot or if he should be transferred to a conversation with a human agent. The labeling mechanism could also be used to classify historical data, to later train a model. Different models and techniques are evaluated and those with the best performance in detecting the conversations that should escalate to a human agent are presented.
  • Ítem
    Definición de una metodología para análisis de discurso basado en lingüística computacional y técnicas de aprendizaje de máquina
    (Universidad EAFIT, 2023) Fajardo Becerra, Daian Paola; Montoya Múnera, Edwin Nelson; Ariza Jiménez, Leandro Fabio
    The different actions carried out by a state regulatory body generate multiple opinions among citizens, which form debates among people, causing them to agree, disagree or partially agree with the decisions or strategies proposed. In order to know the opinions of the citizens, in Chile a project called "Tenemos que hablar Chile" (We have to talk Chile) was created, which asked structured questions to a group of citizens, where the answer of each person was classified by the moderator. each person's answer was classified by the moderator. This label was used for different discourse analyses that began to be developed without any specific order. This project was replicated in Colombia, under the same dynamics in order to know the opinions of the citizens, however, the techniques used were different from the Chilean project. As a result, it is observed that although both projects had the same dynamics and sought a similar result, it was not possible to reuse the techniques developed in the Chilean project in Colombia. Due to this, the proposal of this master's project seeks the implementation of a methodology that allows the use of different techniques of discourse analysis based on computational linguistics and machine learning that will provide the team of analysts with a scheme of stages which will have tools and techniques of Natural Language processing (NLP) to improve the efficiency of this type of projects. Within this project we can highlight the strengths of the director who has a high experience in Machine Learning (ML) and NLP, in addition to the strengths of the co-director with a broad understanding of the project "Tenemos que Hablar Colombia" (TQHC), and finally the student of this project with a base in the Master of Data Science and Analytics to generate a research on NLP techniques.
Todo persona que consulte en este repositorio podrá copiar apartes del texto citando siempre la fuentes, es decir el título del trabajo y el autor. Esta autorización no implica la renuncia a la facultad que tiene el autor de publicar total o parcialmente la obra.
La Universidad no será responsable de ninguna reclamación que pudiera surgir de terceros que invoquen autoría de la obra que presenta el autor.
Todos los derechos reservados.