Nonparametric Generation of Synthetic Data Using Copulas

dc.contributor.advisorLaniado Rodas, Henryspa
dc.contributor.advisorRivera Agudelo, Juan Carlosspa
dc.contributor.authorRestrepo Lopera, Juan Pablo
dc.coverage.spatialMedellín de: Lat: 06 15 00 N degrees minutes Lat: 6.2500 decimal degrees Long: 075 36 00 W degrees minutes Long: -75.6000 decimal degreeseng
dc.creator.degreeMagíster en Matemáticas Aplicadasspa
dc.creator.emailjurest82@eafit.edu.cospa
dc.creator.grantorThis research was funded by the call 852-2019 of the Ministry of Science, Technology and Innovation of the Republic of Colombia (MinCiencias), which allowed the development of the project with code 1216-852-72082 called “Descriptive and predictive analysis of the cement and concrete production process”spa
dc.date.accessioned2023-05-16T21:34:10Z
dc.date.available2023-05-16T21:34:10Z
dc.date.issued2023
dc.description.abstractThis article presents a novel nonparametric approach to generate synthetic data using copulas, which are functions that explain the dependency structure of the real data. The proposed method addresses several challenges faced by existing synthetic data generation techniques, such as the preservation of complex multivariate structures presented in real data. By using all the information from real data and verifying that the generated synthetic data follows the same behavior as the real data under homogeneity tests, our method is a significant improvement over existing techniques. Our method is easy to implement and interpret, making it a valuable tool for solving class imbalance problems in machine learning models, improving the generalization capabilities of deep learning models, and anonymizing information in finance and healthcare domains, among other applications.spa
dc.identifier.ddc006.312 R436
dc.identifier.urihttp://hdl.handle.net/10784/32480
dc.language.isospaspa
dc.publisherUniversidad EAFITspa
dc.publisher.departmentEscuela de Ciencias Aplicadas e Ingeniería. Departamento de Ciencias Matemáticasspa
dc.publisher.placeMedellínspa
dc.publisher.programMaestría en Matemáticas Aplicadasspa
dc.relation.urihttps://doi.org/10.3390/electronics12071601spa
dc.relation.urihttps://github.com/jurest82/SyntheticDataCopulasspa
dc.rightsTodos los derechos reservadosspa
dc.rights.accessrightsinfo:eu-repo/semantics/openAccessspa
dc.rights.localAcceso abiertospa
dc.subjectGeneración de datos sintéticosspa
dc.subjectAumento de datosspa
dc.subjectTest de homogeneidadspa
dc.subjectCópulas empíricasspa
dc.subjectEstadística no paramétricaspa
dc.subject.keywordSynthetic data generationspa
dc.subject.keywordData augmentationspa
dc.subject.keywordHomogeneity testspa
dc.subject.keywordEmpirical copulasspa
dc.subject.keywordNonparametric statisticsspa
dc.subject.lembMATEMÁTICAS PARA INGENIEROSspa
dc.subject.lembDATOS ESTADÍSTICOSspa
dc.subject.lembMINERÍA DE DATOSspa
dc.titleNonparametric Generation of Synthetic Data Using Copulasspa
dc.typemasterThesiseng
dc.typeinfo:eu-repo/semantics/masterThesiseng
dc.type.hasVersionacceptedVersioneng
dc.type.localTesis de Maestríaspa
dc.type.spaArtículospa

Archivos

Bloque original
Mostrando 1 - 3 de 3
No hay miniatura disponible
Nombre:
carta_aprobacion_trabajo_grado_eafit.pdf
Tamaño:
78.09 KB
Formato:
Adobe Portable Document Format
Descripción:
Carta aprobacion trabajo de grado
No hay miniatura disponible
Nombre:
JuanPablo_RestrepoLopera_2023.pdf
Tamaño:
8.83 MB
Formato:
Adobe Portable Document Format
Descripción:
Trabajo de grado
No hay miniatura disponible
Nombre:
formulario_autorizacion_publicacion_obras.pdf
Tamaño:
1.31 MB
Formato:
Adobe Portable Document Format
Descripción:
Formulario autorizacion publicacion obras
Bloque de licencias
Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
2.5 KB
Formato:
Item-specific license agreed upon to submission
Descripción: