Nonparametric Generation of Synthetic Data Using Copulas

Date
2023Author(s)
Restrepo Lopera, Juan Pablo
Advisor(s) / Researcher(s)
Laniado Rodas, Henry
Rivera Agudelo, Juan Carlos
Metrics
Metadata
Show full item recordAbstract
Abstract
This article presents a novel nonparametric approach to generate synthetic data using copulas, which are functions that explain the dependency structure of the real data. The proposed method addresses several challenges faced by existing synthetic data generation techniques, such as the preservation of complex multivariate structures presented in real data. By using all the information from real data and verifying that the generated synthetic data follows the same behavior as the real data under homogeneity tests, our method is a significant improvement over existing techniques. Our method is easy to implement and interpret, making it a valuable tool for solving class imbalance problems in machine learning models, improving the generalization capabilities of deep learning models, and anonymizing information in finance and healthcare domains, among other applications.
Documents PDF

Source / Editor URL
https://doi.org/10.3390/electronics12071601
https://github.com/jurest82/SyntheticDataCopulas