Comparison and evaluation of different methods for the feature extraction from educational contents

dc.citation.journalTitleComputationeng
dc.contributor.authorAguilar, J.
dc.contributor.authorSalazar, C.
dc.contributor.authorVelasco, H.
dc.contributor.authorMonsalve-Pulido, J.
dc.contributor.authorMontoya, E.
dc.contributor.departmentUniversidad EAFIT. Departamento de Ingeniería de Sistemasspa
dc.contributor.researchgroupI+D+I en Tecnologías de la Información y las Comunicacionesspa
dc.creatorAguilar, J.
dc.creatorSalazar, C.
dc.creatorVelasco, H.
dc.creatorMonsalve-Pulido, J.
dc.creatorMontoya, E.
dc.date.accessioned2021-04-12T20:55:48Z
dc.date.available2021-04-12T20:55:48Z
dc.date.issued2020-01-01
dc.description.abstractThis paper analyses the capabilities of different techniques to build a semantic representation of educational digital resources. Educational digital resources are modeled using the Learning Object Metadata (LOM) standard, and these semantic representations can be obtained from different LOM fields, like the title, description, among others, in order to extract the features/characteristics from the digital resources. The feature extraction methods used in this paper are the Best Matching 25 (BM25), the Latent Semantic Analysis (LSA), Doc2Vec, and the Latent Dirichlet allocation (LDA). The utilization of the features/descriptors generated by them are tested in three types of educational digital resources (scientific publications, learning objects, patents), a paraphrase corpus and two use cases: in an information retrieval context and in an educational recommendation system. For this analysis are used unsupervised metrics to determine the feature quality proposed by each one, which are two similarity functions and the entropy. In addition, the paper presents tests of the techniques for the classification of paraphrases. The experiments show that according to the type of content and metric, the performance of the feature extraction methods is very different; in some cases are better than the others, and in other cases is the inverse. © 2020 by the authors.eng
dc.identifierhttps://eafit.fundanetsuite.com/Publicaciones/ProdCientif/PublicacionFrw.aspx?id=11932
dc.identifier.doi10.3390/COMPUTATION8020030
dc.identifier.issn20793197
dc.identifier.otherWOS;000551199500003
dc.identifier.otherSCOPUS;2-s2.0-85085131929
dc.identifier.urihttp://hdl.handle.net/10784/28641
dc.language.isoengeng
dc.publisherMDPI Multidisciplinary Digital Publishing Institute
dc.relationDOI;10.3390/COMPUTATION8020030
dc.relationWOS;000551199500003
dc.relationSCOPUS;2-s2.0-85085131929
dc.relation.urihttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85085131929&doi=10.3390%2fCOMPUTATION8020030&partnerID=40&md5=be87f14dd269657e1b1e6a9be41b0e6c
dc.rightsMDPI Multidisciplinary Digital Publishing Institute
dc.sourceComputation
dc.subjectContent analysiseng
dc.subjectEducational contentseng
dc.subjectFeature extractioneng
dc.subjectInformation retrievaleng
dc.subjectRecommendation systemeng
dc.subjectSemantic representationeng
dc.titleComparison and evaluation of different methods for the feature extraction from educational contentseng
dc.typeinfo:eu-repo/semantics/articleeng
dc.typearticleeng
dc.typeinfo:eu-repo/semantics/publishedVersioneng
dc.typepublishedVersioneng
dc.type.localArtículospa

Archivos

Bloque original
Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
computation8020030.pdf
Tamaño:
282.08 KB
Formato:
Adobe Portable Document Format
Descripción: