Repositorio Universidad del Cauca

Adaptación de un modelo de espacio vectorial de recuperación de información a textos escritos en Nasa Yuwe

Mostrar el registro sencillo del ítem

dc.contributor.author Sierra Martínez, Luz Marina es
dc.date.accessioned 2019-11-05T14:55:37Z
dc.date.available 2019-11-05T14:55:37Z
dc.date.issued 2016-02-04
dc.identifier.uri http://repositorio.unicauca.edu.co:8080/xmlui/handle/123456789/1353
dc.description.abstract The nasa yuwe is an official language of Colombia, it is currently in danger of extinction, nowadays advanced strategies are being promoted from different national and indigenous organizatons such as the information technologies to seek to support the visibility of the language and its use through computational tools. This document describes the development and results in the adaptation of a vector space model for information retrieval of texts written in nasa yuwe by: Building a closed test collection of texts written in nasa yuwe, which involved: • Field work with nasa teachers from several nearby community shelters to the city of Popayan. • The establishment of 97 documents written in nasa yuwe. • The definition of 8 queries. • The register of expert judgment about the relevance of the documents for each query. A prototype of information retrieval system for texts written in nasa yuwe, it was developed taking into account: • The adaptation of a nasa yuwe tokenizer based on the Lucene .NET standard tokenizer (version 2.9.4) • The definition of a stopwords removal list to apply on the documents of the nasa yuwe test collection and queries. • Performance evaluation of the prototype through traditional measures of the research area as the Precision – Recall Curve. To develop this work it was observed that although the nasa yuwe, is a language in process of description, it was possible to adapt a tokenizer and to define a stopwords removal list for this language, in order to get a prototype of information retrieval systems for texts written in nasa yuwe, and through perfomance evalution of this prototype was possible to see the adaptation of the nasa tokenizer is an important task in the recovery and this Project showed promising results in relation with the baseline, unlike the results obtained with the stopwords removal list, there is not substantial improvements in the performance of the prototype. en
dc.language.iso spa es
dc.publisher Universidad del Cauca es
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject Tokenizer Nasa en
dc.subject Nasa Yuwe es
dc.subject Information retrieval for texts written in Nasa Yuwe en
dc.subject Stopwords removal list for Nasa Yuwe en
dc.subject Adapting of a tokenizer for Nasa Yuwe en
dc.title Adaptación de un modelo de espacio vectorial de recuperación de información a textos escritos en Nasa Yuwe es
dc.type Tesis maestría es
dc.rights.creativecommons https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.type.driver info:eu-repo/semantics/masterThesis
dc.type.coar http://purl.org/coar/resource_type/c_bdcc
dc.publisher.faculty Facultad de Ingeniería Electrónica y Telecomunicaciones es
dc.publisher.program Maestría en Ingeniería Telemática es
dc.rights.accessrights info:eu-repo/semantics/openAccess
dc.type.version info:eu-repo/semantics/publishedVersion
dc.coar.version http://purl.org/coar/version/c_970fb48d4fbd8a85


Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

https://creativecommons.org/licenses/by-nc-nd/4.0/ Excepto si se señala otra cosa, la licencia del ítem se describe como https://creativecommons.org/licenses/by-nc-nd/4.0/

Buscar en DSpace


Listar

Mi cuenta