Desde Junio 01, 2021 12:15 hasta Junio 01, 2021 13:15

Ciclo de Coloquios 2021: "TopicVisExplorer: Supporting multi-corpora comparison through visual exploration of topic modeling"

Publicado por Katherine Quezada

El Departamento de Informática de la Universidad Técnica Federico Santa María tiene el agrado de invitar a la comunidad Universitaria a su ciclo de coloquios 2021. Esta presentación se transmitirá vía https://tv.inf.utfsm.cl/coloquio, el martes 01 de junio a partir de las 12:15 horas. Participa, sin previa inscripción, ingresando al enlace el día y hora del evento (link se actualizará en el momento del coloquio)


Expositor

Felipe González, Ingeniero Civil Informático USM y estudiante de Magíster en Ciencias de la Ingeniería Informática.

Mini Bio

Felipe González es estudiante de Magíster en Ciencias de la Ingeniería Informática. Recientemente fue aceptado como estudiante de doctorado en Computer Science en The University of British Columbia, Canadá y como visiting scholar en Max Planck Institute for Informatics, Saarbrücken, Alemania. Previamente, recibió el título profesional en Ingeniero Civil Informático en la USM, recibiendo la distinción “Federico Santa María” por ser el mejor titulado de su generación. Sus intereses incluyen Social computing, Text Mining e Information Visualization..

Resumen

The constant increase in the volume of textual data has led to the development of various algorithms intended to aid in summarizing and understanding this type of data. A promising solution to this problem is topic modeling, a statistical approach for extracting themes from high volumes of data. Humans that directly interact and interpret the output of topic modeling rely on visualization tools to obtain a better interpretation. These tools still have limitations: current visual representations are designed to support monolingual corpora comparison and keywordbased topic splitting operations, which may result in poor interpretations of topics. To address these limitations, we propose to develop TopicVisExplorer, a set of web-based interactive visualizations of topics estimated using Latent Dirichlet Allocation over short texts. Our visualization attempts to support the following tasks: (1) What is the meaning of each topic?, (2) How prevalent is each topic?, and (3) How do the topics relate to each other? Additionally, it aims to users visualize the evolution of topics over time and compare multiple topic modeling outputs. Two key innovations of this proposal seek to support the identification of similar topics between two corpora and model refinement. We propose a topic similarity metric that supports multi-corpora comparison and a new document-based topic splitting algorithm. To validate our proposal, we conducted a user study to evaluate if the proposed functionalities of TopicVisExplorer allow better interpretation of topics.

¡Te esperamos!