Desde Junio 30, 2020 11:45 hasta Junio 30, 2020 13:15

Ciclo de Coloquios 2020: "A Worst-case Optimal Join Algorithm for SPARQL"

Publicado por Katherine Quezada

El Departamento de Informática de la Universidad Técnica Federico Santa María tiene el agrado de invitar a la comunidad Universitaria a su ciclo de coloquios. Esta presentación se realizará por videoconferencia a través de la plataforma Zoom, el día martes 30 de junio a las 11:45 horas.

Expositor

Adrián Soto, Académico Universidad Adolfo Ibáñez, PhD Computer Science PUC Chile

Mini Bio

Adrián Soto is Assistant Professor at UAI, and Software Engineer. My main research interests are semantic web, data management, DBMS implementation and algorithms for efficient query processing. My main works are oriented to extend the SPARQL query language and to optimize the current algorithms in RDF engines.

Resumen

Worst-case optimal multiway join algorithms have recently gained a lot of attention in the database literature. These algorithms not only offer strong theoretical guarantees of efficiency, but have also been empirically demonstrated to significantly improve query runtimes for relational and graph databases. Despite these promising theoretical and practical results, however, the Semantic Web community has yet to adopt such techniques; to the best of our knowledge, no native RDF database currently supports such join algorithms, where in this paper we demonstrate that this should change. We propose a novel procedure for evaluating SPARQL queries based on an existing worst-case join algorithm called Leapfrog Triejoin. We propose an adaptation of this algorithm for evaluating SPARQL queries, and implement it in Apache Jena. We then present experiments over the Berlin and WatDiv SPARQL benchmarks, and a novel benchmark that we propose based on Wikidata that is designed to provide insights into join performance for a more diverse set of basic graph patterns. Our results show that with this new join algorithm, Apache Jena often runs orders of magnitude faster than the base version and two other SPARQL engines: Virtuoso and Blazegraph.