[Monetdb-developers] RDF data management in MonetDB/SQL

Hi MonetDB developers, I have a question about RDF data management in MonetDB/SQL. The comment of sql.rdfshred says "shredding an RDF data file from location results in 7 new tables (6 permutations of SPO and a mapping) ... We can then query with SQL queries the RDF triple store by quering tables gid_spo, gid_pso etc., ...". In my option, if the spo table is considered the triples table, the other 5 tables (sop, pso, pos, osp, ops) (except the mapping table) can be viewed as indexes of the triples table spo. When I write SQL to query the shredded RDF data in the triples table, I have two ways. The first way is to only use spo table to make self-joins. The second way is to use all 6 tables to make joins. I noticed that "MonetDB/SQL Reference Manual" says that "The heart is the MonetDB server, which comes with the following innovative features. ... Index selection, creation and maintenance is automatic". If I use 6 tables (as indexes) explicitly to make joins, it seems that I write the query plan by myself. However, I think this work should be done by the SQL optimizer using statistics from the system catalog. I wondered if these tables have already been specified as indexes in the internal code, or if there is a way to specify it so that the optimizer can use them as indexes to generate query plans. I am not sure if my understanding is correct. I will appreciate any help from developers. Thank you in advance. Best regards, Xin Wang

Hi, your understanding is in general correct. The availability of the six different indices of the triple table should be understood by the SQL/SPARQL query optimizer and used accordingly. However, the MonetDB/RDF module has not been announced yet, and that is because it has not finished yet. You are using a piece of code which is experimental and unfinished. As such, the only way to test Monet and its capabilities on RDF data is to manually write the SQL query in such a way that you use the correct order of the triple table on the correct join. In the future of course, this will be done by the optimizer and the user will only have to write simple SPARQL queries referring to the name of the graph, instead of the underlying storage schema, but until then I am afraid you will have to do it by hand. The good news is that if you are using RDF data and testing queries that have been published previously on papers as benchmarks, most likely someone else will have done already the translation to a correct SQL query (since most experiments on newly build engines that do not support SPARQL use this method). Hope this helps you a bit, lefteris 2010/7/15 wangx <wangx@tju.edu.cn>:
Hi MonetDB developers, I have a question about RDF data management in MonetDB/SQL. The comment of sql.rdfshred says "shredding an RDF data file from location results in 7 new tables (6 permutations of SPO and a mapping) ... We can then query with SQL queries the RDF triple store by quering tables gid_spo, gid_pso etc., ...". In my option, if the spo table is considered the triples table, the other 5 tables (sop, pso, pos, osp, ops) (except the mapping table) can be viewed as indexes of the triples table spo. When I write SQL to query the shredded RDF data in the triples table, I have two ways. The first way is to only use spo table to make self-joins. The second way is to use all 6 tables to make joins. I noticed that "MonetDB/SQL Reference Manual" says that "The heart is the MonetDB server, which comes with the following innovative features. ... Index selection, creation and maintenance is automatic". If I use 6 tables (as indexes) explicitly to make joins, it seems that I write the query plan by myself. However, I think this work should be done by the SQL optimizer using statistics from the system catalog. I wondered if these tables have already been specified as indexes in the internal code, or if there is a way to specify it so that the optimizer can use them as indexes to generate query plans. I am not sure if my understanding is correct. I will appreciate any help from developers. Thank you in advance.
Best regards, Xin Wang ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
participants (2)
-
Lefteris
-
wangx