Internals of intermediate results in MonetDB

Hi all, Thanks for your hard works on this great open source project. I’m a student newly working on MonetDB to implement some research ideas, and I want to understand internals of how MonetDB chooses to materialize and how it represents intermediate results. As I understand, table data and intermediate results in MonetDB are all stored as BATs, but I’m not sure if there are some special optimizations on BATs, rather than just array(s) of atoms. Here are two simple examples to elaborate my questions a little bit: 1.Given two tables, Items(id, order, price, tax), and Orders(id, discount). Items [I1, I2, I3, I4, I5, I6, I7, I8, I9, I10, I11...] [O3, O3, O3, O3, O3, O3, O3, O3, O3, O3, O4...] [10, 11, 20, 15, 110, 80, 90, 12, 13, 88, 30...] [0.1, 0.15, 0, 0.1, 0.1, 0.15, 0.11, 0.18, 0.10, 0.15, 0.20, ...] Orders [O1, O2, O3] [0.8, 0.9, 0.95] When we perform a join between these two tables on the field `order` (one-to-many relationship represented by the primary-foreign key, O3 matches with [I1, I2, …, I10]), in the output intermediate BATs of the join, will MonetDB duplicate `O3` 10 times? Are there any optimizations currently in MonetDB to remove/reduce the duplication? 2. Given a table item(id, price, tax), when MonetDB performs the filter `price<10` on the table, will MonetDB actually copy all matched tuples into intermediate BATs? Or just keep a list of OIDs of matched tuples as references? Thanks! Best, Guodong

Hi On 21/09/2020 05:31, Guodong Jin wrote:
Hi all,
Thanks for your hard works on this great open source project. I’m a student newly working on MonetDB to implement some research ideas,
As I understand, table data and intermediate results in MonetDB are all stored as BATs, but I’m not sure if there are some special optimizations on BATs, rather than just array(s) of atoms. Here are two simple examples to elaborate my questions a little bit:
1.Given two tables, /Items(id, order, price, tax)/, and /Orders(id, discount)/. Items [I1, I2, I3, I4, I5, I6, I7, I8, I9, I10, I11...] [O3, O3, O3, O3, O3, O3, O3, O3, O3, O3, O4...] [10, 11, 20, 15, 110, 80, 90, 12, 13, 88, 30...] [0.1, 0.15, 0, 0.1, 0.1, 0.15, 0.11, 0.18, 0.10, 0.15, 0.20, ...] Orders [O1, O2, O3] [0.8, 0.9, 0.95]
When we perform a join between these two tables on the field `/order/` (one-to-many relationship represented by the primary-foreign key, O3 matches with [I1, I2, …, I10]), *in the output intermediate BATs of the join, will MonetDB duplicate `O3` 10 times? Have a look at the result of EXPLAIN SELECT * FROM Items JOIN Orders to view the MAL program to learn
For a PhD? or semester task? and I want to understand internals of how MonetDB chooses to materialize and how it represents intermediate results. the intermediate types. The join result is a collection of OID-pairs
Are there any optimizations currently in MonetDB to remove/reduce the duplication?*
2. Given a table /item(id, price, tax)/, when MonetDB performs the filter `price<10` on the table, *will MonetDB actually copy all matched tuples into intermediate BATs? Or just keep a list of OIDs of matched tuples as references?* See MAL plan, a list of oids
Thanks!
Best, Guodong
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
participants (2)
-
Guodong Jin
-
Martin Kersten