Hello all-
First of all thanks to you guys for your great help these past few
weeks. I really appreciate it. I've learned a bit, and hopefully
your product is improving from my bug reports.
I've managed to pull a bit of my data from my proprietary db into
monetdb. (Have some 60 million rows in the fact table, pulled in 5
million)! I even ran a query against it and got the same numbers
(though in a different order)!
So I did a very basic poor mans benchmark. My python code can hit the
old db or monetdb by changing one line. So I took timestamps around
the query execution portion.
The old db takes 0.21 seconds when cold. Subsequent queries take
0.027-0.072 seconds
Monetdb takes 0.67 seconds when cold and 0.12-0.23 otherwise.
It appears that the old db is faster when cold than monetdb is after
being primed.
A few more caveats. (Yes, this is a poor benchmark but after reading
pdfs saying that monetdb was 7 times faster than a 64 way oracle box,
I thought I'd get quite a bit of speedup.) The old db is running
under vmware on a Core2 1.86Ghz box. Monetdb is running on my laptop,
a Pentium M 1.86 Ghz machine.
Since the query is confidential I can't really put it here, but I can
describe it. I'm pulling stuff out of my fact table and two other
dimension tables. The select clause sums a few columns, and on one
sums the product of one column and the sum of a few others. The
coalesce function is used to convert NULLs to 0. The where statement
joins the tables using the = operator and then has two "IN" clauses.
Then there is a group by (since I'm aggregating in the select) and an
order by.
Any hints on how to increase performance? Do I need to tune my query
or tweak other things? (Yes I know the laptop machine is not apples
to apples to the vmware machine)
Am open to suggestions.
thanks much,
-matt