Hi list,
I was just doing some basic queries on MonetDB5/SQL to see if it is
suitable for my application, I'm doing lots of aggregates on some
logfile-abstractions. Basically they all boil down to 'how many unique
visitors and total pageviews where there in period X-Y in section Z'.
I have this table:
pageviews (
timestamp timestamp not null,
clientip varchar(15) not null,
sectionid smallint not null,
itemid integer not null,
channelid smallint default 0
)
Currently it only contains data for last september, with about 2M
records/day, and 5.6M in total.
There are no additional indexes in this case.
When doing a query like this, monetdb very fast. Once the data is in the
memory cache, it returns (according to trace) in about half a second.
select count(*) from pageviews
where timestamp between '2007-09-21 00:00:00' and '2007-09-22 00:00:00';
The result is 1916813
This one is also pretty fast, taking about 1.7 second
select count(distinct clientip) from pageviews
where timestamp between '2007-09-21 00:00:00' and '2007-09-22 00:00:00';
The result is 165700
And the third which is pretty fast:
select channelid, count(*) from pageviews
where timestamp between '2007-09-21 00:00:00' and '2007-09-22 00:00:00'
group by channelid;
Here's the distribution, and its returned in about 0.6 second
[ 0, 538187 ]
[ 1, 1108478 ]
[ 4, 42867 ]
[ 3, 145565 ]
[ 2, 81716 ]
But when I combine those last two queries, the result isn't returned in
a reasonable amount of time, I waited for more than half an hour and it
still hadn't returned the results.
select channelid, count(distinct clientip), count(*)
from pageviews
where timestamp between '2007-09-21 00:00:00' and '2007-09-22 00:00:00'
group by channelid;
I'm not very good at reading your explain output yet, so I've attached
the resulting explain for that query.
Is there a way to speed up this type of query? It seems a bit odd that
it's taking more than half an hour (postgresql does it in about 20
seconds) while the other queries return much faster (postgresql does
them in about 14 seconds).
Best regards,
Arjen
function user.s5_1():void;
_1:bat[:oid,:timestamp]{rows=55914459:lng,bid=1968} := sql.bind("sys","pageviews","timestamp",0);
_10:bat[:oid,:timestamp]{rows=0:lng,bid=2323} := sql.bind("sys","pageviews","timestamp",1);
constraints.emptySet(_10);
_10:bat[:oid,:timestamp]{rows=0:lng,bid=2323} := nil;
_12:bat[:oid,:timestamp]{rows=0:lng,bid=2324} := sql.bind("sys","pageviews","timestamp",2);
constraints.emptySet(_12);
_12:bat[:oid,:timestamp]{rows=0:lng,bid=2324} := nil;
_17:bat[:oid,:sht]{notnil=true,rows=55914459:lng,bid=1969} := sql.bind("sys","pageviews","channelid",0);
_24:bat[:oid,:str]{rows=55914459:lng,bid=1961} := sql.bind("sys","pageviews","clientip",0);
_8{rows=27957230:lng} := algebra.uselect(_1,2007-09-21 00:00:00.000,2007-09-22 00:00:00.000,true,true);
_1:bat[:oid,:timestamp]{rows=55914459:lng,bid=1968} := nil;
_15{rows=27957230:lng} := algebra.markT(_8,0@0);
_8{rows=27957230:lng} := nil;
_16{rows=27957230:lng} := bat.reverse(_15);
_15{rows=27957230:lng} := nil;
_19{rows=27957230:lng} := algebra.join(_16,_17);
(ext40,grp38):= group.new(_19);
_19{rows=27957230:lng} := nil;
_22 := bat.mirror(ext40);
ext40 := nil;
_23 := algebra.joinPath(_22,_16,_17);
_17:bat[:oid,:sht]{notnil=true,rows=55914459:lng,bid=1969} := nil;
_26{rows=27957230:lng} := algebra.join(_16,_24);
_16{rows=27957230:lng} := nil;
_24:bat[:oid,:str]{rows=55914459:lng,bid=1961} := nil;
(_27,grp55):= group.new(grp38);
(_29,grp57):= group.derive(_27,grp55,_26);
_27 := nil;
grp55 := nil;
_31 := bat.mirror(_29);
_29 := nil;
_32 := algebra.semijoin(_26,_31);
_26{rows=27957230:lng} := nil;
_31 := nil;
_33:bat[:oid,:int] := aggr.count_no_nil(_32,grp38,_22);
_32 := nil;
_34:bat[:oid,:int] := aggr.count(grp38,grp38,_22);
grp38 := nil;
_22 := nil;
_35 := sql.resultSet(3,1,_23);
sql.rsColumn(_35,"sys.pageviews","channelid","smallint",16,0,_23);
_23 := nil;
sql.rsColumn(_35,"sys.pageviews","count_no_nil_clientip","int",15,0,_33);
_33:bat[:oid,:int] := nil;
sql.rsColumn(_35,"sys.pageviews","count_channelid","int",32,0,_34);
_34:bat[:oid,:int] := nil;
sql.exportResult(_35,"");
end s5_1;