
Le 04/07/2013 15:12, Stefan Manegold a écrit :
Please check what the monetdbd / merovingian log file says. Hum... /var/log/monetdb is empty. And I can't locate any monet* log file.
The table size on disk for Postgresql is 37 GB. Using VectorWise, the table loads, and take only 7 GB on disks, but the loading process (very similar to what we do with MonetDB) takes a very long time( 25 hours), much longer than what happens with the 109 columns tables.
I am trying to make sense of that and understand why this table is giving a (very) hard time to monetdb (and somehow VectorWise as well). Does anyone have an explanation ? Can you share the schemas (column types) of both tables? Yes, sure.
The first table is called dwhinv and has : 89 columns of type varchar(32). 2 timestamps (not null) 2 decimal (at least one of them is set on each line) 19 integers The second table is called dwhcol and has : 83 columns of type varchar(32) 2 timestamps 420 columns of type decimal
For each type used the number of column with that type might be sufficient. If there are no string columns at all, that would be sufficient info for us to estimate the expected storage size.
Other than duplicate elimination (dictionary encoding) for string columns, MonetDB does not yet use compressed storage. We do have good experiences with compressed file systems, though. What about null values ? I was under the impression that null values where not stored in monetdb (just non existant entries in the BATs) ? Because each decimal column has at most 1 million non null values, and many have a few thousands... Doesn't it matter ?
Do you have constraints and/or keys defined on your tables prior to bulk loading? No, there are no foreign keys, nor constraints of any sort.
Franck