Re: [MonetDB-users] Open source Column based DB ... for timeseries

Yue,
I really like "ff", haven't used "bit" yet. I use ff, together with biglm for large scale regression.
Thanks
I need a database to stored all my data, in the billions of rows, so that I can keep analysis within R and data retrieval in database.
ff under 32 bit has a limit on vector length of integer.max ~ 2 billion elements we have not compiled under 64bit so far - you might want to try this on a machine with much RAM and a fast RAID.
You think "ff" is just as fast as MonetDB when it comes to dealing with billions of rows, and from different "ff" files (the equivalent of tables)
1) ff does not have tables (or data.frames) with columns of different types in the current version. This will happen during this year. Of course you can just work with a bunch of vectors 2) Speed is a stubborn beast which behaves very different depending on what you actually are doing. Taking your example "give me the total traded volume of IBM between 10:05am and 10:09am" If we assume that your tick data comes in in in a fixed time rythm and you are looking on historical data, then it should be possible to CALCULATE the positions of those elements falling in your time window. In this case you would get away with disk reading just those = no reading of an index or scanning a complete column. In ff you can access vector elements just by their position. Hard to believe that any DB access could be faster than that. If by contrast you need to scan many columns to decide which records you want, a column- or even row-based DB with indexes might be faster. Details matter. Let me know in case you tried compiling ff under 64 bit. Jens
participants (1)
-
Jens Oehlschlägel