
Hello Karanbir, Well, that can't be the cause then. :) I will file the bug-report regarding the BOM. When I try to write the command exactly as you did (with both '-s' and a pipe, i get) $ echo 'CREATE TABLE aap (a int);' | mclient -lsql -dtest $ cat data.dat 1 2 3 4 5 $ N=4; head -n $N data.dat | mclient -lsql -dtest -s "copy $N records into aap from STDIN;" MAPI = monetdb@localhost:50000 QUERY = copy 4 records into aap from STDIN; ERROR = !SQLException:sql:value ';' while parsing ';' from line 0 field 0 not inserted, expecting type int !SQLException:importTable:failed to import table It seems that mclient is confused about what to read first (statement or stdin) and perhaps it is a bug? I think an mclient guru might be able to answer this? I also tried (and this seems to work fine): $ (N=4; echo "copy $N records into aap from STDIN;"; head -n $N data.dat) | mclient -lsql -dtest [ 4 ] $ mclient -lsql -dtest -s "select * from aap;" % sys.aap # table_name % a # name % int # type % 1 # length [ 1 ] [ 2 ] [ 3 ] [ 4 ] So this could be another workaround? Have you tried this already? Wouter p.s. Otherwise, I guess another workaround would be to create a (temporary) pipe on your filesystem (but i'm not sure whether that works): $ mkfifo /tmp/workaroundpipe $ (cat mydata > /tmp/workaroundpipe) & $ mclient -lsql -s "copy x records into x from '/tmp/workaroundpipe';" 2009/5/5 Karanbir Singh <mail-lists@karan.org>:
Hi Wouter,
Wouter Alink wrote:
Hello Karanbir,
This sounds like a BOM (Byte Order Mark, http://unicode.org/faq/utf_bom.html#BOM) is not dealt with correctly.
Thats interesting, and not something I'd considered at all. However :
If you try:
xxd /home/kbsingh/data/data/1000.utf8 | head
does it start with 'EF BB BF'?
[kbsingh@koala ~]$ xxd /home/kbsingh/data/data/1000.utf8 | head 0000000: 3664 6266 6339 6431 6635 3464 3137 3366 6dbfc9d1f54d173f 0000010: 6130 3962 6664 6131 3965 3566 6335 3062 a09bfda19e5fc50b
So that does not seem to be the issue in this case.
A little experiment (on the head) reveals a bug in mclient (it does not handle correctly the optional BOM at the beginning of the input):
$ cat selectWithBOM.py print "\xEF\xBB\xBFSELECT 1;" $ python selectWithBOM.py > queryWithBOM.sql $ xxd queryWithBOM.sql 0000000: efbb bf53 454c 4543 5420 313b 0a ...SELECT 1;. $ cat queryWithBOM.sql SELECT 1; $ echo "SELECT 1;" | mclient -lsql % . # table_name % single_value # name % tinyint # type % 1 # length [ 1 ] $ cat queryWithBOM.sql | mclient -lsql (Hangs)
I guess a bug should be filed.
Good call, should I go ahead and do that using your test case here ? or would you like to file the bugreport yourself ? The only reason I am hesitant to do this is that while there seems to be this issue, its not an issue that my data suffers from here.
If your data starts with the BOM, a workaround would be to strip the first three bytes of your data (as the BOM is not very meaningful when using UTF-8).
I dont think that its the case here, so what are the workaround options available ? Essentially : I need to load about 600 to 700 G worth of data thats going to be delivered to me in a .gz file, expanding that to raw text is not something I'd like to consider unless thats was the _only_ way to get data loaded here.
-- Karanbir Singh : http://www.karan.org/ : 2522219@icq
------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users