Re: [MonetDB-users] huge files and xrpc

Hello Francois, Since the error you get also happens without XRPC, it's very possible that there is bug in the query execution, so I have CC-ed this e-mail to the monetdb-users mailing list, sothat more people could look at this problem. my answers / further questions are inlined below. On Mar 27, 2009, at 16:07 , francois guérin wrote:
Le 26 mars 2009 11:36, Ying Zhang <Y.Zhang@cwi.nl> a écrit :
Hello Francois,
I need some more information to be able to tell what's wrong here.
- which version of monetdb/xquery are you using? how did you install it (from CVS, using the supper tarball, windows installer, RPM package)?
the 18th mars nightly built version, with a super tarball
- on which OS? CPU? RAM?
64 bit Linux Kernel 2.6.24 Distrib Mandriva Cpu Intel(R) Pentium(R) 4 CPU 3.40GHz 2Go RAM
- how did you insert you document? using pf:add-doc() via e.g. mclient or by making an xrpc call?
i used pf:add-doc()
- have you checked that the insertion of your document has succeeded or not?
it has. the document appears when i call the function pf:documents() (but i cant view the whole file, cause it's too big)
It should be sufficient. To be more sure, maybe try something like count(doc("yourdoc.xml")//*), but this could take quite some time with your document :)
- have you inserted the document as a read-only document or updatable document?
as an updatable document
- what is the exact xrpc request you have sent?
get_collocations_PASSAGE("wikipedia","section")
- which function are you trying to call (please send me the xquery code)?
declare function foo:get_collocations_PASSAGE($corpus as xs:string,$word as xs:string){ for $grp in doc($corpus)/DOCUMENT/Sentence/G for $w in $grp/W/@form return if(fn:contains($w,$word)) then foo:groupe_text_PASSAGE($grp) else () };
Would you please also send us the code of the function foo:groupe_text_PASSAGE() (and the code of the other user-defined-functions called by this function, if any)?
(that code worked with a corpus much more little than the one called "wikipedia")
What is the size of your smaller document? Would it be possible for you to investigate what is the (size of the) largest document on which your query still run?
- can you execute the same function without xrpc?
with an .xq file read by "mclient -lx", i got the same error.
my .xq file: for $grp in doc("wikipedia")/DOCUMENT/Sentence/G for $w in $grp/W/@form return if(fn:contains($w,"section")) then $grp else ()
So the error seems unrelated with the function foo:groupe_text_PASSAGE(). Have you executed any updating function before you execute this function/code? Since the functions you mentioned in your e-mail are all read-only functions. Could you please also try to add your (4.6GB) document as a read-only document, and execute the functions with/without XRPC to see what would happen? If your functions execute without any error on a read-only document, it's most likely that the problem is in the update related code (updatable documents are stored differently than read-only documents). To reproduce and debug your problem, we would need your document. Would it be possible to put the (zipped) document in a place we could download?
- after a document has been stored in the database, is the url (the one appearing in pf:documents()) useful? or may i change my files name in my computer? or even delete them once they've been stored?
You can rename/remove the original documents, once they have been added to the DB. As far as I know, the URLs displayed by pf:documents() are merely adminstrative info. However, you might want to keep your original document as a backup. Kind regards, Jennie
Thanks,
francois.
Kind regards,
Jennie
On Mar 25, 2009, at 11:54 , francois guérin wrote:
Hello,
(it's francois again)
I've inserted a huge file (4,8G) in MonetDB xml database - all the process did well - , but when i send an xrpc request, i always get that error message: (and i dont what that means :s )
<?xml version="1.0" encoding="utf-8"?> <env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope"> <env:Body> <env:Fault> <env:Code> <env:Value>env:Sender</env:Value> </env:Code> <env:Reason> <env:Text xml:lang="en">Error occurred during execution. !ERROR: [remap]: 2 times inserted nil due to errors at tuples 3@0, 5@0. !ERROR: [remap]: first error was: !ERROR: CMDremap: operation failed. !ERROR: interpret_unpin: [remap] bat=559,stamp=-664 OVERWRITTEN !ERROR: BBPdecref: 1000000006_rid_nid does not have pointer fixes. !ERROR: interpret_params: leftfetchjoin(param 2): evaluation error.
</env:Text> </env:Reason> </env:Fault> </env:Body> </env:Envelope>
Where does it come from? Perhaps, should i make several little files instead of a big one (even if the documentation says to prefer one big to several little)... My app is to store huge xml datas (either in a big one file, or several little ones) and to have a fast storage and request timing. What would you advise me?
Regards,
-- francois.
-- francois.
participants (1)
-
Ying Zhang