extreme performance breakdown after update

Hello List, I'm running monetdb on a Debian server and the software is managed via Debian packages. A few days ago I installed an update for the monetdb5-server package. After that I noticed a problem with queries with the function lower() in the WHERE clause:
WHERE LOWER(“fieldname”) LIKE '%lower_string%'<<
without any change to the data, the requests became much, much slower then before (from some hundred microseconds to some minutes). Not only in my application but also in the monetdb client... I could solve the problem via a switch from using LIKE to ILIKE and no lower()-function, but I'm curious what could be the cause for this dramatic performance breakdown? I searched the release history but I found nothing... Has anybody experienced the same effect? Anny ideas on this phenomenon? Greetings from Germany, Andreas (theafh)

Hi, You can check the merovingian.log file for WARNINGs In the development branch 'default' we had a temporary bug that failed to find some bulk-operations. It has been fixed already. Perhaps you hit this jackpot. regards, Martin On 24/02/16 21:38, Andreas F. Hoffmann wrote:

@Martin: The issue you mention where never release; hence, I do not think they play a role, here. @Andreas: Most important question: Which version of MonetDB did work fine before, and which does no longer after the upgrade? For the new version, you could profile your quer(y|ies) by prefixing them with TRACE --- feel free to share the results for analysis --- unfortunately, you cannot (easily) downgrade to the older version to also profile that one. Would you be to share all ingredients to reproduce the problem? Also, do I understand correctly that you tested the alternative omitting LOWER() and using ILIKE, instead, and that works fine (also) with the new version of MonetDB? If so, you might also want to profile (TRACE) that one to compare the trace with that of the "slow" query. Best, Stefan ----- On Feb 24, 2016, at 9:43 PM, Martin martin.kersten@cwi.nl wrote:
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |

Hi Andreas, from your traces, I can read the following: the query with lower() + like is (significantly) more expensive than the query with ilike, which is in turn (slightly) more expensive than the query with like: $ grep ' := user.s' lower.log ilike.log simple.log lower.log:[ 84857647, "X_4=0@0:void := user.s4_2(\"%qcrawl%\":str);" ] ilike.log:[ 394212, "X_4=0@0:void := user.s3_2(\"%qcrawl%\":str);" ] simple.log:[ 131674, "X_4=0@0:void := user.s2_2(\"%QCrawl%\":str);" ] (times are in micro-seconds) For the lower() + like, almost all of the time is spent in the lower() function: $ echo ; for i in lower.log ilike.log simple.log ; do echo $i ; grep ' := ' $i | egrep -v ' := (language.dataflow|user.s[0-9])' | sort -nrk2 | head ; echo ; done lower.log [ 75923265, "X_311=<tmp_3453>[675067]:bat[:oid,:str]{horigin=6} := batstr.toLower(X_301=<tmp_3362>[675067]:bat[:oid,:str]);" ] [ 71794523, "X_308=<tmp_3462>[675067]:bat[:oid,:str]{horigin=3} := batstr.toLower(X_295=<tmp_3554>[675067]:bat[:oid,:str]);" ] [ 70768563, "X_313=<tmp_1412>[675074]:bat[:oid,:str]{horigin=8} := batstr.toLower(X_305=<tmp_2760>[675074]:bat[:oid,:str]);" ] [ 69586703, "X_310=<tmp_2207>[675067]:bat[:oid,:str]{horigin=5} := batstr.toLower(X_299=<tmp_3357>[675067]:bat[:oid,:str]);" ] [ 67852676, "X_309=<tmp_3510>[675067]:bat[:oid,:str]{horigin=4} := batstr.toLower(X_297=<tmp_3316>[675067]:bat[:oid,:str]);" ] [ 57176388, "X_312=<tmp_3572>[675067]:bat[:oid,:str]{horigin=7} := batstr.toLower(X_303=<tmp_3560>[675067]:bat[:oid,:str]);" ] [ 27191014, "X_307=<tmp_2755>[675067]:bat[:oid,:str]{horigin=2} := batstr.toLower(X_293=<tmp_3604>[675067]:bat[:oid,:str]);" ] [ 16893811, "X_306=<tmp_1277>[675067]:bat[:oid,:str]{horigin=1} := batstr.toLower(X_291=<tmp_2406>[675067]:bat[:oid,:str]);" ] [ 58652, "X_319=<tmp_3175>[0]:bat[:oid,:oid]{torigin=5, horigin=5} := algebra.likesubselect(X_310=<tmp_2207>[675067]:bat[:oid,:str],X_119=<tmpr_2101>[675067]:bat[:oid,:oid],A0=\"%qcrawl%\":str,\"\":str,false:bit);" ] [ 56453, "X_321=<tmp_3514>[0]:bat[:oid,:oid]{torigin=7, horigin=7} := algebra.likesubselect(X_312=<tmp_3572>[675067]:bat[:oid,:str],X_123=<tmpr_2705>[675038]:bat[:oid,:oid],A0=\"%qcrawl%\":str,\"\":str,false:bit);" ] ilike.log [ 336293, "X_297=<tmp_3462>[0]:bat[:oid,:oid]{torigin=6, horigin=6} := algebra.ilikesubselect(X_133=<tmp_3505>[675067]:bat[:oid,:str],X_123=<tmpr_3444>[675067]:bat[:oid,:oid],A0=\"%qcrawl%\":str,\"\":str,false:bit);" ] [ 323465, "X_296=<tmp_2406>[0]:bat[:oid,:oid]{torigin=5, horigin=5} := algebra.ilikesubselect(X_132=<tmp_3410>[675067]:bat[:oid,:str],X_121=<tmpr_3254>[675067]:bat[:oid,:oid],A0=\"%qcrawl%\":str,\"\":str,false:bit);" ] [ 320678, "X_298=<tmp_3357>[0]:bat[:oid,:oid]{torigin=7, horigin=7} := algebra.ilikesubselect(X_134=<tmp_3204>[675067]:bat[:oid,:str],X_125=<tmpr_3260>[675038]:bat[:oid,:oid],A0=\"%qcrawl%\":str,\"\":str,false:bit);" ] [ 320629, "X_293=<tmp_3311>[0]:bat[:oid,:oid]{torigin=2, horigin=2} := algebra.ilikesubselect(X_129=<tmp_2207>[675067]:bat[:oid,:str],X_115=<tmpr_2506>[675067]:bat[:oid,:oid],A0=\"%qcrawl%\":str,\"\":str,false:bit);" ] [ 319128, "X_294=<tmp_3362>[0]:bat[:oid,:oid]{torigin=3, horigin=3} := algebra.ilikesubselect(X_130=<tmp_2651>[675067]:bat[:oid,:str],X_117=<tmpr_1352>[675067]:bat[:oid,:oid],A0=\"%qcrawl%\":str,\"\":str,false:bit);" ] [ 307071, "X_299=<tmp_3560>[0]:bat[:oid,:oid]{torigin=8, horigin=8} := algebra.ilikesubselect(X_135=<tmp_3514>[675074]:bat[:oid,:str],X_127=<tmpr_3164>[675073]:bat[:oid,:oid],A0=\"%qcrawl%\":str,\"\":str,false:bit);" ] [ 303231, "X_295=<tmp_3156>[0]:bat[:oid,:oid]{torigin=4, horigin=4} := algebra.ilikesubselect(X_131=<tmp_3602>[675067]:bat[:oid,:str],X_119=<tmpr_3007>[675067]:bat[:oid,:oid],A0=\"%qcrawl%\":str,\"\":str,false:bit);" ] [ 288524, "X_292=<tmp_1500>[1]:bat[:oid,:oid]{torigin=1, horigin=1} := algebra.ilikesubselect(X_128=<tmp_3337>[675067]:bat[:oid,:str],X_113=<tmpr_3306>[675067]:bat[:oid,:oid],A0=\"%qcrawl%\":str,\"\":str,false:bit);" ] [ 42323, "X_127=<tmpr_3164>[675073]:bat[:oid,:oid]{horigin=8, torigin=8} := sql.tid(X_3=0:int,\"sys\":str,\"agents\":str,7:int,8:int);" ] [ 40222, "X_125=<tmpr_3260>[675038]:bat[:oid,:oid]{horigin=7, torigin=7} := sql.tid(X_3=0:int,\"sys\":str,\"agents\":str,6:int,8:int);" ] simple.log [ 65030, "X_297=<tmp_3341>[0]:bat[:oid,:oid]{torigin=6, horigin=6} := algebra.likesubselect(X_133=<tmp_1751>[675067]:bat[:oid,:str],X_123=<tmpr_3473>[675067]:bat[:oid,:oid],A0=\"%QCrawl%\":str,\"\":str,false:bit);" ] [ 64647, "X_296=<tmp_3323>[0]:bat[:oid,:oid]{torigin=5, horigin=5} := algebra.likesubselect(X_132=<tmp_3311>[675067]:bat[:oid,:str],X_121=<tmpr_3175>[675067]:bat[:oid,:oid],A0=\"%QCrawl%\":str,\"\":str,false:bit);" ] [ 63896, "X_293=<tmp_3135>[0]:bat[:oid,:oid]{torigin=2, horigin=2} := algebra.likesubselect(X_129=<tmp_3006>[675067]:bat[:oid,:str],X_115=<tmpr_3552>[675067]:bat[:oid,:oid],A0=\"%QCrawl%\":str,\"\":str,false:bit);" ] [ 62461, "X_298=<tmp_3543>[0]:bat[:oid,:oid]{torigin=7, horigin=7} := algebra.likesubselect(X_134=<tmp_3534>[675067]:bat[:oid,:str],X_125=<tmpr_3475>[675038]:bat[:oid,:oid],A0=\"%QCrawl%\":str,\"\":str,false:bit);" ] [ 62363, "X_294=<tmp_3307>[0]:bat[:oid,:oid]{torigin=3, horigin=3} := algebra.likesubselect(X_130=<tmp_3324>[675067]:bat[:oid,:str],X_117=<tmpr_3434>[675067]:bat[:oid,:oid],A0=\"%QCrawl%\":str,\"\":str,false:bit);" ] [ 61810, "X_295=<tmp_3137>[0]:bat[:oid,:oid]{torigin=4, horigin=4} := algebra.likesubselect(X_131=<tmp_1177>[675067]:bat[:oid,:str],X_119=<tmpr_2305>[675067]:bat[:oid,:oid],A0=\"%QCrawl%\":str,\"\":str,false:bit);" ] [ 58262, "X_292=<tmp_3540>[1]:bat[:oid,:oid]{torigin=1, horigin=1} := algebra.likesubselect(X_128=<tmp_2506>[675067]:bat[:oid,:str],X_113=<tmpr_3570>[675067]:bat[:oid,:oid],A0=\"%QCrawl%\":str,\"\":str,false:bit);" ] [ 57441, "X_299=<tmp_3520>[0]:bat[:oid,:oid]{torigin=8, horigin=8} := algebra.likesubselect(X_135=<tmp_3410>[675074]:bat[:oid,:str],X_127=<tmpr_3401>[675073]:bat[:oid,:oid],A0=\"%QCrawl%\":str,\"\":str,false:bit);" ] [ 39817, "X_117=<tmpr_3434>[675067]:bat[:oid,:oid]{horigin=3, torigin=3} := sql.tid(X_3=0:int,\"sys\":str,\"agents\":str,2:int,8:int);" ] [ 39441, "X_125=<tmpr_3475>[675038]:bat[:oid,:oid]{horigin=7, torigin=7} := sql.tid(X_3=0:int,\"sys\":str,\"agents\":str,6:int,8:int);" ] The overall order of these three versions is to be expected. The simple like does least work. The with the ilike, the actual comparison is expected to be (slightly?) more expensive than with the simple like; basically, is also has to nomalize the case internally. The version with lower() + like --- basically re-implementing ilike "the crude way", is expected to be more expensive than the other two, mainly as it first needs to convert and materialize all data (in your case 5400543 strings of up to 100 characters each, i.e., roughly between 5.5 MB and 550 MB, or so) --- in memory, if it fits --- before it does the same like comparison as the simple version. Whether the difference is bound to be that big is had to tell without running it myself, or at least without knowing some more details about your data and machine, e.g., how much memory does the machine have, and how big (in MB) is your data (in particular the "ua" column); you might consider using the storage() function to check the latter: select * from storage(); Also, memory performance might be influenced by Linux kernel settings that might have changed with your latest Debian upgrade, we cannot exclude that the performance changes are not only due to the MonetDB upgrade. I cannot find any code chances in MonetDB's lower() implementation since the Oct2014 release that would trigger any performance degradation, let alone between the Jul2015-SP1 and Ju2015-SP2 releases. All I can find is a very recent change that is supposed to improve performance, but that is not included in any release, yet. cf., http://dev.monetdb.org/hg/MonetDB/rev/43fb721ed302 and https://www.monetdb.org/pipermail/developers-list/2016-January/004563.html I am sure you are facing the same problem. In fact a quick test on 20M strings of 2 to 147 chars (35 chars on average) long just revealed that a query similar to your lower() + like query takes 1.5 min with the Jul2015-SP2 release, but only 7 secs with the latest (not yet released) code base of the Jul2015 branch, i.e., after above fix. (all on my 16GB 8-core 3.4GHz machine) (We are currently planning to prepare the next feature release, which will also include the above fix, but that will take a few more months. I'm not sure, whether we'll manage to issue a Jul2015-SP3 bugfix release before that ...) The questions remains, why you pervious version of MonetDB showed so much better performance on the very same data and very same machine for the very same query, although I cannot find and lower() related code changes since the Oct2014 release. For this, it would really help us to know with which version of MonetDB your lower() + like was so much (> 100x ?) faster than it is now with Jul2015-SP2. Maybe your merovingian log(s) date back for enough? Best, Stefan ----- On Feb 28, 2016, at 1:32 PM, Andreas F. Hoffmann me@theafh.net wrote:
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |

In the next release, lower() (and upper()) should be significantly faster. The day before the release of SP2 I checked in a change that did that, but that change is not in SP2 since the release was built before I made the change. The changeset in question is 43fb721ed302. On 02/28/2016 04:47 PM, Stefan Manegold wrote:
-- Sjoerd Mullender

Hello Stefan, thank you for digging so deep in to that problem! :-) Yes I also got a new Linux kernel and some new version of system libraries with that update and it is possible that this might be the source of that problem... Luckily my merovingian.log might reveal something: before the update it states "Merovingian 1.7 (Jul2015-SP1) starting" and after the update it shows "Merovingian 1.7 (Jul2015-SP2)". Before that, also working with the lower() function in my SQL, it states "(Oct2014-SP4)"... Is that what you are locking for? Greetings from Berlin, Andreas On 28.02.2016 16:47, Stefan Manegold wrote:
participants (4)
-
Andreas F. Hoffmann
-
Martin Kersten
-
Sjoerd Mullender
-
Stefan Manegold