Re: [Monetdb-developers] [Monetdb-checkins] MonetDB/src/gdkgdk_utils.mx, MonetDB_1-22, 1.206, 1.206.2.1

Hi Stefan, The pftijah bug was unrelated to vmalloc(), a relationship that I never assumed. The vmalloc() troubles occured, as you may recall, during the data preparation of the sorted 100GB TPC-H. As it is entirely foreseeable that we at some point again will re-encounter an OS with a malloc implementation that is prone to fragmentation, I'd keep the vmalloc() code in for the moment (though in disabled state). Linux and Windows do not need it. So, my advise is to for the monet let vmalloc() be disabled, until the issue is properly investigated and a bugfix is found. Let's call it option (0). Peter Dear all, since Roberto's bug appears to be fixed by Peter's ".priv" (vm_minsize related) fixes (Thanks!), and disabling of vmalloc() (mem_bigsize related) does not appear have any impact on this bug (just tested successfully both with vmalloc() disabled and vmalloc() enabled), there are two ways to finish this release (more or less) quickly: (1) finish the disabling of vmalloc() in all parts of the code as indicated below (or better: remove all remains completely) --- this might need some extra testing to ensure that it does not have any (significant) impact on performance ... (2) for now (i.e., in the release branch) re-enable vmalloc(), and then finish the removal of it (incl. extensive testing) in the development trunk for th next release. Since I'm not completely aware of the impact of vmalloc(), I cannot really make this decision --- the "save" back-up would IMHO be (2). Any other comments, opinions, suggestions, expertise? Thanks in advance! Stefan On Sun, Feb 10, 2008 at 01:33:04PM +0100, Stefan Manegold wrote:
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |

Dear all, On Mon, Feb 11, 2008 at 02:54:59PM +0100, Peter Boncz wrote:
Indeed, that was (only) my mis-interpretation. Sorry.
Ok. Just to sum up the current state. Please check, whether this is OK and consistent: a) Documentation in MonetDB4/conf/MonetDB.conf.in & MonetDB5/conf/monetdb5.conf.in says: " # gdk_mem_bigsize & gdk_vm_minsize will be set/limited to # 1/2 of the physically available amount of main-memory # during start-up in src/tools/Mserver.mx # memory chunks of size >= gdk_mem_bigsize (in bytes) will be mmaped anonymously #gdk_mem_bigsize=262144 # memory chunks of size >= gdk_vm_minsize (in bytes) will be mmaped; #gdk_vm_minsize=137438953472 " b) Documentation in ./monetweb/Docs/XQuery/MonetDB.conf.texi says " @item @code{gdk_mem_bigsize}: minimum size for memory-mapped columns, e.g. @code{262144}. @item @code{gdk_vm_minsize}: column size above which memory mapped files are used always, e.g. @code{1749291171} " c) MonetDB/src/common/monet_options.mx sets " set[i].kind = opt_builtin; set[i].name = strdup("gdk_mem_bigsize"); set[i].value = strdup("262144"); " d) MonetDB/src/common/monet_options.py.in sets " # gdk_mem_bigsize & gdk_vm_minsize will be set/limited to # 1/2 of the physically available amount of main-memory # during start-up in src/tools/Mserver.mx gdk_mem_bigsize = '256K' gdk_vm_minsize = '128G' " e) GDKinit() in MonetDB/src/gdk/gdk_utils.mx sets " if ((p = GDKgetenv("gdk_mem_bigsize"))) { /* when allocating >6% of all RAM; do so using vmalloc() iso malloc() */ lng max_mem_bigsize = GDK_mem_maxsize/16; /* sanity check to avoid memory fragmentation */ GDK_mem_bigsize = (size_t) MIN(max_mem_bigsize, strtol(p, NULL, 10)); } " f) GDKmallocmax() in MonetDB/src/gdk/gdk_utils.mx ignores GDK_mem_bigsize g) HEAPalloc() in MonetDB/src/gdk/gdk_heap.mx uses GDK_mem_bigsize as follows: " /* when using anonymous vm we malloc we need 64K chunks, also we * 20% extra malloc */ if (h->size > GDK_mem_bigsize) { h->maxsize = (size_t) ((double) h->maxsize * BATMARGIN) - 1; h->maxsize = (1 + (h->maxsize >> 16)) << 16; } " h) HEAPextend() in MonetDB/src/gdk/gdk_heap.mx uses GDK_mem_bigsize as follows: " /* extend a malloced heap, possibly switching over to file-mapped storage */ Heap bak = *h; int can_mmap = (h->filename && size >= GDK_mem_bigsize); int must_mmap = can_mmap && (size >= GDK_vm_minsize || (h->newstorage != STORE_MEM)); [...] if (can_mmap) { /* in anonymous vm, if have to realloc anyway, we reserve some extra space */ if (size > h->maxsize) { h->maxsize = (size_t) ((double) size * BATMARGIN); } /* when using anonymous vm we malloc we need 64K chunks */ h->maxsize = (1 + ((h->maxsize - 1) >> 16)) << 16; [...] /* too big: convert it to a disk-based temporary heap */ if (can_mmap) { char privext[PATHLENGTH], *of = h->filename; FILE *fp; h->filename = NULL; sprintf(privext, "%s.priv", ext); fp = GDKfilelocate(nme, "wb", privext); if (fp != NULL) { fclose(fp); /* a non-persistent heap: we create a .priv but *not* MMAP_PRIV !!! */ h->storage = STORE_MMAP; h->base = NULL; if (HEAPload(h, nme, privext, FALSE) >= 0) { memcpy(h->base, bak.base, bak.free); HEAPfree(&bak); return 0; } } GDKfree(of); } " i) MIL offers mem_bigsize() & mem_bigsize(lng) to get / set mem_bigsize in MonetDB4/src/modules/plain/sys.mx j) both MIL & MAL offer commands to set mem_maxsize and vm_minsize, however, only the MIL variants (still) related mem_maxsize & vm_minsize with mem_bigsize, while the MAL variants no longer do so: MonetDB4/src/modules/plain/sys.mx: " int set_mem_maxsize(lng *num) { @:num2sze(mem_maxsize)@ if (sze < GDK_mem_bigsize) set_mem_bigsize(num); GDK_mem_maxsize = MAX(GDK_mem_bigsize, sze); return GDK_SUCCEED; } int set_vm_minsize(lng *num) { @:num2sze(vm_minsize)@ if (sze < GDK_mem_bigsize) set_mem_bigsize(num); GDK_vm_minsize = MAX(GDK_mem_bigsize, sze); return GDK_SUCCEED; } " MonetDB5/src/modules/kernel/status.mx " int set_mem_maxsize(lng *num) { @:num2sze(mem_maxsize)@ #if 0 if (sze < GDK_mem_bigsize) set_mem_bigsize(num); GDK_mem_maxsize = MAX(GDK_mem_bigsize, sze); #endif GDK_mem_maxsize = sze; return GDK_SUCCEED; } int set_vm_minsize(lng *num) { @:num2sze(vm_minsize)@ #if 0 if (sze < GDK_mem_bigsize) set_mem_bigsize(num); GDK_vm_minsize = MAX(GDK_mem_bigsize, sze); #endif GDK_vm_minsize = sze; return GDK_SUCCEED; } " If all this is considered OK and consistent, we're done. Otherwise, we need to do what ever is required to get the code and documentation consistent, preferably before the release. Stefan
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
participants (2)
-
Peter Boncz
-
Stefan Manegold