Posted by 
Robert G. Jakabosky on 
					
					URL: http://elua-development.15.s1.nabble.com/GC-Parameters-Emergency-Collection-tp2846858p2866833.html
					
					On Monday 11, Bogdan Marinescu wrote:
> > I agree.  Does the performance hit only show up during a GC (emergency or
> > no)?  I've not clocked any of this yet.
>
> Yes, the performance hit happens only during GC, which becomes a visible
> problem when GC is called too often. But hey, at least your program is
> running :)
The only performance hit during a normal GC is the in-place resizing.  During 
an emergency (out of memory) you can choice between throwing an error or 
taking a performance hit and calling the GC.
The in-place resizing should be the only change in EGC that can cause slow 
downs (I wasn't able to messure any slow downs on my desktop, maybe the 
in-place resizing was helped by the L2 cache).  Also only the hashpart 
resizing will require extra cpu work (it is also the most complex part of the 
EGC patch).  Re-ordering the strings into the correct hash bucket was easy 
for the string table, since the buckets are normal linked lists.  The 
hashpart of tables are more difficult to resize since the chain of nodes for 
each bucket are nodes from a single array instead of each node being 
allocated as a seperate block of memory.
The other 1/3 of the EGC patch is just bug fixes to make it safe to call the 
GC from any place in Lua (i.e. on any allocation).  Those changes were mostly 
moving code arround and not adding more work.  Basically you just have to 
make sure the GC can find all GC'able objects, this means keeping atleast a 
reference to them on the Lua stack and not just the C-stack.
> > The other side is that if the penalty is only at GC time, people could
> > disable collection during time-critical operations.
>
> Definitely, part II :)
You can disable the GC from happening durimg time-critical ops and it will 
also stop it from happening during an emergency.
You could override the default Lua allocator and tune it to your needs.
One way would be to reserve some memory for use when the GC is blocked.  Use 
the memlimit as a softmark, that will call the GC during allocations only if 
the memory usage is above the memory limit but not throw an error if the GC 
can't free enough memory to lower the usage to below the softmark.  Then 
during time-critical ops the GC will be blocked and there should still be 
some free memory to handle allocations during that time.
> > I haven't taken a deep look to understand how every part of the patch
> > works yet, but I see that it does provide a configurable of a memory
> > limit.  Maybe another interesting tunable might be how many GC steps to
> > take when an emergency GC is called (i.e.: just enough for the allocation
> > pending, full gc, or something else?)  I'm also still curious about the
> > idea of running GC in fixed timesteps,
>
> I did quite a few tests with Robert's patch a while ago, I really need to
> look for my messages related to this subject on the Lua list and repost
> them here. They're quite interesting.
The default Lua allocator was change to call the luaC_step() multiple times 
untill just enough memory was freed to handle the current allocation or 
untill atleast one full GC cycle is completed.  That is if the memory usage + 
allocation size is greater then the set memory limit.
If realloc() returns NULL, then a real emergency full GC cycle is done.
A fixed timestep based GC would be very easy to do with a count debug hook.  
You could call the singlestep() from lgc.c once every X Lua ops.
looking at the luaC_step() function again, it looks like it might be doing to 
much work to be used from the Lua allocator.  I might look into creating a 
more flexible/simple GC step function.
> > In the interests of pragmatism, I don't think I'd pursue anything more
> > detailed here unless I run up against some situation where it doesn't
> > work as desired. I think, overall, a default of sane incremental
> > collection (maybe with the GCPAUSE set to 110) and a backstop emergency
> > GC should be a good solution for most cases.
>
> I'd say a configurable emergency GC should be a good solution for most
> cases. That is, the user should be able to define how the emergency GC will
> run, which will give him a choice between speed and memory usage. Not hard
> to do, fortunately. Again, things will probably be more clear after I post
> the results of Robert's patch here.
Your posts helped me alot with developing the EGC patch, thanks.
Another place to tune Lua's memory usage is the size of the internal string 
table.  Right now the table will grow to twice the old size when the number 
of unique strings equals the old table size.  This is not required for the 
string table since it uses buckets that chain together nodes that are 
allocated seperately (unlike the hashpart of Lua tables).  The string tables 
bucket array can be many times smaller then the total number of unique 
strings.  Having a smaller number of buckets will only slow down new string 
creation.  Once strings are internalized they don't touch the string table.
some quiuck tests with life.lua:
1Mbyte of memory without EGC patch:
str count = 257, max buckets 512
1Mbyte of memory with EGC:
str count = 256, max buckets 512
64Kbyte of memory with EGC:
str count = 128, max buckets 256
1Mbyte of memory with EGC and allow bucket count < 2 * str count:
str count = 273, max buckets 256
64Kbyte of memory with EGC and allow bucket count < 2 * str count:
str count = 145, max buckets 128
1Mbyte of memory with EGC and allow bucket count < 4 * str count:
str count = 273, max buckets 128
64Kbyte of memory with EGC and allow bucket count < 4 * str count:
str count = 145, max buckets 64
Each bucket is a pointer so 512 buckets would equal 2Kbytes with 32bit 
pointers.
I only timed two tests to see the difference in speed:
64Kbyte of memory with EGC:
str count = 128, max buckets 256
real time = 11.035 seconds
64Kbyte of memory with EGC and allow bucket count < 4 * str count:
str count = 145, max buckets 64
real time = 8.832 seconds
I was very surpised to see that the run time decrease with a smaller number of 
buckets, I guess the EGC didn't have to run as many times since there was an 
extra 768 bytes of memory.  Also notice that the string count is higher on 
the faster test, life.lua creates a lot of temp strings to render the output, 
so with more memory available to hold the string it doesn't have to recreate 
them as much.
The attached patch has the changes to allow bucket count < 4 * str count and 
the fprintf I used to get the above string & bucket counts.
I think it would be good to add support for tunning the low & high water marks 
for shrinking/growing the string table.
Atleast this shows that it is possible to squeeze more memory out of the Lua 
core.
-- 
Robert G. Jakabosky
_______________________________________________
Elua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev