I have a particular long running and data intensive rake task that
sometimes works great and sometimes either hangs or exits unexpectedly
with no message. The places where it fails vary from run to run, but
seem to be related to the size of the data I'm passing to it. (In
other words, some of my data sets that are smaller never fail)
I suspect that it is running out of memory. Is there a way to increase
the amount of memory that is available to Ruby and/or my rake task?
Ruby will take pretty much as much as it can get if needed; there's
not a hard maximum like with Java. Sometimes that won't do the right
thing - running 'gem install rails' on a small (<256M) VPS would eat
all of main memory and all of swap a few years ago.
You may want to try logging space usage, perhaps using
ObjectSpace.count_objects or by doing this:
real_size = `ps -o rsz #{$$}`.split("\n")[1].to_f
which will give you the physical memory usage of your process (rsz may
need to be rss; try it manually to see). I snagged that little snippet
from the New Relic agent code.
You might also want to see if you're helping the GC by releasing
references to stuff you don't need (set things to nil), and in extreme
circumstances might even want to call GC.start if you've just created
a bunch of objects that you no longer need.