Or I guess it could mean both Ruby and Go are not accessing the macOS filesystem in the most performant way. Apple has plenty of performance tips on filesystem performance - Performance Tips
I wonder if ramdisk on Mac is slower than ramdisk on a VM on the same system
If this is the case it would indicate there is some sort of filter driver slowing stuff down
@tenderlove what approach should we be taking here, putting all gems into one file in some sort of cache (a zero validation bootsnap sort of option), or should we be giving up and kicking a fuss with our Apple friends? something else?
I made an APFS ram disk on macOS:
diskutil partitionDisk $(hdiutil attach -nomount ram://2048000) 1 GPTFormat APFS 'ramdisk' '100%'
Copied 10,000 files each of a random 100 bytes, and then ran the same Go code as before against the ramdisk. The initial run was much quicker than directly against the filesystem, but still not super fast. After multipe runs it also still ended up at around the same 240ms.
$ ./gofile
difference = 646.605246ms
$ ./gofile
difference = 440.253606ms
$ ./gofile
difference = 340.012913ms
$ ./gofile
difference = 271.434962ms
...
$ ./gofile
difference = 232.221546ms
$ ./gofile
difference = 238.020729ms
I repeated with a HFS+ ramdisk and got similar results.
Itâs not like there is many alternatives to open()
and read()
syscalls⊠And the documentation you point to is for Switft/Objective-C APIs, itâs safe to assume they use these syscalls under the hood just like Go, Ruby any pretty every programming languages.
I have heard from a small avian friend that the macOS filesystem sandbox security guarantees mean that read()
is around the same speed but open()
is dramatically slower due to sandbox/security guarantees. It also seems like those guarantees are non-negotiable from the OS security side. I think that leaves options like:
- Try to cooperate with someone who has the right job to optimize the security stuff without disabling it, with a payoff in months or years.
- Try to add something to Bootsnap or Bundler that reduces calls to
open()
. Has anyone tried concatenating all their library files into one giant file and comparing perf that way?
I suppose thereâs nothing stopping us from doing both.
So that backs up my hunch above.
It also means it doesnât only interest Rubyist but pretty much all developers. Even if I donât have high hopes, publicizing this on popular places such as HN might actually lead to something long term. But yes, I agree that this isnât something you can do now and wait on.
I tried several variations of that a few years back when I was working on bootscale (bootsnapâs father), but there are many problems I didnât find a solutions to. Out of the top of my mind changing the code path mean you need to rewrite all __FILE__
, __LINE__
and dependant calls such as require_relative
etc.
However what we can do much more easily to get most of gain with little effort is to store the Bootsnap iseq in a giant indexed file. Because anyway, after your first boot youâre no longer reading the ruby source file, but the bootsnap cache.
We just need to have an efficient way to keep a big mmaped hash opened. Something like this: GitHub - luispedro/diskhash: Diskbased (persistent) hashtable
So I wrote a quick hacky PR: HACK: try to store iseqs in LMDB to reduce the number of opened files by casperisfine · Pull Request #297 · Shopify/bootsnap · GitHub
This uses LMDB as a backend to store the ISeqs. Weâve been using LMDB as a store for sprockets since years, itâs a bit finicky to use, but I think weâve ironed most of the bugs out since then.
On micro benchmarks LMDB#get
itâs 4 times faster than File.read
on my machine: lmdb.rb · GitHub .
Also on paper, this save 1
open()
syscall per ruby file loaded, and if we were to consider gems content as immutable, we could also avoid the open()
and stat
used to validate the cache freshness.
However when testing it again our app I canât seem to see any performance improvements. Iâd like to try it against the discourse benchmark, but Iâm having trouble setting it up.
Itâs possible that what is gained by avoiding many open()
syscalls is lost by going through the LMDB bindings and managing these blobs in Ruby rather than in C like the regular bootsnap cache store does. For instance bootsnap uses rb_str_new_static
to avoid copying the cache blobs. To do the same Iâd need to query LMDB from C.
Sorry to barge right in but does the secure boot layer change work in the startup boot utility of OSX?
The screenshot below mentions something about secure boot but donât know if it has anything to do with the disk access or whether you can disable it for the current system.
Thatâs not secure boot weâre taking about but âSystem Integrity Protectionâ: How to turn off System Integrity Protection on your Mac | iMore
I turned off System Integrity and re-ran my go code. Same sort of results as before:
$ ./gofile
difference = 2.72653261s
$ ./gofile
difference = 1.159182558s
$ ./gofile
difference = 605.57577ms
$ ./gofile
difference = 388.847999ms
...
$ ./gofile
difference = 254.285507ms
$ ./gofile
difference = 250.556632ms
Unfortunately, my source does not know whether stat() is also slowed down.
To be clear, the security/sandbox I am referring to here has nothing to do with secure boot or system integrity protection. Modern versions of macOS have per-process access controls for basically all hardware, including the file system. Thereâs a conceptual introduction here if youâre interested:
The kernel-level security framework that keeps processes from automatically having access to your camera and all your files. That thing. You canât turn it off.
FWIW, I think we should at least try to put together a Radar for this. I think weâre close to a reproducible example.
I understand the guarantees that might be causing this behavior are non-negotiable, but itâs possible that no one at Apple realized theyâre slow in these cases and no oneâs ever tried to optimize them.
Yes, I definitely agree with this. I have also heard that there might be available headcount on the team responsible, so if anyone wants to work on it, get in touch.
On behalf of thousands of developers feeling this every day I just want to say this thread makes me very very excited.
Sooooooo did anyone manage to get anywhere with flagging this to Apple?
I am sure they read this boot on m1 is way better even with the slow file access, you just need native vs Rosetta to get the full bang
Almost night and day difference for me. Identical RSpec test on M1 is almost 6x faster in the âfiles took X seconds to loadâ bit.