https://pantsbuild.org/ logo
#development
Title
# development
a

aloof-angle-91616

04/11/2020, 6:10 AM
i'm hacking away on a framework (https://github.com/cosmicexplorer/upc) to share memory between processes on a local machine so that i can delegate filesystem operations in pants to a virtual git filesystem (VCFS) from twitter that i'm trying to open source. there's nothing there right now but my plan is to make it really easy for: (1) _virtual filesystems_: to communicate their contents with pants (2) _command-line tools_: to be modified to read file contents from shared memory instead of the filesystem (3) _build tools_: to have super high-resolution tracing on the subprocesses they invoke without having to build in zipkin themselves, to replay nondeterministic failures from a remote machine! might go nowhere!! but i have a prototype of the shared memory part working and i think it can be made generic so that we can plug arbitrary subprocesses into our virtual file system and still have them read everything at high speed, without taking up any temp dir space! let me know if that intersects with any work anyone else is doing anytime this year!
😎 1
w

witty-crayon-22786

04/11/2020, 4:26 PM
neat.
it overlaps a bit with brfs, so make sure you've taken a look at that: https://github.com/pantsbuild/pants/blob/master/src/rust/engine/fs/brfs/src/main.rs#L661
no plans to push on that soon since i haven't really profiled that portion of the code
a

aloof-angle-91616

04/11/2020, 4:28 PM
i'm looking at brfs now
w

witty-crayon-22786

04/11/2020, 4:28 PM
it's a FUSE filesystem that daniel wrote a while back. read-only from a snapshot.
a

aloof-angle-91616

04/11/2020, 4:29 PM
ok fantastic
thank you for pointing me to that!
i've been able to cut away a lot of superfluous unnecesarily complex communication since thinking about this last night
w

witty-crayon-22786

04/11/2020, 4:29 PM
mounting it below a union/overlayfs would allow for reading through from the snapshot, and writing out to something else.
a

aloof-angle-91616

04/11/2020, 4:29 PM
ah, ok
i'm currently trying to see how much porting effort it would take to get scrooge to access a fully virtual filesystem (by editing the codebase), the idea being if i can virtualize file i/o, then i can avoid having to materialize or digest any files at all.
w

witty-crayon-22786

04/11/2020, 4:32 PM
in the workspace, you mean?
a

aloof-angle-91616

04/11/2020, 4:32 PM
yes
w

witty-crayon-22786

04/11/2020, 4:33 PM
one of the things that we had discussed was exposing a cheap "what is the digest of this path" operation
a

aloof-angle-91616

04/11/2020, 4:33 PM
yes
that was one of the things i made a lot simpler after thinking about it last night, but still up in the air (still drawing diagrams with arrows at this point)
right now i'm thinking that VCFS can write directly to the LMDB store by depending on the engine binary, then return a Digest (via thrift) to pants, so i think that is aligned with what you've just mentioned above
the thrift part works already, integrating the LMDB store via the engine crate i haven't tried yet
w

witty-crayon-22786

04/11/2020, 4:37 PM
there is prior art for exposing digests via filesystem metadata too
which can make it more generic.
a

aloof-angle-91616

04/11/2020, 4:37 PM
i hadn't realized that, thank you for the tip
sorry, what kind of prior art?
or do you mean in brfs as well
w

witty-crayon-22786

04/11/2020, 4:38 PM
a

aloof-angle-91616

04/11/2020, 4:38 PM
!!! haven't seen that word in a while
hm. that also sounds like a way to address https://github.com/pantsbuild/pants/issues/9428 (can't materialize symlinks)
(in that case we were able to work around neediing to materialize symlinks, though)
w

witty-crayon-22786

04/11/2020, 4:39 PM
a bit on the prior art in bazel https://github.com/bazelbuild/bazel/issues/923
a

aloof-angle-91616

04/11/2020, 4:40 PM
this is a very interesting issue
thank you again!
w

witty-crayon-22786

04/11/2020, 4:40 PM
(can search for xattr in that thread)
🌈 1
a

aloof-angle-91616

04/11/2020, 4:42 PM
i'm gonna spend a bit of time today prototyping the jvm library to virtualize i/o in-memory and seeing whether that's prohibitively slow for scrooge in the monorepo. i'm really interested in using the expected output files/directories from the EPR to make any file operations outside of the desired output just no-op, for example
i might get nowhere. but brfs is super helpful as a bridge between the hacks i have on top of VCFS and what i want to see, thanks again
will ask for help or just questions, etc
w

witty-crayon-22786

04/11/2020, 4:51 PM
what's the connection between scrooge and VCFS? that wouldn't be in the workspace probably... would be in a sandbox