<@U02Q3KHC59B>: you’ve had a series of issues arou...
# development
w
@sparse-lifeguard-95737: you’ve had a series of issues around `experimental_shell_command`: it’d be interesting to hear more about the use case, and how it is working out. we’ll need to scope what would be required to stabilize it at some point.
👀 1
s
I’ve primarily been trying to use it to work around other not-yet-implemented features in Pants 🙂
for example, we’re unfortunately still in a situation where front-end and back-end code is tightly coupled, so we need to build all the front-end assets as part of building docker images for the web app. we first tried adding all the raw static asset files as dependencies to the
docker_image
target, but hit a wall because: • there are so many dependencies to tailor and include on each image (we reuse the same assets for multiple apps) • building the assets within docker was much less performant than building directly on the host (mainly because of the docker emulation perf on M1 macs) I had the idea to string together
yarn
commands in
experimental_shell_command
targets to recreate the front-end build process. we’d still need to manually tailor
file
targets for everything, but ideally only specify the dependencies once. and
docker_image
s would depend on the final output of the shell commands instead of all the raw sources
my proof-of-concept showed that it’s possible to set this up, but still a bit messy. and I hit https://github.com/pantsbuild/pants/issues/16825 for a command that produced a big
node_modules/
so I might have hit a dead end
another
experimental_shell_command
use-case I considered was injecting a
docker pull
for each of our
docker_image
targets, to improve our cache hit-rate. I never pursued the idea because I figured out a hack to wire in buildx’s
--build-from
support to get remote caching there
experimental_shell_command
also isn’t very
asdf
-friendly - to run
yarn
(which we install via
asdf
) I had to add these
tools
:
Copy code
bash
basename
dirname
sed
uniq
cut
awk
grep
head
tr
readlink
uname
and these `extra_env_vars`:
Copy code
ASDF_DIR
HOME
and a dependency on
//.tool-versions
I’m not sure if it’s on Pants to make
asdf
Just Work, but it is a surprising mismatch to have dedicated support for it at the bootstrapping layer and not in process execution
w
I’ve primarily been trying to use it to work around other not-yet-implemented features in Pants 🙂
yea, that’s definitely the intent. we’d like it to be possible to get JS applications packaged this way.
my proof-of-concept showed that it’s possible to set this up, but still a bit messy. and I hit https://github.com/pantsbuild/pants/issues/16825 for a command that produced a big
node_modules/
so I might have hit a dead end
i’ll take a look at this one…. getting a profile while it’s running would be helpful, but it sounds like it might have been capturing the
node_modules
into a snapshot? that was the goal, i assume? if you capture a zip/tar file as the output of the process instead, does performance get better?
👀 1
s
w
got it, thanks: that actually perfectly highlights the issue.
s
Nice!
no rush, but when you have time to check it:
I assume that you are running with
cache_content_behavior="fetch"
(the default)? If so, you might try capturing another trace with
cache_content_behavior="validate"
to see how much of a difference that makes. If it’s significant, we can probably adjust that codepath to unconditionally use
validate
for a local cache.
👍 1
w
…ok, yea. that’s the difference between 204s to “fetch” the cache entry, vs 9.6s to “validate” it. thanks!
the disconnected
setup_sandbox
workunit is still a mystery, but we can fix this bit.
are you able to privately share some of the repro? or do you have a theory about which other processes might have been running?
s
I should be able to share a minimized repro - it’s a big
yarn install
so nothing really proprietary about it AFAIK. will try to put something together
w
While the scale is useful in terms of the total timings, my guess is that something that is just "shaped" correctly will still repro the dangling workunit
👍 1
But yea, thanks: either way.
s
I suspect that if I
.tar.gz
’d the
node_modules/
output dir things would drastically improve
checking now…
“drastically improve” other than eating the extra IO to zip/unzip the giant node_modules…
will open a PR in the repo and leave it unmerged