< sparse lifeguard 95737> you ve had a series of issues arou Pants #development

<@U02Q3KHC59B>: you’ve had a series of issues arou...

witty-crayon-22786

09/13/2022, 7:14 PM

@sparse-lifeguard-95737: you’ve had a series of issues around `experimental_shell_command`: it’d be interesting to hear more about the use case, and how it is working out. we’ll need to scope what would be required to stabilize it at some point.

👀 1

sparse-lifeguard-95737

09/13/2022, 7:37 PM

I’ve primarily been trying to use it to work around other not-yet-implemented features in Pants 🙂

sparse-lifeguard-95737

09/13/2022, 7:41 PM

for example, we’re unfortunately still in a situation where front-end and back-end code is tightly coupled, so we need to build all the front-end assets as part of building docker images for the web app. we first tried adding all the raw static asset files as dependencies to the

docker_image

target, but hit a wall because: • there are so many dependencies to tailor and include on each image (we reuse the same assets for multiple apps) • building the assets within docker was much less performant than building directly on the host (mainly because of the docker emulation perf on M1 macs) I had the idea to string together

yarn

commands in

experimental_shell_command

targets to recreate the front-end build process. we’d still need to manually tailor

file

targets for everything, but ideally only specify the dependencies once. and

docker_image

s would depend on the final output of the shell commands instead of all the raw sources

sparse-lifeguard-95737

09/13/2022, 7:42 PM

my proof-of-concept showed that it’s possible to set this up, but still a bit messy. and I hit https://github.com/pantsbuild/pants/issues/16825 for a command that produced a big

node_modules/

so I might have hit a dead end

sparse-lifeguard-95737

09/13/2022, 7:44 PM

another

experimental_shell_command

use-case I considered was injecting a

docker pull

for each of our

docker_image

targets, to improve our cache hit-rate. I never pursued the idea because I figured out a hack to wire in buildx’s

--build-from

support to get remote caching there

sparse-lifeguard-95737

09/13/2022, 7:48 PM

experimental_shell_command

also isn’t very

asdf

-friendly - to run

yarn

(which we install via

asdf

) I had to add these

tools

Copy code

bash
basename
dirname
sed
uniq
cut
awk
grep
head
tr
readlink
uname

and these `extra_env_vars`:

Copy code

ASDF_DIR
HOME

and a dependency on

//.tool-versions

I’m not sure if it’s on Pants to make

asdf

Just Work, but it is a surprising mismatch to have dedicated support for it at the bootstrapping layer and not in process execution

witty-crayon-22786

09/13/2022, 8:02 PM

I’ve primarily been trying to use it to work around other not-yet-implemented features in Pants 🙂

yea, that’s definitely the intent. we’d like it to be possible to get JS applications packaged this way.

witty-crayon-22786

09/13/2022, 8:04 PM

my proof-of-concept showed that it’s possible to set this up, but still a bit messy. and I hit https://github.com/pantsbuild/pants/issues/16825 for a command that produced a big
node_modules/
so I might have hit a dead end

i’ll take a look at this one…. getting a profile while it’s running would be helpful, but it sounds like it might have been capturing the

node_modules

into a snapshot? that was the goal, i assume? if you capture a zip/tar file as the output of the process instead, does performance get better?

👀 1

witty-crayon-22786

09/13/2022, 8:17 PM

https://github.com/pantsbuild/pants/issues/16825#issuecomment-1245912195

sparse-lifeguard-95737

09/13/2022, 8:50 PM

https://app.toolchain.com/organizations/color/repos/color/builds/pants_run_2022_09_13_16_49_38_116_3fef1fa93fcc4d4199978b95906c0eae/ is the run with trace-level streaming workunits

👍 1

witty-crayon-22786

09/13/2022, 9:14 PM

got it, thanks: that actually perfectly highlights the issue.

sparse-lifeguard-95737

09/13/2022, 9:20 PM

Nice!

witty-crayon-22786

09/13/2022, 9:28 PM

https://github.com/pantsbuild/pants/issues/16825#issuecomment-1245971491

witty-crayon-22786

09/14/2022, 8:30 PM

no rush, but when you have time to check it:

I assume that you are running with
cache_content_behavior="fetch"
(the default)? If so, you might try capturing another trace with
cache_content_behavior="validate"
to see how much of a difference that makes. If it’s significant, we can probably adjust that codepath to unconditionally use
validate
for a local cache.

👍 1

sparse-lifeguard-95737

09/14/2022, 8:45 PM

https://app.toolchain.com/organizations/color/repos/color/builds/pants_run_2022_09_14_16_45_13_208_d1ab5cac59e542f99c6523b2bf917a6c/

witty-crayon-22786

09/14/2022, 8:56 PM

…ok, yea. that’s the difference between 204s to “fetch” the cache entry, vs 9.6s to “validate” it. thanks!

witty-crayon-22786

09/14/2022, 8:57 PM

the disconnected

setup_sandbox

workunit is still a mystery, but we can fix this bit.

witty-crayon-22786

09/14/2022, 9:11 PM

are you able to privately share some of the repro? or do you have a theory about which other processes might have been running?

sparse-lifeguard-95737

09/15/2022, 3:10 PM

I should be able to share a minimized repro - it’s a big

yarn install

so nothing really proprietary about it AFAIK. will try to put something together

witty-crayon-22786

09/15/2022, 3:43 PM

While the scale is useful in terms of the total timings, my guess is that something that is just "shaped" correctly will still repro the dangling workunit

👍 1

witty-crayon-22786

09/15/2022, 3:43 PM

But yea, thanks: either way.

sparse-lifeguard-95737

09/15/2022, 6:07 PM

https://github.com/danxmoran/pants-yarn-install-repro

sparse-lifeguard-95737

09/15/2022, 6:08 PM

I suspect that if I

.tar.gz

’d the

node_modules/

output dir things would drastically improve

sparse-lifeguard-95737

09/15/2022, 6:09 PM

checking now…

sparse-lifeguard-95737

09/15/2022, 6:20 PM

“drastically improve” other than eating the extra IO to zip/unzip the giant node_modules…

sparse-lifeguard-95737

09/15/2022, 6:20 PM

will open a PR in the repo and leave it unmerged

2 Views

Open in Slack

Previous Next