I can configure `named_caches_dir = "/codefresh/vo...
# general
s
I can configure
named_caches_dir = "/codefresh/volume/.cache/pants/named_caches"
for CI but how do I configure
New virtual environment successfully created at /root/.cache/nce
? I'm trying out removing
./pants
and just using
pants
via
get-pants.sh
but my CI keeps bootstrapping pants. If I could tell pants to look at ``codefresh/volume/.cache/nce`` I think that'd solve this
e
You can set the
SCIE_BASE
env var. See more here: https://github.com/a-scie/jump/blob/main/jump/README.md#the-nce-cas
s
Perfect, Thanks!
b
Is there any way for scie-pants to default this to underneath the pants umbrella?
e
That was a super lazy question, but no (yes in scie-jump, no in science): https://github.com/a-scie/lift/pull/41
w
I think I’m bumping into this - I am on circle ci and using the default cache dirs. I’m restoring the cache e.g.
Copy code
Found a cache from build 160764 at pants-bootstrap-2.16.0-20230822
Size: 113 MiB
Cached paths:
  * /home/circleci/.cache/pants/setup
  * /home/circleci/.cache/nce
  * /home/circleci/.cache/named_caches

Downloading cache archive...
Validating cache...

Unarchiving cache...
but using
./get-pants.sh
in CI is still bootstrapping Did I miss something? I’m fairly sure the cached pants is the same version as needed by pants.toml
e
@wonderful-boots-93625 your pattern match here was in error - nothing to do with Yusuf's request afaict. That said, the listed cached paths are the correct ones (
/home/circleci/.cache/nce
in particular assuming the user is
circleci
), but you don't want to run
./get-pants.sh
each time, it does not check cache and then proceed to ignore you as you're hoping. Instead, its very simple and it just always installs pants. I assume you're not using https://github.com/pantsbuild/actions/tree/main/init-pants? If so, maybe use that. It should have everything taken care of. In particular see here: https://github.com/pantsbuild/actions/blob/b16b9cf47cd566acfe217b1dafc5b452e27e6fd7/init-pants/action.yaml#L86-L99
That change was on May 3rd; so sort of recent: https://github.com/pantsbuild/actions/pull/21
Actually, this looks broken to me all the way back to the 1st commit (https://github.com/pantsbuild/actions/commit/ca4e0c41384a272b6bc7661080169ec0be465a85) with support in the action for scie pants. I think the step I pointed to must set the PATH to include ~/bin as its 1st step.
When I say broken, I'm assuming that assuming ~/bin happens to be on the PATH already (up to GitHub whims), is a bad assumption.
@happy-kitchen-89482 I feel like I must be crazy. It seems unlikely the init action has been broken like this for so long. Perhaps you can provide a sanity check.
h
So I understand the context - @wonderful-boots-93625 is running on circleci, so how is the github action relevant?
e
Ah right, I guess not except as an example of how to use get-pants.sh conditionally correctly. Which then leads to the still relevant question of if the init action does this correctly! AFAICT it does not.
h
IIRC this was intended to differentiate between self-hosted runners that already have
pants
preinstalled on their
$PATH
, and hosted runners that don't. In the latter case we expect to re-install every time anyway because we don't cache
~/bin
. So I think this is working as intended. We could add caching for the
pants
binary in the hosted runner case, but it's not clear to me that this would be meaningfully faster than running get-pants.sh.
Also note my comment from just now on that PR:
echo "$HOME/bin" >> $GITHUB_PATH
only affects subsequent actions in the job. So if we did want to check for existing
~/bin/pants
we would need to modify
$PATH
in this action.
So it is true that moving this line does not do what the author intended, AFAICT.
w
If all i have to do is cache the binary in
~/bin
and switch before calling get-pants.sh I can make that work.
but for now I think I have other bits that are taking more time than this install - I was just hopeful that it would be an easy fix.
r
Hey @happy-kitchen-89482 and @enough-analyst-54434. I hope you're doing well! I think I have an issue similar to Nasron's. I am running the
SCIE_BASE=/.cache/setup pants
binary, and this creates the
/.cache/setup/
folder containing both Python 3.9 and the virtual environment created by pants. I am running this while creating a Docker image, that's why the
/.cache/setup
is in my root folder. When I want to reuse the
/.cache/setup
data, I need to copy the folder to my CI/CD workspace directory that's shared by all CI steps, so I can reuse these files in all CI steps. I do this by running
cp -r /.cache/ $WORKSPACE/
, so in the end I have
$WORKSPACE/.cache/setup
available. When running pants again, I want to run it using the python and virtual environment installation available in
$WORKSPACE/.cache/setup
instead of the
/.cache/setup
, so I am running
SCIE_BASE=$WORKSPACE/.cache/setup pants
, but every time I try running this command, the command is downloading a new installation of python and creating a new virtual environment, overwriting I already copied. I'm using the same
pants.toml
between the first and second
pants
execution and I know pants is downloading the exact same Python version and creating the exact same virtual environment as the hashes of the folders created are the same. For some reason
pants
is just overwriting the
$WORKSPACE/.cache/setup
and I am not sure how to solve this issue. I am using pants 2.16 if that matters. To reproduce, you can do the following: 1. Create a new pants project using pants
2.16
. 2. Run
SCIE_BASE=<abs_path_to_cur_folder>/setup pants -V
. This will bootstrap Pants. 3. Copy the
setup
folder to
setup2
or something else in the same directory. 4. Run
SCIE_BASE=<abs_path_to_cur_folder>/setup2 pants -V
. This will reproduce the issue I am facing. One very interesting thing that's also happening is that if you delete the original
setup
folder, make a copy from
setup2
to
setup
, and run
SCIE_BASE=<abs_path_to_cur_folder>/setup pants -V
, it will reuse the files that are available in the
setup
folder instead of bootstrapping itself. Here are the exact commands I executed for debugging purposes:
Copy code
$ cat pants.toml
[GLOBAL]
pants_version = "2.16.0"
pantsd = false
$ SCIE_BASE=/Users/mikaelsilva/testing-pants/setup pants -V
Downloading <https://github.com/indygreg/python-build-standalone/releases/download/20230507/cpython-3.9.16%2B20230507-x86_64-apple-darwin-install_only.tar.gz>...
Bootstrapping Pants 2.16.0 using cpython 3.9.16
Installing pantsbuild.pants==2.16.0 into a virtual environment at /Users/mikaelsilva/testing-pants/setup/50b60c9f29c54578592fd2b4725722435f65d43cb21e6e2732268b6d5050a09a/bindings/venvs/2.16.0
New virtual environment successfully created at /Users/mikaelsilva/testing-pants/setup/50b60c9f29c54578592fd2b4725722435f65d43cb21e6e2732268b6d5050a09a/bindings/venvs/2.16.0.
2.16.0
$ cp -r setup setup2
SCIE_BASE=/Users/mikaelsilva/testing-pants/setup2 pants -V
Bootstrapping Pants 2.16.0 using cpython 3.9.16
Installing pantsbuild.pants==2.16.0 into a virtual environment at /Users/mikaelsilva/testing-pants/setup2/50b60c9f29c54578592fd2b4725722435f65d43cb21e6e2732268b6d5050a09a/bindings/venvs/2.16.0
New virtual environment successfully created at /Users/mikaelsilva/testing-pants/setup2/50b60c9f29c54578592fd2b4725722435f65d43cb21e6e2732268b6d5050a09a/bindings/venvs/2.16.0.
2.16.0
These are the steps that are recreating the same exact folders that already exist, now the part that I find very interesting:
Copy code
$ rm -rf setup
$ cp -r setup2 setup
$ SCIE_BASE=/Users/mikaelsilva/testing-pants/setup pants -V
2.16.0
I am not sure why it works when I run it with a folder that I used in the past, but not when using a new folder. There were a few warning about not using the
telemetry
parameter that I removed from the text above so it would be easier to read.
e
@rhythmic-butcher-20315 I did not read all this but stopped here "BASE_SCI=/.cache/setup ./pants" - SCIE_BASE is only respected by and understood by
pants
- the pants binary (otherwise known as scie-pants). Your leading command says
./pants
- which is the pants shell script. That is much older and completely different. It uses
~/.cache/pants/setup
. So, before going further, in the hopes this clears up confusion:
./pants
and
pants
should be considered totally unrelated ways to launch Pants. Pick one, then go from there. The one you pick should be
pants
for all new projects / repos using Pants since that is more modern, the now-supported way to run Pants.
Basically, you have a lot of apparent typos - maybe, but they are close enough to the real things (
./pants
vs
pants
,
~/.cache/setup
vs
~/.cache/pants/setup
). That makes understanding if you are confused or typoing or both unclear in all of the above.
r
@enough-analyst-54434 Ty for your input. I had a few typos as you noticed and I think I might be using the older pants binary. I tested what I said in both a docker container and locally on my mac. I think the
get-pants.sh
script I am using in my docker image might be outdated. I'll download the newer
get-pants.sh
script, install the newest pants version and try to repro this issue again.
Ok, so, after deleting every instance of pants that I had on my machine and downloading the latest one with the
get-pants.sh
script, I still can't do
Copy code
$ SCIE_BASE=/Users/mikaelsilva/testing-pants/setup pants -V
$ cp -r setup setup2
$ SCIE_BASE=/Users/mikaelsilva/testing-pants/setup2 pants -V
When running the second command, pants still tries to download everything again. That's my main issue. I want to reuse what I already downloaded in the
setup
folder that I copied to
setup2
. I also updated the original post because I was using the new
pants
binary there too.
e
Ok - I repro, but I believe this is because venvs are not relocatable (they have scripts with absolute paths in shebangs. So the question is why do you do this this way? Why can't the path be the same for each step?
r
It's because I want to create a docker image with all necessary pants files, but when I use the image in the CI pipeline, only files that are available in
/drone/src/github.com/company/repo/
are shared between build steps. Currently, we have a
bootstrap
CI step that runs
SCIE_BASE=/drone/src/github.com/company/repo/.cache/setup pants -V
, and this creates a
pants
installation that is shared between steps, but this step always bootstraps Pants and I wanted to preprocess this step in the image itself to save some time.
e
So, why not a Dockerfile like:
Copy code
# ... other steps to get `pants` (scie-pants) set up ...
RUN touch pants.toml && \
    for version in ${VERSIONS_NEEDED[@]}; do \
        PANTS_VERSION=${version} pants; \
    done && \
    rm pants.toml
No custom SCIE_BASE, no copying in the pants scie cache - just get the cache built in the image itself.
r
I am doing something very similar to what you proposed, but then pants is only available in this image. We want to run pants inside a few different images we already have in our CI pipeline. That being said, It would be a lot easier to just create a simple pants image as you suggested and check if we can use it in all our CI steps, instead of doing this MacGyver stuff. Thank you again for all the support!
e
Yeah, since venvs are platform specific, trying to build them here then copy them there is a fools errand afaict.
👍 1
😊 1
Are you allowed to volume-mount an image? You could make a pants image that you use as a data image and volume mount it into the existing corporate images.
I do this for Pex in CI. There is a base image I volume mount in a data-only cache image for various caches I want to share and update separately from the base image.
r
I am not sure about this approach. I'll also check if someone in the company can share me some info about `volume-mount`s. At this moment, I'm just trying to build a PoC for a way to make our CI builds faster.
e
If you're using docker but haven't heard of volume mounts - definitely read up. In almost every build situation I've seen, they are integral and critical.
To be sure - volume mounting from an image is decidedly less common. Typically you volume mount in from the host machine filesystem.