Hey there I have a maybe naive question about caching I have Pants #general

Hey there, I have a (maybe naive) question about c...

fancy-policeman-6755

05/02/2024, 1:10 PM

Hey there, I have a (maybe naive) question about caching. I have a project with quite a few external dependencies. When I do a

pants run

on the target runnable pex, every time I touch any code on my repo (which is in itself quite light), Pants starts rebuilding all the requirements like this

Copy code

Building 26 requirements for my_app.pex from the 3rdparty/python/default.lock resolve: Jinja2<4.0,>=3.0.0, PyYAML>=6.0.1

which (even with

pantsd

enabled) takes well over 30 seconds every time, even for very minor code changes on my side. I've checked the common issues and global options, but I feel I'm missing something here.

fancy-policeman-6755

05/02/2024, 1:11 PM

The

pants.toml

(slightly cleaned up, there's also some

black

and

flake8

options around there, plus the sources config) is basically

Copy code

[GLOBAL]
pants_version = "2.18.3"
backend_packages = [
  "pants.backend.python",
]
local_cache = true
pantsd = true

[source]
...

[python]
enable_resolves = true
default_resolve = "default"
interpreter_constraints = ["==3.11.*"]

[python.resolves]
default = "3rdparty/python/default.lock"
tools-external = "3rdparty/python/tools-external.lock"

[python-repos]
indexes = ["<https://some.internal.repo/repository/pip-cache/simple>"]

[setuptools]
install_from_resolve = "tools-external"
requirements = ["//tools/external:requirements"]

wide-midnight-78598

05/02/2024, 1:18 PM

Few questions: • Is there anything in the logs for

.pants.d

folder that might point at a culprit? • Does this happen when you just run

pants package

on the pex? • Does

-ldebug

give any potential insight as to why it's packaging requirements? • Are you using or modifying any of the ignore items? https://www.pantsbuild.org/2.20/reference/global-options#pants_ignore https://www.pantsbuild.org/2.20/reference/global-options#pants_ignore_use_gitignore

fancy-policeman-6755

05/02/2024, 2:03 PM

any of the ignore items

🤦

fancy-policeman-6755

05/02/2024, 2:03 PM

using gitignore, set to ignore everything like

.*/

fancy-policeman-6755

05/02/2024, 2:03 PM

which includes

.pants.d

fancy-policeman-6755

05/02/2024, 2:04 PM

thanks for the quick answer 😅

fancy-policeman-6755

05/02/2024, 2:20 PM

So, updates • Does this happen when you just run

pants package

on the pex? -> Yeah, same behaviour • It actually still happens even if I set

pants_ignore_use_gitignore = false

. Every time I add a new character, it goes on "building 26 requirements". • Added

pants -ldebug

but I don't get much other info 😕 • The logs in

.pants.d

show me the same stuff,

Copy code

16:17:59.15 [33m[WARN][0m /home/.../specifiers.py:255: DeprecationWarning: Creating a LegacyVersion has been deprecated and will be removed in the next major release
  warnings.warn(

16:18:21.90 [INFO] Completed: Building 26 requirements for some_app.pex from the 3rdparty/python/default.lock resolve: Jinja2<4.0,>=3.0.0, PyYAML>=6.0.1, beautifulsoup4<5.0,>=4.12.2... (498 characters truncated)
16:18:21.98 [INFO] Wrote dist/some_app.pex

wide-midnight-78598

05/02/2024, 2:54 PM

This is strange, is there a minimal reproducible example you could post? Also, does this happen on pants 2.20 (for example)?

wide-midnight-78598

05/02/2024, 2:56 PM

I may have misread this original post. Does this happen on consecutive runs of

pants package

? Or is it ONLY after some code has been manipulated?

wide-midnight-78598

05/02/2024, 2:58 PM

On the machine I first replied on, the “I touch any code” was cut-off - leading me to think this happened every time you ran the same command, regardless of whether you touched code 🤦‍♂️

wide-midnight-78598

05/02/2024, 3:38 PM

So, what you're currently seeing isn't unusual (at least, not to me) - where modifying the targets that are packaged up causes a re-build. I have that happen too on some of my projects. Some of the ways I've tried to mitigate this when it annoys me are splitting up my dependencies to be a bit finer and easier to cache independently edit: Struggling to find the project where I did it, but I recall doing something like this: https://www.pantsbuild.org/blog/2022/08/02/optimizing-python-docker-deploys-using-pants#multi-stage-build-leveraging-2-pexs I don't know if that will work for you, but it just so happened to work in my circumstance because of how everything else needed to work. Essentially, I built a requirements pex and an application pex - and merged the two in some way to build the binary I needed

dry-architect-80370

05/02/2024, 8:42 PM

Is it the same when you run

pants run path/to/your_main_file.py

fancy-policeman-6755

05/03/2024, 6:04 AM

@wide-midnight-78598

Or is it ONLY after some code has been manipulated?

Yeah, exactly, only after manipulating code. I see that multistage build trick but it seems very targeted for the building of Dockerfiles, I'd have to see if I can adapt it. @dry-architect-80370

Is it the same when you run
pants run path/to/your_main_file.py
?

Actually no! Then it's fine. However, the PEX uses a custom entrypoint to launch a FastAPI server via gunicorn. Not sure there'd be an easy way to reproduce this with

pants run <file>

dry-architect-80370

05/03/2024, 8:29 AM

Interesting.. What do you put in the dependencies field in the pex target?

fancy-policeman-6755

05/03/2024, 9:42 AM

Copy code

pex_binary(
    name="some_api_pex",
    script="gunicorn",
    args=[
        "<http://some_api.app:create()|some_api.app:create()>",
        "--worker-class",
        "uvicorn.workers.UvicornWorker",
    ],
    dependencies=[
        ":some_api_src",
        "//:requirements#gunicorn",
        "//:requirements#uvicorn",
    ],
    output_path="some_api.pex",
)

where the

:some_api_src

are just the python sources

wide-midnight-78598

05/03/2024, 12:06 PM

Okay, so you’re using this something like I am - with the pex holding unicorn and gunicorn. I literally just discarded the multi-step pex example last night. Let me try to do this with one of my simpler repos this morning. While the example works for docker specifically, that’s more how the pex files are reconciled in the end. I did something similar with

scie

packages.

dry-architect-80370

05/03/2024, 12:07 PM

You could try changing

:some_api_src

dep to just

<http://some_api.app|some_api.app>

(not sure what's the correct syntax to point it to a single file, but it's def doable), otherwise I have no other ideas

wide-midnight-78598

05/03/2024, 12:43 PM

@fancy-policeman-6755 Would there be any benefit in structuring it something like this? All your deps in one pex, sources in another and then a top-level pex that takes in both? My usecase in the end does something like this - so your mileage may vary. I realized it's hard to really make a good example on some of my trivial projects, because my computer is so fast, that changing sources is a sub-1 second re-build anyways 🤦‍♂️

PEX_TOOLS=1 python3.11 hellofastapi-pex.pex venv --bin-path prepend --compile --rm all ./venv

./venv/bin/uvicorn hellofastapi.main:app

Copy code

python_sources(
    name="libhellofastapi",
    sources=["**/*.py"],
)

pex_binary(
    name="hellofastapi-deps",
    include_sources=False,
    include_tools=True,
    dependencies=["//:reqs#uvicorn", "//:reqs#fastapi", "//:reqs#numpy", "//:reqs#pywebview", "//:reqs#gunicorn"]
)

pex_binary(
    name="hellofastapi-srcs",
    include_requirements=False,
    include_tools=True,
    dependencies=[":libhellofastapi"]
)

pex_binary(
    name="hellofastapi-pex",
    dependencies=[":hellofastapi-srcs", ":hellofastapi-deps"],
    include_tools=True,
)

fancy-policeman-6755

05/07/2024, 11:13 AM

Hey there! So the problem is that I don't think

uvicorn

and

gunicorn

are the heaviest dependencies in there (there's also fastapi, dash, and a bunch of other heavy things). If I understand right, I'd have to explicitly define them in the

hellofastapi-deps

pex, which defeats a bit the point of

pants

autodiscovery.

fancy-policeman-6755

05/07/2024, 11:17 AM

However, I've been thinking about what @dry-architect-80370 says about just running a source file directly. I guess the key point is that I don't really need to run the pex itself to test incremental changes. And it's building the pex (so I guess zipping the deps) that causes the slowdown. So what I did is just add a

Copy code

if __name__ == "__main__":
  import uvicorn
  uvicorn.run(create_app(), debug=True, ...)

in the module that creates the FastAPI app that is called in the pex entrypoint by gunicorn. Then I can just run directly the module, without having to build the whole pex. It's not exactly the same as in the production env, but for testing minor code change it's enough.

5 Views

Open in Slack

Previous Next