We've got quite a few `package`-able Python target...
# general
b
We've got quite a few
package
-able Python targets: mostly
python_awslambda
but some
pex_binary
too. We run
./pants package ::
in CI to validate that they're all packageable, but this ends up spending a lot of time bundling up requirements (~10s per target, as a rule of thumb), and thus it'd be nice to make this faster. I've made two observations: 1. changing our code without changing the dependency structure seems to require repackaging the dependencies from scratch (I ran
./pants package ::
before modification locally, made the change, and then
./pants package ::
after, with the same pantsd process) 2. multiple targets that have the same set of requirements seem to build their requirements separately (more details in thread) Is this expected behaviour? Any tips for improving it?
Example for 2: here's the relevant subset of the log for two targets (renamed
first.zip
and
second.zip
) that use the same dependencies:
Copy code
22:39:03.55 [INFO] Starting: Building 16 requirements for first.zip from the 3rdparty/python/default.lock resolve: asyncpg==0.27.0; python_version >= "3.9" and py... (1124 characters truncated)
22:39:12.91 [INFO] Completed: Building 16 requirements for first.zip from the 3rdparty/python/default.lock resolve: asyncpg==0.27.0; python_version >= "3.9" and py... (1124 characters truncated)
22:39:22.45 [INFO] Starting: Building 16 requirements for second.zip from the 3rdparty/python/default.lock resolve: asyncpg==0.27.0; python_version >= "3.9" and python... (1120 characters truncated)
22:39:31.71 [INFO] Completed: Building 16 requirements for second.zip from the 3rdparty/python/default.lock resolve: asyncpg==0.27.0; python_version >= "3.9" and python... (1120 characters truncated)
• Based on the timestamps these builds are running sequentially (i.e. it doesn't seem like there'd be concurrency issues that inhibit cache usage) • Both steps take ~9 seconds
h
Hmmmmm
That is unexpected, I think
1. is not unexpected, at least in that the zip obviously needs to bundle all the deps inside it, even if they haven’t changed from the previous zip, so I would guess that is what’s taking the time
But 2. is unexpected, that should surely only happen once
b
For 1, I see the "Building ... requirements for ..." output taking a long time, so I take it that includes the time to construct a zipfile with the dependencies and source? That would suggest that 2 is similar: it's just the time to construct the packaged zip (including both deps and source) that's the problem?
h
I’ll dig in a little more tomorrow, I’m not 100% sure what that process does exactly.
It’s possible that the description is a tad misleading and it’s actually building the package itself, including the sources.
👍 1
h
package
is our slowest Python goal because it builds a single PEX with both source files and requirements in the same thing. Normally we use a dedicated requirements.pex and then loose source files so that changes to source code don't result in needing to build any new pex
👍 1
h
Yes, but it’s unclear that “Building requirements for…” is doing that work. The description is misleading, as it sounds like it’s just resolving.
👍 2
b
Okay, thanks. It sounds like the speed here is about normal, which is helpful to know and guide optimisation efforts. 👍 Just brainstorming: I wonder if it’s feasible (and faster) to build these sort of zips incrementally, eg two steps that are independently catchable: 1. build the requirements into a zip (presumably shared across all targets that need the same deps, including tests etc) 2. append the source code into that zip
(I imagine a packed or loose layout would solve some of the problems here, but I’m not sure if that’ll work for us. I’ll have to think about it.)