Are there any simple tricks to get pants to get be...
# general
a
Are there any simple tricks to get pants to get better at resolving transitive dependencies? I told it to run some tests, it's been running for 10 minutes and it's like this currently:
Copy code
1917702 <http://cris.bi|cris.bi>+  20   0  813.5g   6.6g 108692 S 103.7  21.2  12:40.82 pantsd [/home/c
I have 32GB of memory, so it's not a huge issue, but our CI doesn't approve of this. 😞
r
Are you using lockfiles? That helps in speeding up things a bit.
a
We have parametrised intepreter constraints, it's not really easy to move to lockfiles
f
there are others who face performance issues (time / memory usage) when querying large dependency graphs. I am afraid troubleshooting performance would require careful logging collection and interpretation, so you may want to file a GitHub issue with more details, the command you run, the size of your repo, the memory consumption etc anything that would help us troubleshoot. Apart from that, I think the only way to achieve a decent performance for a large repo is to have a cache, either local or remote one to avoid unnecessary computations. Is this an option? Please see https://www.pantsbuild.org/docs/using-pants-in-ci It doesn't have to anything fancy immediately; even a directory shared between builds that take place on that node should have a profound effect on performance (if you can't have a shared remote cache, e.g. https://www.pantsbuild.org/docs/remote-caching-execution#server-compatibility)
a
We do have the cache shared, but this happens locally even without me changing any code, 10 minutes of pain and misery.
f
gotcha, thanks for clarifying. It's likely that you are suffering from the same issue as the one I've shared above.
a
Yeah, I'm reading it now
πŸ‘ 1
Yeah, we have 6.5k python files, so I'm guessing the growth is not linear. πŸ™‚
I think I might just bite the bullet and look into lockfiles with all the magic macros that we need to use.
f
I am not sure lockfiles would resolve your pantsd memory / performance issues
a
Oh, then I won't πŸ™‚
This did get much worse when we went from 2.6 to 2.14 (in one go, so can't really tell you when), but not by 20%, but by 3-400%
And I'm not sure if it's just been getting slower because we have more crap in our repo, or any changes in pants, but it's definitely way worse lately.
f
πŸ˜„ if you let it run on a fresh machine, say it takes 10 mins. If you re-run it say 100 times, does it consistently takes only a few seconds to fetch from the memory (as the results will be memoized)? You may also need to tweak your pantsd memory, see https://www.pantsbuild.org/docs/reference-global#pantsd_max_memory_usage
a
well, if it takes 7GB of memory, it'll get killed (as it's a looot, regardless of what I said initially about 32GB making it okay)
but, yes, for smaller runs, it's instant the second time
f
oh I see, so it does grow beyond 4Gib used by default, gotcha
what if you disable using pantsd at all?
--no-pantsd
and only use local cache?
a
let me give it a go, it might actually be faster, I was looking at the CI and it's NOT that slow
f
I am not intimately familiar with how pantsd daemon works, but if it gets killed by the OS and then it takes a lot of time to get started / schedule work etc.
a
Oh, it doesn't get killed by the OS, it dies after it's done with the command, because of that option you mentioned.
βœ… 1
f
ah so you left it at the default 4GiB; I saw it's taking ~7GiB so thought you increased it but then OS interferes and kills it
a
It's at least not 5 times faster, it's been running for 2 minutes now πŸ™‚
Nah, it doesn't get killed, but it does need to swap a bit. It doesn't seem to affect pants, though, I closed pycharm and chrome, no extra swapping and it still took >10m
βœ… 1
Okay, so it's much faster and uses less memory without pantsd.
It uses 4GB of memory and it was done in 4-5 minutes
Or, hm, there's still a resolve transitive deps there, but it said it's running tests
f
this is something. What I also would like to try is to use https://www.pantsbuild.org/v2.17/docs/reference-python-infer#use_rust_parser in 2.17 - if you have lots of files, maybe you are getting a bottleneck there
I saw some folks reporting significant dep inference performance improvements
a
Let's see (but, it's gonna take a bit, need to take care of some personal stuff before I try to upgrade pants)
f
no need to do the full upgrade Pants, you can just use the new version locally just to experiment πŸ™‚
a
I mean, I need to see if our plugins work πŸ™‚
f
When I experiment with a new version, I just disable them in the backends if there are any changes in the plugin API (during the experiment) πŸ™‚
a
True, could do that.
βœ… 1
Btw, we don't use dependency inference. Just the
__init__.py
, that causes me headaches every so often.
f
Btw, we don't use dependency inference
oh this is the first time I get exposed to a repo that doesn't have it enabled. How do you declare dependencies between targets?
a
Just manually.
f
wow
a
we're nothing if not hard working πŸ™‚
πŸ˜‚ 1
Are lockfiles mandatory in 2.17?
f
Just manually.
there must be a good reason why you do that, I'd love to learn more! If you'd like to explore automatic build target generation, feel free to explore https://www.pantsbuild.org/v2.17/docs/reference-tailor. I know that for example for Bazel you have to manually declare dependencies, but you would still take advantage of tooling such as Gazelle to generate the dependencies for you. You of course get into a terrible situation when you have both human and machine generated dependencies which is suboptimal.
Are lockfiles mandatory in 2.17?
no they are not
a
okay, it was our plugin that was screwing it up, I guess
and, well, I'm not sure why, except this is how we've always done it.
f
imho, lock files are optional and require careful planning as you may end up with lots complications. I had worked in a Pants monorepo with constraints files generated with
pip-compile
and https://www.pantsbuild.org/v2.17/docs/python-third-party-dependencies#constraints-files worked lovely. So I wouldn't go for it just for the sake of having them. In a large monorepo with complicated tooling lockfiles may make your life worse if not carefully researched first, imho.
a
We do want them, even if it will make life more complicated, our 3rd paty deps are a mess currently, only part of them are in constraints... Anyway, we need to figure out how to do parametrised interpreter constraints in a way that doesn't make people sad
Definitely not a huge improvement in speed with the rust inference parser.
f
that's because you don't read the
import
statements in Python files, do you
the file parsing and ast construction evaluation is what was sped up
a
it still does
__init__
files, and quite a few of those have a million imports.
f
gotcha, so some parsing is still done, I see
a
I just now read that it gets worse if you turn inference off, I wonder...
Well, at least it failed fast.
Copy code
pants.engine.target.InvalidFieldException: The target defender/defender/tasks/populate_defender_monitoring_stats.py:../../lib has the `interpreter_constraints` ('CPython~=3.7.4', 'CPython~=3.10.9'), which are not a subset of the `interpreter_constraints` of some of its dependencies:

  * ('CPython~=3.10.9',): insights/insights/defender_stats/entities.py:../../lib
  * ('CPython~=3.10.9',): insights/insights/defender_stats/stat_types.py:../../lib
Anyway, will try to look into this more. The problem is that I'm trying to reduce the CI time, and I managed to get all tests to run in parallel, but now if we don't split into multiple workers, the dependencies bit is slowing things down πŸ™‚
h
To clarify, what do you mean by "resolving transitive dependencies" ? Which processes are taking a long time?
Is it a pip resolve?
Forcing Pants to use a newer pip can speed those up, since newer pips have better backtracking heuristics
a
Sorry, I misspoke. It says
323.05s	Resolve transitive targets
h
That is shockingly long. That rule isn't running external processes, it's just doing graph traversal in memory.
How big is your build graph?
(how many source files in the repo is a good approximation)
a
About 6.4k python files
h
That's a small number, so hmm...
Something is very off here
a
If there's anything I can share to look into this, I'm more than willing πŸ™‚
I just don't really know where to begin 😞
h
I assume this code is proprietary and you can't share it in a public github repo?
a
okay, short of that πŸ˜„
yeah, it is
h
Well, confidential channel support (including NDA) is potentially available to project sponsors now! See the new sponsorship page: https://www.pantsbuild.org/docs/sponsorship
a
Not because of that, but I was trying to get us to sponsor pants, since it's such a big part of our workflow πŸ™‚
❀️ 2
f
Are you using any `UnionRule`s with
InferDependencyRequest
by chance?
I have 8k targets and I am struggling with similar issues
a
Our only custom plugin just does the releases, and it's really simple, other than that it's just a huge mess of python sources that depend on each other too much. πŸ™‚
f
just a huge mess of python sources that depend on each other too much
Perhaps that's the commonality here πŸ˜„
πŸ˜„ 1
a
Yeah, I still think it's excessive. I might try to get the powers that be to approve me spending a few days on profiling this, but I'm not sure I can figure it out, my only expose to rust was reading parts of their books while commuting, so yeah. 😞
f
We use dep inference, and I did notice in experiments in turning it off that it got slower not faster. "Not using dep inference" isn't a well-tested use case in Pants, because it's not really expected. I'm going through what I think is the relevant code here https://github.com/pantsbuild/pants/blob/7c3270f3631d6540fd822deac0236703696f59af/src/python/pants/engine/internals/graph.py#L1307
a
We tried turning it on, but we got some weird errors from doing it.
f
(i think that's a link to 2.17.0rc2 's code, but you get the picture)
f
I'd also advocate for trying to chop off a part of the repo and run Pants again. I wonder whether it's the size that matters (it may certainly make it slower, most certainly), but it could be something about the runtime environment. For example, I get pantsd occasionally crashing on my hobby repo as well
That is, keep the config in place, just try a smaller subset
f
George, what mechanism do you use to turn off dependency inference?
a
Okay, we have it on for
__init__.py
files, but we didn't turn it on for anything else
Copy code
[python-infer]
imports = false
f
hmmm I don't think that actually make dependency inference not run
a
That's a bit silly. πŸ™‚
I, personally, am not a fan of this feature, but I never looked too much into it, since we cannot turn it on anyway.
f
oh... i'm lying
it might depend on which version you're using
a
We're on 2.16.0
a
To be fair, I'm sure that even parsing those, it shouldn't be this slow.
f
yeah it's not the parsing that's slow
or that's taking up so much memory
I'm just trying to nail down what's actually happening
a
It seems to me that it's just something that doesn't scale linearly, things are getting way worse as we're adding files
f
Yeah I think I'm gonna go through at some point and start commenting things out in the source or replacing with no-ops and see where i get a speed up
there's something really funky going on
a
But, yeah, it's getting slightly ridiculous for us, for some of our apps, running tests is basically 5 minutes of pants doing shit, then 3-4 minutes of running tests 😞
f
I've submitted some speedscope profiles before and it's led to a few improvements, but there's something weird happening here and I think it's gonna take a more aggressive approach
Oh we have hours of tests to run, so we barely even feel it
I'm only kinda joking
We feel it more when we try to put it in dev-facing tooling
a
We split our tests by app, it actually does get better if you're asking pants to do this for less targets.
Actually, we even split apps into multiple 'shards' (not sure what circleci calls them, that's what we called them before we moved there), and the time it takes to start running tests depens on how many tests we want to run
f
I wish that were an option for us. We use
dependents
which requires that Pants calculate all dependencies
But I'm going to dive into this. I think hacking and slashing the graph and dep inference code might at least let me find where a bottleneck is in the code
a
we have a step at the beginning that gets the dependencies between this PR and master, then split it into, erm, we call them components (sometimes more than one app/lib), and each of these runs tests/lint/typechecking separately.
f
This is hard for much of the Pants team to work on because it's not clear how to create reproducers for these issues
I'm a maintainer and I have access to a repo that can reproduce this, so I guess it should fall on me πŸ˜…
a
we used to run it all together, it did take more than one hour, but this was on an ancient version of pants
f
we have a step at the beginning that gets the dependencies between this PR and master, then split it into, erm, we call them components (sometimes more than one app/lib), and each of these runs tests/lint/typechecking separately
This is where I want to get to as well but it will take time
a
well, if you have any ideas, I can try them on our side, and I definitely am willing to commiserate πŸ™‚
f
If you have some time to hack, https://www.pantsbuild.org/docs/running-pants-from-sources will show you how to build and run your own version of pants locally
(This is what I'm going to do... tear bits out of the code I linked you and then run it on my codebase)
a
oh, right, I thought you meant removing parts of your repo
which is... really hard, hehe
f
nah I need those bits
we're a monolith masquerading as a monorepo, it's rough
a
I don't need 90% of it, but it's very hard to separate them πŸ˜„
btw, are you using lockfiles?
f
heh... yes but sparingly
a
'cause, in the past, when I complained about this, I was told that lockfiles should make this better
or, well, that they should make things better, in general
f
depends on what your issue is
we use fake non-existent dependencies for python third party deps and run our tests outside of Pants because our actual code is provided by RPM packages, not PyPI
a
My condolences... I thought our setup is screwed up, but yeah... πŸ˜„
h
To clarify @ancient-france-42909’s issue, FWICT the time consuming thing is graph traversal, not dep inference? So I wonder if the hairball-ness is the issue.
f
Maybe so but we need to be find ways to isolate where these bottlenecks actually are in the code
And speedscope profiles haven’t really shown anything that has stood out as being a primary cause here. So I feel like hacking at this the old fashioned way and just remove and replace bits of code with no ops until I figure out what’s going on ☺️
h
πŸ™‚
@witty-crayon-22786 might know a way of creating a flame graph of rules
πŸ™ 1