<@UB2J9BQA0> We just noticed <https://github.com/p...
# development
a
@hundreds-father-404 We just noticed https://github.com/pantsbuild/pants/issues/9346 - any guesses at a root cause before we dig in?
🔥 3
h
ohhhh great find! cc @happy-kitchen-89482 @polite-garden-50641 @hundreds-breakfast-49010 this could be one of the culprits for why the cache is being invalidated more than we’d expect So, looking at the constructor for
PathGlobs
in
engine/fs.py
, the solution should be really simple. Change line 84 from
Copy code
python
 self.globs = tuple(globs)
to
Copy code
python
 self.globs = tuple(sorted(globs))
Globs are an
Iterable[str]
, so it’s always safe to sort (or we got bigger problems) and sorting strings is cheaper than bad cache misses
👍 2
h
yeah that's a super-quick fix
h
Specifically, the ordering won’t be stable if the passed
globs
were not deterministic, e.g. if you passed a set to the constructor Even if you passed a deterministic data structure like a tuple, we will still get more cache misses than we’d like. target1 with
['f1.py', 'f2.py']
sources and target2 with
['f2.py', 'f1'.py']
would be treated as distinct for no good reason
h
you wanna write that up and then I can review it @hundreds-father-404?
👍 1
it'd be interesting to see how much improvement this fix gets us
👍 1
h
I suspect it’s only one of many culprits. But, one down gets us closer and it’s extremely low hanging fruit 🎉
a
Wow, that was fast - thanks @hundreds-father-404!!
❤️ 1
w
good find, thanks folks.
❤️ 1
h
Great stuff
❤️ 1
w
added one comment on https://github.com/pantsbuild/pants/pull/9347 ... we have a test for this, so would be good to get a repro from @average-vr-56795
a
The repro was literally:
./pants --native-engine-visualize-to=/some/path list 3rdparty:
in the pants repo
And the problem wasn’t the ordering of the Snapshot
w
yea, Eric followed up on the ticket: i understand now
a
It was the ordering of the include-elements in the
PathGlobs
w
inputs vs outputs
👍 1
a
Cool 🙂
w
thanks!