A question about dependencies. I have a `3rdparty...
# general
b
A question about dependencies. I have a
3rdparty/requirements.txt
file with the
python_requirements()
in the BUILD sister file. so, for example:
Copy code
❯ ./pants dependencies 3rdparty:boto3
3rdparty:requirements.txt
Now suppose I change the version of boto3 in
requirements.txt
.
./pants list --changed-since=HEAD --changed-dependees=transitive
will return, well, basically everything. One some level this makes sense (everything has a 3rd party requirement), but how can I filter on just those files that are affected by the change to that single requirement?
h
Hmm interesting question! At the git level we see "this file changed", not "this single requirement changed", and as you surmised, the potential blast radius of the former is larger than the actual blast radius of the latter.
I can think of a hacky way to do this
For example assuming your requirements.txt is very regular (one requirement per line, with nothing preceding the project name) then this would list the targets that have changed:
Copy code
git diff -U0 | grep -E -o "^[-+]\w+" | sed 's/^./3rdparty:/g' | sort | uniq
So this would list the targets affected:
Copy code
git diff -U0 | grep -E -o "^[-+]\w+" | sed 's/^./3rdparty:/g' | sort | uniq | \
  xargs ./pants dependees --transitive
But that seems somewhat fragile
b
Thanks. I suppose I was hoping there was a diff that would apply after transmuting
requirements.txt
into the individual
python_requirement(...)
targets. More generally, can
pants
compute two subgraph of targets, using the same specification but for two different git revisions. And then compute the diff?
h
That would require pants to actually check out two different git states, which is not a path we've gone down
c
If the blast radius is large, and you have a large/huge requirements.txt file, you could split it up into smaller files to get a smaller blast radius when you change only one of those files.. also a bit hackish, but may work if there is a sensible way to partition your 3rdparty deps.
b
I see, thanks for clarifying. I suppose I like to think in terms of targets: how have they changed between git refs? Whereas right now
--changed-since
is merely a file selector, and one can only pivot from that to asking what targets are impacted by those files having been changed. Probably my point of view is a lot more work to implement 🙂 I thought I could solve this by listing everything in
requirements.txt
individually as a
python_requirement(...)
in the sister BUILD file. Alas it does not for the above reasons. I'll have to think a bit more about what the desired behavior is here, and perhaps try writing a plugin if that appears doable.
h
The problem here is that git diffs are intrinsically file level. The file->target mapping may not be straightforward, given things like macros. So if you want to go to the target level, you have to actually be checked out at the git sha and run some pants command at that sha, then check out the other git sha and run the pants command at that new sha, and then diff the two results
So question 1 would be - are you willing to have Pants change your git state out from under you
e
We could probably
git clone --reference
or
git clone --shared
into another spot and operate against that to get this info if we wanted to go down that route.
c
Another option (perhaps inferior) would be to track the line numbers from a source to the generated targets, and be able to operate on generated targets only for changed lines for a certain source file..
b
Thanks to your suggestions, it ended up being pretty easy to throw together a prototype: https://gist.github.com/AlexanderVR/84c89ecdb9d845600af96b6831d5c504
👍 1
c
Cool! 👍🏽
h
This is neat!
b
So, I feel like this should be the default functionality of
--changed-since
, or at least be another option like
--changed-since-exact
. If you all agree, and think the road there doesn't require a major overhaul, I'd be happy to take a pass at implementation.