Thoughts on adding Git integration to introspectio...
# general
r
Thoughts on adding Git integration to introspection goals like
./peek
? Might be a bit scope creepy but very useful for CI + there's already subsystem logic for Git. One use case I have in mind is raising a CI lint error if a Docker image target changes but its version number hasn't changed.
h
Can you give an example? Not sure I understand what this would do beyond what
--changed-since
does?
r
The actual use case here is to figure out what the tag change was between two revisions for a single target. Simplest approach would be similar to
gsw master && ./pants peek
+ some scripting.
Copy code
main_tag=$(./pants peek --git-ref=main :tgt | jq ..)
head_tag=$(./pants peek :tgt | jq ..)
I could imagine something more advanced than that to reflect on changes discovered in
changed-since
. But much less trivial.
h
So pants would change the state of the repo via git commands, at least temporarily? That would be a bold step…
🤔 1
r
I don't mean actually changing the git state. It's just how you could accomplish the same right now. That's what I hope to avoid.
h
So it would read file content from git instead of from the filesystem, something like that?
1
r
Yep. I assumed it would be more straightforward than
--changed-since
h
Hmm, I don’t know how straightforward that would be? How would you know which git state to look at?
r
Well I mean I assume
--changed-since=main
has all the same technical challenges and more? It has to read the merge base git ref and then compute some logical diff between that and the worktree
./pants peek --git-ref=main
would be the same thing minus the logical diff. Unless I'm way off about how
--changed-since
works
h
Ah,
--changed-since
only looks at which paths have changed, but not their content.
🤯 1
Not their previous content, I mean
Doing “peek at some gitref” correctly would require checking the entire repo out at that gitref
Maybe in a tmpdir, so we don’t pull the rug out from under Pants at its current state
So it’s more involved, and is ~equivalent to doing “git reset --hard <gitref> && ./pants peek”
Unless there’s something I’m not seeing
r
--changed-since
only looks at which paths have changed, but not their content.
I'm a bit confused here. Are you saying it doesn't even look at the contents of BUILD files?
Because for example I have a simple git repo with pants installed and the following history
Copy code
commit f070e507e8b4bb2d4f44549d00be9f5936d1bd50
Author: Navneeth Jayendran <navneeth.jayendran@affirm.com>
Date:   Wed Nov 16 19:04:40 2022 -0800

    Update message

diff --git a/README.md b/README.md
index 716ed14..dd880bc 100644
--- a/README.md
+++ b/README.md
@@ -1 +1,3 @@
 # Hello world
+
+Add some more content

commit 2d244ca0497b53065235b6f3efa0a9410a3de3e8
Author: Navneeth Jayendran <navneeth.jayendran@affirm.com>
Date:   Wed Nov 16 19:04:23 2022 -0800

    Initial commit

diff --git a/BUILD.pants b/BUILD.pants
new file mode 100644
index 0000000..2666440
--- /dev/null
+++ b/BUILD.pants
@@ -0,0 +1,4 @@
+file(
+    name="README",
+    source="README.md"
+)
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..716ed14
--- /dev/null
+++ b/README.md
@@ -0,0 +1 @@
+# Hello world
diff --git a/pants.toml b/pants.toml
new file mode 100644
index 0000000..8d269d8
--- /dev/null
+++ b/pants.toml
@@ -0,0 +1,2 @@
+[GLOBAL]
+pants_version = "2.14.0"
If I run
./pants list --changed-since HEAD~
, it shows that
//:README
changes. The only change since previous commit was to modify README file contents
h
Yes, because Pants sees that the BUILD file has changed, so it invalidates it, but it doesn’t diff against the previous content. For example, if you add a space to an existing BUILD file, all the targets in it will be invalidated.
👀 1
--changed-since
behaves as if you had hand-edited all the files that have changed since the gitref
When you hand-edit a file Pants has no way of knowing what they previously held
Generally speaking Pants is agnostic to git state, except for
--changed-since
r
Ah... that's not what I expected 😢 . I thought it was something like https://github.com/Tinder/bazel-diff
c
Aha, so they do what Benjy suggests as well, checking out the entire project in some tmp location to extract relevant information..
bazel-diff
works as follows
• The previous revision is checked out, then we run
generate-hashes
. This gives us the hashmap representation for the entire Bazel graph, then we write this JSON to a file.
• Next we checkout the initial revision, then we run
generate-hashes
and write that JSON to a file. Now we have our final hashmap representation for the Bazel graph.
• We run
bazel-diff
on the starting and final JSON hash filepaths to get our impacted set of targets. This impacted set of targets is written to a file.
1