Considering how hard it is to use `--changed-since...
# development
c
Considering how hard it is to use
--changed-since
with generated targets, due to there being “too many” targets affected from a single source file, I’m playing with the idea to track the source lines of targets, so we can get more granular filtering for these targets. 🧵
👀 2
As a POC for myself, to see how feasible my approach is, I’ve hardcoded in some data:
Copy code
$ ./pants peek src/python/pants/bin:bin
18:44:35.91 [INFO] Initializing scheduler...
18:44:43.22 [INFO] Scheduler initialized.
[
  {
    "address": "src/python/pants/bin:bin",
    "target_type": "python_sources",
    "definition_location": {
      "path": "src/python/pants/bin/BUILD",
      "start_line": 2,
      "end_line": 5
    },
    "dependencies": [
...
The definition location here is hardcoded in the BUILD file parser only, then that is passed on all the way to peek via a `Target`…
so if this looks good, I can finish this up with tests and proper detection of source lines (hopefully without to big perf impact, in case this should be behind a ft toggle, perhaps)
as pre work for follow up work to offer finer grained filtering based on this data…
e
Is it not the case that --changed-since, even if it acts on 1 real target that produces 100 generated targets, already short circuits on all but the changed generated targets in downstream rules? Say the 2 out of 98 that are actually new? It should at a high level since the other 98 will mix into the same cache keys they mixed into on prior runs.
c
The purpose of using
--changed-since
(from my perspective) is to exclude targets from an operation, regardless of whether there exists a prior cached result or not. Say I want to package (and publish) all dists that have changed since a previous commit. With a single
requirements.txt
file, that would be all dists that use any dep from those requirements, which is wrong. I’d prefer if I could filter out so only those dists that actually depend on a dep that was changed in that
requirments.txt
file was published. Does that make sense?
But you have a good point, opeining up for an alternative approach to solve this, and that is to filter based on if there was a cache miss or not.. although that feels a bit fragile, and fails if you wipe the cache clean.
w
doing line change detection wouldn’t work for variables or methods, unfortunately…
c
what do you mean?
w
this is one of the dirty secrets of why 111 and “more BUILD files” in general ends up being a bit better in the end.
@curved-television-6568: if a target consumes a variable, and you change the variable, you’ve changed the definition of the target
this relates to https://github.com/pantsbuild/pants/issues/11206, although that is going to be a significant undertaking
💯 1
c
Oh… I haven’t come across using variables in BUILD files yet.. but macros, being methods is of course an edge case I haven’t looked at yet
My mind is still on requirements.txt and the
python_requirements
target..
plain and simple..
however I’m starting to realise this may be a rather naive approach to the problem… It was a fun half hour experiment, but I think I’ll leave it at that for the time being..
👍 1
h
I haven’t come across using variables in BUILD files yet.
Yeah, BUILD files are full-on Python. Just we ban
import
💯 1
c
Oh, so you could actually create classes and stuff.. 🤪 and use builtins like
open
? 🤔
h
You can indeed create classes if you really want hah. But they'd have to be defined to the local BUILD file because imports are banned. I think we do ban
open
given that it will mess up caching, we scan the AST for both
import
and
open
c
Ok, cool.