Question to the community: One of the features we...
# general
h
Question to the community: One of the features we're looking at for 2.0 is not requiring so much BUILD file boilerplate. For example, we can infer dependencies in most cases by analyzing import statements, and sometimes we can even infer the targets themselves (for example if you do
./pants binary path/to/main.py
we know what you mean even if there is no `python_binary`target). See https://bit.ly/3d3wHq1 for more detail. How valuable would you find such a feature on a scale of 1 (not at all valuable) to 5 (extremely valuable)?
w
the semantics would be to match the
import
semantics of the language as closely as possible: so if you import something, it’s a dependency.
taking python as an example, it would be as if “all source root entries” are pythonpath entries. the main difference from python imports is that we would likely not rely on “ordering” the way that the pythonpath does, and instead require that a dep be explicitly specified if there was more than one source of it.
so very minimal magic: if you understand your language’s imports, you understand the behavior.
a
seems nice, but i've seen enough edge cases in building our BUILD file autoupdate tooling that i'm skeptical it could be done in a manner that was performant in large repositories, and i'd be worried about the potential edge cases
w
that’s part of the reason we’ve been pushing on pantsd. relatively little needs to be kept warm in memory to get equivalent performance to having BUILD files on disk
a
like, wouldn't this mean that to run
./pants dependees
etc., pants would have to essentially parse every line of python in the repository, install every possible third party dependency, and generate all generated code? cacheable, sure, but...
w
@astonishing-jelly-60479: re: 3rdparty code: no, see https://github.com/pantsbuild/pants/pull/10025 for how that works
for dependees, yes. but “parsing every line of python code” is similar to “parsing every BUILD file”
cheaper even, because BUILD files are interpreted
a
ah, i would've thought that BUILD file parsing would've been something that was either done in native code or would eventually be done in native code
the dependency inference code is interesting i suppose but sounds painful at scale if every package has to have its entrypoint specified... though i guess we could autogenerate that from the tooling we have now
w
every package has to have its entrypoint specified
what do you mean?
a
if you have to specify
module_mapping
for every python requirement you bring in
w
if the import doesn’t match the name of the requirement, yea.
but that’s a one time cost per dependency that saves per usage
a
fair
h
Plus we (Toolchain) may offer an API that generates that mapping for you
Apart from performance are there other reasons you might prefer mostly-autogenerated checked-in dependencies in BUILD files over inferred deps at pants runtime?
c
You say that in ”most cases” you could infer. Would it be clear when it can't? That's my main worry as a user many exceptions to the rules
Apart from that concern I'd be interested in this
👍 1
a
it sounds like it'd be great for new projects, but i'm worried about migrating to it. it'd be a decent adjustment for users, and we've invested a lot of energy in building a CI system around pants' quirks, avoiding any kind of persistent caching due to inadequate or incorrect fingerprinting in prior versions or in certain codegen plugins, etc., and having a choice between a shared cache/background daemon versus long startup times and parsing python files with all of the myriad edge cases that brings makes me a bit nervous. on its own, probably not a big deal. as one more thing that's changing with a v2 migration, it's stressful.
additionally, inferring deps means the build graph can change without any actual source code or BUILD file changes (i.e. because of a pants upgrade). like, if inference rules change because of a bug fix, or worse, because of a regression. i'm anticipating supporting production issues where users are asking "why is my job failing because of this ImportError" and having practically no way to track down the root cause without a line-by-line review of all of pants and pex
w
as one more thing that’s changing with a v2 migration, it’s stressful.
one thing that you should know is that this will absolutely be optional, and independent from pantsd. so very decoupled from a 2.0 upgrade
additionally, inferring deps means the build graph can change without any actual source code or BUILD file changes (i.e. because of a pants upgrade).
agreed. in this regard, it’s a lot like changing the version of a linter, or of mypy or pytest
it’s particularly similar to mypy though… in compiled languages, compiler version bumps frequently mean new dependencies are necessary.
one thing that we could consider doing would be decoupling the version of the dependency inferencer from the version of pants, but i expect that code to be relatively stable, because python’s import semantics are
You say that in “most cases” you could infer. Would it be clear when it can’t? That’s my main worry as a user many exceptions to the rules
@crooked-gpu-88495: this is a good point, and that’s probably where the majority of the usability lies. part of what would be so different about a change like this is that in some cases users might not learn about BUILD files for a while… they might become an “advanced topic”, where you only need a BUILD file if something is failing to import.
i think that if “where i need to create the BUILD file” and “which dep i need to add” are clear enough though, it might be a net benefit for teachability, because new users don’t need to learn those topics until later
thanks for the feedback everyone!
c
I haven’t had time to catch up on the conversation, but I will say that one nice thing about having dependencies specified explicitly is that it makes it very easy to understand what depends on what. As long as there is a subcommand that answers the question “what are the dependencies for this target?“, I think this should be fine.
w
yea,
./pants dependencies
will continue to do that
👍 1
c
@witty-crayon-22786 That command doesn’t seem to exist in v1.25? Is this new? I see
dependees
, but not
dependencies
.
w
… mm, that is a v2-migration rough edge. i believe that in a v2-only install it’s called
dependencies2
in older versions… it’s taken back the name
dependencies
in more recent versions, iirc.
c
I don’t see
dependencies
at all in my
pants goals
output:
Copy code
Use `./pants help $goal` to get help for a particular goal.

   bash-completion: Generate a Bash shell script that teaches Bash how to autocomplete pants command lines.
            binary: Create a runnable binary.
         bootstrap: Bootstrap tools needed by subsequent build steps.
          buildgen: Automatically generate BUILD files.
            bundle: Create a deployable application bundle.
         clean-all: Delete all build products, creating a clean workspace.
              cloc: Print counts of lines of code.
           compile: Compile source code.
  deferred-sources: Map `remote_sources()` to files that produce the product `UnpackedArchives`.
         dependees: List all targets that depend on any of the input targets.
           filemap: Print a mapping from source file to the target that owns the source file.
            filter: Filter the input targets based on various criteria.
               fmt: Autoformat source code.
               gen: Generate code.
                go: Runs an arbitrary go command against zero or more go targets.
            go-env: Runs an arbitrary command in a go workspace defined by zero or more go targets.
       kill-pantsd: Terminate the pants daemon.
        killserver: Kill the reporting server.
              lint: Find formatting errors in source code.
             login: Task to auth against some identity provider.
          minimize: Print a minimal covering set of targets.
      node-install: Installs a node_module target into the directory that the target is defined in.
           options: Display meta-information about options.
              path: Find a dependency path from one target to another.
             paths: List all dependency paths from one target to another.
              repl: Run a REPL.
           resolve: Resolve external binary dependencies.
               run: Invoke a binary.
            server: Run the reporting server.
          setup-py: Generate setup.py-based Python projects.
              sort: Topologically sort the targets.
           targets: List available target types.
     unpack-wheels: Extract native code from `NativePythonWheel` targets for use by downstream C/C++ sources.
(v1.25.0)