Having a couple issues getting started with builds Have a te Pants #general

Having a couple issues getting started with builds...

nice-florist-55958

11/13/2021, 12:44 AM

Having a couple issues getting started with builds. Have a test project called casper w/

./src/casper/<pkg, mod, etc.>

structure and

"src/"

pattern for root. When using

./pants run <target>

located in

./src/casper/pkg/subpkg

, I get an module not found error on

import casper.pkg.subpkg

I would have assumed Pants would place

./src

in the

PYTHONPATH

(or perhaps install the package in the hermetic env to make it self-importable)? I don't know if it makes a difference, but the import was in

__init__.py

pkg

. The second issue is that when I make any changes to the offending file, it seems Pants does not pick those changes up -- build and run goals keep using the same depended on file.

enough-analyst-54434

11/13/2021, 12:46 AM

1st issue: run

./pants roots

to see what Pants guesses for

PYTHONPATH

entries.

hundreds-father-404

11/13/2021, 12:46 AM

but the import was in init.py of pkg

It might be this gotcha about

__init__.py

files: https://www.pantsbuild.org/v2.8/docs/python-backend Have you seen that yet?

nice-florist-55958

11/13/2021, 12:54 AM

@enough-analyst-54434 I get back

src

and

tests

and so

"./src"

should be on the path, thus

import casper.pkg.subpkg

possible? Curiously, when packaging, it gives me

./dist/src.casper.pkg.subpkg/target_module.pex

. Is the

src.

prefix an indication that it only put

./

on the path?

hundreds-father-404

11/13/2021, 12:55 AM

./dist/src.casper.pkg.subpkg/target_module.pex . Is the src. prefix an indication that it only put ./ on the path?

That part is a red herring. See the

output_path

field https://www.pantsbuild.org/v2.8/docs/reference-pex_binary#codeoutput_pathcode

nice-florist-55958

11/13/2021, 1:15 AM

@hundreds-father-404 Gotcha. Per your suggestion, I enabled the first option, but the second option python.tailor_ignore_solitary_init_files = false appears to only be available on 2.8 (using 2.7.1). Even with the first option enabled and recreating BUILD files and trying to change the init file into a blank, the build/package goals still use the original init file with the bad import. Is pants not invalidating the cache of that file when I make changes?

nice-florist-55958

11/13/2021, 1:26 AM

I attempted this again on a fresh tailor of ./src/casper/pkg2/subpkg2 with only vacuous init files, and having the same import casper.xx issues. Also having the same issue of Pants using the original files to build/package despite changes made.

nice-florist-55958

11/13/2021, 1:33 AM

So tried again w/ a package that only imports stdlib pkgs and that works. I tried one that imports a third party package pandas and having import issues. Maybe I am misunderstanding some things, so I'll go back and read over the documentation before posting more, but I thought that Pants would infer the dependencies and from that create an isolated venv with required dependencies and then perform the run goal (and package that venv in a pex file when disting). Admittedly I was probably being too optimistic here, and I see now there is some mention of having to teach Pants about your dependencies through a requirements.txt file, and I did not begin to look at the target-specific dependency declarations yet.

nice-florist-55958

11/13/2021, 1:52 AM

Ah, so: https://www.pantsbuild.org/v2.8/docs/python-third-party-dependencies#requirementstxt. Dependency inference is only done within the universe you define up-front.

nice-florist-55958

11/13/2021, 1:53 AM

Is it a feature for Pants to not warn you about an import it cannot find in your declarations during the import inference stage? Is it meant to be caught only at run-time?

enough-analyst-54434

11/13/2021, 1:59 AM

Yeah - we do not do magic - you have to declare 3rdparty dependencies via requirements or elsewise - we will not guess a version for you.

👍 2

enough-analyst-54434

11/13/2021, 1:59 AM

As to the silence on missed imports - that was just made better 2 days ago: https://github.com/pantsbuild/pants/pull/13491

👍 1

enough-analyst-54434

11/13/2021, 2:00 AM

See the block here: https://github.com/pantsbuild/pants/pull/13491/files#diff-b5f1dd9557ee0bdb5054c727d4620db34752fec1c7851c9040bdb1be8ee564faR213 That'll be in 2.9.x - not sure its getting backported.

hundreds-father-404

11/13/2021, 2:02 AM

Yeah no plans to backport, we'll be doing a stable release of 2.8 likely on Monday assuming no bugs are reported before then Taylor to check, are things working now? Anything we can help with?

nice-florist-55958

11/13/2021, 2:09 AM

I found the macro to use requirements.txt, but again stuck in the situation where Pants won't invalidate its caches when I make file changes. I wonder if this has anything to do with my files being checked out in my NFS home directory, but the .cache/ being on the local host in /var/tmp?

enough-analyst-54434

11/13/2021, 2:21 AM

I don't think it has to do with separate locations. I'm almost positive it is exactly due to NFS.

enough-analyst-54434

11/13/2021, 2:22 AM

You should be able to quickly test this, but we use inotify and the like and these do not see NFS edits (since they're remote) IIUC.

nice-florist-55958

11/13/2021, 4:32 AM

Is there any way to “wake it up” (like an event loop)?

fast-nail-55400

11/13/2021, 2:09 PM

discussion on stack overflow points at it being because NFS does not have protocol support for notification events: https://stackoverflow.com/questions/4231243/inotify-with-nfs

fast-nail-55400

11/13/2021, 2:10 PM

(since NFS is several decades old and predates even the existence of the Linux kernel)

nice-florist-55958

11/13/2021, 3:22 PM

Right, but can Pants be instructed to scan/recompute hashes for a given file, dir, project manually? Or maybe it needs a polling feature. For example, PyCharm complains about file syncing because it can't place file watchers on NFS mounts, but we still manage because it periodically polls the project files and you can force resync.

fast-nail-55400

11/13/2021, 3:43 PM

probably needs an opt-in polling feature

👍 1

nice-florist-55958

11/13/2021, 4:32 PM

In the meantime, kill whatever server processes listening for file changes Pants gins up so it is forced to reexamine the files?

enough-analyst-54434

11/13/2021, 4:33 PM

Yeah, which is easier accomplished with

--no-pantsd

. Performance will not be good though.

➕ 1

nice-florist-55958

11/13/2021, 4:51 PM

Does that option disable caching? Is caching and killing more optimal if willing to trade longer init times w/ faster goal completion?

nice-florist-55958

11/13/2021, 5:01 PM

So I tried both ways: w/ --no-pantsd (working now!) and not setting the option, but modifying the pants script to kill the sever pid at exit. Since the proto project is trivial, I don't know if "poor performance" just means slow init time or if Pants is starting from scratch on each goal run?

nice-florist-55958

11/13/2021, 5:02 PM

Also, is the inotify logic in the rust or python codebase? Wondering how difficult architecturally polling would be to implement (or force resync) for the server.

hundreds-father-404

11/13/2021, 5:19 PM

It does not disable caching on-disk, only memoization in-memory. But that memoization is pretty important to avoid doing things like "Creating module mapping" for dependency inference Rather than killing the server, it'd be simpler to use

--no-pantsd

so you don't start the server in the first place It is implemented in Rust. I'm not familiar enough w/ that part of the code to know how tricky it would be, but definitely possible!

hundreds-father-404

11/13/2021, 5:21 PM

maybe open a feature request to support NFS by adding polling for file watching? Can link to https://github.com/pantsbuild/pants/issues/10842, where we talk about adding polling for paths outside of the "build root" (the repo)

nice-florist-55958

11/13/2021, 5:39 PM

@hundreds-father-404 Thanks! I'll add this request to my list of things I'm keeping as we delve into this over the weekend. I think local performance is really important (for developers to run goals before committing), but our CI jobs would always be starting from disk (at best) and so the initialization lag is already going to be a fact of life there. They download the Git workspace and at some point run your custom build logic. The intention is to have Pants already bootstrapped on an NFS drive the build uid can access, so when it starts running goals using the pants script, it will not need to re-bootstrap itself and have access to the cache, which will hopefully be in a state based on the last successful release. Still need to sort out the details though. If you have any examples of people deploying Pants under similar constraints that would be helpful!

enough-analyst-54434

11/13/2021, 10:00 PM

I'm slightly confused by your scenario description since it sounds ~opposite of what we've been working through here, where the code repo is on NFS. Here you describe ~/.cache/pants on NFS IIUC. Pants has native support for a remote / shared cache: https://www.pantsbuild.org/docs/remote-caching-execution. The Pants OSS CI jobs dogfood this with a remote cache hosted / donated by Toolchain in fact.

6 Views

Open in Slack

Previous Next