qq: I've added a dependency to requirements.txt bu...
# general
g
qq: I've added a dependency to requirements.txt but the
python_requirements()
macro doesn't seem to be picking it up when I run my tests? do I need to refresh the cache or something or at the minimum how can I debug this situation (is there a goal to validate what the macro expands to?)
h
Hi, where is the requirements.txt file located specifically?
g
in my
src/python
root the other dependencies there are working just fine
there also doesn't seem to be a clean goal to clear up any cache of the requirements
h
Okay I suspect your intuition is right about caching -- specifically that the Pants daemon is not recognizing the change. Try
rm -rf .pids
to force Pantsd to restart What's going on is that
python_requirements
was using an old macro system that predates the Rules API, and it does not properly use the file watching everything else uses with Pants. There's an option
[GLOBAL].pantsd_invalidation_globs
that forces Pantsd to restart when the file changes, but the default doesn't include every location like
src/python/reqs.txt
We fixed this in Pants 2.10, which is very close to a stable release 🙂
g
ok awesome, I literally just tried that and it worked
🙌 1
so I just need to add
pantsd_invalidation_globs = ["/src/python/requirements.txt"]
to fix this in the future?
h
It'd be great to confirm that
rm -rf .pids
does fix this. If so, then two options: 1. Stick with your current release and add
pantsd_invalidation_globs.add = ["src/python/reqs.txt"]
2. Use Pants 2.10.0rc3 and follow the instructions to switch to the fixed "target generator" mechanism
Great! Really sorry for the trouble, this has tripped up several people and I'm very excited for 2.10 to be released so it stops being so confusing
g
yup, have to add another dependency, I'll test now that the invalidation glob works
also qq while we're on 3rdparty deps: inference doesn't seem to work with
from lib import mod
statements, will that be fixed in the future or not planned to be supported?
invalidation globs seems to works, thanks for the help
❤️ 1
h
Hm, it should be! Pants understands both
import path.to.module
and from
from path.to.module import Foo
Have you seen https://www.pantsbuild.org/docs/troubleshooting#import-errors-and-missing-dependencies for how to debug missing imports? https://www.pantsbuild.org/docs/troubleshooting#import-errors-and-missing-dependencies
👍 1
👀 1
g
I'll check it out, for some reason it's not inferring pyspark correctly
which is weird because it looks like it's inferring it correctly for other sources 🤔
h
what is the import for pyspark? I wonder if you need to set up
module_mapping
g
Copy code
E   ModuleNotFoundError: No module named 'pyspark.sql.SparkSession'
python_requirement() dependency:
Copy code
src/python:pyspark
I think inference worked fine for other modules (I do need a
module_mapping
for some other dependencies
h
hmmm fishy. purely for the sake of debugging, could you try other imports like
import pyspark
, then run
./pants dependencies path/to/foo.py
to see if it shows up?
also you could try running
./pants peek src/python:pyspark
to make sure that target looks good
g
Copy code
❯ ./pants peek src/python:pyspark
[
  {
    "address": "src/python:pyspark",
    "target_type": "python_requirement",
    "dependencies": [
      "src/python:requirements.txt"
    ],
    "dependencies_raw": [
      ":requirements.txt"
    ],
    "description": null,
    "modules": null,
    "requirements": [
      "pyspark==3.1.2"
    ],
    "tags": null,
    "type_stub_modules": null
  }
]
so inference definitely failed, I wonder if it's because it's a pex target? does it only work for sources?
is there more documentation on inference? the only thing I've seen is in the third party deps page and it's pretty sparse
h
what do you mean pex target? the import is happening in a source file right?
g
yeah
h
I don't think we have more detailed docs, but you could check out the code to see what's happening: 1. For each file, we run this parser to grab your import statements https://github.com/pantsbuild/pants/blob/2.9.x/src/python/pants/backend/python/dependency_inference/import_parser.py 2. We create a module mapping of third-party and first-party code. The third-party code looks like this https://github.com/pantsbuild/pants/blob/93037980e22ba711a2594f9e43ad0356141087da/src/python/pants/backend/python/dependency_inference/module_mapper.py#L287-L368 3. We then look up for each import if it's in either module mapping, but don't infer if there's ambiguity. https://github.com/pantsbuild/pants/blob/93037980e22ba711a2594f9e43ad0356141087da/src/python/pants/backend/python/dependency_inference/module_mapper.py#L396-L443
btw have you tried explicitly adding the pyspark target to the dependencies of the
python_source
/
python_sources
target? To confirm that fixes things
g
ok this is weird, what I meant by pex target was that I was checking
./pants dependencies src/python/path:target
which wasn't showing the dependencies correctly (which makes sense I guess since the target's dependency is just the file) but when I tried it by pointing at the file the correct dependencies are showing up, which means I'm prob importing it incorrectly in my tests
s/correctly/what I expected/
h
Oh yeah for the pex target, you'd want to use
dependencies --transitive
g
oh nice, ok that's cool
h
https://www.pantsbuild.org/docs/project-introspection has more on queries like that, including
dependees
goal 🙂
g
that'll be helpful to debug why the test isn't importing the right dependencies
awesome, thank you so much Eric
❤️ 1
this should be enough for me to figure out where the hell I went wrong
h
You're welcome! I'd love to figure this out - it's crucial for Pants to be ergonomic that dep inference works well, so we're eager to know of any bugs, or ways to make docs & errors more clear
g
yeah so it looks like the inference worked correctly, but I'm still getting an import module error, which leads me to believe I'm just importing incorrectly
actually it looks like it's not picking up transitive dependencies?
dependencies --transitive
is showing the correct dependencies, but it doesn't seem to be building the requirements when running the tests?
h
Hm
but it doesn't seem to be building the requirements when running the tests?
What do you mean? Is this based on Pants's logging when building requirements.txt for example?
g
I wrote a small noop test that just imports pyspark, worked as expected, but you can see the output of it building the dependency no such thing happens when it runs my other test that depends on a file that depends on pyspark
h
hmm, you could try running with
--no-local-cache --no-pantsd
to force Pants to rebuild the requirements PEX? That's a mission-critical bug if that actually makes a difference..but to be sure
g
oh maybe it was the local cache, it seems to be collecting pyspark now
nope, never mind, still got the same error 🤔
I'm trying to package the pex now to see if there's an issue there, I'm getting a weird error
Copy code
WARNING: Discarding <https://files.pythonhosted.org/packages/89/db/e18cfd78e408de957821ec5ca56de1250645b05f8523d169803d8df35a64/pyspark-3.1.2.tar.gz#sha256=5e25ebb18756e9715f4d26848cc7e558035025da74b4fc325a0ebc05ff538e65> (from <https://pypi.org/simple/pyspark/>) (requires-python:>=3.6). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement pyspark==3.1.2
👀 1
this is interesting since my interpreter constraint is
CPython==3.7.*
h
But it looks like Pyspark was included in the requirements.pex? Here's another thing you can try,
--no-process-cleanup
https://www.pantsbuild.org/docs/troubleshooting#debug-tip-inspect-the-sandbox-with---no-process-cleanup A PEX file can be inspected with
unzip
. There is a file
PEX-INFO
at the root that can say if pyspark is included. So you can inspect requirements.pex
g
oh blah, I think I know what's going on, this might be a problem with
pyenv
I might need to bootstrap the interpreter by hand
👀 1
h
Huh, how come?
g
I changed two things, not sure which fixed the issue, but it is fixed 1. I changed the interpreter to 3.7.10 2. I changed the search_path to
["<PYENV>"]
More than likely it was the first change that fixed this issue, but the pex built correctly, gonna run the tests again
based on the warning I assume it was picking up the wrong interpreter somehow, not sure why
test is still failing with an importerror though
^^^ that's without --no-local-cache and --no-pantsd let me test with those flags
yup, still same error
h
K how about
--no-process-cleanup
?
g
ran both commands with no-process-cleanup
h
were you able to follow the linked instructions to inspect the relevant sandbox and use
unzip
to look at the
requirements.pex
?
g
it's definitely building it (or trying)
👍 1
and yeah it's in the pex
h
Okay then I have no idea what's going on. Sounds like Pants is correctly detecting the dependency & building it. At that point, this is no longer in Pants's control and it's the underlying Pytest process / Python machinery
👍 1
g
yeah, it's prob something I did wrong, I'll start from scratch and see what's going on, thanks for your help Eric
❤️ 1
you gave me enough tools that I don't need to flounder around as much 😛
🙌 2