Can someone help me debug the code that pants is p...
# general
p
Can someone help me debug the code that pants is providing to pylint? The
./pants dependencies ...
output looks correct, but pylint is complaining that it can't find some modules (modules that are listed in the
./pants dependencies
output). What's more, if I only include the targets in that directory, pylint has a different set of errors. And if I do a very narrow selection of files to lint (including files that had errors with a larger selection of targets) then I don't get any errors at all. I thought maybe it was overlapping source roots, so I temporarily deleted
//conftest.py
, the relevant target from
//BUILD
, and the
/
source root, but that didn't resolve it. Here's a bunch of
./pants ...
output: https://gist.github.com/cognifloyd/f428a423a07e1e859d4e2625c0cd66f1 And here's the code base I'm working with: https://github.com/st2sandbox/st2/tree/pants
h
Are you using
./pants dependencies
rather than
./pants dependencies --transitive
? Note that Pylint only looks at direct deps, not transitive You could also try
--no-process-execution-local-cleanup
to inspect the files you expect are there
p
In this case, the deps pylint is complaining about are direct dependencies. Just to be sure, yes, they show up in both
./pants dependencies
and
./pants dependencies --transitive
.
πŸ‘ 1
Interesting. Looking just at this error:
Copy code
************* Module st2api.controllers.v1.keyvalue
st2api/st2api/controllers/v1/keyvalue.py:1:0: F0002: <class 'ModuleNotFoundError'>: No module named 'st2common.logging' (astroid-error)
If I look at
./pants dependencies st2api/st2api/controllers/v1/keyvalue.py
I do not see
st2common/st2common/logging/__init__.py
though it is present in
--transitive
. And when I look in the dir left by
--no-process-execution-local-cleanup
, the
st2common/st2common/logging
directory is not present. The relevant import in
keyvalue.py
is
from st2common import log as logging
.
st2common/st2common/log.py
in turn has:
Copy code
from st2common.logging.filters import LoggerNameExclusionFilter

# Those are here for backward compatibility reasons
from st2common.logging.handlers import FormatNamedFileHandler
from st2common.logging.handlers import ConfigurableSyslogHandler
It looks like pylint is considering transitive deps somehow
h
Yeah I'm surprised by that, we have a test that transitive deps should be ignored https://github.com/pantsbuild/pants/blob/461b2ab078795824c83f2c157baa80d7388d3992/src/python/pants/backend/python/lint/pylint/rules_integration_test.py#L181-L223 I wonder if the
__init__.py
changes that somehow? To test, you could try creating a dummy repo w/o Pants and with a similar layout to what you have. Delete the transitive dep file, and see if Pylint complains
p
Perhaps. Backing up from
,/pants lint st2api::
(with the weird potentially transitive-related errors) to
./pants lint ::
- that is more clear cut - all the files that pylint says are missing are definitely present. Given the tmp directory
/tmp/process-execution5xrkvP
there are directories
/tmp/process-execution5xrkvP/st2api
and
/tmp/process-execution5xrkvP/st2api/st2api
. Is
/tmp/process-execution5xrkvP
, the directory that contains
__run.sh
implicitly included in the PYTHONPATH somehow? Because then it might try looking in
st2api/
instead of
st2api/st2api/
which would not work.
h
I think
__run.sh
should show
PEX_EXTRA_SYS_PATH
, which is the Pex way of setting
PYTHONPATH
p
This is interesting:
Copy code
$ ./pylint_runner.pex_bin_python_shim.sh
Python 3.6.13 (default, Jun 12 2021, 12:27:04) 
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import st2api
>>> st2api.__path__
_NamespacePath(['/tmp/process-execution5xrkvP/st2api'])
>>> import sys
>>> sys.path
['', '/usr/lib64/python36.zip', '/usr/lib64/python3.6', '/usr/lib64/python3.6/lib-dynload', '/tmp/process-execution5xrkvP/.cache/pex_root/venvs/6e8f51fbbffcacbdf7d21b92a25fe2057ac9e907/f0dd7252e78b9eef2b00bf88b4b41d699b14a04a/lib64/python3.6/site-packages']
st2api is not a namespace package
Copy code
export PEX_EXTRA_SYS_PATH=$'contrib/chatops:contrib/core:contrib/examples:contrib/examples/lib:contrib/hello_st2:contrib/linux:contrib/packs:contrib/runners/action_chain_runner:contrib/runners/announcement_runner:contrib/runners/http_runner:contrib/runners/inquirer_runner:contrib/runners/local_runner:contrib/runners/noop_runner:contrib/runners/orquesta_runner:contrib/runners/python_runner:contrib/runners/remote_runner:contrib/runners/winrm_runner:pylint_plugins:scripts:st2actions:st2api:st2auth:st2client:st2common:st2common/benchmarks/micro:st2exporter:st2reactor:st2stream:st2tests:st2tests/testpacks/checks:tools:__plugins'
anywho. I'm off to bed.
OK. So, I ended up chatting much later than I planned and had another quick go at debugging this. I edited
__run.sh
from
./pants lint ::
so that it tested just two files:
st2api/bin/st2api st2api/st2api/app.py
, which failed, but
st2api/st2api/app.py st2api/bin/st2api
passes, so pylint's module detection logic is getting confused by the binaries that match the module names. From the pylint docs:
It is also possible to analyze Python files, with a few restrictions. The thing to keep in mind is that Pylint will try to convert the file name to a module name, and only be able to process the file if it succeeds.
and pants is passing all of the filenames to pylint, instead of modules. So I wonder if it would be safer to pass modules instead, except for the files that aren't in modules? (like st2api/bin/st2api)? For now,
skip_pylint=True
in
st2*/bin
allows
./pants lint ::
to pass (as much as I expect it to). selecting a smaller subset still fails, which is odd. but its better.
Passing stuff to pylint seems so fiddly.
argh
h
Ouch, that is really fiddly indeed.
We know how to convert file paths to modules because we know the source roots, so passing modules to pylint seems possible
πŸ‘ 2
We assume that all files are modules, so it's all or nothing
Well, all files are modules, by definition, iirc. It sounds like the issue is with figuring out the full module name
I'd need to look a little more closely
Which I will do
@proud-dentist-22844 I assume
st2api/
is a source root?
And that when you write
st2api/bin/st2api
you mean
st2api/bin/st2api.py
?
Oh, looked at the source and I see that you did not mean that
βž• 1
I need to get to stronger wifi to keep on reproducing and debugging this, but I can confirm that (AFAICT) we don't set sys.path correctly for the Pylint process. It should be set to include all the relevant source roots
p
st2api/
is a source root.
st2api/bin/st2api
is an executable without a
.py
extension. I explicitly added it to a
python_libraries
target sources.
Hmm. Should I add st2api/bin as a separate source root? Except that then that could mess up imports for other things. Everything in
st2*/bin
should be considered a leaf (of sorts) - ie, nothing should be allowed to import from those files, but they should be able to import from everything else. Not sure how to inform pants (or pylint) about something like that. πŸ˜•
Maybe pants needs a concept of
isolated_source_roots
? ie a source root that never gets added to PYTHONPATH, but that counts as a root when calculating its "module"?
Then, when passing things to pylint, anything in an
isolated_source_root
would get passed to pylint last, which apparently doesn't confuse it.
or maybe
bin_root
?
h
the repo root itself can be a fallback source root (the innermost source root wins), presumably no one will try to import
st2api.bin.st2api
for example
p
Or
bin.st2api
since
st2api/
is a source root, however, I think pylint inspects
st2api/bin/st2api
and decides that it provides the
st2api
module. It must be caching that because subsequent files that import something
from st2api
fail. But if it loads the bin file last, it already processed the real
st2api
module, so there is no ambiguity.
Ah ha. Yes, there is a cache: https://stackoverflow.com/a/22294044
astroid
stores a cache of parsed modules. https://github.com/PyCQA/pylint/issues/158
That might also be why I had so many issues with pylint running on the test files.
st2api/tests
and
st2auth/tests
would both resolve to the module
tests
So pants needs to somehow detect module collisions and run them in batches. Or even run one pylint run for each source root.
Per source root would not resolve the
st2api/bin/st2api
vs
st2api/st2api
issue though.
Here's a similar pylint issue that has sadly been closed. https://github.com/PyCQA/pylint/issues/2095
This issue about file ordering is still open: https://github.com/PyCQA/pylint/issues/689
Ah. So any file that is in a directory without
__init__.py
should be passed to pylint last.
h
Hmm, why would pylint think that
st2api/bin/st2api
provides
st2api
? It should only do that if
st2api/bin
is in PYTHONPATH no?
Oh, or if it has an
___init___.py
But
st2api/bin
does not
Oh, because it's an implicit namespace package
πŸ™ƒ 1
That's why
dammit pylint
To be fair this is more of a "dammit python"
βž• 1
Not sure why they closed https://github.com/PyCQA/pylint/issues/2095 it's a legitimate issue
h
No reproduction case. Sounds like they'd reopen if you have one
h
Someone posted one
Yeah the issue here is that there are files that are never intended to be imported (e.g., tests, binaries) that Pylint still treats as if they were
So this is a "dammit pylint" situation
βž• 1
Trying to think if we can detect such cases
Got them to reopen it, maybe that's where the solution will come from
🀞 1
πŸ™Œ 2
p
Yeah, this is totally pylint making things difficult.
as far as tests go, I have code that does import from tests files (crappy unittest stuff - maybe after pants is in with pytest as the test runner, I can work on convincing people to switch over to pytest style tests with conftest.py-style fixtures) So, I really do need a way for pants to pass source roots separately if I ever want to approach passing test files through pylint. For now, I’m fine skipping pylint for all the tests directories (thanks for the skip_pylint feature!). But, this is a potential blocker that is independent of the irritating bugs in pylint.
h
So tests import from non-test files that are under the
tests/
dir?
How do they specify such an import unambiguously?
p
let me show you… grabbing links…
h
Got it
How does this work outside of Pants?
p
Run the tests for each package separately.
the st2* packages (plus several others, like *_runner) get installed in a virtualenv (generally with
python setup.py develop
- scary I know), and then the makefile iterates through each of the package directories and runs
nosetest
in them.
I think I have all of the various fixtures files registered in BUILD files, so the next steps are to go through various runs of
./pants lint
and
./pants test
and fix any issues that come up.
πŸ‘ 1
Thus the pylint stuff πŸ™‚
h
Got it
The suggestion on that pylint issue to cache against the file path and not the module name sounds right to me
βž• 2
should solve all the issues