One other problem I've run into is output files. W...
# general
i
One other problem I've run into is output files. We have a task that produces an output and a binary that pulls in the task as a dependency. In 1.8, this output file was being wrapped into the resulting pex file, but is not occurring in 1.28. Has something changed underneath with how output files are handled? Example:
Copy code
example_task(
  name='compile',
  sources=['./**/*'],
  output='output.cpython-36m-x86_64-linux-gnu.so',
  dependencies=[
    '3rdparty:numpy
    ]
)


In another BUILD file:

python_library(
  name='impl',
  sources=['cli/*.py'],
  dependencies=[
    'submodules/plugins/plugins:example_task',
  ]
)

python_binary(
  name='cli',
  zip_safe=False,
  sources=['cli/cli.py'],
  dependencies=[':impl'],
  tags={'default-build', 'docker-build'}
)
a
1.8 => 1.28 is a long distance to go in a single leap! i can think of (1) pants runs in python 3 by default now (2) it depends on how you're marshalling those output files into the pex!
do you have a link to your plugin's source?
our strong recommendation is to go a single version at a time, in case nobody's told you before
(the v2 engine makes this sort of thing a lot easier to debug -- there's a lot less duplicated logic for interacting with the filesystem. but i would for sure focus on upgrading your v1 tasks first, then later we could look at rewriting in v2 if useful)
i
I was unaware of the recommendation when I started, but I definitely should have tried to do a lesser step for the upgrade 😂. We were using the pex_build_util prior to this and I had to update it to the PexBuildWrapper. Let me grab the file real quick so you can take a peek.
a
i just remembered -- glob matching definitely changed!
is the output
.so
file supposed to be at the top level of the zip file (the pex file)?
i
Yes, it should be.
a
i might try adding
'./*'
to my globs
for
sources
in the
example_task()
target
to be more clear:
Copy code
example_task(
  name='compile',
  sources=['./*', './**/*'],
  output='output.cpython-36m-x86_64-linux-gnu.so',
  dependencies=[
    '3rdparty:numpy
    ]
)
i
Aaaah interesting. I thought that the
**
would include the top level.
a
it should!! this would be a bug if so, since i would expect
**/*
to include the current directory. cc @witty-crayon-22786
hm. @important-librarian-62877 could you also try globs of
['*', '**/*']
, without the leading
./
?
let me know if any of that works
i
Will do 👍
a
or maybe just
'**/*'
would be ideal. but it's possible they all may fail, no rush
i
I should note, I do see the output file being created in the pants cache or temp directory (can't remember the exact file path), it just isn't making it into the pex
a
to be clear, i agree that your original glob should be working, in a better world
I do see the output file being created in the pants cache or temp directory (can't remember the exact file path), it just isn't making it into the pex
see it how?
oh you mean in
.pants.d/gen/
? ok i get that
i think that also aligns with the current hypothesis of having bad sources globs
because when a pex is created (you can see this in the source for
PexBuilderWrapper
all the source files are individually added to the pex, for sources and resources separately. if a file is left off, it won't get included just because it's in the same directory if it's not specified in the
sources
kwarg of your generated
resources()
target
sorry if overload
would love to listen to what you're seeing
i
Nope you're fine. This library is kind of giant so it takes a long to compile. I had assumed the output field would be treated as a source (resource? not sure on the difference) and would be added to the pex.
a
oh! sorry!
you'll need to modify
_copy_target_attributes
and make it return
["output"]
it's horribly documented
i'm very sorry
oh!
no, sorry, red herring
I had assumed the output field would be treated as a source
this is super super reasonable! i think it might be possible to create more than one "sources" field, if you check the docstring and usages of
self.create_sources_field()
in
src/python/pants/build_graph/target.py
i
Ha that's what I've sounded like the past couple of days 😂. So the returning output is not something it should be doing?
a
but i wouldn't actually do that, personally
i would just expect your glob to work
unfortunately, it's nontrivial in v1 to make a random field contain source files in a nice way
the returning output is doing everything right atm, sorry
👍 1
if you let me know whether any of those other globs work, as well as trying to add the literal string
'output.cpython-36m-x86_64-linux-gnu.so'
as a glob, that would be helpful to make an issue
sorry i know the wait is long ^_^
i
I know that
path/***/**.py
works as I have run into that previously where a library was not pulling in all the modules. I suppose the
.
could be throwing it off.
a
hmm, it should be a glob though, not a regex, if that's what you're concerned about there
i
FWIW the file shows up in this directory
./.pants.d/pyprep
a
!!!!
hm
where specifically in that directory? could you provide the full file path you just found to the elusive
.so
and the contents of the containing directory?
i
./.pants.d/pyprep/compile-cython/9c44cc2743fa/submodules.plugins.plugins.compile-plugins/85f16c5ba5a9/output.cpython-36m-x86_64-linux-gnu.so
a
ok, that makes sense. i'm assuming you registered your task in the
pyprep
goal, which makes sense. the
SimpleCodegenTask
will copy over sources.
hm. but then why were those sources not copied over again to the output?
what is your full pants command line, by the way?
at this point i think it would be super useful if you could take basically the content of the messages you've sent in this thread with some explanatory text and make a new issue at https://github.com/pantsbuild/pants/issues/new
it's not as easy of a glob issue as i had thought apparently :( sorry
i
./pants binary cli/cli::
is the command I am running. Yepp, I can certainly do that 😄
a
also what are the contents of the (symlink) directory
.pants.d/binary/py/current/submodules.plugins.plugins.compile-plugins/current/
?
oh wait oops sorry
well yeah forget about that
just the issue with what you have now would be the best
i
The end result pex file is in that directory except it's a
cli.cli.cli
path. There isn't a submodules one.
👍 1
a
ah, yes, thank you, that makes sense
i
Ok, I figured out what the issue is here. We were using a
find_sources()
function in our plugin that was deprecated in favor of the
sources_globs
attribute of the SimpleCodegenTask. Setting that attribute to the
target.output
fixed the problem. Thanks for the assistance on this one 😄
💵 1
👖 1
💯 1
a
how did you figure out the failing function?
that’s great to hear!!!
i
Well I realized we were returning
[target.output]
in the
find_sources()
function, so I simulated the "Upgrade one step at a time" by just looking at the relevant files in sequence in git. Thankfully the function was deprecated in 1.10 so I didn't have to look too far 😂.
🔥 1
a
oh yes!!! that’s a hard problem to solve
as in, we have some nice deprecation tooling, but it would be hard to make it more useful here. glad git was useful as always
i
Yeah, I don't think there would be a feasible way to make it an automatic fix. Maybe there could be some sort of scan tool for custom plugins that notifies a user of deprecated functions and attributes across a range of versions. That would quite the effort though.
👍 1
a
the implementation of that does possibly align with separate thoughts i’ve been having about separating testing and documentation from code
it would definitely be an effort though just tossing it out there
i’m glad you haven’t been having worse problems!
💯 1