how should I generate a `UrlToFetch` from a `Binar...
# development
h
how should I generate a
UrlToFetch
from a
BinaryToolBase
?
a
use the
.get_external_url_generator()
method to get a url generator!
and use the default one in
binary_util.py
if that returns None
the one named
class PantsHosted(BinaryToolUrlGenerator):
w
there was a previous thread on this. the challenge is that you need a digest per platform
(see rest of thread)
a
(1) isn't that concept what @early-needle-54791 is enabling with multiplatform process requests? (2) cloc is not platform dependent
also, i described to @hundreds-breakfast-49010 how to do this already with platform awareness
I'm trying ot figure out how the
PantsHosted
class you mentioned works [3:28 PM] cosmicexplorer [3:28 PM] you take the
binary_request
(which
BinaryToolBase
generates) and add the
baseurls
which are a bootstrap option (
--binaries-baseurls
) [3:28 PM] (see the
__init__()
) [3:28 PM] then it just works [3:29 PM] Greg Shuflin [3:29 PM] should
_make_binary_request
be public? [3:30 PM] cosmicexplorer [3:30 PM] it could be made public [3:30 PM] it's still an implementation detail imho [3:30 PM] well [3:30 PM] no [3:30 PM] Greg Shuflin [3:30 PM] maybe
ClocBinary
could have a method that returns a
UrlToFetch
? [3:30 PM] cosmicexplorer [3:30 PM] what i wanted to do was to actually make a rule from
BinaryRequest -> Digest
[3:31 PM] Greg Shuflin [3:31 PM] where should that rule live? [3:31 PM] cosmicexplorer [3:31 PM] i wouldn't put it on
ClocBinary
[3:31 PM]
binary_tool.py
[3:31 PM] we have all the info we need already in
BinaryToolBase
and
PantsHosted
(edited) [3:31 PM] Greg Shuflin [3:31 PM]
PantsHosted
relies on having a baseurl [3:31 PM] cosmicexplorer [3:31 PM] yes [3:32 PM] Greg Shuflin [3:32 PM] which
BinaryUtil
provides [3:32 PM] cosmicexplorer [3:32 PM] that's a bootstrap option [3:32 PM]
--binaries-baseurls
[3:32 PM] yes [3:32 PM] you can make an
optionable_rule()
[3:32 PM] if we can make
_make_binary_request
public, then make a ruleset from
BinaryRequest -> Digest
which tries all the urls provided for a given
BinaryRequest
via invoking the url generator, i think that would work [3:33 PM] Greg Shuflin [3:33 PM] and that ruleset would also rely on
BinaryUtil
as an input [3:33 PM]
@rule(Digest [BinaryRequest, BinaryUtil])
? [3:36 PM] cosmicexplorer [3:36 PM] yes [3:36 PM] i think [3:36 PM] and then declaring an
optionable_rule(BinaryUtil)
so that it can be consumed [3:38 PM] and with the rule you just described (
@rule(Digest, [BinaryRequest, BinaryUtil]
), anyone who wants a binary tool can just say:``` @rule(Whatever, [ClocBinary]) def do_cloc(cloc_binary): cloc_snapshot = yield Get(Digest, BinaryRequest, cloc_binary.make_binary_request()) # do whatever with it def rules(): return [ do_cloc, optionable_rule(ClocBinary), ] ```
w
1)
only for ExecuteProcessRequest... would need something similar for binaries.
2)
mm, yep. just one digest then.
a
so in short, by making a rule from
(BinaryRequest, BinaryUtil) -> Digest
we can avoid marrying a
Subsystem
in general with a
UrlToFetch
and we can continue to use the same classes we've been using, just using a different method than
.select()
this time
and
BinaryRequest
contains the platform
is why this is feasible
hm no
w
k. still need to add a tree of digests to whatever subsystem.
a
still need to add a tree of digests
what does this mean? making
.make_binary_request()
public in
BinaryToolbase
(allowing the
@rule
at the end of my above quoted conversation) gives us all the information we need to make a platform-specific binary request, without attaching any new information to any subsystem
h
I'm confused too, and also about what this rule would have to do with
ExecuteProcessRequest
once we have a rule that yields a
Digest
we can do whatever we want with it, whether it's use it in
ExecuteProcessRequest
or something else, right?
a
yes!
i think
ExecuteProcessRequest
is not relevant now, i brought it up because i wasn't sure why the platform-specificity was seen as a blocker
h
w
the reason
--version
is an option for these tools is that people want to change the version
a
yes
w
when they do that, they will also need to change the digests
a
yes
w
and they will need a digest per platform
a
this is all in the
BinaryRequest
object
w
maybe that's a
dict_option
rather than a tree, but.
a
that's only if they need to download it for both platforms
i don't think we want to do that automatically
h
so passing a new`--version` flag will make the
BinaryRequest
object be different (and invalidate any caching)?
👍 1
w
you'd... probably have to? 99% of usecases involve both linux and osx.
a
like i wouldn't want to remote the use of cloc i would think?
w
we already do
a
ok
i don't know if i agree with the "99% of usecases" part but returning a dict is still not hard to do and can be done at the python level
w
not disagreeing. that's just why i was mentioning a subsystem / place for the new option(s).
a
where would the new options come from?
a
i'm really having difficulty understanding your point
does changing options not invalidate
optionable_rule()
instances?
w
if i set
./pants --cloc-binary-version=0.10.0-mycustom-version cloc
, it will break, because any hardcoded digest will mismatch
i must also specify a new digest
as an option
💡 1
a
oh incredible
@hundreds-breakfast-49010 ^
thank you very much for explaining stu!
i had forgotten that we provide digests for
UrlToFetch
it might still be something we can stick onto
BinaryToolBase
but yeah
h
I'm confused about the hardcoded Digest
a
when we do a
UrlToFetch
, we also provide an expected digest
if it doesn't match the result, it fails
h
hm, ok
a
so while we can hardcode in some digests for some known versions, making it an option is necessary
h
by "making it an option" do you mean inviting the user to pass in the digest hex string as a command-line argument?
a
or an env var, or a config file entry
so yes
would recommend thinking about the user interface for this a bit
h
hm, ok
a
if there's a way to split it off from implementing the basic functionality you're looking for with cloc i might recommend it, but it would make sense to have it together
h
well the whole reason I'm going down this rabbit hole
a
right
h
is because i currently get the cloc binary from calling
.select()
on an instance of
ClocBinary
and this appraently won't work when remoting
a
yes
h
because it points at a local file system location
a
only `ExecuteProcessRequest`s work remotably
yes
so
especially since
cloc
isn't platform-dependent
h
so right now something is successfully downloading the cloc perl script to that location (maybe it's not checking a hash?)
a
you could do a hardcoded digest for now
h
but whatever it is is all v1 code that I don't understand
a
it's all in
binary_util.py
h
so I just want to do whatever the right thing to do is, in a v2 context
a
it's complex and annoying code iirc
yes
h
so that's the
select()
method tehre
a
so a
BinaryRequest -> Digest
rule would be the right thing
h
ok
a
(imho)
h
and internally to that rule, I'll need to invoke
UrlToDownload -> Snapshot
a
yes
h
which I think we're already using
a
yes
h
and that's the thing that expects a precomputed digest
👍 1
a
that's correct!
h
and here digest is just, like, a sha-256 hash of the file or something?
not entirely sure what that is
a
yes
well
no
i believe it's a hash of the protobuf representing that directory containing the file
i can check
h
I guess I can just run it with a bogus string and hope the error message reports the correct hash, which I can then hardcode
🔥 1
a
absolutely 100% correct
h
I think we're also assuming a specific cloc version
I didn't actually know there was a command line flag for specifying different versions of the cloc binary
a
you can bolster this investigation by using ripgrep or something to search for
UrlToFetch
(or
UrlToDownload
?) and see whether the error message reports the correct hash
h
maybe we should just remove that? does anyone really care about being able to specify this?
a
yes
because it's an option on all
BinaryToolBase
subclasses too
and many of those definitely care about their version
what we can do is provide a default value for the
version -> digest
dict_option
(stu was mentioning this earlier)
and then if users have their own version, they can simply edit the option
well
wait
sorry
no wait yes it's a
dict_option
because of multiple platforms!
so like
Copy code
{'linux': ('1.0.0', 'asdfasdfsadf'),
 'darwin': ('1.0.0', 'asdfasdfdf')}
h
I'm not sure where this
dict_option
you're talking about would fit in
a
we would convert the
--version
option which is registered on
BinaryToolBase
into an option named something like
--version-digests
which would have the value like the dict i messaged just above
hm or
no that's right
h
so that would entail changing the
version
method on
BinaryToolBase
?
ah that's only being used on`select`
a
i think it would entail changing the
register_options()
method, actually
h
at least in that file
a
it's also being used to make a binary request
h
yeah that's what
select
is doing with it
a
yes
h
can we just change option names like that?
a
i was thinking of adding a new option and deprecating the old one
options are really easy to deprecate
h
do we have an example of that?
a
i always just search the repo with ripgrep whenever i want an example
src/python/pants/option/optionable.py
might have the description of the correct keyword arguments
looking at options registered in
register_options()
with the
removal_version
kwarg should be what you want to do