<@U04S45AHA> Recent bash to python conversion on ...
# general
w
@enough-analyst-54434 Recent bash to python conversion on release.sh caused some complication for us internally, since it will call now pants thus requiring bootstrap itself via internet. I was wondering if we can just push a pex directly per master SHA to binaires.pantsbuild.org, as it should benefit the community as well to consume any SHA with much lower overhead. We are already doing so per tag, so basically just increasing the frequency there. Thoughts?
w
i would be in favor of this.
when John added the final deploy shard for github, we resolved https://github.com/pantsbuild/pants/issues/4896 (edit: which was 100% reasonable, to be clear)
i think that doing an s3 pex deploy per master/branch commit would be cheap and worthwhile.
(...lot o' history in 4896... cc @gorgeous-alligator-94763)
e
The community does not consume master commits. Twitter, one important member of the community does. This is just calling out a - likely innocent - disingenuous argument. Not to mention foisting by one's own petard. That said, have at it. I'm loosing some of my fight on this issue.
w
cool thanks for the understanding. maybe it could be a thing that if we build it, people would come 🙂
👍 1
w
Yeah. I could imagine that with advertising, people would use them every now and then.
w
@enough-analyst-54434 qq: Does https://github.com/pantsbuild/pants/blob/38ccf42c49f31e26462a266e8c63fb7875bcec14/.travis.yml#L131-L152 deploy
pants.pex
to s3 first, and github released pex (e.g. https://github.com/pantsbuild/pants/releases/tag/release_1.11.0rc0) is then manually uploaded? If that’s the case, does that mean I only need to add
OR branch=master
to
if:
Copy code
stages:
  - Test Pants
  - name: Deploy Pants Pex
    if: tag IS present AND tag =~ ^release_.*$
e
No, it builds the multiplatform pex (
./build-support/bin/release.sh -p
), then directly deploys to (GitHub) releases. Master should not abuse GitHub, bintray, etc in this way.
Also,
PANTS_PEX_RELEASE=stable
is untrue for master.
w
So we realized that the pex built in OSS is a thin pex, as in only
pantsbuild.pants
and its deps’ wheels are included, but none of the contrib modules. I am not sure how feasible it is to build a thicker pex in OSS, because twitter needs a
pants.pex
with
pantsbuild.pants
+ a subset of the contribs. (Maybe benjy can offer build-pants-pex-as-a-service :p) I’ve reverted https://github.com/pantsbuild/pants/pull/6674 to unblock our internal release process.
g
I’m thinking of adding an option that we would pass into the release script that can only be used with -p and -q that will build a fat pex. how does that sound?
e
-p already is fat, what do you mean exactly?
w
It does not contain the wheels needed for pycodestyle for example.
w
there is an environment variable you can pass to include more contrib modules
but yea, if you were going to do
thin
and
thick
, the thick one would probably need to contain all of the contrib modules
e
So, it would be awesome to change one thing at a time. Today we publish a "thin" pex by your definition for tags. You wanted this on master too. And prior to that you wanted a simple tool to build these for any branch to use for internal testing. Can we knock off those 1st before deploying fat pexes which is something we currently do nowhere?
w
basically push through https://github.com/pantsbuild/pants/pull/6694 first right?
e
Yes, and also just do what was discussed. A fat pex - if agreed it even made sense - must include all contribs and would 1st require the groundwork to turn off all contribs by default if you want to claim this is useable by anyone.
Just noticed:
(Maybe benjy can offer build-pants-pex-as-a-service :p)
I truly am dumbfounded. I thought the whole / only point of Twitter wanting to consume publically released dists / pexes was that they were the actual ones to go through CI. If you want to build an alacart pex using pre-tested wheels, you already can but that exact combo of wheels in 1 pex will not be the binary that ran ci.
IOW, modifying release.sh -p to take args so that
release.sh -p contrib.x contrib.y
to get a multiplatfrom pex built from CI generated wheels at a given sha would be maybe a 30 delta line change and require 0 extra CI infra.
Twitter or anyone else could run this to generate their alacart pex of CI built / tested component parts and then deploy that however they wished.
With that 30 delta line change done, the last step as I understand it is a bit of script to run said-same on an OSX machine but using locally built wheels using
release.sh -n
once locally on OSX and once in a docker image for Centos6
Thats the bit that supports roundtripping through Twitter internal CI before a PR hits master.
w
there’s already
PANTS_PEX_PACKAGES
, but yeah, cli args seems better
e
Ah right. So why are deployed pexes needed again if Twitter has to use that env var? That implies you rebuild a pex anyhow.
w
I just learned that, but yes, we did rebuild a pex via
PANTS_PEX_PACKAGES
. Although rebuilding wouldn’t be possible if
release.sh
is shelling out to
./pants ...
, hence the idea building everything upstream and not having to build the pex internally.
e
But you added shelling out to pants - so don't do that
g
my use case is a little different here
e
Whatcha got?
w
shelling out to pants - so don’t do that
yeah, initially I took on the task to convert some of the bash to python, but it does not seem to work well, hence reverted.
e
python would be fine, it sounds like your requirement is pure python that needs to fetch no deps
... except from s3 - that's ok for some reason
presumably a network proxy whitelist that bans pypi
g
I want to build linux wheels (for now) that don’t go into S3 and I was hoping I don’t need to change the release script (for building the pex with the contents of dist instead of S3). I need the fat pexes to be able to run the pex as the pants executable.
w
we whitelisted s3, for pypi we have internal mirror. the choke point is rust deps, which contains some forks of repos
g
it seems like my best path forward is going back to my original solution which is what @enough-analyst-54434 and I spoke about
e
@gorgeous-alligator-94763 Why did that get derailed as a solution in the 1st place?
@wide-energy-11069 - gotcha. So python replacement of the release script can be total leveraging libs - say requests or boto, it just can't use pants code
g
because @wide-energy-11069 was making changes to the release script at the same time, so to avoid clashes, I tried to find another way to do it. I thought it would simpler to use -q since we only needed a linux compatible pex
e
Actually - it almost certainly can use pants code, but via PYTHONPATH=src/python aka sourcing the pants_venv
Case in point on the latter is Danny's reworking of build-support/bin/download_binary.sh
That gets executed before rust is bootstrapped so is a proof in point
Gotcha @gorgeous-alligator-94763.
w
pip install everything
works basically
just not building the rust code
e
Great - Danny's technique can be mimicked then
if you want to detour in a rewrite.
I'd suggest maybe doing that after though as a cleanup. None of the things we talked about is hard or large in terms of ammendments to the existing bash
w
to clarify, were you suggesting using
build-support/bin/download_binary.sh
to download the `engine.so`s?
e
No
I'm suggesting you use the technique used in that script to run pants code
Danny should be a great resource. And you can teach him about Fire - which he missed in that change despite a prod.
w
On a high level, I was just confused how this would work without building the rust code. Sure I will talk to Danny 🙂
e
Any pants code which does not invoke the engine doesn't need rust is the short answer - so basically all utils and subsystems are useable.
w
thanks!
w
no BUILD file parsing
which means no binaries can be built
so would not work in this case.
e
Yeah - aka don't run pants
w
Yi is going to open a ticket
ah, i see what you mean.
e
OK. His prior attempt didn't try to build binaries iirc - just use Fetcher
w
yea, Yi's going to open a ticket
w
His prior attempt didn’t try to build binaries iirc - just use Fetcher
the
run_local_pants
part is the gotcha. https://github.com/pantsbuild/pants/commit/01c807efd51ab4f98e727ab21491c18d71d462c8#diff-9ed7102b7836807dc342cc2246ec4839R438
e
Understood - pointing out that was an arguably extravagant way to run a main that you didn't need to do.
With the venv activated you can just run it
w
@wide-energy-11069: let's do this on the ticket please. i think i understand what John is suggesting, but easiest in that format.
👍 1
e
I'll underscore there was nothing new / interesting in this thread besides fat. That is totally new and something we've never done for pypi dists or pexes.
It would probably be most helpful to just lay out Twitter's requirements explicitly so we can end this once and for all with a solution. IE: We can whitelist up to 2 urls, we can mirror x but not y, and we want the produced binaries to have been used in XYZ ways in CI.
w
yea, that is what i hope to add to Yi's ticket if it doesn't include it.
e
Excellent
w
(and Dorothy's, which is for a different usecase)
e
Yeah - good. The vet a branch internally pre hitting pantsbuild/pants:master
w
slight follow up: we were able to quickly narrow down an offending commit via the master pexes which caused openssl failure only on newer version of the linux (ubuntu 18 in this case)
e
This could have been caught publically w/o the master pexes: https://github.com/pantsbuild/setup/pull/29
w
I think what he meant was that he was able to bisect to the precise commit more quickly
e
Yes, but if that CI was run as part of our OSS process - no bisect
It would just be red right away
w
Since this was a usecase that required a Centos binary being consumed on Ubuntu
e
It does not
apparently
w
The bug did
e
Exception message: libssl.so.10: cannot open shared object file: No such file or directory
?
Thats what the linked setup CI gets for ubuntu 12.04 and 14.04 straight up
w
right
because it was consuming a binary built by Centos 6... ie, in pantsbuild/pants CI
so bisecting would mean rebuilding things on centos and then consuming them on ubuntu... non trivial
(easier if you're on linux though)
or i suppose you could do it with two docker images on osx
e
I'm confused, but I'll file an issue to make the setup CI part of release. I'm just showing the master pex releases have nothing to do with detecting this - we could have been detecting from the wheels already as my PR proves.
w
oooh, i see what you mean.
e
Yeah - batting down another poor argument!
I'll see if I can figure something out: https://github.com/pantsbuild/pants/issues/6875
w
i think that if it were possible to bisect directly with the
setup
script (eg by fetching from s3) it would be easier to bisect in general
in this case, the bug was so easy to detect in the right env (centos6 -> ubuntu) that just running setup would be sufficient. in other cases might be harder
but we are not consumers of the setup script in general, and in the past we've talked about switching setup to consuming pexes... not sure whether that is still a goal
e
setup wheel/pex aside - my point here is there should be no need for private bisects - we should publically test the case whenever possible. Private testing is wonderful to have but horrible to rely on.
As far as setup consuming pexes - no one is clamoring ofr this that I know of so it hasn't hit my priority radar at any rate.
w
yep. insofar as it is possible to test wacky combinations like this, we're all for it.
e
Cool. And this is not whacky - Ubuntu is still - I assume - the most popular distro out there.
w
unfortunately, it probably generalizes to the thing you were trying to test before: build a binary once within a travis run, and then kick off all of the other shards to consume that binary
re: ubuntu, yes: but the cross-building thing was unexpected.
e
Yeah - and that proved strictly slower.
w
i thought that the issue was that PR builds couldn't actually use creds to write anywhere? or did you find a way around that
(we added OSX 10.12 and 10.13 sanity-check shards recently, but it's not clear that they're adding as much coverage as they could be, because it's not a cross-compile from some other OSX)
e
There is no way around that ever for any CI provider.
w
which has been nagging at the back of my brain from a remoting perspective for a while =/ (cc @average-vr-56795)
how do you give PR contributors access to a remoting cluster? feels like a whitelist is the only thing
or bring your own creds to test with, which is not very equitable
e
how do you give PR contributors access to a remoting cluster?
You prop up a github app. IE: you can't cheese off travis or circle, etc.
w
it's expensive cheese... and in theory those providers are already plenty exploitable
but yea.
hm... or maybe ... if the provider implemented an authenticated proxy or something?
requests from within the container are accepted without auth, and are then authed to another (gRPC) service
e
Whatever it is, it just has to base all trust on github webhooks - it cannot use credentials configured by the main repo PR writers fork from.
w
yea.
you could have some semblance of protection if the github webhook api provided something that was signed by their private key for a particular PR
then you could require that signature for a PR that is less than X hours old or something
anyway, back to options parsing =P
e
github already passes a secret to your webhook to prove github is github. From there you trust the payload and mask it againts any internal whitelists of parent repos you're allowing to use your service. This much is all solved and it's what travis, circle, etc use.
w
mhm
neat, thanks