< enough analyst 54434> Recent bash to python conversion on Pants #general

<@U04S45AHA> Recent bash to python conversion on ...

wide-energy-11069

10/26/2018, 11:38 PM

@enough-analyst-54434 Recent bash to python conversion on release.sh caused some complication for us internally, since it will call now pants thus requiring bootstrap itself via internet. I was wondering if we can just push a pex directly per master SHA to binaires.pantsbuild.org, as it should benefit the community as well to consume any SHA with much lower overhead. We are already doing so per tag, so basically just increasing the frequency there. Thoughts?

witty-crayon-22786

10/26/2018, 11:46 PM

i would be in favor of this.

witty-crayon-22786

10/26/2018, 11:47 PM

when John added the final deploy shard for github, we resolved https://github.com/pantsbuild/pants/issues/4896 (edit: which was 100% reasonable, to be clear)

witty-crayon-22786

10/26/2018, 11:48 PM

i think that doing an s3 pex deploy per master/branch commit would be cheap and worthwhile.

witty-crayon-22786

10/26/2018, 11:51 PM

(...lot o' history in 4896... cc @gorgeous-alligator-94763)

enough-analyst-54434

10/27/2018, 12:22 AM

The community does not consume master commits. Twitter, one important member of the community does. This is just calling out a - likely innocent - disingenuous argument. Not to mention foisting by one's own petard. That said, have at it. I'm loosing some of my fight on this issue.

wide-energy-11069

10/27/2018, 4:53 AM

cool thanks for the understanding. maybe it could be a thing that if we build it, people would come 🙂

👍 1

witty-crayon-22786

10/27/2018, 3:40 PM

Yeah. I could imagine that with advertising, people would use them every now and then.

wide-energy-11069

10/28/2018, 11:11 PM

@enough-analyst-54434 qq: Does https://github.com/pantsbuild/pants/blob/38ccf42c49f31e26462a266e8c63fb7875bcec14/.travis.yml#L131-L152 deploy

pants.pex

to s3 first, and github released pex (e.g. https://github.com/pantsbuild/pants/releases/tag/release_1.11.0rc0) is then manually uploaded? If that’s the case, does that mean I only need to add

OR branch=master

if:

Copy code

stages:
  - Test Pants
  - name: Deploy Pants Pex
    if: tag IS present AND tag =~ ^release_.*$

enough-analyst-54434

10/29/2018, 12:27 AM

No, it builds the multiplatform pex (

./build-support/bin/release.sh -p

), then directly deploys to (GitHub) releases. Master should not abuse GitHub, bintray, etc in this way.

enough-analyst-54434

10/29/2018, 12:29 AM

Also,

PANTS_PEX_RELEASE=stable

is untrue for master.

wide-energy-11069

10/30/2018, 5:09 AM

So we realized that the pex built in OSS is a thin pex, as in only

pantsbuild.pants

and its deps’ wheels are included, but none of the contrib modules. I am not sure how feasible it is to build a thicker pex in OSS, because twitter needs a

pants.pex

with

pantsbuild.pants

+ a subset of the contribs. (Maybe benjy can offer build-pants-pex-as-a-service :p) I’ve reverted https://github.com/pantsbuild/pants/pull/6674 to unblock our internal release process.

gorgeous-alligator-94763

10/30/2018, 6:36 AM

I’m thinking of adding an option that we would pass into the release script that can only be used with -p and -q that will build a fat pex. how does that sound?

enough-analyst-54434

10/30/2018, 6:51 AM

-p already is fat, what do you mean exactly?

wide-energy-11069

10/30/2018, 3:02 PM

It does not contain the wheels needed for pycodestyle for example.

witty-crayon-22786

10/30/2018, 3:15 PM

there is an environment variable you can pass to include more contrib modules

witty-crayon-22786

10/30/2018, 3:17 PM

but yea, if you were going to do

thin

and

thick

, the thick one would probably need to contain all of the contrib modules

enough-analyst-54434

10/30/2018, 3:33 PM

So, it would be awesome to change one thing at a time. Today we publish a "thin" pex by your definition for tags. You wanted this on master too. And prior to that you wanted a simple tool to build these for any branch to use for internal testing. Can we knock off those 1st before deploying fat pexes which is something we currently do nowhere?

wide-energy-11069

10/30/2018, 3:35 PM

basically push through https://github.com/pantsbuild/pants/pull/6694 first right?

enough-analyst-54434

10/30/2018, 3:37 PM

Yes, and also just do what was discussed. A fat pex - if agreed it even made sense - must include all contribs and would 1st require the groundwork to turn off all contribs by default if you want to claim this is useable by anyone.

enough-analyst-54434

10/30/2018, 3:42 PM

Just noticed:

(Maybe benjy can offer build-pants-pex-as-a-service :p)

I truly am dumbfounded. I thought the whole / only point of Twitter wanting to consume publically released dists / pexes was that they were the actual ones to go through CI. If you want to build an alacart pex using pre-tested wheels, you already can but that exact combo of wheels in 1 pex will not be the binary that ran ci.

enough-analyst-54434

10/30/2018, 3:45 PM

IOW, modifying release.sh -p to take args so that

release.sh -p contrib.x contrib.y

to get a multiplatfrom pex built from CI generated wheels at a given sha would be maybe a 30 delta line change and require 0 extra CI infra.

enough-analyst-54434

10/30/2018, 3:46 PM

Twitter or anyone else could run this to generate their alacart pex of CI built / tested component parts and then deploy that however they wished.

enough-analyst-54434

10/30/2018, 3:50 PM

With that 30 delta line change done, the last step as I understand it is a bit of script to run said-same on an OSX machine but using locally built wheels using

release.sh -n

once locally on OSX and once in a docker image for Centos6

enough-analyst-54434

10/30/2018, 3:50 PM

Thats the bit that supports roundtripping through Twitter internal CI before a PR hits master.

wide-energy-11069

10/30/2018, 3:51 PM

there’s already

PANTS_PEX_PACKAGES

, but yeah, cli args seems better

enough-analyst-54434

10/30/2018, 3:53 PM

Ah right. So why are deployed pexes needed again if Twitter has to use that env var? That implies you rebuild a pex anyhow.

wide-energy-11069

10/30/2018, 3:57 PM

I just learned that, but yes, we did rebuild a pex via

PANTS_PEX_PACKAGES

. Although rebuilding wouldn’t be possible if

release.sh

is shelling out to

./pants ...

, hence the idea building everything upstream and not having to build the pex internally.

enough-analyst-54434

10/30/2018, 3:57 PM

But you added shelling out to pants - so don't do that

gorgeous-alligator-94763

10/30/2018, 3:58 PM

my use case is a little different here

enough-analyst-54434

10/30/2018, 3:58 PM

Whatcha got?

wide-energy-11069

10/30/2018, 3:59 PM

shelling out to pants - so don’t do that

yeah, initially I took on the task to convert some of the bash to python, but it does not seem to work well, hence reverted.

enough-analyst-54434

10/30/2018, 4:00 PM

python would be fine, it sounds like your requirement is pure python that needs to fetch no deps

enough-analyst-54434

10/30/2018, 4:00 PM

... except from s3 - that's ok for some reason

enough-analyst-54434

10/30/2018, 4:00 PM

presumably a network proxy whitelist that bans pypi

gorgeous-alligator-94763

10/30/2018, 4:01 PM

I want to build linux wheels (for now) that don’t go into S3 and I was hoping I don’t need to change the release script (for building the pex with the contents of dist instead of S3). I need the fat pexes to be able to run the pex as the pants executable.

wide-energy-11069

10/30/2018, 4:01 PM

we whitelisted s3, for pypi we have internal mirror. the choke point is rust deps, which contains some forks of repos

gorgeous-alligator-94763

10/30/2018, 4:01 PM

it seems like my best path forward is going back to my original solution which is what @enough-analyst-54434 and I spoke about

enough-analyst-54434

10/30/2018, 4:02 PM

@gorgeous-alligator-94763 Why did that get derailed as a solution in the 1st place?

enough-analyst-54434

10/30/2018, 4:03 PM

@wide-energy-11069 - gotcha. So python replacement of the release script can be total leveraging libs - say requests or boto, it just can't use pants code

gorgeous-alligator-94763

10/30/2018, 4:04 PM

because @wide-energy-11069 was making changes to the release script at the same time, so to avoid clashes, I tried to find another way to do it. I thought it would simpler to use -q since we only needed a linux compatible pex

enough-analyst-54434

10/30/2018, 4:04 PM

Actually - it almost certainly can use pants code, but via PYTHONPATH=src/python aka sourcing the pants_venv

enough-analyst-54434

10/30/2018, 4:05 PM

Case in point on the latter is Danny's reworking of build-support/bin/download_binary.sh

enough-analyst-54434

10/30/2018, 4:05 PM

That gets executed before rust is bootstrapped so is a proof in point

enough-analyst-54434

10/30/2018, 4:06 PM

Gotcha @gorgeous-alligator-94763.

wide-energy-11069

10/30/2018, 4:06 PM

pip install everything

works basically

wide-energy-11069

10/30/2018, 4:06 PM

just not building the rust code

enough-analyst-54434

10/30/2018, 4:07 PM

Great - Danny's technique can be mimicked then

enough-analyst-54434

10/30/2018, 4:07 PM

if you want to detour in a rewrite.

enough-analyst-54434

10/30/2018, 4:07 PM

I'd suggest maybe doing that after though as a cleanup. None of the things we talked about is hard or large in terms of ammendments to the existing bash

wide-energy-11069

10/30/2018, 4:09 PM

to clarify, were you suggesting using

build-support/bin/download_binary.sh

to download the `engine.so`s?

enough-analyst-54434

10/30/2018, 4:09 PM

enough-analyst-54434

10/30/2018, 4:09 PM

I'm suggesting you use the technique used in that script to run pants code

enough-analyst-54434

10/30/2018, 4:10 PM

Danny should be a great resource. And you can teach him about Fire - which he missed in that change despite a prod.

wide-energy-11069

10/30/2018, 4:11 PM

On a high level, I was just confused how this would work without building the rust code. Sure I will talk to Danny 🙂

enough-analyst-54434

10/30/2018, 4:11 PM

Any pants code which does not invoke the engine doesn't need rust is the short answer - so basically all utils and subsystems are useable.

wide-energy-11069

10/30/2018, 4:13 PM

thanks!

witty-crayon-22786

10/30/2018, 4:13 PM

no BUILD file parsing

witty-crayon-22786

10/30/2018, 4:13 PM

which means no binaries can be built

witty-crayon-22786

10/30/2018, 4:14 PM

so would not work in this case.

enough-analyst-54434

10/30/2018, 4:14 PM

Yeah - aka don't run pants

witty-crayon-22786

10/30/2018, 4:14 PM

Yi is going to open a ticket

witty-crayon-22786

10/30/2018, 4:14 PM

ah, i see what you mean.

enough-analyst-54434

10/30/2018, 4:14 PM

OK. His prior attempt didn't try to build binaries iirc - just use Fetcher

witty-crayon-22786

10/30/2018, 4:14 PM

yea, Yi's going to open a ticket

wide-energy-11069

10/30/2018, 4:18 PM

His prior attempt didn’t try to build binaries iirc - just use Fetcher

the

run_local_pants

part is the gotcha. https://github.com/pantsbuild/pants/commit/01c807efd51ab4f98e727ab21491c18d71d462c8#diff-9ed7102b7836807dc342cc2246ec4839R438

enough-analyst-54434

10/30/2018, 4:20 PM

Understood - pointing out that was an arguably extravagant way to run a main that you didn't need to do.

enough-analyst-54434

10/30/2018, 4:20 PM

With the venv activated you can just run it

witty-crayon-22786

10/30/2018, 4:20 PM

@wide-energy-11069: let's do this on the ticket please. i think i understand what John is suggesting, but easiest in that format.

👍 1

enough-analyst-54434

10/30/2018, 4:21 PM

I'll underscore there was nothing new / interesting in this thread besides fat. That is totally new and something we've never done for pypi dists or pexes.

enough-analyst-54434

10/30/2018, 4:25 PM

It would probably be most helpful to just lay out Twitter's requirements explicitly so we can end this once and for all with a solution. IE: We can whitelist up to 2 urls, we can mirror x but not y, and we want the produced binaries to have been used in XYZ ways in CI.

witty-crayon-22786

10/30/2018, 4:26 PM

yea, that is what i hope to add to Yi's ticket if it doesn't include it.

enough-analyst-54434

10/30/2018, 4:26 PM

Excellent

witty-crayon-22786

10/30/2018, 4:35 PM

(and Dorothy's, which is for a different usecase)

enough-analyst-54434

10/30/2018, 4:37 PM

Yeah - good. The vet a branch internally pre hitting pantsbuild/pants:master

wide-energy-11069

11/27/2018, 1:05 AM

slight follow up: we were able to quickly narrow down an offending commit via the master pexes which caused openssl failure only on newer version of the linux (ubuntu 18 in this case)

enough-analyst-54434

12/06/2018, 6:58 PM

This could have been caught publically w/o the master pexes: https://github.com/pantsbuild/setup/pull/29

witty-crayon-22786

12/06/2018, 7:00 PM

I think what he meant was that he was able to bisect to the precise commit more quickly

enough-analyst-54434

12/06/2018, 7:00 PM

Yes, but if that CI was run as part of our OSS process - no bisect

enough-analyst-54434

12/06/2018, 7:00 PM

It would just be red right away

witty-crayon-22786

12/06/2018, 7:00 PM

Since this was a usecase that required a Centos binary being consumed on Ubuntu

enough-analyst-54434

12/06/2018, 7:00 PM

It does not

enough-analyst-54434

12/06/2018, 7:00 PM

apparently

witty-crayon-22786

12/06/2018, 7:00 PM

The bug did

enough-analyst-54434

12/06/2018, 7:01 PM

Exception message: libssl.so.10: cannot open shared object file: No such file or directory

enough-analyst-54434

12/06/2018, 7:01 PM

enough-analyst-54434

12/06/2018, 7:01 PM

Thats what the linked setup CI gets for ubuntu 12.04 and 14.04 straight up

witty-crayon-22786

12/06/2018, 7:01 PM

right

witty-crayon-22786

12/06/2018, 7:01 PM

because it was consuming a binary built by Centos 6... ie, in pantsbuild/pants CI

witty-crayon-22786

12/06/2018, 7:02 PM

so bisecting would mean rebuilding things on centos and then consuming them on ubuntu... non trivial

witty-crayon-22786

12/06/2018, 7:02 PM

(easier if you're on linux though)

witty-crayon-22786

12/06/2018, 7:03 PM

or i suppose you could do it with two docker images on osx

enough-analyst-54434

12/06/2018, 7:03 PM

I'm confused, but I'll file an issue to make the setup CI part of release. I'm just showing the master pex releases have nothing to do with detecting this - we could have been detecting from the wheels already as my PR proves.

witty-crayon-22786

12/06/2018, 7:04 PM

oooh, i see what you mean.

enough-analyst-54434

12/06/2018, 7:04 PM

Yeah - batting down another poor argument!

enough-analyst-54434

12/06/2018, 7:07 PM

I'll see if I can figure something out: https://github.com/pantsbuild/pants/issues/6875

witty-crayon-22786

12/06/2018, 7:07 PM

i think that if it were possible to bisect directly with the

setup

script (eg by fetching from s3) it would be easier to bisect in general

witty-crayon-22786

12/06/2018, 7:07 PM

in this case, the bug was so easy to detect in the right env (centos6 -> ubuntu) that just running setup would be sufficient. in other cases might be harder

witty-crayon-22786

12/06/2018, 7:08 PM

but we are not consumers of the setup script in general, and in the past we've talked about switching setup to consuming pexes... not sure whether that is still a goal

enough-analyst-54434

12/06/2018, 7:10 PM

setup wheel/pex aside - my point here is there should be no need for private bisects - we should publically test the case whenever possible. Private testing is wonderful to have but horrible to rely on.

enough-analyst-54434

12/06/2018, 7:10 PM

As far as setup consuming pexes - no one is clamoring ofr this that I know of so it hasn't hit my priority radar at any rate.

witty-crayon-22786

12/06/2018, 7:11 PM

yep. insofar as it is possible to test wacky combinations like this, we're all for it.

enough-analyst-54434

12/06/2018, 7:12 PM

Cool. And this is not whacky - Ubuntu is still - I assume - the most popular distro out there.

witty-crayon-22786

12/06/2018, 7:12 PM

unfortunately, it probably generalizes to the thing you were trying to test before: build a binary once within a travis run, and then kick off all of the other shards to consume that binary

witty-crayon-22786

12/06/2018, 7:13 PM

re: ubuntu, yes: but the cross-building thing was unexpected.

enough-analyst-54434

12/06/2018, 7:13 PM

Yeah - and that proved strictly slower.

witty-crayon-22786

12/06/2018, 7:13 PM

i thought that the issue was that PR builds couldn't actually use creds to write anywhere? or did you find a way around that

witty-crayon-22786

12/06/2018, 7:14 PM

(we added OSX 10.12 and 10.13 sanity-check shards recently, but it's not clear that they're adding as much coverage as they could be, because it's not a cross-compile from some other OSX)

enough-analyst-54434

12/06/2018, 7:14 PM

There is no way around that ever for any CI provider.

witty-crayon-22786

12/06/2018, 7:16 PM

which has been nagging at the back of my brain from a remoting perspective for a while =/ (cc @average-vr-56795)

witty-crayon-22786

12/06/2018, 7:16 PM

how do you give PR contributors access to a remoting cluster? feels like a whitelist is the only thing

witty-crayon-22786

12/06/2018, 7:17 PM

or bring your own creds to test with, which is not very equitable

enough-analyst-54434

12/06/2018, 7:18 PM

how do you give PR contributors access to a remoting cluster?

You prop up a github app. IE: you can't cheese off travis or circle, etc.

witty-crayon-22786

12/06/2018, 7:23 PM

it's expensive cheese... and in theory those providers are already plenty exploitable

witty-crayon-22786

12/06/2018, 7:24 PM

but yea.

witty-crayon-22786

12/06/2018, 7:24 PM

hm... or maybe ... if the provider implemented an authenticated proxy or something?

witty-crayon-22786

12/06/2018, 7:25 PM

requests from within the container are accepted without auth, and are then authed to another (gRPC) service

enough-analyst-54434

12/06/2018, 7:27 PM

Whatever it is, it just has to base all trust on github webhooks - it cannot use credentials configured by the main repo PR writers fork from.

witty-crayon-22786

12/06/2018, 7:27 PM

yea.

witty-crayon-22786

12/06/2018, 7:29 PM

you could have some semblance of protection if the github webhook api provided something that was signed by their private key for a particular PR

witty-crayon-22786

12/06/2018, 7:29 PM

then you could require that signature for a PR that is less than X hours old or something

witty-crayon-22786

12/06/2018, 7:29 PM

anyway, back to options parsing =P

enough-analyst-54434

12/06/2018, 7:31 PM

github already passes a secret to your webhook to prove github is github. From there you trust the payload and mask it againts any internal whitelists of parent repos you're allowing to use your service. This much is all solved and it's what travis, circle, etc use.

witty-crayon-22786

12/06/2018, 7:32 PM

mhm

witty-crayon-22786

12/06/2018, 7:32 PM

neat, thanks

5 Views

Open in Slack

Previous Next