< witty crayon 22786> and cc < enough analyst 54434> curious Pants #development

<@U06A03HV1> (and cc <@U04S45AHA>), curious your t...

hundreds-father-404

03/09/2022, 7:11 PM

@witty-crayon-22786 (and cc @enough-analyst-54434), curious your thoughts on

available_concurrency

for building a PEX x lockfiles:

hundreds-father-404

03/09/2022, 7:11 PM

• if no lockfile, we use # requirements. You point out this isn't perfect because we miss transitive deps • if lockfile, we use # of lines. But I realize that's not good either because one req might have 10 lines due to

--hash

• w/ Pex JSON lockfiles, we actually could parse the JSON and get an exact count. But John has discouraged that because the format is not stable I'm wondering if for lockfiles, we switch to the imperfect no-lockfile heuristic of # of input requirements? Altho gr, even that doesn't work well, we only now the req_strings of the current context, which can be a subset. We'd need to parse the lockfile header to get the # of input requirements, but there's no guarantee a lockfile header is present due to manual lockfile generation So maybe this? 1. Keep requirements.txt the same 2. Parse PEX JSON, but make this fail-safe if the format changes on us. Fall back to the

req_strings

, even tho that's imperfect

witty-crayon-22786

03/09/2022, 7:12 PM

Altho gr, even that doesn’t work well, we only now the req_strings of the current context, which can be a subset.

consuming the lockfile for a subset should only actually build the subset, right? if so, that should be fine

witty-crayon-22786

03/09/2022, 7:13 PM

…or maybe that is followup work, and we always build the whole lockfile when we consume it right now?

witty-crayon-22786

03/09/2022, 7:14 PM

i don’t think that this matters a whole lot: but biasing toward heuristics that overcount is better than those that undercount. so #lines remains decent.

enough-analyst-54434

03/09/2022, 7:29 PM

For 2. I'd still highly discourage siloing. Why don't you add a pex3 tool to do this since the code is already all there? The LockedResolve.resolve code, which just spits out the list of artifacts to download, takes ~17ms in my medium size jupyter-server tests.

enough-analyst-54434

03/09/2022, 7:30 PM

On the other hand we simply won't have this option with other resolvers in other verticals, so it is perhaps better to not obsess on this optimization generally or even just right now.

👍 1

witty-crayon-22786

03/09/2022, 7:31 PM

yea, i think that this isn’t worth worrying too much about. bias toward overcounting, and leave as a TODO.

hundreds-father-404

03/09/2022, 7:38 PM

Okay +1 to not spending too much effort on this. Any suggestions for a heuristic that over-counts? A super simple one is to still use line counts. Note that we use

--indent=2

when generating, which causes entries to have new lines. It will definitely overcount

witty-crayon-22786

03/09/2022, 7:38 PM

yea, just line count should be fine.

👍 1

Open in Slack

Previous Next