https://pantsbuild.org/ logo
#general
Title
# general
f

few-arm-93065

03/23/2023, 9:45 PM
Hi folks! Is there documentation of the pants lockfile format?
e

enough-analyst-54434

03/23/2023, 9:57 PM
Nope. It comes from Pex and is purposely opaque / undocumented currently. What's your use case?
f

few-arm-93065

03/23/2023, 9:59 PM
We need to generate a software bill of materials (SBOM) to satisfy regulatory requirements, so we’d like to write software that will consume the python lockfile as well as lock files from other languages to produce a combined SBOM.
e

enough-analyst-54434

03/23/2023, 10:01 PM
Gotcha. For now you just have to consume as is and hope it doesn't break. Pants itself does this to print out pretty diffs when you run
pants generate-lockfiles
.
Is there some SBOM standard? I'd feel much more comfortable adding a feature to Pex to emit that than to defining and committing to its own lock file format standard.
SBOM seems to scream standard.
f

few-arm-93065

03/23/2023, 10:06 PM
There is an open standard, SPDX. https://spdx.github.io/spdx-spec/v2.3/
e

enough-analyst-54434

03/23/2023, 10:09 PM
Ok, I just found two others as well - my god. Taking SPDX though, if I provide a
pex3 lock export
(an existing command)
--format
for that would that suffice?
f

few-arm-93065

03/23/2023, 10:09 PM
That would be amazing!
Yes, unfortunately there are a lot of standards. SPDX seems to be the one that is widely used and supported by the linux foundation.
e

enough-analyst-54434

03/23/2023, 10:12 PM
Ok. Since you know something about this would you be willing to file a Pex issue? I need to page in that spec to see how much work this entails, but clearly it's ~exactly the amount of work you are preparing to do anyway.
f

few-arm-93065

03/23/2023, 10:13 PM
Absolutely! Happy to.
And if I can, I’d be happy to contribute to the feature. I would need to learn how.
e

enough-analyst-54434

03/23/2023, 10:16 PM
Ok, that would be great. I'll use the issue to seed some notes on where this goes and where the underlying lock data model is, etc.
This is Pex for completeness: https://github.com/pantsbuild/pex
f

few-arm-93065

03/23/2023, 10:23 PM
c

curved-television-6568

03/23/2023, 10:28 PM
thanks for bringing this to our attention @few-arm-93065! /me follows this 🙂
e

enough-analyst-54434

03/23/2023, 10:40 PM
The checksum thing is pretty bad. Sometimes there are 10s of files per locked project and 100s of locked projects in 1 lock file. Pulling down ~1000 wheels just to re-fingerprint down from sha256 to sha1 is pretty horrible. Maybe I read the spec wrong?
f

few-arm-93065

03/23/2023, 10:46 PM
I’m just diving into this spec as well. Frustrating. It looks like allowing people to use any checksum function has been suggested, and may make it into the v3 version of the standard. https://github.com/spdx/spdx-spec/issues/106
e

enough-analyst-54434

03/23/2023, 10:47 PM
Ok. I mean, that seems like a blocker to me. Poor PyPI for one.
It certainly could be done, just more than a bit crazy since it gives you less surety in your SBOM!
So, @few-arm-93065 come hell or high water though you must produce SPDX across software ecosystems anyhow?
Presumably this is a problem elsewhere too?
f

few-arm-93065

03/23/2023, 10:49 PM
No worries, we can always consume the pants/pex lockfile in the interim
e

enough-analyst-54434

03/23/2023, 10:50 PM
So you will be spamming PyPI?
f

few-arm-93065

03/23/2023, 10:51 PM
Internally, we actually don’t care about perfect adherence to SPDX. We’ll be producing a human-readable regulatory document. So in the short term we can make a somewhat hacky attempt at it.
e

enough-analyst-54434

03/23/2023, 10:51 PM
Aha.
Yeah, I'd definitely want to emit a standard if it is to be a Pex feature.
f

few-arm-93065

03/23/2023, 10:54 PM
I wonder if there’s a more lightweight standard we could export locks into. CycloneDX https://cyclonedx.org is another one that seems to offer a lot more flexibility.
e

enough-analyst-54434

03/23/2023, 10:56 PM
That was one of the other two I found. I'll take a look. Back on SPDX, there is https://github.com/pantsbuild/pex/blob/main/pex/resolve/pep_691/fingerprint_service.py which Pex uses. I'm not sure what actual hashes PyPI tends to present. I'll check that now.
The thing is, that's ~only supported for PyPI. If using custom indexes in addition, downloading all the things from those indexes will still be needed.
f

few-arm-93065

03/23/2023, 10:58 PM
my knowledge of custom indexes isn’t great, are they guaranteed to provide at least one kind of hash, even if it’s not sha1 or sha256?
ah, the code you linked to answers my question. If that were true pex wouldn’t need to download packages to fingerprint them…
e

enough-analyst-54434

03/23/2023, 10:59 PM
They are not. Pex falls back to downloading and hashing.
Ok, tried out PyPI PEP-691 on a project and it only returns sha256 hashes.
Copy code
$ curl -sSL -H "Accept: application/vnd.pypi.simple.v1+json" <https://pypi.org/simple/p537> | jq .
f

few-arm-93065

03/23/2023, 11:02 PM
And that’s just an implementation detail of pypi... I’m curious what you think of cyclonedx. It seems to allow for many kinds of fingerprint.
e

enough-analyst-54434

03/23/2023, 11:04 PM
I can't even see it requiring 1. Components seems to be 0 or more and same hashes: https://cyclonedx.org/docs/1.4/json/#components_items_hashes
This spec is easier to read, if looser. And that draws my attention to licenses. Presumably you need those? (they are also 0 or more).
For that, Pex will again need to download the file to extract the license. You could maybe cheat and just download 1 and assume the license is the same in each wheel variant (and sdist) published for that version.
f

few-arm-93065

03/23/2023, 11:08 PM
yes, licenses are going to be important. Doesn’t the pypi JSON API have a license field?
e

enough-analyst-54434

03/23/2023, 11:08 PM
Not at that endpoint.
f

few-arm-93065

03/23/2023, 11:09 PM
I see “license” under the “info” object in the main /pypi/project/json endpoint, I assume pex does hit that as well?
e

enough-analyst-54434

03/23/2023, 11:09 PM
Pex does not, no.
f

few-arm-93065

03/23/2023, 11:09 PM
how about classifiers?
(that’s a long shot admittedly)
e

enough-analyst-54434

03/23/2023, 11:10 PM
Yeah - none of that. Ok, here: https://warehouse.pypa.io/api-reference/json.html I thought that API was deprecated, occasionally blacked out, etc though. Let me check.
OK, I guess not - looks legit. So that would be the primary way to get the extra SBOM data, with, again, fallback to downloading artifacts and cracking them open.
f

few-arm-93065

03/23/2023, 11:14 PM
Even a “lazy” approach would be ok with me - if the authors didn’t set the license in pypi, just don’t include anything in the BOM, indicating we made an attempt but manual investigation is needed
(I assume we’ll be chasing down things like this regardless)
e

enough-analyst-54434

03/23/2023, 11:14 PM
If you can find a lazy enough spec then I'm happy to support that directly in Pex. That's crucial though for Pex support.
I don't want Pex to emit invalid spec X.
f

few-arm-93065

03/23/2023, 11:15 PM
of course - and understood. Cyclonedx does seem to support an array of 0 or more licenses. I’ll need to do some more detailed research on this.
e

enough-analyst-54434

03/23/2023, 11:16 PM
Ok, in the meantime I'll add the code pointers to the issue in case this ends up being feasible as a Pex feature pending your spec investigation.
f

few-arm-93065

03/23/2023, 11:16 PM
thank you!
e

enough-analyst-54434

03/23/2023, 11:27 PM
You're welcome. One last question from my end - ignorant of SBOMs - is intended that an SBOM actually contains unused software? Pex
--style universal
lock files - which is what Pants uses - lock the artifacts needed to form a PEX across Python versions and target systems (Linux & Mac). Presumably though you only actually build software for some of those. IOW you produce a PEX file that just contains a sub-slice of the lock file and you never actually ship or use - say - 90% of the artifacts in the lock. Is this as intended?
f

few-arm-93065

03/23/2023, 11:29 PM
ah, I didn’t realize that. That is not intended. We use a single platform for all the PEXs we ship (devs on macs but building docker to x86 linux). The regulators only care about the packages that are in the actual product.
e

enough-analyst-54434

03/23/2023, 11:29 PM
Right. And this is why Pex itself defaults to
--style strict
locks.
Pants is getting in your way here and you probably don't actually mean to SBOM a lock file.
f

few-arm-93065

03/23/2023, 11:30 PM
thanks for the heads up on that one. Can I call pex directly to generate a strict style lock? or is there a way I can deconstruct a pex?
e

enough-analyst-54434

03/23/2023, 11:31 PM
So, this same whole -> SBOM could be a pex-tool added to PEXes, i.e.:
PEX_TOOLS=1 my.pex generate-sbom here.json
That would be easy to add to Pex. The actual wheels you really use are local at that point and re-hashing is cheap and eco-sensitive, etc.
Getting Pants to
--style strict
will be a bear I think.
f

few-arm-93065

03/23/2023, 11:32 PM
That’s actually a great solution - and much more airtight when it comes to proving that the sbom is complete and accurate
e

enough-analyst-54434

03/23/2023, 11:33 PM
Ok, great. That's much more sane.
Let me update the ticket notes with new code pointers for pex-tools.
f

few-arm-93065

03/23/2023, 11:33 PM
thank you, I really appreciate it!
e

enough-analyst-54434

03/23/2023, 11:44 PM
And, 1 more. Since a PEX is a single file - in some sense you can just SBOM that and be done. That does leave out the licenses and versions of all the included installed wheels though which you probably want.
For example, the Pants PEX (which few people use), has:
Copy code
$ zipinfo ~/Downloads/pants.2.16.0.dev3.pex | grep LICENSE
-rw-r--r--  2.0 unx     1078 b- defN 80-Jan-01 00:00 .bootstrap/pex/vendor/_vendored/setuptools/setuptools-44.0.0+3acb925dd708430aeaf197ea53ac8a752f7c1863.dist-info/LICENSE
-rw-r--r--  2.0 unx     1125 b- defN 80-Jan-01 00:00 .bootstrap/pex/vendor/_vendored/wheel/wheel-0.37.1.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     1101 b- defN 80-Jan-01 00:00 .deps/PyYAML-6.0-cp37-cp37m-macosx_10_9_x86_64.whl/PyYAML-6.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1101 b- defN 80-Jan-01 00:00 .deps/PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl/PyYAML-6.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1101 b- defN 80-Jan-01 00:00 .deps/PyYAML-6.0-cp38-cp38-macosx_10_9_x86_64.whl/PyYAML-6.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1101 b- defN 80-Jan-01 00:00 .deps/PyYAML-6.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl/PyYAML-6.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1101 b- defN 80-Jan-01 00:00 .deps/PyYAML-6.0-cp39-cp39-macosx_10_9_x86_64.whl/PyYAML-6.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1101 b- defN 80-Jan-01 00:00 .deps/PyYAML-6.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl/PyYAML-6.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1052 b- defN 80-Jan-01 00:00 .deps/certifi-2022.12.7-py3-none-any.whl/certifi-2022.12.7.dist-info/LICENSE
-rw-rw----  2.0 unx     1070 b- defN 80-Jan-01 00:00 .deps/charset_normalizer-2.1.1-py3-none-any.whl/charset_normalizer-2.1.1.dist-info/LICENSE
-rw-rw----  2.0 unx     1081 b- defN 80-Jan-01 00:00 .deps/chevron-0.14.0-py3-none-any.whl/chevron-0.14.0.dist-info/LICENSE
-rw-rw----  2.0 unx    10143 b- defN 80-Jan-01 00:00 .deps/fasteners-0.16.3-py2.py3-none-any.whl/fasteners-0.16.3.dist-info/LICENSE
-rw-rw----  2.0 unx     1523 b- defN 80-Jan-01 00:00 .deps/idna-3.4-py3-none-any.whl/idna-3.4.dist-info/LICENSE.md
-rw-rw----  2.0 unx     2265 b- defN 80-Jan-01 00:00 .deps/ijson-3.1.4-cp37-cp37m-macosx_10_9_x86_64.whl/ijson-3.1.4.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     2265 b- defN 80-Jan-01 00:00 .deps/ijson-3.1.4-cp37-cp37m-manylinux2010_x86_64.whl/ijson-3.1.4.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     2265 b- defN 80-Jan-01 00:00 .deps/ijson-3.1.4-cp38-cp38-macosx_10_9_x86_64.whl/ijson-3.1.4.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     2265 b- defN 80-Jan-01 00:00 .deps/ijson-3.1.4-cp38-cp38-manylinux2010_x86_64.whl/ijson-3.1.4.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     2265 b- defN 80-Jan-01 00:00 .deps/ijson-3.1.4-cp39-cp39-macosx_10_9_x86_64.whl/ijson-3.1.4.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     2265 b- defN 80-Jan-01 00:00 .deps/ijson-3.1.4-cp39-cp39-manylinux2010_x86_64.whl/ijson-3.1.4.dist-info/LICENSE.txt
-rw-rw----  2.0 unx      568 b- defN 80-Jan-01 00:00 .deps/importlib_resources-5.0.7-py3-none-any.whl/importlib_resources-5.0.7.dist-info/LICENSE
-rw-rw----  2.0 unx      197 b- defN 80-Jan-01 00:00 .deps/packaging-21.3-py3-none-any.whl/packaging-21.3.dist-info/LICENSE
-rw-rw----  2.0 unx    10174 b- defN 80-Jan-01 00:00 .deps/packaging-21.3-py3-none-any.whl/packaging-21.3.dist-info/LICENSE.APACHE
-rw-rw----  2.0 unx     1344 b- defN 80-Jan-01 00:00 .deps/packaging-21.3-py3-none-any.whl/packaging-21.3.dist-info/LICENSE.BSD
-rw-rw----  2.0 unx     1252 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex/vendor/_vendored/toml/toml-0.10.2.dist-info/LICENSE
-rw-rw----  2.0 unx     1082 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex/vendor/_vendored/attrs/attrs-21.5.0.dev0.dist-info/LICENSE
-rw-rw----  2.0 unx     1125 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex/vendor/_vendored/wheel/wheel-0.37.1.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     1090 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex/vendor/_vendored/pip/pip-20.3.4.dist-info/LICENSE.txt
-rw-rw----  2.0 unx      197 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex/vendor/_vendored/packaging_20_9/packaging-20.9.dist-info/LICENSE
-rw-rw----  2.0 unx    10174 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex/vendor/_vendored/packaging_20_9/packaging-20.9.dist-info/LICENSE.APACHE
-rw-rw----  2.0 unx     1344 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex/vendor/_vendored/packaging_20_9/packaging-20.9.dist-info/LICENSE.BSD
-rw-rw----  2.0 unx     1023 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex/vendor/_vendored/packaging_20_9/pyparsing-2.4.7.dist-info/LICENSE
-rw-rw----  2.0 unx     1078 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex/vendor/_vendored/setuptools/setuptools-44.0.0+3acb925dd708430aeaf197ea53ac8a752f7c1863.dist-info/LICENSE
-rw-rw----  2.0 unx     1023 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex/vendor/_vendored/packaging_21_3/pyparsing-2.4.7.dist-info/LICENSE
-rw-rw----  2.0 unx      197 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex/vendor/_vendored/packaging_21_3/packaging-21.3.dist-info/LICENSE
-rw-rw----  2.0 unx    10174 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex/vendor/_vendored/packaging_21_3/packaging-21.3.dist-info/LICENSE.APACHE
-rw-rw----  2.0 unx     1344 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex/vendor/_vendored/packaging_21_3/packaging-21.3.dist-info/LICENSE.BSD
-rw-rw----  2.0 unx    11323 b- defN 80-Jan-01 00:00 .deps/pex-2.1.116-py2.py3-none-any.whl/pex-2.1.116.dist-info/LICENSE
-rw-rw----  2.0 unx     1549 b- defN 80-Jan-01 00:00 .deps/psutil-5.9.0-cp37-cp37m-macosx_10_9_x86_64.whl/psutil-5.9.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1549 b- defN 80-Jan-01 00:00 .deps/psutil-5.9.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl/psutil-5.9.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1549 b- defN 80-Jan-01 00:00 .deps/psutil-5.9.0-cp38-cp38-macosx_10_9_x86_64.whl/psutil-5.9.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1549 b- defN 80-Jan-01 00:00 .deps/psutil-5.9.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl/psutil-5.9.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1549 b- defN 80-Jan-01 00:00 .deps/psutil-5.9.0-cp39-cp39-macosx_10_9_x86_64.whl/psutil-5.9.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1549 b- defN 80-Jan-01 00:00 .deps/psutil-5.9.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl/psutil-5.9.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1023 b- defN 80-Jan-01 00:00 .deps/pyparsing-3.0.9-py3-none-any.whl/pyparsing-3.0.9.dist-info/LICENSE
-rw-rw----  2.0 unx     1147 b- defN 80-Jan-01 00:00 .deps/python_lsp_jsonrpc-1.0.0-py3-none-any.whl/python_lsp_jsonrpc-1.0.0.dist-info/LICENSE
-rw-rw----  2.0 unx    10142 b- defN 80-Jan-01 00:00 .deps/requests-2.28.1-py3-none-any.whl/requests-2.28.1.dist-info/LICENSE
-rw-rw----  2.0 unx     1050 b- defN 80-Jan-01 00:00 .deps/setuptools-63.4.3-py3-none-any.whl/setuptools-63.4.3.dist-info/LICENSE
-rw-rw----  2.0 unx     1066 b- defN 80-Jan-01 00:00 .deps/six-1.16.0-py2.py3-none-any.whl/six-1.16.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1252 b- defN 80-Jan-01 00:00 .deps/toml-0.10.2-py2.py3-none-any.whl/toml-0.10.2.dist-info/LICENSE
-rw-rw----  2.0 unx    12755 b- defN 80-Jan-01 00:00 .deps/typing_extensions-4.3.0-py3-none-any.whl/typing_extensions-4.3.0.dist-info/LICENSE
-rw-rw----  2.0 unx     1959 b- defN 80-Jan-01 00:00 .deps/ujson-5.6.0-cp37-cp37m-macosx_10_9_x86_64.whl/ujson-5.6.0.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     1959 b- defN 80-Jan-01 00:00 .deps/ujson-5.6.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl/ujson-5.6.0.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     1959 b- defN 80-Jan-01 00:00 .deps/ujson-5.6.0-cp38-cp38-macosx_10_9_x86_64.whl/ujson-5.6.0.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     1959 b- defN 80-Jan-01 00:00 .deps/ujson-5.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl/ujson-5.6.0.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     1959 b- defN 80-Jan-01 00:00 .deps/ujson-5.6.0-cp39-cp39-macosx_10_9_x86_64.whl/ujson-5.6.0.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     1959 b- defN 80-Jan-01 00:00 .deps/ujson-5.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl/ujson-5.6.0.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     1115 b- defN 80-Jan-01 00:00 .deps/urllib3-1.26.13-py2.py3-none-any.whl/urllib3-1.26.13.dist-info/LICENSE.txt
-rw-rw----  2.0 unx     1050 b- defN 80-Jan-01 00:00 .deps/zipp-3.11.0-py3-none-any.whl/zipp-3.11.0.dist-info/LICENSE
That's a weird PEX though that contains 6 platforms worth of wheels, Python 3.{7,8,9} x Linux/Mac.
f

few-arm-93065

03/23/2023, 11:47 PM
yes, the pex we generate is considered the “medical device” in our use case - the dependencies are what we need to report on. If I’m following your question correctly.
e

enough-analyst-54434

03/23/2023, 11:47 PM
Right, gotcha. Makes sense.
3 Views