https://pantsbuild.org/ logo
#development
Title
# development
a

ancient-vegetable-10556

01/07/2022, 8:24 PM
@witty-crayon-22786 Working through invalidation headers for JVM, in Python-land we prepend a comment which is ignored by tools that eventually read the lockfiles. Our JVM lockfile format is custom, but currently is fed straight into a JSON parser. To support invalidation headers, I could either start stripping lines that begin with
#
, or swap to a YAML parser (which will read JSON with comments). Preference here?
h

hundreds-father-404

01/07/2022, 8:26 PM
Any thoughts on TOML for lockfile format? I think it's worked quite well for Cargo
a

ancient-vegetable-10556

01/07/2022, 8:27 PM
Toml would be entirely fine, it’s just that we already have lockfiles out there that are JSON, so there’d need to be deprecation work
w

witty-crayon-22786

01/07/2022, 8:27 PM
what about just embedding a key in the JSON…?
a

ancient-vegetable-10556

01/07/2022, 8:27 PM
root object is a list
sorry 🙂
w

witty-crayon-22786

01/07/2022, 8:27 PM
make it not a list?
a

ancient-vegetable-10556

01/07/2022, 8:28 PM
see above
w

witty-crayon-22786

01/07/2022, 8:28 PM
i don’t think that the deprecation work is avoidable? any of these are changes
f

fast-nail-55400

01/07/2022, 8:28 PM
maybe switch to JSON object and embed a
version
key? then can read in using
json.loads
, if a list then assume v1 lockfile, if an object assume v2+
w

witty-crayon-22786

01/07/2022, 8:29 PM
yea. regardless, it seems like the top level needs to be an object.
a

ancient-vegetable-10556

01/07/2022, 8:29 PM
Right
h

hundreds-father-404

01/07/2022, 8:29 PM
We're so early into the backend that I recommend we think about the ideal format for when we have 100 or 1000 users, rather than the few we have now. It'll be much harder to change in 3 months if we realize we need to
a

ancient-vegetable-10556

01/07/2022, 8:29 PM
@fast-nail-55400 your solution seems the least clunky for now
f

fast-nail-55400

01/07/2022, 8:30 PM
a

ancient-vegetable-10556

01/07/2022, 8:31 PM
yeah, Gradle’s is very similar to what we have for Python
f

fast-nail-55400

01/07/2022, 8:31 PM
not suggesting that we use it, but there might be useful attributes in there
also we need to switch to a JSON object any way since we may need to introduce concepts like “resolve parameters” for scala version
and would need to store them in the lockfile
a

ancient-vegetable-10556

01/07/2022, 8:32 PM
One thing that’s nice about having our invalidation headers as a comment is that we can ensure it lives at the top of the file and includes information on how to regenerate the files. That’s less controllable with a JSON object in the root
h

hundreds-father-404

01/07/2022, 8:32 PM
if we're switching to JSON object anyways, would changing to TOML be much more disruptive?
w

witty-crayon-22786

01/07/2022, 8:33 PM
@ancient-vegetable-10556: you can have a comment explaining how to regenerate, but the other bits of data can still be structured as keys
a

ancient-vegetable-10556

01/07/2022, 8:33 PM
@hundreds-father-404 With JSON objects, the parsing will always work with one line; for TOML or JSON, you’d need to pick one format to try first
@witty-crayon-22786 That’s why I’m asking about how to implement comments, which are unsupported by JSON
h

hundreds-father-404

01/07/2022, 8:34 PM
With JSON objects, the parsing will always work with one line; for TOML or JSON, you’d need to pick one format to try first
I don't follow. What do you eman?
w

witty-crayon-22786

01/07/2022, 8:34 PM
in JSON, there is a sort-of convention for:
Copy code
{
  "#": "This is a comment!",
  "key": "value",
}
🤔 1
a

ancient-vegetable-10556

01/07/2022, 8:34 PM
OK, does the Python json serializer guarantee key order?
w

witty-crayon-22786

01/07/2022, 8:35 PM
sorting keys is optional… not sure if that means its guaranteed, but
a

ancient-vegetable-10556

01/07/2022, 8:35 PM
sort_keys is the opposite of what I want
f

fast-nail-55400

01/07/2022, 8:35 PM
Copy code
Note

This module's encoders and decoders preserve input and output order by default. Order is only lost if the underlying containers are unordered.
a

ancient-vegetable-10556

01/07/2022, 8:36 PM
thanks @fast-nail-55400
@hundreds-father-404
json.load
will produce a list or a dict depending on what the contents of the serialized form is (and you can process based on the type of the deserialized object); for toml we’d need to determine whether the file is toml or json somehow (e.g. trying a toml deserializer and falling back to json)
w

witty-crayon-22786

01/07/2022, 8:40 PM
re: toml: I have no preference. The lockfile format isn't meant to be readable... although there is maybe some benefit to avoiding nesting if you're going to merge conflicts.
1
But even with toml, I never assume that I have correctly merged a conflict... have to re-run to confirm.
a

ancient-vegetable-10556

01/07/2022, 8:41 PM
@witty-crayon-22786 the invalidation header is meant to be readable, so that people can get loudly reminded that it’s not meant to be hand-edited
1
h

hundreds-father-404

01/07/2022, 8:41 PM
for toml we’d need to determine whether the file is toml or json somehow (e.g. trying a toml deserializer and falling back to json)
But only for the sake of backwards compatibility if we want to keep the existing JSON lockfiles still working for now, right? That will go away
w

witty-crayon-22786

01/07/2022, 8:41 PM
@ancient-vegetable-10556: sure: my comment was re: toml vs json, not about whether there should be a comment.
a

ancient-vegetable-10556

01/07/2022, 8:43 PM
I’m certain that JSON objects are not the right tool for an invalidation header comment, which leads me back to the starting point here 🙂 swapping in a YAML (de)serialiser and making it use JSON formatting would be the lowest impact to me right now, followed by adding in a way to strip out the invalidation header; adding in a new parser format would be tedious for now, but potentially useful in the long run; maybe
w

witty-crayon-22786

01/07/2022, 8:45 PM
If not JSON, then toml please.
1
a

ancient-vegetable-10556

01/07/2022, 8:46 PM
ok
h

hundreds-father-404

01/07/2022, 8:47 PM
big +1 to TOML over YAML. YAML has so many gotchas
a

ancient-vegetable-10556

01/11/2022, 10:16 PM
How strongly do we feel about the lockfile metadata header being readable along with the rest of the lockfile? It’s currently not readable in Python (we pass the lockfile contents directly to pip in Python land, the invalidation header is a comment and thus ignored), and it would be easiest if we did it that way for JVM, but there’s no reason why that has to be the case per se
w

witty-crayon-22786

01/11/2022, 10:17 PM
ideally there should be a readable comment containing the re-run instruction, but i don’t care if any of the rest of the metadata is readable.
a

ancient-vegetable-10556

01/11/2022, 10:19 PM
Cool.
w

witty-crayon-22786

01/11/2022, 10:19 PM
honestly, even the human readable “run this to regenerate” comment isn’t required though. the point of the metadata is to be able to automatically tell users when/what to run
(imo)
a

ancient-vegetable-10556

01/11/2022, 10:19 PM
Yeah, the “run to regenerate” comment isn’t as important as “don’t touch this” tbqh
w

witty-crayon-22786

01/11/2022, 10:20 PM
mm, true. that’s nice to have.
a

ancient-vegetable-10556

01/11/2022, 10:21 PM
OK. To wrap up this whole thread, in the interests of maintaining the smallest possible scope of work, I’m going to maintain the current lockfile format, but prepend a header block that will be stripped out at read time, we can swap over to TOML in future work if (e.g.) Tom has a real need for it
w

witty-crayon-22786

01/11/2022, 10:41 PM
that would mean that it’s not actually JSON…?
a

ancient-vegetable-10556

01/11/2022, 10:41 PM
Right; does that matter?
w

witty-crayon-22786

01/11/2022, 10:42 PM
it doesn’t matter, but since you’re already changing what data is stored based on whether something is readable via a particular format or not, changing to another format at the same time seems trivial
so, imo: make the TOML change at the same time.
it seemingly just amounts to:
Copy code
try:
    content = toml.loads(..)
except:
    # Is either old, or corrupted: attempt to load as JSON, and assume v1
    content = json.loads(..)
    content['version'] = 'v1'
a

ancient-vegetable-10556

01/11/2022, 10:46 PM
fine