Given a digest object (`pants.engine.fs.Digest`) r...
# general
p
Given a digest object (
pants.engine.fs.Digest
) representing a single (text) file, how do I read the contents of the file into memory (string) ? or get a file-like object ?
h
await Get(FileContent, Digest, my_digest)
Or possibly
await Get(FilesContent, ..)
(plural), then get the first element of
FilesContent
đź‘Ť 1
w
a thing to note here is that we never expose a Digest that represents exactly one file… they exist internally, but when python sees them, they are always for a “directory/Snapshot”
that
Snapshot
might contain exactly one file, but it might not.
so the plural form is likely always necessary.
cc @happy-kitchen-89482 since you mentioned docs on this
p
this is confusing.
Copy code
my_digest = config_snapshot.digest
        config_content = await Get(FileContent, my_digest)
doesn't work.
config_content = await Get(FileContent, Digest, my_digest)
doesn't work either.
there are no examples in the code of using FileContent w/ Get...
h
try
FilesContent
, plural
p
ok.
config_content = await Get(FilesContent, Digest, my_digest)
đź‘Ť 1
this is not an intuitive API and having FilesContent & FileContent is very confusing.
h
Indeed it is confusing having both name. Open to suggestions on renaming it
w
yea. did you see my comment above? Digest is always for a collection of files, never for a single file.
there might be naming tweaks… Digest used to be DirectoryDigest, but maybe it should be SnapshotDigest
h
I think Asher’s feedback is more that
FileContent
and
FilesContent
aren’t good names. Too subtle the difference
p
I think the solution is to have an idiom/shortcut to do those common operations.
the same way django has shortcuts like get_or_404 or get_or_none, etc.. for common ORM operations.
w
yea. maybe FilesContent should just be an orderedmap of path->content?
@polite-garden-50641: maybe, but that method would be falliable… it could fail if the snapshot contained more than one file
and we have avoided exceptions
đź‘Ť 1
(or if it contained zero files)
h
At a minimum, I think it should be
FileContent
and
FileContentCollection
more typing is worth the clarity imo
w
@hundreds-father-404: thoughts on the orderedmap? maybe the issue is just that we don’t actually need both types…
…ah, dang.
is_executable
bool…
đź‘Ť 1
…but it’s unused…? EDIT: is probably used for
InputFilesContent
.
p
a good example is django. if you do get_or_none() and there is more than one object, you will get an exception. get_or_none() is basically a way to say in the code there shouldn't be more than one object that matches the given query.
you should intent in the code, just like dict[key] vs. dict.get(key)
dict[key] means that you expect the key to be there, and if it is it is an error, while using dict.get is saying that it is ok if the key is not in the dict.
w
yea. maybe reasonable here.
h
@polite-garden-50641 any ideas on a rename for
InputFilesContent
? *
CreateDigest
*
CreateFileContentCollection
*
CreateFileContent
It would look like:
Copy code
await Get(Digest, CreateDigest([FileContent(..), FileContent(..)])
I’m renaming
FilesContent
to
FileContentCollection
p
the thing is, this code is very verbose... I think adding shortcut methods (as class methods) will do the trick...
h
That’s not how the engine works, though, unfortunately. You need to do an
await Get(P, S)
. I know that’s clunky, but that’s the constraint of the system
So, barring some major redesigns of Pants, the best we can do now is improve the names to make things less confusing
p
i.e. instead of
InputFilesContent([FileContent(".coveragerc", config_content.encode())])
I should be able to just do
InputFilesContent.create_file(".coveragerc",  config_content)
h
We can do helper methods on
InputFilesContent
, that’s fine. We still need to do
await Get(Digest, InputFilesContent, my_input_files_content)
at the end of the day, though. I’m wondering what is a better name than
InputFilesContent
p
also maybe another class method that takes file name & content tuples.
What I don't fully understand is why we have to specify
InputFilesContent
? can't we infer that from
my_input_files_content
?
h
We parse the AST to determine the rule graph. You can either do the explicit form of:
Copy code
await Get(Digest, InputFilesContent, my_input_files_content)
or the abbreviated inline constructor-style:
Copy code
await Get(Digest, InputFilesContent([FileContent(..)])
John has been exploring alternatives to work around that. But for now, we parse the AST to know that this is going from
InputFilesContent -> Digest
. Just like we parse the type hints of your
@rule
function signature
p
got it. makes sense.
đź‘Ť 1