is `EngineAware.artifacts` supposed to return a di...
# development
h
is
EngineAware.artifacts
supposed to return a dict of `Digest`s or
Union[Snapshot, Digest]
, or some other type?
I remember a while ago there was a problem with using `Digest`s corresponding to stdout/stder, since these weren't really named file output
and IIRC we changed how the python
Digest
type worked to support that
h
Since then, we've added
FileDigest
vs.
Digest
(directory digest)
h
I think there might still be an implicit type discrepency with
EngineAwareReturnType.artifacts
(context for this is, I'm trying to thread the process stdout/stderr types through Python so we can put them on the
Enriched*
workunit)
👍 1
right now we handle putting the stdout/stderr output on the workunit for the process execution (which doesn't get created if that process execution is cached, hence the problem)
we do that by explicitly defining :
Copy code
pub stdout: Option<hashing::Digest>,
 pub stderr: Option<hashing::Digest>,
as two fields on
WorkunitMetadata
I'm still reminding myself of how we thread this data around to get it to come out as text when we process workunits in python
h
right now we handle putting the stdout/stderr output on the workunit for the process execution (which doesn't get created if that process execution is cached, hence the problem)
Ah ha. So we need to stop caring about the
Process
workunit, and care purely about the
enriched_foo
workunit
To confirm, you're going to expose the std{out,err} through a different mechanism than the
message()
that gets printed to the console, right? For example, via
artifacts
Asking because we add some of our own message to what gets printed, e.g. a check or minus sign
h
right now I'm just trying to add new `Snapshot`s on the enriched workunit that are exactly the same as the ones we're storing on the inner process workunit that doesn't always get created
and I'm trying to make sure I understand what the right types to be working with are
h
Snapshot
always refers to the file system. Are you trying to add something to represent the stdout/stderr? If so, I think you'd want a
FileDigest
, which is referring to a
str
h
so we have some code in
workunit_to_py_value
that does:
Copy code
if let Some(stdout_digest) = &workunit.metadata.stdout.as_ref() {
    artifact_entries.push((
      externs::store_utf8("stdout_digest"),
      crate::nodes::Snapshot::store_file_digest(core, stdout_digest),
    ));
  }

  if let Some(stderr_digest) = &workunit.metadata.stderr.as_ref() {
    artifact_entries.push((
      externs::store_utf8("stderr_digest"),
      crate::nodes::Snapshot::store_file_digest(core, stderr_digest),
    ));
  }
so this should be creating a Python
Value
on the workunit that has Python type
Snapshot
or maybe this is actually making a Python
Digest
, I'm not sure
h
No, it's making a
FileDigest
object. The
Snapshot
is confusing - that is purely for namespacing of the method
Copy code
pub fn store_file_digest(core: &Arc<Core>, item: &hashing::Digest) -> Value {
    externs::unsafe_call(
      core.types.file_digest,
      &[
        externs::store_utf8(&item.0.to_hex()),
        externs::store_i64(item.1 as i64),
      ],
    )
  }
Note the type ID
core.types.file_digest
h
ah okay so that's a Python
FileDigest
👍 1
so I think this implies that the right type schema for artifacts is
Union[Snapshot, FileDigest]
but so far we've only been putting `FileDigest`s on in rust code (which can't get typechecked at the python level)
and only putting `Snapshot`s on in the python level
yeah we only use
Snapshot::store_file_digest
for handling stdout/stderr in this exact case
so basically I think we should change the type schema of
EngineAwareReturnType.artifacts
from
def artifacts(self) -> Optional[Dict[str, Snapshot]]:
to
def artifacts(self) -> Optional[Dict[str, Union[Snapshot, FileDigest]]]:
because this reflects the types that clients are already expecting
h
so I think this implies that the right type schema for artifacts is Union[Snapshot, FileDigest]
Yep. Although, I'm not sure we should still use
Snapshot
. It could be
Digest
. At the time, there was no
FileDigest
so it was ambiguous what was meant by
Digest
, hence us wanting to use the unambiguous
Snapshot
to refer to the file system
So maybe it should be
Union[Digest, FileDigest]
?
Snapshot
may be more ergonomic for consumers, I'm not sure how it gets consumed
h
hm, right now we have a bunch of code returning `Snapshot`s in those methods
it's possible we could trivially change those to
Digests
I'm not sure if that helps with anything though
h
Yeah I don't really know how consumers interact with this API. Snapshot is more eager, so likely more convenient that callers won't need to convert a digest into resolved file paths
h
most of the interaction with this api is happening in the toolchain plugin
so it turns out that changing the signature of
artifacts
or even just passing non-
Snapshot
values in that method regardless of the type annotations breaks things at the engine level in a way that I spent way too long trying to untangle