hundreds-father-404
03/13/2020, 3:37 PMThat sounds like a yes in disguise.Maybe we add a
LazyField
vs. EagerField
distinction? So long as the end API for users is the same, i.e. that they call my_tgt.get(MyField).value
without having to know if its lazy or not, then I’m okay with it.
For this specific Address
problem, we would always pass the Address
to the constructor. For EagerField
, it throws away the Address
after the constructor call. For LazyField
, it stores the Address
so that it can refer to it later when lazily hydrated.
@average-vr-56795 this circumvents your concern about using @memoized_property
when it really isn’t appropriate to. It is weird to memoize
this:
class BoolField(PrimitiveField):
default: ClassVar[bool]
raw_value: Optional[bool]
@memoized_property
def value(self) -> bool:
if self.raw_value is None:
return self.default
return self.raw_value
My only concern is that field authors have to decide when to use LazyField
vs EagerField
? Maybe that’s not that bad, that they should be thinking about the performance implications of hydrating their field? We’d default to suggesting EagerField
. And, the majority of fields wouldn’t even have to make a decision because they would subclass some helper field like BoolField
or StringField
, which already made the correct decisionhundreds-father-404
03/13/2020, 3:39 PMAsyncField
, which indicates that we need the engine for hydration. You call await Get[SourcesResult](Sources, my_tgt.get(Sources)
, rather than my_tgt.get(Sources).value
. Only about 5-10% of fields are `AsyncField`s. They’re orthogonal to this discussion because `AsyncField`s are by definition always lazy.)average-vr-56795
03/13/2020, 3:59 PMAsyncField
wherever you’d want to use LazyField
? i.e. “If you want to do something expensive enough to want memoization, do it in an @rule
?average-vr-56795
03/13/2020, 3:59 PM@rule
or a standard Python helper function” when writing rules in generalhundreds-father-404
03/13/2020, 4:01 PMcompatibility
and timeout
, even if those fields never once get used by the downstream rules? On one hand, that’s a benefit that we more eagerly fail for invalid BUILD files. On the other hand, we’re now doing a lot of unnecessary workwitty-crayon-22786
03/13/2020, 4:01 PMwitty-crayon-22786
03/13/2020, 4:02 PMyield Get(HydratedField, (Address(..), Field(..)))
, etcwitty-crayon-22786
03/13/2020, 4:02 PMwitty-crayon-22786
03/13/2020, 4:03 PMhundreds-father-404
03/13/2020, 4:04 PMre: “storing an Address on the field” , if we fixed https://github.com/pantsbuild/pants/issues/7490 , you could project them both as separate paramsThat won’t help unless we make you use the engine to hydrate every single field. For 95% of fields, you simply call
my_tgt.get(Compatibility).value
, i.e. Pure Python with no engine.
That is very intentional to avoid needing to use the engine for accessing something as simple as a boolean field zip_safe
. See https://github.com/pantsbuild/pants/pull/9284#discussion_r391817323 for more on why I think we really really want to avoid always using AsyncField
and should preserve PrimitiveField
witty-crayon-22786
03/13/2020, 4:05 PMwitty-crayon-22786
03/13/2020, 4:06 PMwitty-crayon-22786
03/13/2020, 4:06 PMhundreds-father-404
03/13/2020, 4:11 PMensure_str_list
(which iterates over the entire list) on every compatibility
we encounter when running ./pants filedeps ::
, which never needs compatibility
. Are you sure that we’re okay with that performance implication? (I know V1 behaves this way)
I think I’d advocate for LazyField
vs. EagerField
with a default to EagerField
. LazyField
is for when you have some light-weight hydration/validation that doesn’t need the engine, like calling ensure_str_list
. John is correct in diagnosing that the only truly expensive hydration, like reading from the FS, already will need to be AsyncField
.
So, roughly,
* EagerField
-> used by 70% of fields; do nothing more than maybe use a default value,
* LazyField
-> used by 20% of fields; do light weight validation/hydration
* AsyncField
-> used by 10% of fields; do heavy-weight validationaloof-angle-91616
03/13/2020, 4:13 PMaverage-vr-56795
03/13/2020, 4:16 PMensure_str_list
and say that we should expect fields to have a single correct type to avoid needing to think about when we do and don’t want to pay the costs of implicit type conversion…average-vr-56795
03/13/2020, 4:17 PMLazyField
hundreds-father-404
03/13/2020, 4:19 PMensure_str_list
in general. But, there are still other examples like ensuring that timeout > 0
. Although, very likely, that is so cheap to compute that it’s not enough to justify LazyField
average-vr-56795
03/13/2020, 4:19 PMhundreds-father-404
03/13/2020, 4:23 PMLazyField
is very likely premature. Maybe we end up finding that it is worth adding back, but for now, it’s more complexity than necessary. We should only add it when we find there’s a compelling use case
So, I will change PrimitiveField
to now eagerly compute its .value
, rather than it being a property
For the original Address
problem, this solves things. We pass the Address
to the constructor of PrimitiveField
so it can access it for any validation errors, but then the field will throw it away after. For AsyncField
, we will store Address
as a field that can be accessed upon lazy hydrationaverage-vr-56795
03/13/2020, 4:25 PMhundreds-father-404
03/13/2020, 4:25 PMwitty-crayon-22786
03/14/2020, 11:59 PM./pants filter ::
in pantsbuild pants, and a lot more internally.witty-crayon-22786
03/15/2020, 9:27 PMhundreds-breakfast-49010
03/16/2020, 10:12 PMWorkunit
code uses std::time::SystemTime
rather than std::time::Instant
?witty-crayon-22786
03/16/2020, 11:26 PMhundreds-father-404
03/16/2020, 11:32 PM...we should probably just re-enable the pex cache, and trust it to be concurrency safe.We should probably take this to the new channel #performance , but we’ve been discussing this proposal at Toolchain and think it’s likely a good change to re-enable. At a minimum, re-enable when we know it’s local execution. Pex has sane caching afaict (John put a lot of sweat into ensuring that)
hundreds-father-404
03/17/2020, 3:58 PMX -> Z
A -> X -> Z
B -> X -> Z
Specifically:
StripSnapshotRequest -> SourceRootStrippedSources
StripSourcesFieldRequest -> StripSnapshotRequest -> SourceRootStrippedSources
LegacyStripTargetRequest -> StripSnapshotRequest -> SourceRootStrippedSources
hundreds-father-404
03/17/2020, 4:06 PMhundreds-father-404
03/17/2020, 4:07 PM