https://pantsbuild.org/ logo
d

dry-analyst-73584

01/27/2023, 8:30 PM
I'm working on a goal to run security auditing tools. Since the result will always be dependent on the state of the outside world (the currently reported vulnerabilities) I need to make sure we never cache the result. Is there a preferred way to mark a goal/result as uncachable?
w

wide-midnight-78598

01/27/2023, 8:33 PM
Could that be handled at the Process scope? Don't cache the results? https://www.pantsbuild.org/docs/rules-api-process#process
f

fast-nail-55400

01/27/2023, 8:33 PM
How and where is the database of security vulnerabilities accessed?
And is that database under control of the auditing tool exclusively or are there control knobs for the Pants (or another invoker) to use?
d

dry-analyst-73584

01/27/2023, 8:36 PM
I'm planning to use it first for running the pyaudit tool
f

fast-nail-55400

01/27/2023, 8:37 PM
For example, if the tool can report the version of the security database, then using an uncached Process, get that version number. Then store the version number as a "dummy" env var on the Process that actually runs the audit.
Then you would have caching per version of the audit rules.
Since a different version number would lead to a different Process due to changing env var
d

dry-analyst-73584

01/27/2023, 8:38 PM
I'm looking now to see where it's fetching data from and also whether I can get any further info from it.
f

fast-nail-55400

01/27/2023, 8:40 PM
and if you are willing to cache the "version lookup" per session than you will get another bit of performance win
👍 1
d

dry-analyst-73584

01/27/2023, 8:41 PM
Also: I mean pip-audit, not py-audit
c

curved-television-6568

01/27/2023, 8:42 PM
usually there’s a process invocation involved, and it’s easy enough to control the caching for those (as @wide-midnight-78598 was poking at as well) See: https://github.com/pantsbuild/pants/blob/d87f9b6810209b87238635101000cb7db512d835/src/python/pants/engine/process.py#L32-L45 for those options if that turns out to be a fit..
👍 1
d

dry-analyst-73584

01/27/2023, 8:54 PM
Unfortunately the tool doesn't give me a handle on the current version of the db. Since the data is being fetched from https://github.com/pypa/advisory-database/ I could theoretically make an additional call and check the etag of https://github.com/pypa/advisory-database/tree/main/vulns to determine whether I need to run again or not.
I might file an issue with pip-audit to see if they can give me a better solution though.
Thanks for the help!
b

bitter-ability-32190

01/28/2023, 12:27 PM
In the meantime, having it work uncached is better than not having anything, I think 😛
h

happy-kitchen-89482

01/29/2023, 3:13 AM
There is precedent for this, A goal_rule is always uncacheable, you can mark any other rule as
@_uncacheable_rule
, and you can mark processes as uncacheable (I forget the details).
👍 1
c

curved-television-6568

01/29/2023, 4:07 AM
there’s also the possibility to have a rule’s return type inherit from
EngineAwareReturnType
and from there you can flag it as uncachable. https://github.com/pantsbuild/pants/blob/5580f808ceea83b15bbc85498f0a55b78362772a/src/python/pants/engine/engine_aware.py#L66-L72
👍 1
2 Views