is it a good idea (and is it possible?) to collect all files from dependencies and subdependencies into a digest?
im writing a tailwind plugin and tailwinds needs all html files to be able to purge unneeded css rules.
09/20/2022, 1:44 PM
It is both fine and possible to do this. Under the covers each individual file has its contents stored as a blob with its sha 256 hash as its key in a fast memory-mapped DB. So storing 1 file is just like storing 1000 files. The knitting is a a digest of paths pointing to these blobs, really a digest of this data structure: https://github.com/pantsbuild/pants/blob/0bbf1fbb6078cacb1e3a75f5ea67a14d0398910c/[…]ote-apis/build/bazel/remote/execution/v2/remote_execution.proto; so that's cheap.
The only real issues you'll run into are latency on the read side of things. If you need to materialize a digest pointing at 10K files, that won't be super snappy. You might start running into O(100ms) overheads. If that overhead is large compared to the time it takes to run your action against all those files, you're in a bit of a bind. On the Python side this hits us. If we naively try to store a whole virtualenv digested, that's typically O(10k) files and the overhead of materialization can be ~500ms. For runnning unit tests this often competes with the time to actually run the test. As a result we store Digested zips instead of the loose files. Basically 1 zip per dependency (wheel) since dependencies are of fixed content). That performance hack though greatly complicates the Python rule set. If you can avoid it and the performance of storing and materializing loose file digests is good enough - definitely stick with that.
09/21/2022, 11:15 AM
thank you for your detailed answer 🙂
so if i every run into performance issues it would be possible to either concat the html files together (tailwind is just reading the classnames) or extract the classnames into one big string per target.
todays my third day with pants. atm its more a crawling than a running to the solution 🙂
in the meantime i tried it out and it worked as i hoped it would