Hey, I'm just curious as to a higher level explana...
# general
b
Hey, I'm just curious as to a higher level explanations of differences between Bazel and Pants. From a general overview, I see that they both have the same aim and the BUILD files look remarkably similar. • One issue I constantly come across is how to control the toolchain in Bazel. Many developers get around this by just hosting everything in a docker sandbox and allowing the use of host tools or force developers to execute remotely in carefully controlled containers. Others create verbose "toolchain" config files, such as this for C++: https://github.com/codificasolutions/bazel-cross-compile-example/blob/master/toolchain.bzl and try to maintain a custom set of rules. Others use a macro known as
rbe_autconfig
which tries to guess what the toolchain would be in your docker container. I'm unsure if any of these solutions are optimal at the moment so I wanted to know if you did things differently? • Why a deamon? I notice Bazel does the same (or at least similar) and you mention in your documentation it is to keep "the action graph warm" - does that mean you're just doing this to keep the action graph in memory? Do you watch for any files that change and update the action graph in the background? • Does anyone know how the generation of an action graph differs? I assume at a basic level you both try to achieve the same thing. You create a DAG and look what needs rebuilding in the tree when an Action changes. After that, is it just what a Bazel rule and Pants rule actually define as an Action? • How does one create a build rule for Pants? A very brief look at your documentation doesn't elude to any easy way to create a plugin. Though that could be just me not being thorough enough.
h
Hello! Welcome! For the first questions, I pinged some folks who can answer those questions better than I can. For the last question, that’s solely because we haven’t had the time to finish out the plugin docs yet. We started work on them the past 3 days, so they are still very much a WIP and incomplete. It’s my current project, and I am hoping to finish in the next two weeks. It’ll include an example repository demoing how you could, for example, add a plugin for Bash support. In the meantime, did you attend the Python meetup last night? Plugins are created the same way as demoed last night - using Python 3 coroutines (async/await) with type hints. For example,
Copy code
@rule
def demo(a: A, b: B) -> C:
  ...
b
I just joined the Slack group today, so I did not. And if the meetup was evening in the Californian timezone, I most certainly will have been asleep. That's interesting, return types are definitely something missing from Starlark.
Before I ask my next question, I'm not saying Python is a bad choice. I am curious as to why you chose Python and not a language such as Go which has both types and coroutines baked into the language?
h
Ah, got it. Let me make a copy of the slides from last night - the talk was about how we use Python 3 features like async/await, type hints, and dataclasses to achieve caching, concurrency, and remote execution
How Pants Leverages Python 3 Features.pdf
👍 1
b
Thankyou so much.
❤️ 1
h
not a language such as Go which has both types and coroutines baked into the language?
One of the main reasons is the popularity of Python and accessibility for new developers who haven’t used it before. We wrote the engine in Rust for performance, and the rules API / plugins in Python for accessibility. Another reason is Python’s strong CFFI story, such as a couple Rust projects like the
cpython
crate to talk between Rust and Python. Also, Python 3 now has types and coroutines baked in as well. (Python 2 kind of did, but not in a robust way.)
Oh, didn’t realize we had a slide with a gif in it. I’ll make a non-PDF copy so that doesn’t get botched
b
So for you, it's about expressing what you want done as easily as possible in one language, whilst leaving the harder graph calculation to a compiled language such as Rust.
👍 1
👆 1
haha yeah - I was wondering what the blurry image was ^ - ^
So for you, it’s about expressing what you want done as easily as possible in one language, whilst leaving the harder graph calculation to a compiled language such as Rust.
Precisely. Great way to describe it.
b
When you create your Python rules, how do you ensure everyone is using the same Python version, or does it not matter for your usecase?
h
Good question! See https://pants.readme.io/docs/python-interpreter-compatibility for how to control which Python version is used when Pants makes subprocesses, like when it runs tests For the Python you can use when writing a rule, you can use 3.6, 3.7, or 3.8. Pants requires Python 3.6+ to run itself, so plugins need to be written in Python 3.6. (But your non-plugin code can still be whatever, like Python 2)
Btw, have you been using the docs at pantsbuild.org or at pants.readme.io/docs/welcome-to-pants? We recommend the latter if you are primarily going to use Python
w
with regard to something like Bazel’s toolchains: we’ve begun to sketch options, but don’t have a complete design yet. one of our goals is for multi-platform (multi-host) speculation of requests, so an important aspect of the design is that multiple platforms with different toolchains might be in use in different subgraphs of the action graph
👍 1
and with regard to the daemon: a daemon is pretty critical when you are doing deep fingerprinting of files rather than only timestamp based invalidation (as in make or ninja)… and digesting files is necessary for caching and remote execution
finally, the action graph differs quite a bit in that we expose dynamic dependencies to the users (ie, we’re a “monadic” system from a Build Systems à la Carte perspective) via
@rules
being coroutines that can
await
more deps. Bazel has a framework internally called skyframe that is similar, but it is not exposed to users (for historical reasons, they say)
👍 1
thanks a lot for the questions!
b
I understand the concept of digesting a file vs just taking the timestamp which is critical to getting a build repeatable. What I don't understand still is why that has to be a daemon.
w
@big-baker-75091: like Bazel (because we use the same remote execution API), we use SHA256
b
Is it to stop two processes creating the same hash at the same time?
w
SHA256 is a few orders of magnitude slower than checking a file timestamp… so you want to keep track of which files have not changed since you last digested them
b
ah there we go, so it's more about checking to see if a file has changed
w
yea. that’s important for doing cache lookups and remote execution.
in a timestamp based system, if you toggle back and forth between two versions of a file (or switch back and forth between branches), things will re-run unnecessarily.
b
So without a deamon, every time I wanted to run a build, it would have to hash everything all over again. Whereas with a deamon, it hashes everything but when you rebuild, it just checks to see which files have changed and need rehashing in order to recalculate the tree again.
w
correct.
at least, that’s the case for Pants. i’m not 100% certain that bazel doesn’t use a combination of digesting and file timestamps*.
but the other reason that both Pants and Bazel have a daemon is that they have components written in languages with non-trivial startup time
Bazel is java, with a C client (iirc)
b
A C client?
w
Pants is about 80% python and 20% rust: our client is currently python, but will be ported to rust in the next few weeks
👍 1
@big-baker-75091: yes: you connect to the daemon with a client
b
I was more surprised that they have some C in there ^ - ^
w
yea. they’ve also needed to port various filesystem interaction code to C, because Java exposes limited API there.
i would say that rust is a definite advantage there… we can realistically be a very fast two language system (python and rust) rather than a three language system (starlark, java, c)
b
I totally agree with you there.
Have you ever managed comparable speed tests?
w
nothing particularly rigorous, unfortunately. it’s quite challenging to compare apples to apples there.
pants is pushing on python as our flagship language for v2, with other languages to follow. and Bazel’s support for python is fairly scattered. it’s possible that pants’ second v2 language will be something that we can compare more directly.
👍 4