Hi I have a question relating to a ` goal rule` that I m imp Pants #plugins

Hi - I have a question relating to a `@goal_rule` ...

cold-mechanic-10814

03/23/2023, 6:05 PM

Hi - I have a question relating to a

@goal_rule

that I'm implementing. The goal needs to (through one way or another) invoke the

git

command to create a git tag. Is that possible, with pants' hermetic workspace? I've read somewhere the docs that you can get a read-only subset of the current git info, but what about calling git commands that will actually modify the state?

enough-analyst-54434

03/23/2023, 6:15 PM

See here: https://github.com/pantsbuild/pants/blob/d0ff0ac4fe05d2596dddbfe7aee95e0c1e9c2e7f/src/python/pants/vcs/git.py And then grep for that type to see uses.

witty-crayon-22786

03/23/2023, 6:16 PM

that would be one way… but particularly for actually mutating things, i’d recommend doing it in an

InteractiveProcess

directly in your goal, at the top of the stack

cold-mechanic-10814

03/23/2023, 6:21 PM

Thanks for your replies. I was looking at

InteractiveProcess

, thinking that might be the recommended path, but I was wondering if it's possible to use a 3rd party lib such as

GitPython

as it provides some useful abstractions - or is that against the rules of @rules since it modifies things in a what that pants can't monitor?

witty-crayon-22786

03/23/2023, 6:22 PM

It does violate the sandbox, yea. To do it safely, you'd have to mark the @rule uncacheable, or ensure that you were only doing it directly from your @goal_rule function.

cold-mechanic-10814

03/23/2023, 6:23 PM

Ok, so the @goal_rule is implicitly marked as uncacheable?

witty-crayon-22786

03/23/2023, 6:24 PM

Correct

cold-mechanic-10814

03/23/2023, 6:25 PM

Are there any limitations as to what you can do in the @goal_rule? Or can you pretty much do anything (i.e. no idempotency restrictions, side effects, network calls, etc.)?

witty-crayon-22786

03/23/2023, 6:28 PM

well. it’s probably closer to call it “unspecified behavior”. for example, https://github.com/pantsbuild/pants/issues/10542 hopes to eventually run goals concurrently, until they reach a critical section marked by builtins like

InteractiveProcess

Workspace.write_digest

Console.print

, etc.

witty-crayon-22786

03/23/2023, 6:29 PM

if you’re creating sideeffects outside of those APIs, you will likely need to fix your code eventually.

cold-mechanic-10814

03/23/2023, 6:33 PM

Ah, so I even if I can use a lib such as

GitPython

, it's probably going to break in a future release?

witty-crayon-22786

03/23/2023, 6:34 PM

probably for #10542 there will be an escape hatch to say “i’m about to do something with sideeffects”. but it will be an update to your code, whereas the existing APIs won’t need that

cold-mechanic-10814

03/23/2023, 6:35 PM

Ok, makes sense

enough-analyst-54434

03/23/2023, 6:35 PM

I think it's generally best to pretend rule Python is not Python. Just for manipulation of data structures, loops, conditionals.

cold-mechanic-10814

03/23/2023, 6:40 PM

Sorry, sent too soon.. rewriting

cold-mechanic-10814

03/23/2023, 6:44 PM

Going down the InteractiveProcess route, just so I understand, would I be able to do something like this? Or have I misunderstood?

Copy code

class MyGitHelper:
    def __init__(self, repo_path):
        self.repo_path

    async def some_complex_multistep_op():
        ...

        res1 = await Effect(InteractiveProcessResult, InteractiveProcess(argv=["git something"]),)

        ... 

        res2 = await Effect(InteractiveProcessResult, InteractiveProcess(argv=["git something else"]),)

        ...

        return some_retval


@goal_rule
def my_goal(targets: Targets) -> MyGoal:
    helper = MyGitHelper(os.cwd())

    await helper.some_complex_multistep_op()

witty-crayon-22786

03/23/2023, 6:45 PM

yea, that should work.

witty-crayon-22786

03/23/2023, 6:46 PM

after the first

InteractiveProcess

has started, your code will be in a critical section, and not ~~interruptible~~ restartable

witty-crayon-22786

03/23/2023, 6:47 PM

(…which is what you want)

cold-mechanic-10814

03/23/2023, 6:48 PM

Ok, great. I'll play around with that approach - I was hoping to avoid having to call

git

directly, being able to use existing libs etc. but this is definitely manageable as we only need a couple of specific ops

cold-mechanic-10814

03/23/2023, 6:49 PM

Thank you both!

cold-mechanic-10814

03/23/2023, 6:58 PM

Sorry, one last question - what if I want to get the

stdout

InteractiveProcess

? Say I want to list the git tags via

git tags

- it seems that if I do it via

Process

it will fail since the

.git

directory is not available, but if I do it via

InteractiveProcess

I can't retrieve the

stdout

witty-crayon-22786

03/23/2023, 6:59 PM

…oh, darn. yea, that’s an issue.

witty-crayon-22786

03/23/2023, 7:00 PM

you’d have to pass in the git dir to

Process

, and then additionally mark the process uncacheable with

ProcessCacheScope

witty-crayon-22786

03/23/2023, 7:01 PM

…ok, sorry. might be a good idea to look at what John recommended, or using a library directly from your

@goal_rule

cold-mechanic-10814

03/23/2023, 7:03 PM

If I were to ensure that an

InteractiveProcess

or a

Console.print

statement were executed before I invoke any side effects, that would "future-proof" the goal against the upcoming changes? A bit of a hack, but reasonable

witty-crayon-22786

03/23/2023, 7:04 PM

yea

cold-mechanic-10814

03/23/2023, 7:04 PM

So I could

Copy code

@goal_rule
def my_goal(console: Console) -> MyGoal:
    console.print('something')
    some_op_with_side_effects_from_a_3rd_party_lib()

witty-crayon-22786

03/23/2023, 7:05 PM

yea.

witty-crayon-22786

03/23/2023, 7:05 PM

referring to that ticket.

enough-analyst-54434

03/23/2023, 7:06 PM

FWIW you can use a Process just fine mod caching concerns: just use GIT_* env vars to point to where .git is.

cold-mechanic-10814

03/23/2023, 7:07 PM

So long as I first copy the git dir into the workspace?

enough-analyst-54434

03/23/2023, 7:07 PM

enough-analyst-54434

03/23/2023, 7:07 PM

It;s not widely known but you can run git in the wrong dir.

enough-analyst-54434

03/23/2023, 7:07 PM

You just need to tell it where the db lives.

cold-mechanic-10814

03/23/2023, 7:07 PM

Ah, sorry, I thought Process couldn't read anything outside of the workspace

enough-analyst-54434

03/23/2023, 7:07 PM

The code I initially pointed to does that.

enough-analyst-54434

03/23/2023, 7:08 PM

Pants sandboxing is fake. We don't jail the fs.

enough-analyst-54434

03/23/2023, 7:08 PM

We just place you in a hard to reason about tmpdir

enough-analyst-54434

03/23/2023, 7:08 PM

You can still see the whole filesystem.

cold-mechanic-10814

03/23/2023, 7:08 PM

Ok, right, so with an absolute path to the git dir I can work around it

enough-analyst-54434

03/23/2023, 7:09 PM

As Stu has discussed with you though, and to re-iterate, caching is the key thing to get right in all this. You're in delicate waters.

cold-mechanic-10814

03/23/2023, 7:10 PM

Any other things to bear in mind other than setting the ProcessCacheScope?

witty-crayon-22786

03/23/2023, 7:11 PM

Re: the sandboxing being "fake" though: that's true until you try and use a

docker_environment

, at which point you would be trapped by

Process

enough-analyst-54434

03/23/2023, 7:12 PM

Well ok. B*ut that's not this.*

witty-crayon-22786

03/23/2023, 7:12 PM

Imo, do the GitPython thing, marked clearly.

witty-crayon-22786

03/23/2023, 7:13 PM

We need to clean up the APIs that John linked to, so they wouldn't be my first choice.

enough-analyst-54434

03/23/2023, 7:14 PM

What favors adding a GitPython dep over using Process out of curiousity?

cold-mechanic-10814

03/23/2023, 7:17 PM

It's not a done-deal yet - on one side I could just use GitPython to avoid re-implementing some of the useful abstractions they provide, but then be a bit "out of bounds" from a Pants rule perspective, or use the more low-level approach of using bare git commands via Process. I'm going to run through the exact list of ops we need to perform on the git repo to quantify the effort of the Process approach, or whether we just use the lib

enough-analyst-54434

03/23/2023, 7:18 PM

What do you mean by useful abstractions in this context?

git tag x

seems to need little abstraction!

cold-mechanic-10814

03/23/2023, 7:19 PM

Of course 🙂 That was just an example

enough-analyst-54434

03/23/2023, 7:19 PM

So you have other uses?

cold-mechanic-10814

03/23/2023, 7:21 PM

Yes - we're currently porting our polyrepos to a monorepo, using pants as the main tooling for the build and release process. That includes version bumping based on git commits, with different artifacts within the repo being independently versioned. It's a bit semantic-release meets lerna, but for python. It ultimately involves quite a bit of git wrangling when you try to figure out exactly which asset needs to be version bumped due to transitive dependencies.

enough-analyst-54434

03/23/2023, 7:22 PM

Pants determines all that git aside for you

enough-analyst-54434

03/23/2023, 7:22 PM

It hashes all inputs transitively for artifacts, etc.

cold-mechanic-10814

03/23/2023, 7:22 PM

Right, that's one of the attractive things about Pants

cold-mechanic-10814

03/23/2023, 7:23 PM

But we still need to inspect the git history to determine what type of version bump

cold-mechanic-10814

03/23/2023, 7:23 PM

But only the subset of commits that are relevant to assets in the transitive dependencies

enough-analyst-54434

03/23/2023, 7:23 PM

So you use keywords or something in commits that hint at semver?

cold-mechanic-10814

03/23/2023, 7:24 PM

Exactly, like conventional commits

cold-mechanic-10814

03/23/2023, 7:24 PM

As I said, we may be able to reduce the number of git commands thanks to pants' dep-tracking, in which case the Process approach might be better

enough-analyst-54434

03/23/2023, 7:25 PM

I'm not sure what that means, but I think I get it a bit more now. You need

git tag x

and, roughly,

git log -- $(pants filedeps)

enough-analyst-54434

03/23/2023, 7:25 PM

I'll still be shocked if 180K of re-implementation of git saves you much over a few git command lines.

cold-mechanic-10814

03/23/2023, 7:26 PM

Sorry, conventional commits is a commit message convention like

fix: bug 123

feat: richer reports

cold-mechanic-10814

03/23/2023, 7:26 PM

https://www.conventionalcommits.org/en/v1.0.0/

cold-mechanic-10814

03/23/2023, 7:27 PM

You're probably right

enough-analyst-54434

03/23/2023, 7:27 PM

Ok. Gotcha. I had not heard of conventional commits TM.

cold-mechanic-10814

03/23/2023, 7:28 PM

It's more of a javascript thing - it came out of the Angular project. Not sure if it has much traction in the python community.

enough-analyst-54434

03/23/2023, 7:28 PM

I'm a luddite either way and generally not up to date on technology.

enough-analyst-54434

03/23/2023, 7:29 PM

Well, what you're doing is a bit of a dream of mine. It would be super cool if it was done without commit messages but with source analysis for API breaks, etc.

enough-analyst-54434

03/23/2023, 7:29 PM

Guava in Java land almost has this automated.

cold-mechanic-10814

03/23/2023, 7:29 PM

Haha, that would be pretty impressive - maybe for a V2

enough-analyst-54434

03/23/2023, 7:30 PM

It a big hole in the software universe. Humans get semver wrong alot.

cold-mechanic-10814

03/23/2023, 7:30 PM

Although if you have good enough static typing, you might be able to do something close..

enough-analyst-54434

03/23/2023, 7:30 PM

Absolutely.

cold-mechanic-10814

03/23/2023, 7:31 PM

You'd need some kind of lightweight contract testing, with snapshots of the type sigs of the public APIs

cold-mechanic-10814

03/23/2023, 7:31 PM

Not completely far-fetched

enough-analyst-54434

03/23/2023, 7:33 PM

Yeah, the early days of Toolchain involved indexing the symbols of all the libraries in PyPI and maven central to form a database of this sort of information for use on the consume side. The idea being you don't actually care about the version you consume, you care about the subset of the API you consume.

cold-mechanic-10814

03/23/2023, 7:35 PM

That makes sense, but seems ambitious!

cold-mechanic-10814

03/23/2023, 7:36 PM

Anyway, thanks again for your help! I definitely understand pants a little bit more.

3 Views

Open in Slack

Previous Next