does anyone have an example of a larger python repo using pa Pants #general

Join Slack

does anyone have an example of a larger python rep...

# general

fast-photographer-12719

11/13/2023, 2:08 PM

does anyone have an example of a larger python repo using pants?

better-van-82973

11/13/2023, 2:11 PM

What constitutes “larger” to you / what kinds of examples are you looking for?

fast-photographer-12719

11/13/2023, 2:13 PM

the docs have two .py files. So maybe 20+ would be sufficient. I'm trying to understand how other have structurted the tests. In all the examples i've seen tests in the same folder as the code they are testing but our repo is setup with tests in a different folder at top level. This means we put the conftest.py in that separate folder, If i move the tests into the same folders as the code, then i'm less sure where to put the conftest.py so it's picked up by pytest.

fast-photographer-12719

11/13/2023, 2:14 PM

also for general purpose seeing how they might solve the issues i've been having (like my earlier question about how to use dependencies from above the current folder)

better-van-82973

11/13/2023, 2:15 PM

Ah, gotcha. In our repo we structure it so that the tests are located in a separate folder called

tests

underneath the package dir. We put the

conftest.py

in the

tests

directory as well

better-van-82973

11/13/2023, 2:16 PM

I think this is the same as your setup now - if you put the tests in the same directory as the source files then presumably you’d want to keep

conftest.py

in that same directory as well

fast-photographer-12719

11/13/2023, 2:21 PM

just to make sure i understand you. I think you are doing this:

Copy code

models
- tests
 L conftest.py
 L some_test.py
-some_code.py

and in my repo we're currently doing:

Copy code

models
- tests
 L conftest.py
 L some_test.py

- code
 L some_code.py

so similar to you. However, when you try to use pants to test only the code that has changed, you need to have specific dependencies from the test to the specific code files being tests. i.e. some_test.py -> some_code.py. That way, when you change some_code.py it will identify that it needs to run the some_test.py file as part of tests. To write that dependency, you have to put the test in the same folder as the code.

better-van-82973

11/13/2023, 2:26 PM

I don’t think that’s the case - here are the contents of my BUILD files in the two packages: `models/tests/BUILD`:

Copy code

python_test_utils()

python_tests()

`models/BUILD`:

Copy code

python_sources()

Pants does dependency inference: https://blog.pantsbuild.org/why-dependency-inference/ so you shouldn’t have to explicitly specify the dependency between

some_test.py

and

some_code.py

happy-kitchen-89482

11/13/2023, 2:26 PM

Those dependencies are typically inferred automatically

happy-kitchen-89482

11/13/2023, 2:27 PM

In our large (but private) Pants-using repo we had the tests live alongside the code under test:

src/python/bar/foo_test.py

tests

src/python/bar/foo.py

and so on

happy-kitchen-89482

11/13/2023, 2:27 PM

I like that because it makes it really easy to find the test for a piece of code

happy-kitchen-89482

11/13/2023, 2:28 PM

And since Pants understands deps it can make sure not to package the tests up with the code for deployment (which is historically why people have separate tests/ folders)

happy-kitchen-89482

11/13/2023, 2:28 PM

And TBH not that packaging up the tests is such a big deal either...

happy-kitchen-89482

11/13/2023, 2:29 PM

And the conftest.py goes in the

src/python/bar

src/python

, depending on what scope it's supposed to apply to

fast-photographer-12719

11/13/2023, 2:29 PM

Oh yeah i definitely prefer having teh tests in it's own folder. hmm, ok i've made a misstep then. When i change foo.py and run the test-changed-since command, it tests the entire repo. an i only want it to run test_foo.py

happy-kitchen-89482

11/13/2023, 2:30 PM

Do you have an explicit dependency somewhere that could be causing that?

happy-kitchen-89482

11/13/2023, 2:30 PM

You can investigate your dependencies with

pants dependencies --transitive

happy-kitchen-89482

11/13/2023, 2:31 PM

or you can find a dependency path between two given targets with

pants paths

fast-photographer-12719

11/13/2023, 2:33 PM

Thank you, i will try the pants path.

fast-photographer-12719

11/13/2023, 2:33 PM

the dependencies look not quite right so likely to be that, i'll investigate. Thanks!

fresh-cat-90827

11/13/2023, 2:44 PM

I have a semi-toy project where I develop with Pants, please see https://github.com/AlexTereshenkov/cheeseshop-query/. It features most widely used Pants features, perhaps you can use this as a sandbox for your explorations.

👍 1

fast-photographer-12719

11/13/2023, 2:57 PM

Thanks Alexey, that is helpful. I'm now seeing the following command is always running all tests, even when no change to the repo has been made.

Copy code

pants --changed-since=HEAD  --changed-dependents=transitive test

When i removed the

--changed-dependents

it runs nothing as i'd expect. I'll try looking through the dependencies to see if there's anything strange in there but they looked correct to me earlier.

fresh-cat-90827

11/13/2023, 2:58 PM

interesting! What's your

git diff

, are you sure you don't have any changes in the working tree? 😕

fast-photographer-12719

11/13/2023, 2:58 PM

Copy code

(base3.10) matt@DGX-2:~/mlcore$ git diff
(base3.10) matt@DGX-2:~/mlcore$

fresh-cat-90827

11/13/2023, 3:00 PM

I wonder if is your first commit in the repo or something like this so HEAD is always going to tell you there are changes (as it compares to "nothing"). But I am skeptical 😕

fresh-cat-90827

11/13/2023, 3:00 PM

I assume all tests are listed with

pants --changed-since=HEAD  --changed-dependents=transitive list

goal?

fast-photographer-12719

11/13/2023, 3:01 PM

this is on my repo rather than yours, so there are plenty of prevous commits. Yes all tests appear to be listed (there are a lot but i have no reason to believe any have been missed)

fast-photographer-12719

11/13/2023, 3:02 PM

I'm going to try on your cheeseshop repo to ensure it's not something wrong with my install of pants

✅ 1

fresh-cat-90827

11/13/2023, 3:20 PM

FWIW may be worth running

pants dependents --transitive <your-python-test-module.py>

on random files to see what kind of things they depend on

fast-photographer-12719

11/13/2023, 3:44 PM

Sorry Alexey, i misread your suggestion earlier. I have run the

pants --changed-since=HEAD  --changed-dependents=transitive list

and i see pretty much every file in the repo. Now when remove teh transitvie, i see only one: the default lockfile.

Copy code

py39) matt@DGX-2:~/mlcore$ pants --changed-since=HEAD  list
3rdparty/python/default.lock:_default-resolve_lockfile

I guess that's because the lockfile is younger than the commit perhaps? When i generate the lockfile on the cheeshop repo and run the list or test again, it runs the tests every time. So i guess there is something about the lockfile's age?

fast-photographer-12719

11/13/2023, 3:50 PM

To recreate on cheeseshop repo: 1. clone or get clean state of cheeseshop repo 2. run

pants --changed-since=HEAD  --changed-dependents=transitive list

and you will see no targets 3.

pants generate-lockfiles

4. repeat step 2 but now you will see lots of dependencies. Which means it will try to run all the tests

fast-photographer-12719

11/13/2023, 3:57 PM

Is there a flag i can set to ignore the lockfile creation time from the --changed-since logic?

fresh-cat-90827

11/13/2023, 4:08 PM

well, fundamentally, since the lockfile is different, all targets that depend on it should be considered changed. I am puzzled why you would like to ignore this fact? 😕

fast-photographer-12719

11/13/2023, 4:09 PM

the lockfile isn't different though

👀 1

fresh-cat-90827

11/13/2023, 4:09 PM

pants generate-lockfiles

it may be not obvious, but this command may produce a different output on subsequent runs even with no source code changes because it fetches the data from the external resources and pip may resolve the 3rd party dependencies differently

fresh-cat-90827

11/13/2023, 4:11 PM

the lockfile isn't different though

oh I see, let me try to repro locally and explore!

fast-photographer-12719

11/13/2023, 4:11 PM

i understand it might be different and i can see why you'd want to check it hadn't changed when using the --

changed-since

however if i was using this in CI for example, then i generate the lockfile before teh tests, so it would always identify the lockfile as newer i think?

fresh-cat-90827

11/13/2023, 4:13 PM

I wouldn't recommend following this approach. You would generate a lockfile once and keep it checked in. You may regenerate the lockfiles once in a while, in whatever cadence your organization would find appropriate (e.g. weekly, monthly). Regenerating a lockfile before every tests run isn't going to be helpful 🙂

fast-photographer-12719

11/13/2023, 4:16 PM

i thought the lockfile had to be generated after cloning the repo though? I thought the lockfile isn't committed to the repo, it has to be generated on each platform so the resolves are correct for that platform? So if i were using in CI, then i'd have to run the generate-lockfile first? Or does the test command generate it's own lockfile?

fresh-cat-90827

11/13/2023, 4:20 PM

https://www.pantsbuild.org/docs/python-lockfiles there's no need to regenerate the lockfiles after cloning the repo. If you have conflicting requirements (you can't use a certain PyPI package on arm64, for instance), you could have multiple resolves, each having a separate lockfile. The

test

and other goals would normally use the lockfile that is already in the repo, they don't have to generate any.

fresh-cat-90827

11/13/2023, 4:20 PM

does this help at all?

fast-photographer-12719

11/13/2023, 4:23 PM

ok, so you would normally commit the lockfile to the repo?

👆 1

fresh-cat-90827

11/13/2023, 4:23 PM

100%. You want to make sure you run your tests against the transitively pinned dependencies and every user of the repo (CI, developers, deployment) does the same

👍 1

fresh-cat-90827

11/13/2023, 4:24 PM

e.g. https://github.com/pantsbuild/pants/blob/38cb6def3fa4d4a04baae03786dc016ae5d90154/3rdparty/python/user_reqs.lock Pants lockfile used for its own requirements

fast-photographer-12719

11/13/2023, 4:25 PM

ok great, for some reason i thought the lockfiles had to be generated each time. Great, that's helped immensely. Thank you! 😄

fresh-cat-90827

11/13/2023, 4:28 PM

no problem at all! There are many moving parts and Python ecosystem is very... evolved 😄 and Pants just adds more entities to reason about (albeit with the idea to streamline the build). If it helps, you can think of a lockfile as of a

go.sum

file in Golang world. So your

go.mod

is Python requirements file with direct dependencies (optionally, with exact versions, but most often not) and

go.sum

is the resolved list of requirements with transitively pinned dependencies (with checksums)

👍 1

fresh-cat-90827

11/13/2023, 4:29 PM

FWIW https://github.com/golang/go/wiki/Modules#should-i-commit-my-gosum-file-as-well-as-my-gomod-file

fresh-cat-90827

11/13/2023, 4:30 PM

and https://github.com/jazzband/pip-tools#should-i-commit-requirementsin-and-requirementstxt-to-source-control if you are new to Python ecosystem - this is how many other organizations who are not using Pants or similar build system resolve their dependencies with the lockfiles being very similar to what you see in the Pants lockfiles

fast-photographer-12719

11/13/2023, 4:39 PM

Yeah, I'm familiar with python/requirements.txt. I was under the impression that the lockfiles were more bespoke to the system generating them. i.e. a lockfile generated on a mac, might not work for an ubuntu system etc, and therefore the lockfile should be generated on each new system cloning the repo. It sounds like i had the wrong impression though, so thanks for your help 👍

happy-kitchen-89482

11/13/2023, 4:56 PM

Ah no, the lockfile is cross-platform and is intended to be checked in. Possibly we should emphasize this more in the documentation.

👍 1

fast-photographer-12719

11/16/2023, 11:48 AM

Hi Alexey and Benjy. You were so helpful last time i thought it would be good to ask you directly: Is there anyway to include a resource from above the directory of the current BUILD file? I have two uses cases: 1. My repo needs to check git data, so needs access to the .git folder. The code that needs this is further down the directory structure, so i can't use resources or files targets. 2. I want to build a pex file that includes CUDA libraries to run torch with GPU support. I hoped i could use something like resources to pull those in as well but i know this is a huge task so probably not that simple even if i could get the directories included in the pex

happy-kitchen-89482

11/18/2023, 1:38 AM

A BUILD file can only own

sources=

that are below it in the filesystem tree, but you can have a separate BUILD file that has a

resources()

files()

target that owns those sources, and then the first target can have an explicit dependency on that target.

👍 1

happy-kitchen-89482

11/18/2023, 1:41 AM

However Pants ignores

.git

(and all other root-level dirs starting with a dot) by default (see https://www.pantsbuild.org/docs/reference-global#pants_ignore), so you'll have to futz with that option

happy-kitchen-89482

11/18/2023, 1:41 AM

and I'm not sure what the implications are

happy-kitchen-89482

11/18/2023, 1:41 AM

Re pytorch in pex, that is a huge topic that has been discussed a LOT (search this slack for details)

happy-kitchen-89482

11/18/2023, 1:42 AM

Probably we should gather all that info into a documentation page

2 Views

Open in Slack

Previous Next