https://pantsbuild.org/ logo
#development
Title
# development
h

hundreds-father-404

11/22/2019, 8:22 PM
Hi - in a unit test is it possible to register a
Digest
with the engine but not have it be created in the test build root? I’m rewriting
fmt-v2
to use
Workspace
and want to test that
fmt-v2
properly converts a
FmtResult.digest
into
workspace.materialize_directories()
. However, my test setup doesn’t work properly because when I call
self.scheduler.product_request(Digest, [InputFilesContent(..)]
in the test setup, the digest gets materialized into the build root prematurely https://github.com/pantsbuild/pants/pull/8691/files#diff-beb17e333eb84b51667078f6c4785f16R40 Is there a way for that
Digest
to instead be created in a temporary directory?
a

aloof-angle-91616

11/22/2019, 8:23 PM
materialize_directories
i thought was supposed to take a root dir param? if not i might hack it to do that just for testing, just a thought. back to keyboard in a few
h

hundreds-father-404

11/22/2019, 8:24 PM
Yes, it does. We implement it in
fmt.py
to use
path='./'
(for context,
binary.py
uses
path='dist/'
)
w

witty-crayon-22786

11/22/2019, 8:32 PM
you can request a digest of
FilesContent
h

hundreds-father-404

11/22/2019, 8:33 PM
Ohh is the issue using
InputFilesContent
as the subject type?
w

witty-crayon-22786

11/22/2019, 8:33 PM
..oh.
InputFilesContent
.
yea.
h

hundreds-breakfast-49010

11/22/2019, 8:35 PM
the point of InputFilesContent is to create a file on disk
👍 1
that type gets handled as an intrinsic and actually does file-writing
I'm trying to think about the right way to create a Digest that isn't yet materialized on disk, we definitely do this
👍 1
w

witty-crayon-22786

11/22/2019, 8:37 PM
er, that's what that does
the intrinsic that consumes
InputFilesContent
puts it directly into the database, and then gives you a
Digest
for it.
h

hundreds-father-404

11/22/2019, 8:38 PM
that’s what that does
What what does?
InputFilesContent -> Digest
writes to disk, which I do not want, and
FilesContent -> Digest
isn’t registered
w

witty-crayon-22786

11/22/2019, 8:38 PM
it won't actually ever go in a directory on disk (just into the database)
no, it does not.
@hundreds-breakfast-49010: i think you're confusing it with materialize
(or something)
sec.
@hundreds-father-404: you used this to create
__init__.py
files
h

hundreds-breakfast-49010

11/22/2019, 8:40 PM
@hundreds-father-404 something like the
workspace_console_rule
test in
fs_test.py
may be what you want
...implements that intrinsic.
given InputFileContents, it creates snapshots by storing each file into the database/Store, and then merges them into a single snapshot and returns its digest
none of that touches any particular file path on disk... it's all "in memory" (slash "in the database")
@hundreds-father-404: what are you giving to
InputFilesContent
?
h

hundreds-father-404

11/22/2019, 8:44 PM
Okay so…turns out this is an issue with the V1 chroot
Running
./pants clean-all
and then
./pants test src/python/pants/rules/core:tests -- -k test_non_union_member_noops
works, but then it fails to run
./pants test src/python/pants/rules/core:tests -- -k FmtTest
. I think this is because the other two tests are writing to the workdir and not cleaning up after themselves
w

witty-crayon-22786

11/22/2019, 8:44 PM
mm, yea, i wasn't seeing anything in your example
@hundreds-father-404: oh. wait, are you using the Workspace in this test...?
h

hundreds-breakfast-49010

11/22/2019, 8:46 PM
yeah, I don't think tha'ts directly related to getting a Digest from an InputFilesContent
but yeah, Workspace actually writes a file to disk
so if you don't manually clean it up, it should stay there when the next test runs
h

hundreds-father-404

11/22/2019, 8:46 PM
oh. wait, are you using the Workspace in this test...?
Yes, the rule requires a
Workspace
arg so I’m passing it
Workspace(self.scheduler)
h

hundreds-breakfast-49010

11/22/2019, 8:47 PM
I think in my tests for workspace I deliberately made workspace write to a tmpdir so that would happen automatically
👍 1
but if we change workspace to require that we write to the build root, that won't work anymore
it's also possible that what we want is actually a MockWorkspcae, much like the MockConsole
w

witty-crayon-22786

11/22/2019, 8:47 PM
hm. and
self.scheduler
in a
TestBase
test should have an "anonymous" buildroot
h

hundreds-breakfast-49010

11/22/2019, 8:47 PM
but Workspace, right now, should actually be writing something to disk, that's what it's for
👍 1
w

witty-crayon-22786

11/22/2019, 8:48 PM
oh. yea,
Workspace
should either take a buildroot arg, or get it from the scheduler.
h

hundreds-father-404

11/22/2019, 8:48 PM
but Workspace, right now, should actually be writing something to disk, that’s what it’s for
Agreed, and it’s important that the tests confirm we are actually writing to disk. But somehow we do need to isolate each individual unit test. Hm
w

witty-crayon-22786

11/22/2019, 8:48 PM
this is why
get_buildroot
is insidious
👍 1
...exceeept. it... doesn't call that.
wat
basically,
self.scheduler
should have an anonymous buildroot.
h

hundreds-father-404

11/22/2019, 8:51 PM
Isn’t it already?
Copy code
@memoized_method
  def _build_root(self):
    return os.path.realpath(mkdtemp(suffix='_BUILD_ROOT'))
w

witty-crayon-22786

11/22/2019, 8:51 PM
yes. but it's not being used in this case
so... are you giving
Workspace
absolute paths...?
mmm.
Copy code
DirectoryToMaterialize(path="./", directory_digest=result.digest),
👍 1
i think that we are not prefixing the buildroot in
Workspace.materialize_directory
, and should be ...?
or perhaps deeper, in the rust code
because
./
will result in "cwd", not the buildroot, unless it is prefixed with the right path
👍 1
h

hundreds-father-404

11/22/2019, 8:55 PM
Then that means
path=dist/
in
binary.py
will also use
cwd
w

witty-crayon-22786

11/22/2019, 8:55 PM
and this probably made the tests for
src/python/pants/rules/core/binary.py
a bit broken
yea
so yea. we need to require a relative path in
Workspace
, and we should probably either prefix in
Workspace.materialize_directory
or deeper in https://github.com/pantsbuild/pants/blob/3896fb088fb979057062615351954df7f12eee1b/src/rust/engine/fs/store/src/lib.rs#L682-L691
h

hundreds-breakfast-49010

11/22/2019, 8:58 PM
that makes sense
w

witty-crayon-22786

11/22/2019, 8:58 PM
i don't know if we have any code that is attempting to write outside of the buildroot
but if we don't, we might just want to require a "path relative to the buildroot" more deeply
👍 1
(i can't think of any cases where we'd want to write outside of the buildroot)
h

hundreds-breakfast-49010

11/22/2019, 8:59 PM
yeah that was something that you had brought up at one point
the only case I know of right now is in fs_test.py
where we deliberately write to a tmpdir so the files get cleaned up
w

witty-crayon-22786

11/22/2019, 9:01 PM
each scheduler has a new buildroot in TestBase, in a tempdir
👍 1
h

hundreds-father-404

11/22/2019, 9:03 PM
Right now, we’re not hooking up the temporary
build_root
to the
TestBase
scheduler. https://github.com/pantsbuild/pants/blob/master/src/python/pants/testutil/test_base.py#L388-L396 could take an argument
build_root
and without it uses
get_buildroot()
w

witty-crayon-22786

11/22/2019, 9:08 PM
hm... are you sure? that doesn't seem likely
i feel like that would break... everything
...oh. maybe because
--chroot
is on by default?
h

hundreds-father-404

11/22/2019, 9:15 PM
Yes, when `scheduler`’s constructor calls
get_buildroot()
, it will still operate outside the true buildroot inside a
.pants.d
folder, but not where we want it
w

witty-crayon-22786

11/22/2019, 9:15 PM
oof
h

hundreds-father-404

11/22/2019, 9:46 PM
Only require relative path for
Workspace
, right? Not for any arbitrary
materialize_directories()
call?
w

witty-crayon-22786

11/22/2019, 9:54 PM
well, see above. if we can get away with it, we should require for all calls
i can't really imagine a good usecase for using this API to write files outside the buildroot
a

aloof-angle-91616

11/22/2019, 9:55 PM
+1
w

witty-crayon-22786

11/22/2019, 9:55 PM
we might have some codepaths using absolute paths, but they shouldn't be
h

hundreds-father-404

11/22/2019, 9:57 PM
w

witty-crayon-22786

11/22/2019, 9:57 PM
yep!
...or, modify
DirectoryToMaterialize
h

hundreds-father-404

11/22/2019, 9:58 PM
I started working on this PR but got stuck figuring out how to access the build root. We had created a singleton, but this is a normal function and we can’t
await
in
scheduler.materialize_directories
w

witty-crayon-22786

11/22/2019, 9:58 PM
the scheduler has a field, maybe... ?
@hundreds-father-404: i think that if you validate that it's relative outside the rust code
...you can prepend it inside the rust code
👍 2
h

hundreds-father-404

11/22/2019, 10:01 PM
Do we simply validate in Python
not pathlib.Path(f).is_absolute()
or we have to validte that it’s relative to the build root? That is, can we avoid knowing what the build root is on the Python side
w

witty-crayon-22786

11/22/2019, 10:02 PM
any relative path is "valid"
👍 1
because you're creating it
☝️ 2
so yea,
not is_absolute()
works
h

hundreds-father-404

11/22/2019, 10:02 PM
Ahhh. That conceptually makes sense. Thank you!
w

witty-crayon-22786

11/22/2019, 10:08 PM
h

hundreds-father-404

11/22/2019, 10:09 PM
Okay, thank you for pointing me to the code. How would that know what the build_root is though? Pass it as a new param to that function?
w

witty-crayon-22786

11/22/2019, 10:10 PM
...hm. i thought that the
Store
already had a root path, but it doesn't. so not there, i suppose.
👍 1
h

hundreds-father-404

11/22/2019, 11:04 PM
woot, got the original tests passing with this change! Putting up PR nowish