Question about dep inference (or maybe just common...
# development
w
Question about dep inference (or maybe just common sense in general): I have a C/C++ header here: https://github.com/sureshjoshi/pants-plugins/tree/76-cc-compilation/examples/cc/core/include/core - which is pulled in from another code module in that repo. If I'm compiling
examples/cc/core/src/greeter.c
- the
cc
backend knows how to discover the associated public header (via some dep inference rules, using a common C++ convention). However, if I'm discovering that public header from another module (e.g. the "app" module), it discovers the header file correctly, but from the header, the dep inference doesn't currently know that is has to compile the associated C/C++ source file to actually be useful (nor does there necessarily need to be an associated source file, for a given header). I guess what I'm asking is whether dep inference should be something that is more "targeted" in nature (we need exactly this header, so try to figure out which are the associated source files), or if it should follow a kitchen sink approach (find a source root, enumerate ALL source files and headers below that, mash it all together, and try to make something useful)? The "easy" solution which I already have is to ensure the "app" module has an explicit dependency on the "core"
cc_library
target, which knows how to package the library and public headers correctly. That feels like a cop-out, since compiling dependencies from source is a valid workflow.
Alternatively, is there some way to say "oh, you found
greeter.h
which is inside the
core
module, so inspect the core
BUILD
and see if there are any
cc_library
targets present, and then take a dep on those"? Still feels cop-outy, but šŸ¤·
TLDR-ish: Is it reasonable to transitively, and broadly, dep-infer based on source roots, even though that could mean enumerating many, many more targets overall?
Actually, well, it gets even wilder - because there is the reasonable option about whether you'd want to infer straight to source files in the first place - or whether you'd want to infer that there is one or more
cc_library
your header is a part of, and make sure you use that - as it might have special link/compile flag
h
So the question is, what should the target of the inferred dep be if some code has
#include path/to/greeter.h
?
For example, should this imply a dep on
path/to/greeter.cc
?
This is where things get odd. There is no compile-time dep on
<http://greeter.cc|greeter.cc>
implied by a dep on
greeter.h
. That is the whole point of header files! You can compile translation units completely independently.
But, then the question is how to do the right thing at link time
w
Exactly - it gets kinda nutty. cmake basically solves this by forcing explicit specification on everything, which, whatever šŸ¤· A reasonable approach, to me, seems like looking for the source root's
cc_library
targets, and pulling that in. But, it already doesn't account for a whole host of workflows I think (even some of my workflows own, for that matter, between test and production code). I could probably make the argument that if there isn't a
cc_library
, this could be a use case for a synthetic target? Might not be needed though.
the question is how to do the right thing at link time
I didn't really think about compiling the missing files at link time, that feels like a weird time to do it - but I don't see why not actually. C++ is a great case for the visibility code too, which will need to be adhered to. In my brain, I treat each of my
core
/
app
/ whatever modules as standalone "things". That might not be the Pants way, but it's definitely the SJ's brain way. From that perspective, not importing a
cc_library
would be weird, because you're potentially trying to grab an unfinished "thing". However, I can validly see someone using multiple source roots, and just wanting to compile code
So, that is to say - I'm perfectly find running the full dep inference per source root, and kinda "reverse engineering" the c files from headers (as dependants), but that would probably definitely lead to over-compilation
And then of course... Modules (the C++ kind)...
h
My point is that in c/c++ land we have to be careful about what we mean by ā€œdependentsā€. There are compile-time deps and link-time deps, and they are different. If I modify a .cc file I do not need to recompile anything that #includes the corresponding .h file. That is the whole point of .h files! But I do need to re-link the binary.
w
Yeah, exactly - it gets hella muddled.
h
c
half-formed thought: It feels like this is a case where punning has generally worked before, but now doesn't. In Python, a file is both the exported symbols and the implementation, so referencing it as the target for all of that makes sense. For C/C++, these are not necessarily the same. There is also the fact that a file can be smaller than the smallest separable grain (pathological is many .h to many .c, but even a 1:1 .h to .c aren't always separable). So would it make sense to split the normally punned roles for a target into separate ones? like exported_symbols, source_code, translation_unit, ... . And then you could have different dependency rules for different purposes. So you could say that the
source_code
of "app/main.c" depends on the `exported_symbols`of "core/greeter.h"; but you can use different rules to infer that the
binary
of "app/main.c" depends on the
exported_symbols
of "core/greeter.h" so Pants knows it needs to find the
translation_unit
that includes those symbols and its associated
source_code