Hello! I am creating a PoC for a Go monorepo, and ...
# general
a
Hello! I am creating a PoC for a Go monorepo, and I am trying out Pants. I got things working, and I have a Go package where I run some tests. This is a single file with 9 quite simple tests, so nothing big. The dependencies are quite big though. When I first ran the test, it took quite long, as analysing such big dependencies took a while. But, when I just rerun the exact same command without changing any code, it takes 22 seconds for it to run, and this happens every time. The command I run is
pants test libs/generators/azure:azure
, and it says
libs/generators/azure:azure succeeded in 0.91s (cached locally).
, which I interpret as the actual tests took less than a second, while the setup took 21 seconds, which seems a bit much. Am I doing something wrong, or is this expected? I attached a 30 second video just to show it in action.
g
So two things... Your pantsd is dying, which causes a lot of things to get invalidated unncessarily. Which pants version are you on? If you remove
.pants.d/pants.log
and then run it a few times, can you then share that log? Would be interesting to see why that happened. You might benefit from tuning the max memory, that is a common cause for spurious shutdowns. The logs will say if that is why, or another reason. Second, I'm quite sure you're seeing https://github.com/pantsbuild/pants/issues/20274... If you can do a two back-to-back runs with
pants -ldebug <...>
and share the logs that would help diagnose that.
a
Thank you for the reply! I think you are correct. I wasn't aware of the log, sorry. This is the
.pants.d/workdir/pants.log
after I deleted it and reran the command:
Copy code
13:02:13.15 [INFO] pantsd 2.22.0 running with PID: 23200
13:02:13.18 [INFO] handling request: `test libs/generators/azure:azure`
13:02:22.19 [ERROR] The scheduler was invalidated: Exception('pantsd process 23200 was using 4189.11 MiB of memory (above the `--pantsd-max-memory-usage` limit of 4096.00 MiB).')
13:02:22.34 [ERROR] service failure for <pants.pantsd.service.scheduler_service.SchedulerService object at 0x10770b6a0>.
13:02:22.34 [INFO] Waiting for ongoing runs to complete before exiting...
13:02:33.64 [INFO] request completed: `test libs/generators/azure:azure`
13:02:33.64 [INFO] Server exiting with Ok(())
13:02:33.64 [INFO] Waiting for Sessions to complete before exiting...
13:02:33.64 [INFO] Waiting for shutdown of: ["scheduler_service_session", "store_gc_service_session", "pants_run_2024_10_02_13_02_14_324_d07f5bc721544172a468db293c35b6db"]
13:02:33.64 [INFO] Shutdown completed: "scheduler_service_session"
13:02:33.64 [INFO] Shutdown completed: "store_gc_service_session"
13:02:33.64 [INFO] Shutdown completed: "pants_run_2024_10_02_13_02_14_324_d07f5bc721544172a468db293c35b6db"
13:02:33.67 [INFO] Exiting pantsd
13:02:33.67 [WARN] File watcher exiting with: The watcher was shut down.
Turning the max memory to 8GiB instead of 4Gib solved the issue, and it now says
libs/generators/azure:azure succeeded in 0.91s (memoized).
instead.
The output while running with
ldebug
flag was quite large, but I added them as .txt attachments.
g
Thank you! Glad to hear the memory bump seems like a solution. Re the other two logs, nothing stands out there... If you have the time, can you repeat the same experiment with
-ltrace
? I was hoping to see whether it was actually doing any work or just spend those 18 seconds on retrieving cached info. I have a sneaking suspicion most of those 4Gbs aren't actually used, they're just part of the dispatch for elidable work.
a
Of course! Added the two new runs as attachments.
g
Awesome, thank you! Will take a peek later, I assume these are even longer.
a
They are both 13 megabytes, so yeah, quite a lot 🙂
😰 1
g
Ok, looked through it. For all I can tell, both of those runs are pulling from the cache, there's no actual work being done. It's just dispatching a metric ton of Pants internal analysis rules - ~15k.
(No new info really, but always good with more datapoints.)
a
Thank you for checking! Do you want me to add anything to the existing issue?
g
If the repo is open another repro is always helpful, but I think it's mostly a time issue at this point, someone has to do the (probably significant) refactor of the code here.
a
It is not open, unfortunately 😞 Anyways, thanks for helping! Tell me if there is anything more you want me to test.