hey guys, I’m setting up airflow and was wondering...
# general
h
hey guys, I’m setting up airflow and was wondering if there are any best practices or recommendations in getting airflow set up in our monorepo. specifically: 1. we’re using poetry to track all our common dependencies, but since airflow pins its dependencies and doesn’t support poetry, we’re not able to simply do a
poetry add apache-airflow
since it leads to an unsatisfiable list of dependency constraints 2. because of this, airflow requires you provide a
requirements.txt
with a list of 3rd-party libraries used in your DAGs. does pants have a way to generate a requirements file from a
python_sources
target? am i thinking about this the wrong way? 3. haven’t started working on adding support for 1st-party code that’s imported into DAGs but I have a feeling this is not going to be very straightforward. any recommendations here?
r
Hey, not that I have answer for all these questions but how were you using airflow without pants in terms of 1st party dependencies? In the past I have used docker image to publish all the dependencies needed to run a DAG.
h
i haven’t gotten to supporting 1st party code yet 😅
r
Where are you running airflow? Self managed?
h
MWAA
👍 1
deployed through pulumi
@refined-addition-53644 do you have any links to docs on how you were using docker to publish your dependencies? currently i’m looking at bundling our code into a wheel to create a
plugins.zip
file and then generating a
requirements.txt
file semi-manually, but i can’t imagine this is the right way to do things https://docs.aws.amazon.com/mwaa/latest/userguide/configuring-dag-import-plugins.html#configuring-dag-plugins-airflow-ex
r
Sorry i didn't add the disclaimer that I haven't used MWAA. I had used airflow (cloud composer) on GCP which was running inside a k8s cluster and we used to push docker image to our private docker repo during CI/CD. Then you can use kubernetes pod operator. I googled around for docker support for MWAA and I found this https://medium.com/@sohflp/how-to-work-with-airflow-docker-operator-in-amazon-mwaa-5c6b7ad36976
h
thanks, that makes sense! and in that docker image you included your first-party code as well as the requirements file? how did you generate the requirements? was it simply the same set of reqs you have in your entire project?
r
Yeah we had a common requirements.txt for whole repo
👍 1
w
@high-energy-55500 I'm looking answers to your questions. I'm also encountering the same situation. Were you able to generate the requirements.txt file automatically using pants? Also, I'm curious about how you packaged your first-party code—was it as a pex file or a tar file?
h
@white-twilight-61019 the gist of it is that airflow is a PITA to work with. we ended up needing to use a bunch of hacks to get airflow working. recently we've switched to multiple resolves which addresses some of the pain points with incompatible packages required by airflow, but adds a whole other layer of complexity. if I could go back, i would have preferred we went with Prefect or some other software. Airflow is one of the most frustrating pieces of software i've ever dealt with