Monitoring a Dockerized Celery Cluster with Flower
A flower, sometimes known as bloom or blossom, is the reproductive structure found in flowering plants. Celery is a marshland plant in the family in the family Apicaeae that has been cultivated as a vegetable since antiquity. A docker is a waterfront manual laborer who is involved in loading and unloading ships, trucks, trains or airplanes – Wikipedia.
There are only two hard things in Computer Science: cache invalidation and naming things. – Phil Karlton
What is Flower?
Flower is a web based tool for monitoring Celery workers and task progress.
Install with pip:
# install flower via pip ~$ pip install flower # start flower, pass the message broker url and flower port flower --broker=redis://localhost:6379/0 --port=8888
http://localhost:8888 in your browser.
Flower on Docker
Use the official mher/flower Docker image to dockerize Flower. Define the docker-compose flower service:
# docker-compose.yaml flower: image: mher/flower command: ["flower", "--broker=redis://redis:6379/0", "--port=8888"] ports: - 8888:8888
This solution is not ideal. If you need to change your broker url, you have to touch the flower command. And your Celery workers, as they use the same broker. Sounds messy if you run your app in more than one environment (say, QA and production). And this is not even a complex setup.
Stuff like broker url and flower port is configuration. The twelve-factor app stores config in environment variables. Environment variables are easy to change between deploys. Docker supports and encourages the use of environment variables for config. Both Celery and Flower support configuration via environment variables out of the box. Flower is (roughly speaking) a Celery extension and thus supports all Celery settings.
All Celery settings (the list is available here) can be set via environment variables. In capital letters, prefixed with
CELERY_. For example, to set
broker_url, use the
CELERY_BROKER_URL environment variable. The Flower specific settings can also be set via environment variables. A full list is available here, uppercase the variable and prefix with
FLOWER_. For instance, to configure
port, use the
FLOWER_PORT environment variable.
Refactor the docker-compose flower service:
# docker-compose.yaml flower: image: mher/flower environment: - CELERY_BROKER_URL=redis://redis:6379/0 - FLOWER_PORT=8888 ports: - 8888:8888
Celery Worker on Docker
The Flower dashboard lists all Celery workers connected to the message broker. Celery assigns the worker name. The worker name defaults to
celery@hostname. In a container environment,
hostname is the container hostname. For what it’s worth, the container hostname is a meaningless string.
As long as you run only one type of Celery worker, this is not an issue. Unlike when you run specialised workers in dedicated containers. If you have different workers processing different queues, this becomes an issue. You cannot tell which worker is what by looking at the Flower dashboard. All you see is a list of
One way to solve this is to control the hostname. Docker gives you control over the hostname via the
# docker-compose.yaml worker_1: hostname: worker_1 command: ["celery", "worker", "--app=worker.app", "--loglevel=INFO"] worker_2: hostname: worker_2 command: ["celery", "worker", "--app=worker.app", "--loglevel=INFO"]
The Flower dashboard shows these workers now as
celery@_worker_2. Unfortunately, this solution does not scale. Docker uses the same hostname for all containers that belong to the same service.
worker_1 to two containers results in two workers named
celery@_worker_1. This is ok with Celery but not so much for Flower. Flower shows only one worker in the dashboard, arguably a bug (?). But even if it did show both workers with the same name, you would not be able to tell them apart.
There is an alternative solution hidden in the Celery docs. Celery provides a
--hostname command line argument to set the worker name. The
--hostname argument itself supports variables:
%h: hostname, including domain name
%n: hostname only
%d: domain name only
Refactor the docker-compose worker services:
# docker-compose.yaml worker_1: command: ["celery", "worker", "--app=worker.app", "--hostname=worker_1@%h", ", "--loglevel=INFO"] worker_2: command: ["celery", "worker", "--app=worker.app", "--hostname=worker_2@%h", ", "--loglevel=INFO"]
Here, we set the names
worker_2. You can make this meaningful. For example assigning the same name as the queue name the worker subscribes to. The hostname
%h is still container hostname gibberish but it ensures a unique name, even at scale. Plus, it allows you to link back to the container, which can be useful for logging and debugging.
Gotchas and shortfalls
Flower has no idea which Celery workers you expect to be up and running. The Flower dashboard shows workers as and when they turn up. When a Celery worker comes online for the first time, the dashboard shows it. When a Celery worker disappears, the dashboard flags it as offline.
When you run Celery cluster on Docker that scales up and down quite often, you end up with a lot of offline workers. That’s a lot of dashboard clutter. As of now, the only solution is to restart Flower. There is an open GitHub issue for this.
Flower is the de-facto monitoring tool for Celery. It is easy to set up and deploy into a containerized stack. But it takes some tweaking to make it work effectively in a microservices setup.