in Design

7 Tips for Working with Celery

celery

If you’re looking to integrate a task queue into one of your development projects, you’ll likely be working with Celery, which is a fast, reliable option. Whether you’re a beginner to running task queues, or never worked with Celery, you have a may have a bit of a challenge ahead of you. While Celery provides basic tutorials, they’re lacking in nuance, to say the least. Here are some tips for working with Celery that will help you avoid unnecessary frustration and middle of the night panic.

Set up Task Timeout

If you didn’t already know — tasks don’t just time out on their own. If the tasks contains an indefinite network connection, your queue will back up at some point and take down backend processing. On top of that, you’ll be hunting down an unknown cause. Global default time outs and specific short time outs on tasks will help you avoid this problem all together.
Celery Soft Time Limits
Celery Task Time Limit

Set the Number of Clients

Celery’s default is to set the number of workers equal to the number of cores a machine has. In the case that you are running more than one worker per machine, it’s probably a great idea to explicitly set the number of clients to less. This is a simple fix: use CeleryD_Concurrency setting, OR handle it on the command line by passing -c.

-Ofair Task Distribution for Preforking Workers

The problem with Celery’s default task distribution is that it distributes each task evenly, although that can end up slowing the task down. Using -Ofair for preforking workers optimizes time by waiting to distribute work until each worker process is actually available for work. Although -Ofair distribution comes with added coordination cost, you’ll end up with more predictable behavior, especially if your tasks have varied execution times.

Setup Retry Delays

You know that old saying: If at first you fail, try, try again? When it comes to Celery, one try will do the trick, and then — wait. If the third party service is down, trying over and over exponentially will only make matters worse. You can easily setup retry delays, which will exponentially increase the delay between retry attempts. It’s a pretty simple exponential backoff equation: # of retries already used as an exponent on a desired base retry time.

JSON

JSON Serializer for Inoperability

If you have tasks from other languages, you’ll want to use the JSON serializer to convert another language into a task that you can insert into the queue. This will look just like a Celery task, and works just fine in the queue.

Celery In Development

Celery in production is normally queued on the backend, but running Celery in development may not actually need a queue. Instead, you’ll want to carry out code paths, where CELERY_ALWAYS_EAGER is a nifty tool to make celery run in code and ensure correct code paths.

Configure a Broker

For Celery to work, it needs a broker to communicate with, where messages are sent and received to workers and processed as they are received. RabbitMQ is the most popular, and Celery recommended.

Start RabbitMQ:
$ rabbitmq-server
Activating RabbitMQ plugins …
0 plugins activated:

broker running

Stop it with: $ rabbitmqctl stop

Push notifications, email, backups, you name it: Celery can help. However, learning Celery can be frustrating. What we have listed above barely scratches the surface of the giant that is Celery, and there’s always more to learn. Hopefully these tips make learning Celery a bit faster and save you some sleep too. Happy queueing!