Celery Questions and Answers

 

Q1: What is a message broker?

A system that passes messages between producer and worker (Redis, RabbitMQ).

Q2: Difference between synchronous and asynchronous?

Synchronous blocks execution. Asynchronous runs in background.

Q3: How does Celery scale?

By adding more worker processes or containers.

Q4: What happens if worker crashes?

Task remains in broker queue and can be retried.

Q5: Difference between Celery and cron?

Cron schedules OS-level jobs. Celery manages distributed background jobs.

1️⃣ What is Celery?

Celery is a distributed task queue used to run asynchronous and background tasks in Python applications.

It helps execute time-consuming operations outside the request-response cycle.

2️⃣ Why do we need Celery in Django?

In Django:

Without Celery:

  • API waits until email/report/API call finishes

  • Slow response time

With Celery:

  • Tasks run in background

  • Faster API response

  • Better scalability

3️⃣ What is a Message Broker?

A message broker is an intermediary that sends tasks from the application to worker processes.

Common brokers:

  • Redis

  • RabbitMQ

4️⃣ What is a Worker?

A worker is a separate process that:

  • Listens to the broker

  • Picks up tasks

  • Executes them

Run worker:

celery -A project worker --loglevel=info

5️⃣ What is the difference between synchronous and asynchronous tasks?

Synchronous            Asynchronous
Blocks execution            Runs in background
User waits            User gets instant response
Slower APIs            Faster APIs

6️⃣ Explain Celery architecture.

Flow:

Client → Broker → Worker → Result Backend

  1. App sends task

  2. Broker stores task

  3. Worker consumes task

  4. Result stored (optional)

7️⃣ What is a Result Backend?

Stores task results and states.

Example:

CELERY_RESULT_BACKEND = "redis://localhost:6379/0"

Without result backend → task runs but result not stored.

8️⃣ What are Celery task states?

  • PENDING

  • STARTED

  • SUCCESS

  • FAILURE

  • RETRY

9️⃣ How do you retry a failed task?

from celery import shared_task

@shared_task(bind=True, max_retries=3)
def call_api(self):
try:
risky_operation()
except Exception as exc:
raise self.retry(exc=exc, countdown=5)

Retries 3 times with 5 seconds delay.

🔟 What is Celery Beat?

Celery Beat is a scheduler that runs periodic tasks like cron jobs.

Example:

  • Daily reports

  • Cleanup jobs

  • Auto notifications

1️⃣1️⃣ What is the difference between Celery and Cron?

CeleryCron
Distributed            OS-level
ScalableRuns on single machine
Works with queuesSimple scheduling

1️⃣2️⃣ What are chains in Celery?

Used to run tasks sequentially.

from celery import chain

chain(task1.s(), task2.s())()

task2 runs after task1.

1️⃣3️⃣ What are groups?

Used to execute tasks in parallel.

from celery import group

group(task.s(i) for i in range(10))()

1️⃣4️⃣ What is a chord?

Group of tasks + callback after all complete.

from celery import chord

chord([task1.s(), task2.s()])(callback.s())

1️⃣5️⃣ How does Celery scale?

By:

  • Adding more workers

  • Increasing concurrency

  • Using multiple queues

  • Running in containers (Docker/Kubernetes)

1️⃣6️⃣ What happens if a worker crashes?

  • Task remains in broker

  • Another worker can pick it

  • If acknowledged before crash → may be lost (depends on acks_late setting)

1️⃣7️⃣ What is acks_late=True?

If enabled:

  • Task acknowledged only after completion

  • Prevents data loss on worker crash

  • Ensures reliability

1️⃣8️⃣ What is task idempotency and why important?

Idempotent task → Running multiple times gives same result.

Important because:

  • Tasks may retry

  • Worker may crash

  • Network failures occur

Example:
Instead of:

balance += 100

Use:

if not already_processed:
balance += 100

1️⃣9️⃣ How do you handle large tasks?

Best practices:

  • Break into smaller subtasks

  • Use task chaining

  • Avoid passing large objects

  • Store files in S3 and pass reference

2️⃣0️⃣ How do you monitor Celery?

Common tools:

  • Flower (UI dashboard)

  • Logs

  • Prometheus + Grafana

  • Redis monitoring

🔹 REAL-TIME SCENARIO QUESTIONS

Scenario 1:

Emails are sending twice. Why?

Possible reasons:

  • Task retried

  • Not idempotent

  • Worker restarted

  • acks_late misconfigured

Scenario 2:

Tasks stuck in PENDING state. Why?

  • Worker not running

  • Broker connection issue

  • Wrong queue

  • Task not registered

Scenario 3:

High memory usage in Celery workers. Why?

  • Large objects in tasks

  • Not restarting workers

  • Memory leak in code

Solution:

celery -A project worker --max-tasks-per-child=100

❓ How do you deploy Celery in production?

  • Separate containers

  • Use Redis/RabbitMQ cluster

  • Supervisor or systemd

  • Auto-scaling

  • Health checks

❓ How do you ensure reliability?

  • acks_late=True

  • Retry with exponential backoff

  • Idempotent tasks

  • Dead letter queues

❓ Can Celery work without Django?

Yes. It works with:

  • Flask

  • FastAPI

  • Pure Python apps

    Concept                Key Point
       Broker                Redis / RabbitMQ
       Worker                Executes tasks
       Beat                Scheduler
       Chain                Sequential
       Group                            Parallel
       Chord                Group + callback
       Retry                Handles failure
       Idempotency                Prevent duplicates

?would you design a scalable background task system?
  1. Use Celery with Redis broker

  2. Separate worker containers

  3. Multiple queues (email, reports, payments)

  4. Idempotent tasks

  5. Monitoring (Flower)

  6. Retry + exponential backoff

  7. Horizontal scaling


Comments

Popular posts from this blog

Database Integration in FastAPI (SQLAlchemy CRUD)

Middleware & CORS in FastAPI

Python Data Handling