Skip to main content
SnackBase includes a background job engine that powers reliable asynchronous execution for webhooks, scheduled hooks, workflow delays, and email delivery. Jobs are managed by superadmins through the Admin API.

Overview

The job system is the execution substrate that other automation features build on:
  • Webhook deliveries are dispatched as jobs with automatic retries
  • Scheduled hooks are executed via jobs on their cron schedule
  • Workflow delays (wait_delay steps) enqueue resume jobs
  • Email sending is handled by background jobs

Key Features

  • Priority Queue: Jobs execute in priority order (lower number = higher priority)
  • Automatic Retries: Failed jobs retry with exponential backoff
  • Stale Job Recovery: Jobs stuck in running state are automatically recovered
  • Job Statistics: Aggregate counts by status with failure rate
  • Retention Cleanup: Completed jobs are automatically purged after a configurable period

Job Lifecycle

pending ──> running ──> completed
                   ├──> failed ──> retrying ──> running (retry)
                   │                        └──> dead (retries exhausted)
                   └──> (cancelled by admin, only from pending)

Job Statuses

StatusMeaning
pendingQueued, waiting to execute
runningCurrently executing
completedFinished successfully
failedExecution failed (may retry)
retryingWaiting to retry after failure
deadAll retry attempts exhausted

Retry Logic

Failed jobs retry automatically with exponential backoff:
delay = retry_delay_seconds * 2^attempt_number
AttemptDelay (default base: 60s)
1st retry60 seconds
2nd retry120 seconds
3rd retry240 seconds
Default max_retries is 3. Once exhausted, the job transitions to dead.

Built-in Job Handlers

HandlerPurpose
webhook_deliveryDeliver outbound webhook payloads
send_emailSend emails via the configured provider
scheduled_taskExecute scheduled tasks
workflow_resumeResume a workflow instance after a wait_delay
scheduled_hookExecute a scheduled API-defined hook’s actions

Admin API

The Jobs API is restricted to superadmins only. It provides a system-wide view across all accounts.

Job Statistics

Get aggregate counts and failure rate:
curl https://api.snackbase.dev/api/v1/admin/jobs/stats \
  -H "Authorization: Bearer {superadmin_token}"
Response:
{
  "pending": 12,
  "running": 3,
  "completed": 1547,
  "failed": 8,
  "retrying": 2,
  "dead": 1,
  "avg_duration_seconds": null,
  "failure_rate": 0.0058
}
The failure_rate is calculated as (failed + dead) / (completed + failed + dead).

List Jobs

Filter by status, queue, or handler:
curl "https://api.snackbase.dev/api/v1/admin/jobs?status=failed&handler=webhook_delivery" \
  -H "Authorization: Bearer {superadmin_token}"

Retry a Job

Manually retry a dead, failed, or retrying job:
curl -X POST https://api.snackbase.dev/api/v1/admin/jobs/{job_id}/retry \
  -H "Authorization: Bearer {superadmin_token}"
This resets the job to pending with attempt_number reset to 0.

Cancel a Job

Cancel a pending job (only pending jobs can be cancelled):
curl -X DELETE https://api.snackbase.dev/api/v1/admin/jobs/{job_id} \
  -H "Authorization: Bearer {superadmin_token}"

Job Fields

FieldTypeDescription
idstringUUID
queuestringQueue name (default: "default")
handlerstringRegistered handler identifier
payloadobjectJSON payload passed to handler
statusstringCurrent job status
priorityintegerExecution priority (lower = higher, default: 0)
run_atdatetimeEarliest execution time (null = immediately)
started_atdatetimeWhen execution began
completed_atdatetimeWhen completed successfully
failed_atdatetimeWhen last failed
error_messagestringMost recent error (truncated to ~5000 chars)
attempt_numberintegerAttempts made so far
max_retriesintegerMax retry attempts (default: 3)
retry_delay_secondsintegerBase retry delay (default: 60)
account_idstringAccount context (null for system jobs)
created_bystringUser who enqueued (null for system jobs)

Worker Configuration

The job worker is configured via application settings:
SettingDescription
job_worker_poll_intervalHow often the worker checks for new jobs (seconds)
job_execution_timeoutMaximum execution time per job (seconds)
job_retention_daysHow long to keep completed jobs before cleanup

Graceful Shutdown

When the worker shuts down, any currently running job is reset to pending so it can be picked up again.

Stale Job Recovery

Jobs stuck in running state longer than the execution timeout are automatically recovered and reset to pending.