Skip to main content

Job Queue System and Workers

This document talks about Holistics Job Queue System: How Holistics processes requests whenever users open a dashboard.

High level Mechanism

In Holistics, when a user opens a report, we construct a SQL query that is sent to the customer's data warehouse, wait for it to finish and visualize the results.

Since the analytical SQL queries take time (seconds to minutes), it is usually not a good idea to handle this using synchronous web requests. A more scalable solution is to use a background job queue system.

A typical flow would look like:

  1. When a user views a report, a job is created and pushed into a job queue.
  2. A worker picks up the job, constructs the SQL queries, then runs them against customer’s data warehouse
  3. Once the query is finished, the result set is visualized and presented to the user’s browser.

What kind of actions will create a job?

Usually actions that involve running a SQL against the customer’s data warehouse:

  • Users viewing dashboards
  • Email schedules triggered
  • Etc.

What is a Worker/Concurrent Worker?

A worker (or concurrent worker) is an actor that processes jobs pushed into the queue. Jobs in a queue are processed sequentially. An available worker will pick up the next job and process it. Once done, the worker releases the job and picks up the next one.

Think of workers as the staff behind the counters when you visit a bank: You queue in a line; and when it’s your turn, the next available bank officer will attend to you.

Each Holistics customer (tenant) has a fixed number of workers that can be adjusted/purchased. The more concurrent workers a tenant has, the higher volume of concurrent queries one can run.

Type of Job Queues

Each Holistics customer has their own job queue and workers. This ensures one customer overloading the job queue will have zero to little effect to other customers’ system.

Furthermore, depending on the nature of the job, it will be classified into different queues (or pools). For example, a Report job runs in a different queue than a Data Transform job.

Life Cycle of a Job

New Job Statuses

We are rolling out new Job Statuses to make them more intuitive.
Please refer to this Community post for more details.
Note that Holistics APIs still use the old Job statuses (created and queued).

StatusDescriptionNew StatusNew Description
createdWhen a job is first triggered (whenever you click Submit or Create button), it will have created status.pendingThis job is waiting for an available job worker in your workspace.
queued Depend on the default slot limit of a specific queue in your tenant/company, if the slot is currently not fully occupied, the created job will be queued.startingThis job is being picked up/started by an available job worker and is going to be executed shortly.
running Depend on the available slots of our concurrent running workers, Holistics will then execute/run jobs which are currently being queued.runningThis job is being executed by a job worker.
success If the job runs successfully, it will have success status.
failure If the job runs unsuccessfully, it will have failure status.
cancelling While a job is running, if you manually cancel the job, it will have cancelling status.
cancelledIf the job is cancelled successfully, it will have cancelled status.
already_existed When a job have this status, it means that this job currently coincides with the previous jobs created / queued / being run.

Default slots for specific job queues

Refer to this docs for the detailed list of queues and their default worker count for each account.

Let us know what you think about this document :)