Job Queue System and Workers
This document talks about Holistics Job Queue System: How Holistics processes requests whenever users open a dashboard.
High level Mechanism
In Holistics, when a user opens a report, we construct a SQL query that is sent to the customer's data warehouse, wait for it to finish and visualize the results.
Since the analytical SQL queries take time (seconds to minutes), it is usually not a good idea to handle this using synchronous web requests. A more scalable solution is to use a background job queue system.
A typical flow would look like:
- When a user views a report, a job is created and pushed into a job queue.
- A worker picks up the job, constructs the SQL queries, then runs them against customer’s data warehouse
- Once the query is finished, the result set is visualized and presented to the user’s browser.
What kind of actions will create a job?
Usually actions that involve running a SQL against the customer’s data warehouse:
- Users viewing dashboards
- Email schedules triggered
What is a Worker/Concurrent Worker?
A worker (or concurrent worker) is an actor that processes jobs pushed into the queue. Jobs in a queue are processed sequentially. An available worker will pick up the next job and process it. Once done, the worker releases the job and picks up the next one.
Think of workers as the staff behind the counters when you visit a bank: You queue in a line; and when it’s your turn, the next available bank officer will attend to you.
Each Holistics customer (tenant) has a fixed number of workers that can be adjusted/purchased. The more concurrent workers a tenant has, the higher volume of concurrent queries one can run.
Type of Job Queues
Each Holistics customer has their own job queue and workers. This ensures one customer overloading the job queue will have zero to little effect to other customers’ system.
Furthermore, depending on the nature of the job, it will be classified into different queues (or pools). For example, a Report job runs in a different queue than a Data Transform job.
Life Cycle of a Job
We are rolling out new Job Statuses to make them more intuitive.
Please refer to this Community post for more details.
Note that Holistics APIs still use the old Job statuses (created and queued).
|Status||Description||New Status||New Description|
|created||pending||This job is waiting for an available job worker in your workspace.|
|queued||starting||This job is being picked up/started by an available job worker and is going to be executed shortly.|
|running||running||This job is being executed by a job worker.|
|success||If the job runs successfully, it will have success status.|
|failure||If the job runs unsuccessfully, it will have failure status.|
|cancelling||While a job is running, if you manually cancel the job, it will have cancelling status.|
|cancelled||If the job is cancelled successfully, it will have cancelled status.|
|already_existed||When a job have this status, it means that this job currently coincides with the previous jobs created / queued / being run.|
Default slots for specific job queues
Refer to this docs for the detailed list of queues and their default worker count for each account.