Skip to main content

Concurrent Workers

What is a Concurrent Worker?

'Concurrent worker' controls the number of concurrent database queries that is sent to the customer database at any point in time. It is a job-queueing function that allows the smooth process of SQL queries to be sent to your database, one at a time.

In return, visualization can be done on the Holistics platform, based on the queries. The Concurrent Workers will thus serve as mediators between the queries and the database.

How does Holistics Concurrent Workers work?

Each customer should have their own job queue that shouldn't affect each other, and each job queue can have different queue size (i.e customer A have 5 slots that can run 5 concurrent jobs, customer B has 3 slots that can run 3 concurrent slots).

Therefore, the more concurrent workers one has, the higher volume of concurrent queries one can run. Easing up congestion is as easy as adding more Concurrent Workers!

Why do you need Holistics Concurrent Workers?

In our platform, whenever someone submits a request, we construct a SQL query that sends to our customers' database, wait for the results, and visualize the charts based on it.

Since the analytics SQL query takes time (a few seconds to minutes), it's not a good idea to use synchronous web requests, thus a background job queue system is needed to handle this - Holistics Concurrent Worker.

If you'd like to learn more about the technical details, you can read more about it here (link).

Why are Concurrent Workers important?

Imagine an extreme scenario where 20 users access 100 charts at the same time. Without any control from the Holistics application, this will send 2000 database queries to the customers' database. If it's a production database, the sudden high incoming volume may crash it.

Holistics workers control the number of concurrent database queries that Holistics will send at any point in time to our customer database. If we set it the customer to have only 5 workers, that there's no more than 5 concurrent queries running at any point in time, while keeping the other queries in a common queue.

Therefore, having more Concurrent Workers increases your querying process for both you and your customers. As your business scales, it is also far more cost effective to be charged based on Concurrent Workers compared to the number of visualizations processed.

Can you give me an analogy?

Imagine you're a tourist who just landed in your favorite country. There are many people like you who are also visiting this country. If there aren't any controls, it is likely that there will be chaos and the airport would shut down.

Custom officers are therefore in place to properly permit travellers to enter the country, ensuring a smooth process. The more custom officers available, the faster the travellers can be permitted into the country. In this case, custom officers are Holistics Concurrent Workers while the travellers are the queries being sent to the database.

New-Character-Illustrations-Feb-2019--Recovered-_Artboard-1

Embedded Analytics - Embed Workers

Holistics' Embedded Analytics feature uses a specific worker type called an Embed Worker. These workers sole usage is to support your embedding of Holistics' Dashboards into your webpage or app.

As you scale your business, you will have increasing simultaneous dashboard viewers. This inevitably slows down the querying process as multiple SQL queries are sent to your database at the same time. Embed Workers come in to ensure that the process of querying these visualization becomes a smooth process for all your customers. The more Embed Workers you have, the faster your customers get the results they need.

Embed Workers can be toggled and increased from the Embedded Analytics Manager (link) The minimum number Embed Workers required for our Embedded Analytics featured is 3 with a (current) maximum of 9.

Note:
  • Embed worker is quite different from other internal workers, just impact the job for external embedded analytics.
  • Internal worker has a range of diff workers which cannot be manually increased by user while embedded analytics has its own set of embed workers that can be activated and increased by user.

If you wish to find out more about Embedded Analytics, feel free to head to our documentation page (link). For a quick overview, you can watch our short 4mins video guide:

Use case: How does caching work in different scenarios with our workers?

Because Holistics allows you to cache the results of the queries sent to your database, this reduces the workload on the workers when an embedded viewer accesses a previously accessed dashboard.

To understand more on this topic, let's walk through a simple example below:

We have a company 'A' and it has 200 viewers (A1, A2,...A2000) [The initial default dashboard filter is the same for all users within the same company]. Each user loads 5 widgets.

Option 1: If you embed a dashboard that is a company level dashboard (where the default filter on the dashboard is at the company level) and doesn't have a custom viewer level filter then:

  • User A1 will help everyone to load the dashboards based on the default filters (This will trigger cache and workers don't mean much here unless each viewer starts applying different filters)
  • This will help to serve the 2000 number of viewers without affecting the initial load time of the dashboard as the data is retrieved directly from the cache.

Option 2: If you embed a dashboard where the default dashboard is at the user level (where you are applying filter at the viewer level) then:

  • Holistics dashboard filters apply a "WHERE" clause to each widget's SQL syntax (based on the specific filter parameters) before sending the final syntax query to your database. If there's a unique SQL query (changed by the filter values) that we can't find in our cache, each widget will send a fresh query to your dashboard.
  • Assuming all users are viewing the dashboard simultaneously with different filters, this will generate 10000 unique SQL queries (5 reports x 2000 users) which is sent to your database for processing.
  • The load time of the dashboard will depend on concurrent workers and the run time of each query.

Now, diving into Option 2 (which is what your use case seems most likely) for instance you have 5 widgets and each widget takes 2 sec run time and you have 5 concurrent workers:

  • Each user loads 5 widgets (i.e. 5 queries). Each query takes 2 seconds to load.
  • Since you have 5 workers and 5 widgets, it will take 2 seconds for each user to load a dashboard
  • When all 2000 users are viewing the dashboard at the same time 2*2000=4000 seconds i.e. roughly will take 66.6 minutes to load the dashboard for all the viewers.

Since the decision of load times depends on the query runtime and concurrent workers, we let you decide how to balance cost vs viewer experience and make decisions if it makes sense to buy more workers or not. 

How much time will it take to process these queries?

Outside of concurrency, our workers are typically not the bottleneck of the data loading time (or negligible impact). They don't process or compute any data. They basically just wait for your query results to be returned from your database before they visualize it in your browser. The time it takes to process these queries depends on the time your database takes to process each query, which is mainly a function of (i) your data warehouse performance and (ii) whether the query is complex/expensive/longrunning.

In the case of the latter (long-running query), you can use Holistics to set up materialized views to automatically persist the results of the query (Transform Model) into your data warehouse physically (storage settings docs). So anyone querying a dataset or dashboard with this (persisted) model will query the results of the physical table instead of running the query at the time of accessing it. This might help in some cases to speed up query time.

Note: Within the Holistics app, you can get real-time data on how your jobs are performing at this.

Pricing by concurrent workers

Basically, our dashboarding solution is priced based on the number of things you build in Holistics and the number of users who will be able to log into your account. What you build is measured by tokens we call 'objects', each type costs a different amount as listed on our website and in-app. You can view our current pricing plan here.

To enable embedded analytics, there's no additional cash price for it. Our pricing allows for unlimited embedded dashboard viewers, just based on the number of concurrent workers that we send to your database for processing.

Holistics Embedded Analytics starts with 3 concurrent workers which are 300 objects. The more workers, the faster it is to load the dashboard, you can add more workers with the additional price of 100 objects/worker.

Pricing by concurrent workers gives a good mix of predictability and usage-based pricing. For some of our customers using the embedded dashboard module, the loading time was a hygienic factor and not a delight factor. For customers who truly require faster query loading time of their dashboards to their customers, they will invest in more workers to load the dashboards.

FAQ

Can we reallocate some or all of the internal workers to be embedded workers?

Our core business model is internal self-service analytics, and embedded analytics is a complimentary add-on. We have not catered from a commercial perspective to support the transfer of workers, or to focus on embedded dashboards.

In addition, you can also set up to 8 concurrent workers for your embedded dashboards on your current plan which may be a good way for you to test the extent to which they can work for you. That way at least you'll have enough spare capacity that if a few customers are using the dashboard concurrently they're not waiting for workers to be freed up.


Let us know what you think about this document :)