4 architecture options for your multitenant analytics solution (2024)

Multitenant analytics is about delivering analytics to users in multiple organizations (tenants). The most common use case for multitenant analytics is customer-facing reports, and dashboards embedded in a SaaS application.

Another frequent use case is an organization that provides analytics to its business partners: suppliers, distributors, resellers, franchises, etc.

4 architecture options for your multitenant analytics solution (3)

Multitenant analytics is often delivered as a product. This involves the following high-level steps:

  • Deliver an initial analytical experience (data visualizations, reports, dashboards, etc.) to new tenants (e.g. organizations, customers, business partners)
  • Organizations customize their analytics with self-service tools
  • You release a new version of the analytics without breaking the customizations
  • Rinse and repeat …

This article describes architecture options for multitenant analytics products.

Here are the key considerations for evaluation of different multitenant analytics architecture options described below. You should weigh them based on the architecture of your application and your user's needs:

  • Data and metadata privacy: privacy of each tenant data and metadata (dashboards, reports, data models, metrics, etc.) must be strongly enforced.
  • Multi-domain analytics: the ability to cross-analyze data from other business domains (e.g. sales, marketing, product, shipments, support).
  • Performance and scalability: sub-second report computation latencies and the ability to scale from single-digit tenants to tens of thousands of tenants.
  • Realtime latencies: analytics uses fresh data with minimum delays.
  • Time to market: solution implementation complexity and cost. Change management velocity (implementation of new versions, and bugfixes).
  • Operational complexity & cost: solution operation complexity and cost. Provisioning new tenants, users, ACLs, permissions, etc. Releasing a new version and rolling it out to all tenants.

This option utilizes the existing operational database that is used for CRUD (Create-Read-Update-Delete) operations on top of the operational data. This approach is good as long as there are few reports (low number of executions) and no or very little data aggregation. If you need to just serve plain lists of data or a few, simple operational reports, this is the easiest option that provides the best realtime reporting capabilities.

4 architecture options for your multitenant analytics solution (4)

However, when your analytical throughput grows (more data, users, or report execution numbers) or becomes unpredictable because of self-service analytics, you’ll need to separate the analytical queries from the operational transactions for performance and scalability reasons. The separation is more important in architectures where the operational database is shared across multiple (or all) tenants of your application.

You might want to invest in a better architecture right from the beginning to not spend your efforts on a temporary solution. Trying to survive with this architecture too long usually leads to significant overspending for the database layer.

This architecture also doesn’t scale in terms of additional data sources. Analytical use cases usually involve data from more domains (e.g. marketing, product, sales data, etc.). Pushing all these additional data to the operational database is another data processing workload for the operational database.

The per-tenant siloed architecture is probably the first that comes to your mind if when you are tasked with extending a single-tenant (internal) analytics solution to a multitenant solution. You simply take a single-tenant analytics solution and deploy it for every tenant. This option is ideal when your application already utilizes a similar siloed architecture.

The siloed approach is great for data and metadata privacy as each of your tenants uses its dedicated infrastructure. Similarly, you can scale individual tenants based on their size and needs.

4 architecture options for your multitenant analytics solution (5)

Achieving close-to-realtime data reports is hard especially when your users need additional datasets that must be distributed to each silo. This applies to additional data (from different domains) as well as to benchmarks.

Operation and management of the siloed multitenant analytics is very hard and costly as you have to deploy, configure, upgrade, and manage all tenants individually. The distributed data management with many databases is also hard because of the data distribution and the fact that you need to apply configurations and upgrades to each tenant individually.

You also might need to invest in advanced virtualization to allocate hardware resources because you don’t want to dedicate the hardware to every tenant.

The shared analytical database architecture relies on the power of a central analytical engine that stores all data for all tenants and serves all queries. Metadata is also stored in a centralized, shared metadata store. The data and metadata access privacy is enforced at the application level using some configuration (e.g. ACLs, forced database filters, etc.).

4 architecture options for your multitenant analytics solution (6)

Data and metadata privacy require special attention in this architecture as all tenants access the centralized data and metadata. In most cases the access is to data and metadata is enforced using some mandatory SQL WHERE filters appended to each query. Automation of all operation and configuration procedures is strongly recommended to prevent human errors that might result in a data breach.

The central analytical database can quickly become a bottleneck as it is used for both data transformation and low-latency analytics queries. Despite many vendor’s claims, there is an inevitable tradeoff between query latency, concurrency and data freshness to be made. The key implication for you is that this architecture will soon require the central database sharding to avoid huge investments in hardware.

Cost and data privacy are the reasons for extending the previous, shared analytical database architecture with workspaces (aka namespaces). The extended architecture contains these two fundamental components:

  • Data warehouse (or data lake) that aggregates data for all tenants for shared data transformations and management (e.g. machine learning, benchmark computation, shared datasets, etc.) purposes. Unlike in the previous architecture option, the low-latency analytical queries execute at the workspace level. So the data warehouse can be optimized for data transformation (ETL/ELT). This allows using more cost-efficient components like Apache Spark, AWS Athena, or cloud storages like AWS S3 or Azure Blob Storage instead of costly Snowflakes or Redshifts.
  • Workspace (aka namespace) contains private data and metadata for each tenant. There are important considerations regarding the workspace query implementation (e.g. in-memory cube, database instance, federated query with a caching layer, etc.).

This architecture is less brittle from a data privacy perspective than the previous one as the workspaces automate the private data distribution from the data warehouse. The workspace also isolates the tenant-private metadata (e.g. custom reports or dashboards).

The distributed nature of workspaces provides more flexibility for scaling. The fact that data volume is partitioned by tenant enables usage of more cost-efficient or faster technology (e.g. in-memory or opensource databases). Also, the workspace isolates other tenants from query workload from large tenants (a large number of users or large data volume).

4 architecture options for your multitenant analytics solution (7)

As stated above this architecture requires heavy automation at the data distribution (from the data warehouse to workspaces), and metadata distribution (releases). This automation requires additional investments (build vs. buy).

There are many open-source technologies that you can leverage as building blocks for your multitenant analytics solution architectures described above.

Analytical database/data warehouse

There are many open-source and commercial databases that you can leverage. Postgres or traditional commercial databases like Oracle or MS SQL Server are probably the best options for the first architecture option that requires good handling of mixed-load workloads.

Postgres or MariaDB are great choices for the siloed architecture option unless you have tenants with larger data (>50GB) or many users (>100). You should again take a look at the commercial options. Vertica with the community edition option might be an interesting option for larger tenants.

Snowflake, Google BigQuery, Dremio, Amazon Redshift, and Vertica are the best for the shared database option as they are optimized for the mixed load from low-latency queries and ETL/ELT micro-batches.

Apache Spark and Amazon Athena are more cost-efficient options for the data warehouse implementation. You can leverage them in combination with workspaces that handle low-latency queries.

Postgres, MariaDB, or Vertica (for larger tenants) are in my opinion the best options for workspace implementations.

Reports, dashboards, and data visualizations

Reports, dashboards, and data visualizations can be implemented via standard single-tenant BI tools. Many vendors use in-memory data processing for low latency (e.g. PowerBI, Qlik). The in-memory approach usually doesn’t provide great realtime capabilities (limited data refresh frequency) and doesn’t scale in terms of data volume. Other BI tools like Tableau use a file-based query mechanism that scales better to larger data volume than the in-memory alternatives.

Many BI tool vendors implement a direct query mechanism that allows executing queries at the database level. Be careful when you use the BI tools for the implementation of the most advanced, workspace-based architecture. The direct query degrades this architecture to the shared analytical database or forces you to implement the workspace storage and query layer manually.

If you don’t want to design and engineer your multitenant analytics architecture yourself, you can use an existing analytics platform. There are a couple of them available on the market. The GoodData analytics platform is in my opinion the best choice for multitenant analytics solutions. Let me briefly show how this platform implements the architectures above.

GoodData platform: fully managed service and local containers

GoodData platform offers two deployment options:

  • Fully managed SaaS platform with workspaces that connects to your data warehouse and distributes data to GoodData hosted workspaces.
  • Docker & Kubernetes container’s images that allow for deploying analytics to your on-premise data center or to private or public cloud (e.g. Amazon AWS, Azure, or Google Cloud) side-by-side with your application. In this case, the GoodData platform connects to your local database.

The fully managed SaaS platform de-facto implements the last, most advanced workspace-based architecture option described in this article.

The GoodData platform container images can be used for the implementation of the first three architectures described in this article: operational database analytics, per-tenant silos, or centralized analytical database. The locally deployed GoodData platform also provides virtual workspaces for multitenant management of your tenant’s metadata.

The unified analytics layer and the same analytics tools allow for easy hybrid deployment of your solution by combining the fully managed SaaS with public or private cloud deployment.

Multitenant analytics like customer-facing analytics are hard and costly to implement and operate. I strongly recommend you plan your implementation at least 18 months ahead. Try to assess the future state of your analytical solution and design its architecture based on the future state’s requirements. Spend more efforts with planning your engineering and operation budget to decide whether you want to build the solution in-house or adopt an existing analytics platform.

4 architecture options for your multitenant analytics solution (2024)
Top Articles
2 Best Ways to Wipe Data From Any Phone
Top 11 Help Center Examples to Learn from in 2023 – Faqprime
Parke County Chatter
Kreme Delite Menu
Live Basketball Scores Flashscore
Manhattan Prep Lsat Forum
Dollywood's Smoky Mountain Christmas - Pigeon Forge, TN
Tesla Supercharger La Crosse Photos
Comforting Nectar Bee Swarm
7.2: Introduction to the Endocrine System
Irving Hac
B67 Bus Time
Aspen.sprout Forum
Gon Deer Forum
Games Like Mythic Manor
Conscious Cloud Dispensary Photos
Vanessa West Tripod Jeffrey Dahmer
1v1.LOL - Play Free Online | Spatial
No Hard Feelings - Stream: Jetzt Film online anschauen
Exterior insulation details for a laminated timber gothic arch cabin - GreenBuildingAdvisor
Transactions (zipForm Edition) | Lone Wolf | Real Estate Forms Software
Mc Donald's Bruck - Fast-Food-Restaurant
yuba-sutter apartments / housing for rent - craigslist
Red Cedar Farms Goldendoodle
Xfinity Cup Race Today
Greenville Sc Greyhound
Shoe Station Store Locator
4 Times Rihanna Showed Solidarity for Social Movements Around the World
John Philip Sousa Foundation
Osrs Important Letter
Best New England Boarding Schools
Mkvcinemas Movies Free Download
Craigslist Free Stuff San Gabriel Valley
Phone number detective
Kagtwt
Where Do They Sell Menudo Near Me
John F Slater Funeral Home Brentwood
What Are Digital Kitchens & How Can They Work for Foodservice
Tugboat Information
Thelemagick Library - The New Comment to Liber AL vel Legis
Updates on removal of DePaul encampment | Press Releases | News | Newsroom
Gopher Hockey Forum
COVID-19/Coronavirus Assistance Programs | FindHelp.org
Joey Gentile Lpsg
Payrollservers.us Webclock
Lucifer Morningstar Wiki
How I Passed the AZ-900 Microsoft Azure Fundamentals Exam
Az Unblocked Games: Complete with ease | airSlate SignNow
Zeeks Pizza Calories
Sacramentocraiglist
Craigslist Pets Lewiston Idaho
Inloggen bij AH Sam - E-Overheid
Latest Posts
Article information

Author: Cheryll Lueilwitz

Last Updated:

Views: 6242

Rating: 4.3 / 5 (54 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Cheryll Lueilwitz

Birthday: 1997-12-23

Address: 4653 O'Kon Hill, Lake Juanstad, AR 65469

Phone: +494124489301

Job: Marketing Representative

Hobby: Reading, Ice skating, Foraging, BASE jumping, Hiking, Skateboarding, Kayaking

Introduction: My name is Cheryll Lueilwitz, I am a sparkling, clean, super, lucky, joyous, outstanding, lucky person who loves writing and wants to share my knowledge and understanding with you.