By Richard Louden, Head of Technology (Data) at Nimble Approach
Many organisations are eager to adopt AI, with a number of the larger platform players marketing key functionality to accelerate this. This blog explores if the reality of these features matches the maAs AI functionality is integrated with data platforms, a potential evolution of dashboards and reporting has emerged where users can now answer questions around organisational data via natural language.
This blog reviews Databricks Genie Spaces, one of the current offerings in this area, to assess how it can deliver value to organisations and highlight the factors that should be considered when implementing it.
Evolving Organisational Insight Using LLMs
For the majority of business users, dashboards are the main approach to understanding what has happened within their organisation to support decision making. The development of these pre-determined reports follows a standard pattern of: understand the questions a user will ask of the data, develop underlying transformations, and present this data to help them answer said questions. Whilst effective at creating an informative output at first, this process falls down as users start to expand and evolve their questions. Addressing these evolving requirements often demands new data sources, visualisations, and, in some cases, entirely new dashboards, all of which require development effort and can impact the user experience.
As this process continues, it becomes more obvious that a single solution is not fit for the two distinct use cases – i.e. quickly understanding an aspect of the organisation and interrogating organisational data as a non technical user. While pre-defined dashboards can satisfy the first requirement, they are less effective at addressing the second.
To bridge this gap, organisations are increasingly turning to alternative solutions, with Large Language Models (LLMs) emerging as a particularly promising option. Such tools have cemented themselves as effective coding assistants, evolving how engineers build applications and data pipelines. Extending similar capabilities to business users is a natural next step. Given access to the appropriate data and context, Large Language Models (LLMs) can translate business questions into underlying queries and deliver relevant, data-driven answers.
Databricks Genie Spaces: A Natural Language Overlay To Your Data
With the majority of organisations storing copies of their organisational data in one of the major platform solutions (Databricks, Snowflake & Fabric), it’s not surprising to see these providers develop products in this space. Databricks addresses this need through Genie Spaces, enabling organisations to create tailored environments in which users can explore and query data using natural language. On top of this core functionality, they have been woven into the Databricks ecosystem to provide a suite of supporting features, such as:
- Permission management through Unity Catalog, ensuring robust data governance and security.
- Use of metric views as a data source, to simplify investigation of key metrics.
- Mechanisms to provide metadata to support the underlying LLM agent in its tasks, such as join conditions and business synonyms.
- Ability to combine Genie Spaces with other Databricks AI offerings to create agent chains for more complex tasks.
On paper, this provides a secure way for users to query organisational data, reducing their reliance on curated dashboards and enabling a more flexible approach to data exploration. With this in mind, I wanted to test the technology firsthand to understand where it could deliver the greatest value within future Databricks platforms I work on.
Testing Genie Spaces
Space creation is supported through both the Databricks portal and API, but there is currently no support for provisioning Genie Spaces via Asset Bundles. To create a space via the API, you need to provide a serialized_space parameter. This is a large JSON string that defines all the elements for the space, such as data connections, semantics, and example queries. Given its complexity, I would recommend creating an initial space through the portal and then extracting the associated configuration if you need to replicate it across other workspaces.
Creation via the portal is incredibly simple – just locate the Genie Spaces page, click on ‘create new’ in the top right, and then add your data sources. This will give you a chat UI – similar to the one shown in figure 1 – where you can pose your questions once you’ve connected your data sources.

Now you have the basic elements, you can start adding organisational context to improve the responses. This includes elements such as specifying how tables join together, providing internal synonyms for column names, and instructions for the underlying LLM, such as ‘this means X’. Configuring a space through the portal requires stepping through several interfaces (Figure 2), which can take some time. However, this process improves response accuracy and produces a reusable template that can later be extracted via the API for use in additional spaces.

Figure 2 – Adding organisational context and semantics
Once you have your data and metadata organised, you can start using the space to investigate – with a lot more freedom than you would find in a standard dashboard (Figure 3). Once a question is posed, the underlying LLM will assess the data it has access to, build up a plan of how best to answer the query, and then develop and run code to answer it.
Alongside the query output and associated visualisations, users are able to access the actual code that was run (Figure 4), meaning those with an understanding of SQL and the data assets can assess how accurate the response is. This is very important during the set-up phase of Genie Spaces, as it allows data engineers and analysts to test known queries with expected outputs and tweak any of the Spaces metadata.

Figure 3 – Example of Genie output

Figure 4 – Underlying SQL query created and ran by the LLM
In addition to providing access to underlying tables, Genie can leverage metric views to improve the reliability of organisational metric calculations, rather than relying on the LLM to determine the appropriate data, filters, and aggregations. Metric views, defined in with SQL or YAML files (Figure 5), have been implemented to move the mass of organisational context that exists in BI systems into the data platform, where they can be more tightly governed.
For example, definitions can be created for different organisational groups, to ensure that their specific variations on key metrics are not lost. These can then be enhanced through adding departmental context, such as comments and synonyms, that support the Genie Space LLM in converting variable natural language queries into the correct metric. An example of how the underlying queries differ is shown in Figure 6, where the same question was posed as in the previous example. When a metric view is provided as a data source, the LLM can leverage its associated metadata to retrieve defined metrics, improving accuracy compared to inferring calculations directly from raw data tables.

Figure 5 – Example of a Databricks metric view definition

Figure 5 – Example of a Genie Space query, when relying on metric views as opposed to 1
Closing Thoughts
After spending some time building out a Genie Space and working through a few of the enhancements, my conclusion is that there is clear value in the concept. Having spent years building dashboards for various key user groups, I understand the pains of their rigidity and the ease at which they can multiply as users find new questions to ask of their now accessible data.
My opinion is that this value is found through using Genie Spaces to augment and rationalise organisational dashboards, rather than as a direct replacement. Dashboards still have a place as a cost effective mechanism for users to understand key metrics and past performance, especially now the semantic elements can be shifted into the data platform where they can be better managed. Genie Spaces can then enable users to directly investigate more specific areas, and support the core reporting layer through two key processes:
- Help rationalise the number of existing dashboards to reduce maintenance and operational costs by replacing those that do not align with agreed organisational metrics.
- Provide a governed space for non-technical users to query data, fulfilling future requirements that would otherwise require the development of new analytical processes or outputs.
As with adopting any new technology, however, consideration needs to be made for the more mundane areas that support the longevity of this approach, rather than just focusing on the interesting outcome. For Genie Spaces, this currently falls into three main areas: reproducibility, cost, and governance.
Reproducibility: Organisations need to consider both how Genie workspaces can be replicated across multiple environments and how responses to similar queries are tracked. The replication aspect can only be automated via the API at present, which will likely require new processes to be established but does significantly reduce manual effort. In terms of ensuring accuracy, workspace owners can establish benchmarks to test their spaces, alongside a number of monitoring tools to ensure that question X consistently returns answer Y.
Cost: As with most Databricks features, there is an underlying cost. For Genie Spaces, this is not based on LLM usage, but the underlying SQL cluster that is used to access data and calculate responses. Giving users free reign to pose questions can therefore end up accruing a larger bill than organisations may have come to expect compared with dashboards that are updated on set intervals. As such, workspace administrators should establish reasonable budgets and policies on such resources to prevent overspending and provide clear usage statistics to end users.
Governance: Given Genie Spaces can be used to pull back row-level data, there needs to be a clear governance wrapper to prevent misuse or certain users accessing inappropriate data. This can be managed through Unity Catalog, but it needs these foundations in place to function correctly. Clear permission structures and attribute-based controls should be established on the underlying data, which can then be enforced by the Genie space on a per user basis.
Clearly there is a valuable niche for Genie Spaces to fill when it comes to helping users understand their organisation’s data, and it does this well. As with all new functionality, however, there needs to be clear data and governance foundations in place from the outset to support both its accuracy and longevity post-implementation.














