Comparing Databricks Data + AI summit vs Fabric (inc build announcements)

As announcement season comes to an end, we can look at how this weeks big announcements from Databricks line up with Fabric and the roadmap we know about today.

Before going into the features from each, it's clear that we are going to have a big tussel between the two.

I'll start with the announcements from data + AI summit and then look at any features the Fabric team have announced that wasn't covered.


Databricks Data + AI summit

Lakebase

First up, we have operational databases. Bringing operational systems into the same platforms as insight and analytics is proving to be a big theme currently. One that will simplify solutions, and reduce both data movement and data duplication. Big win for not only customers, but also the sustainability of our data solutions.

Databricks offering is Lakebase, a Postgres solution underpinned by Databricks storage platform. Over in Fabric, Microsoft has announced Cosmos DB as the equivalent.

Which of these is better is going to be very much down to your preferences and uses cases across the wider platform.

Result: Draw

Agent bricks

Before I go into this one, I'm going to admit that I'm not a data scientist. It's a space I understand at a high level to architect around, but leave to those with bigger brains than me to implement. So please do understand that these thoughts are based on not having practical experience of building these things.

But, for me, Databricks seems to have pulled a rabbit out the hat with this one (especially with the Gemini tie-up).

Agent bricks is effectively LLM driven Agentic AI optimisation. The platform will auto-optimise based on human review of key points, and help work out cost vs output quality trade off - ensuring that the best model is being used for each use case.

If Microsoft was to have the same thing, it would certaily appear in AI foundary and my understanding is it doesn't.

Given this, Databricks definately makes it easier to tune and optimise agentic AI platforms with smaller data science teams compared to the equivalent Microsoft platforms.

Result: Databricks win

Lakeflow designer

This is Databricks no-code/low-code ETL tool - the equivalent to Fabric's Gen2 Dataflows and Fabric pipelines combined.

With the Databricks version we're able to reduce the technical entry point for those using the Databricks platform. Meaning that those not familiar with pro-code tools are able to build data solutions. The good thing with Lakeflow designer is that it still allows professional developers to go in and alter the SQL behind the scenes to ensure the results are as expected.

Turning to Fabric (and parking my dislike of Dataflows), we have two tools available. Whilst this makes it harder for users starting out on the platform it does mean that those users can start building with Dataflows, learn to orchistrate them with Pipelines, and move on to using notebooks as their technical abilities increase - giving a nice path from low-code to pro-code for those that want to follow that path. On top of that, we now have copilot available to support developers regardless of SKU - further reducing the technical cost of entry.

Whilst the multiple options available in Fabric can cause confusion, the available learning curve combined with Copilot integration means that Fabric just nicks this one.

Result: Fabric win

Databricks Free edition

At last, a major platform is providing a limited free edition of their product. This is a massive win for students and data professionals full stop.

Over my career getting an instance to do non-client work in has always been a challenge. Databricks has just solved that challenge.

Whilst Fabric has a free trial, not being able to sign-up with a personal email and for the enviroment to only be available for a limited period as a guarantee (I know it doesn't always stop working) is a real limitation.

Result: Databricks win

Databricks One

Exatly the same functionality as the current Fabric 'chat with your data'. Basically allows business users to use NLP to find what they are after. 

Result: Draw

AI/BI

AI/BI is Databricks built in BI tool. Providing BI functionality without additional costs. Think of it as a light-weight Power BI.

It isn't as feature rich as Power BI, and it probably isn't going to be right for most enterprises.

This one is a slam dunk for Fabric.

Result: Fabric wins

AI/BI Genie

Databricks equivalent of a combination of Data Agents, chat with your data, and Copilot. The idea being that you can ask a question of your data and have an LLM translate your question into an answer based on your data.

At face value, these look very similiar. But, Databricks has already announced that they will go further than Fabric with Deep Research. With this feature, business users are able to ask how they can achieve goalds and the platform will create a report of how to achieve this.

Behind the scenes, the two approaches are also very different. Databricks relies on domain experts marking results and this is used to fine tune the model behind the scenes. However, Fabric really relies on users developing and maintaining a semantic model that provides the LLM with domain expertise. Today it's hard to tell which approach is best given how new the technology is.

Scoring this one is hard as we'll only know the better approach once customers use the two products in anger. But based on the deep research functionality, I'm going to just give this to databricks.

Result: Databricks wins

Unity catalog

For those that don't know, Unity catalogue is Databricks data governance offering. Fabric doesn't really come close as a standalone - and even when paired with Purview, for me unity catalog is still more powerful.

If Microsoft really want to compete, they should start giving Purview away free to Fabric subscribers with F64 and above.

Result: Databricks

Databricks apps

This is a self-hosted service for launching data driven applications hosted in Databricks and delivered with common data science UI tools and commercial tools like Node.js. It can then be combined with 3rd party tools to vibe code applications that are then exposed via a Databricks hosted URL.

In the Microsoft stack, the closest thing we would see is Power Apps.

Which of these that wins, really depends on the capabilities in your business. Those with Power Apps experience will prefer the Microsoft approach and those with opensource experience will prefer the Databricks approach. The result, a draw.

Result: Draw

Lakebridge

Databricks have released Lakebridge, which is an LLM based platform designed to accelerate migration from customers previous platforms to Databricks.

In the Fabric space, so far we've only had tools helping to migrate from previous Microsoft products such as Synapse SQL pools to Fabric - and they tend to address a small slice of the migration path.

Result:Databricks wins

Clean rooms

We've seen an extension of clean rooms onto GCP.  This feature allows data sets to be shared without revealing sensitive data - meaning businesses can quickly and easily share data with partners whilst reducing risk.

Currently Fabric doesn't have an equivalent.

Result: Databricks wins

Microsoft build

Now we've covered off Databricks data and AI summit, let's look at features from build that haven't been covered.

Digital twin builder

Digital twin builder allows us to create a graph DB digital twin of key business objects over our lakehouse without duplicating data. This means that the power of graph databases can be brought to use cases such as supply chain analysis, manufacturing bottleneck identification, etc.

Today, Databricks doesn't have an equivalent.

Result: Fabric wins

Onelake Shortcuts

In Fabric shortcuts allow us to reference data across Fabric objects without duplicating them. Whilst Databricks can reference data in the same workspace, an equivalent of shortcuts doesn't exist today.

Result: Fabric wins

Mirroring

Within Fabric we can create zero cost (within limits) replics of specific data sources with a few clicks.

Today Databricks doesn't have an equivalent.

Result: Fabric wins

GraphQL

Out the box, Fabric provides native GraphQL APIs and the ability to quickly and easily create GraphQL objects that can be accessed from the API.

Again, whilst Databricks provides APIs over both unity catalogue and Databricks itself, it's not the same native functionality for those that want to use GraphQL. If you want to use Databricks and GraphQL, you'll need to build your own.

Result: Fabric wins

Conclusion

The key quesion is which should you go for?

Unfortunately, that's not a clear cut decision as ultimately it depends on:
  • The skills you have in the business today.
  • If you already use Power BI or Synapse.
  • Your risk appetite around how mature the data platform is (Databricks is a lot more mature compared with Fabric).
  • The outcomes you are trying to achieve.
For me, if you end up with either platform, you aren't likely to regret your decision. 

For those at uni, my advice would be to start with Databricks and learn all you can - with Fabric being based on Databricks a lot of the skills you learn can be transferred to Fabric easily.

Comments

Popular posts from this blog

Workspace topologies in Microsoft Fabric

Ignite 2024 - What's been announced for Microsoft Fabric

Power BI - Fabcon keynote, preview features, and March 2025 announcements