Fabcon: Day 1 keynote
Less than an hour after this mornings keynote and I thought I'd put together some initial thoughts on what we heard at Fabcon this morning.
After some initial fun with dancing robots, and introduction from the team at Guy In a Cube, it was time for Arun to take to the stake and open the keynote.
As we've come to expect from Microsoft keynote's over the last 12 months, the key theme was that of AI. But critically recognising how crucial getting your data estate in line is to taking advantage of the new developments we're seeing in technology today.
As all us data experts know, the quality of the insight the business will obtain is only as good as the quality of the data that underpins it. Typically referred to as garbage in / garbage out (GiGo).
Combine this with the fragmentation in the technology landscape, and businesses struggle to solve the dual challenges of interopobility and Gigo. A challenge that Microsoft Fabric is trying to solve today.
For those that aren't familiar with Microsoft Fabric, Arun summarised it perfectly as doing for AI and BI what Office 365 has done for the rest of your business. That is a single suite of tools that your data engineering teams need in order to get your business ready for the age of AI.
Having had the recap of what Fabric is, the adoption rates being drive, and how the community is growing, we moved into the main part of the presentation.
Before we got into the product details, the first announcement is a the new DP700 exam that's launching next month. Unlike the current DP600, this exam focuses on the data engineering persona. Going deep into that part of the product, and allowing engineers to demonstrate their capabilities within their specialisation.
Next up, the new product announcements. This time round they are split into three key areas:
1. AI Powered development
2. AI Powered data estates
3. AI Powered insights
AI powered development
First up, in the next few weeks expect a complete overhaul of the UI. Gone are the individual workload interfaces.
In it's place is a new UI focused around developers and insight analysis. With a straightforward switch between the two. This is the overhaul the product has needed since private preview and it's great to see the feedback from partners and the community has been listened to.
For developers, their interface will be based around task flows. For those that haven't seen taskflows, these are effectively blueprints that development teams can follow to deliver common solutions and group associated Fabric components to make it easy to follow a solution in the future. Be it using the out the box task flows provided by Microsoft or task flows you build yourselves, it is clear these will become more and more important in Fabric.
Next, we moved onto Data Factory and copilot features that are coming soon. Copilot will get full integration with data factory, be it using natural language to build a new pipeline from scratch or debugging that failure in last nights data load. No more searching the internet to see if someone has had the same issue (yes all good developers do this), the promise is that copilot will do this for you and provide you with the answer to solve that issue.
From the demo we had, it looks like a great tool to potentially improve developer efficiency. But we will need to wait to see if it lives up to the promise, or if the fixes it suggests fail to solve the issue. One to definitely keep an eye on and trial once available.
For Power BI developers, more features will be coming to Copilot. With the ability to iterate dashboards and deliver train of thought analysis front and centre. If they deliver to the same standards as Copilot for Power BI does today, this direction of travel has the potential to change the way we do BI in the future. Developer focus will shift from building reports, to finding the insight that has the potential to deliver 10x improvements for the business.
For CI/CD and environment managers, we got confirmation that all the primary Fabric items will get Git support by the end of this year. Meaning that the native code management solution will deliver on its initial promises. As of today, Fabric support in Terraform has gone into public preview. meaning we can ensure our Fabric environments can be deployed as and when needed with the consistency we expect.
The Fabric runtime 1.3 is now GA, delivering support for Spark 3.5 and delta 3.2. Ensuring we are keeping up with the open source versions. On top of that, Microsoft have added a new native spark execution engine that has gone into public preview. The new engine is currently benchmarked to deliver 4x the performance of the current GA version. Again, hands on time will be needed to see if this is the case; but, the best thing is that this new engine is being delivered at no additional cost.
For data scientists, we move on to AI functions. These are designed to turn communion AI workloads like translation, sentiment analysis, and classification into native spark functions. No more building custom models, instead these are provided natively within Fabric without the need of additional services. This development moves these common tasks into the remit of data engineering teams, freeing up the data scientists to focus on more complex and valuable tasks. Add to this native support in Notebooks for mirrored data sources and geospatial analysis, and the notebook experience is getting a pretty significant update in this space.
Lastly, the team looked at real time data consumption and analysis. In this space, we didn't really see anything new. More a recap of what has been delivered so far. Hopefully real time will get a wave of updates soon.
AI Powered data estate
Focusing on OneLake, Microsoft revealed that globally their storage requirements are doubling every 15 weeks currently. Showing how popular Fabric is coming globally.
Starting with data factory again, we had it revealed that a new copy job item was in public preview. Having analyzed pipeline usage, Microsoft realized that a lot of pipelines focused on this, and so have made it easier for developers to complete common tasks.
Whilst only in public preview at the moment, it should certainly reduce delivery timelines on a number of projects - whilst simultaneously adding incremental loads to reduce processing overheads. It supports both tables and files, meaning that you can now easily do incremental loads on both your structured and unstructured data.
Mirroring has had a couple of updates. Be it the mirroring with Snowflake going GA or the public preview of data bricks mirroring via unity catalogue. With unity catalogue shortcuts are automatically synchronized with the catalogue making it easy to access new data sets. If unity catalogue mirroring is of interest, please do read the Microsoft documents carefully. I'm sure with it being in public preview, we'll see some pretty significant limitations at this stage.
Moving back to Snowflake, we had a demonstration of how Iceberg tables can be accessed via shortcuts with Snowflake in AWS or hosted within OneLake for those running on Azure. When either of these options are applied, Fabric creates a virtual delta log. Allowing all Fabric items to talk to your iceberg data natively. For me, this solves what can be a pretty big interoperability headache - the key will be to see it in action with real world use cases to work out how the total cost of ownership will be impacted.
Last in this section was a high level overview of all the data governance features that have been delivered and will be delivered. Not much detail on these features and skipped over pretty quickly - but do go checkout the conference blogs when they come out.
AI powered insights
This was the last section, and the one they clearly wanted to create some sizzle with. Very much focused on Power BI as that interplay between Fabric and Office 365:
Comments
Post a Comment