Having previously covered the main features announced in the keynote, and Power BI release, it's time to turn our attention to preview features.
As always with Preview features, the general rule of thumb is to not use them in production enviroments as they carry the risk of breaking changes being made. If you want to use them, then please do a risk based analysis on each feature - for example, the longer the feature has been in preview the lower the risk of a breaking change being made.
To start, we'll turn our attention to the keynote.
Keynote
Platform enhancements
OneLake security
This is a major change to the security approach that a lot of people have been waiting for. I did cover it in my Keynote blog given how large a change it is, and if you want to know more please do jump over to my keynote blog.
Synapse migration assistant
The next announcement was a migration assistant to help move the metadata and data from Synapse Warehouses across into a Fabric Warehouse using a Fabric based wizard.
Once complete, you can expect to see a list of errors that need to be resolved along with Copilot support to write code to fix these. It's very much an MVP looking at the list of things to come.
But for me, this is an early indication that Synapse users should start planning migration to Fabric to ensure they can migrate in their own time - rather than carrying on and waiting for the innnevitable time that Microsoft serves notice on Synapse.
Command Line Interface (CLI)
We've now had a new CLI drop. At which point Microsoft have fully revealed their hand as the vision of seeing "Fabric as the operating system for data" and not just an insight platform. It was hinted at with Fabric databases and the introduction of the term "transanalytical" (and that's the only time I will ever use that term!).
Built over the rest APIs, it's already being used to do such things as:
- Buid and test Fabric projects locally
- Script dev or demo enviroments
- Integrate with GitHub actions/ Azure DevOps
- Use bash/powershell to automate common tasks
Going forwards we can expect to see more functionality added and the CLI to go open-source.
CI/CD improvements
A number of new CI/CD features have dropped as part of the keynote, these include:
- Variable Libraries. These allow variables to be defined at the workspace level and re-used across Fabric objects. They can be setup such that as deployment pipelines move the variables through enviroments the values are updated accordingly. The big thing to be aware is that these aren't secure - the values are stored as plain text in Git. They aren't designed to replace Azure KeyVault. If you need to store values securely, use KeyVault and the workarounds for integration still.
- Service principal support for Github. Speaks for itself as we no longer need dedicated user accounts for this.
- Additional depoyment pipeline APIs.
User data functions
These are designed to allow re-usable, custom, Python functions to be built at a tenant level. Meaning that these are no longer hidden away in a maze of notebooks (looking at you Synapse), and we're not ending up with code duplication. The trade off is that you still have hidden code that might confuse those new to the platform.
Data integration enhancements
Dataflow Gen2
The main announcement from the keynote blog is the ability to save Dataflow Gen1 as Dataflow Gen2. I covered this off in my Power BI release notes. Please have a look at that blog for more info.
Database mirroring
We've seen a number of changes in this space. These are:
- Mirroring of Azure Database for PostgreSQL
- Mirroring for data sources over on-premise and vitual network data gateways. I know this one is a feature many have been waiting for.
Orchistration improvements
We've now had several improvements to the stage that it's now possible to build metadata-driven pipleines that orchestrate Gen2 dataflows.
Real-time intelligence
The big announcement in this space is new eventstream connectors. Otherwise, it was pretty light on new announcements.
Data engineering and data science enhancements
Autoscale billing for Spark
For those using the spark, we can now ringfence capacity units (CUs) specifically for spark activity and billed on a consumption basis - rather than using the general pool of CUs that are purchased via your capacity.
I've heard some teams finding the capacity based model is too restrictive for their use, and instead they would rather work on a consumption basis. The good news is that this is now an option for you.
For cusotmers who are hitting capacity limits, this could be a viable option instead of moving up to the next SKU. What I would like MSFT do is provide a calculator to be able to work out the estimated cost of taking the Spark elements outside the current fixed cost model.
AI functions
These are a set of functions that are backed by AI and designed to make specific tasks faster and easier. The list of unctions covered today are:
- Simiarity - compare to sets of text strings
- Classify - Classify text values beased on a set of labels
- Sentiment - Sentiment analysis
- extract - Find and extract specific types of information from input text
- fix grammar - correct spelling, grammar, and punctuatuion
- Summarize - Does what it says on the tin!
- translate - Convert from one language to another
- generate response - generate response from the LLM based on provided prompt
These are all customisable and links to the documentation are in the associated blog article.
Partner/ISV integrations.
Whilst the toolkit for developing these is now GA, we've seen preview announcements from:
- Neo4j
- Esri ArcGIS
- Celonis
- CluedIn
- Lumel
- Statsig
- Striim
Purview
We've had a number of new features announced that increase the usefulness of using Purview alongside Fabric. These are:
- Purview for Copilot in Power BI. Allowing governance officers to discover data risks such as sensitive data in prompts and responses, investigation of risky AI usage, and govern AI usage (audit, eDiscovery, retention, and non-compliant usage detection).
- Expansion of data loss prevention policies (DLP). These will now cover KQL and mirrored databases
- Data observability. Making it easier to identiy the root cause of data quality issues.
Beyond this we have a number of features announced in the Fabric March feature blog.
March blog
Esri ArcGIS
As mentioned in the keynote announcement, Estri have released a native integration to bring spatial analytics into Spark notebooks and Spark job definitions.
Deployment pipeline updates
Deployment pipelines now support the ability to move spark job definitions between enviroments.
Enviroment sharing across workspaces
Previously if you wanted to re-use a spark enviroment from one workspace across others, you had to reconfigure the enviroment in each workspace. With this update, this has become a define once and re-use approach. The limitation is that the workspaces have to be on the same capacity and have the same security settings.
Fabric data agent integtration with Azure AI agent Service
Fabric agents (i.e. data skills), can now integrate with the Azure AI Agent service in Azure AI Foundary. The result is the ability to build custom conversational AI agents with domain knowledge - across OneLake, AI search, and Sharepoint.
Fabric data agent SDK
A python library to make it easier to develop and prototype AI assistants on the Fabric platform.
AI functions in data warehouse
Making the AI functions previously mentioned accessable from a data warehouse and Lakehouse SQL endpoint is currently in private preview.
User data functions in Data Warehouse/SQL endpoint
This provides the ability to write customm python functions that can be invoked by T-SQL. Currently this feature is in private preview.
Scalar SQL user-defined functions in Data warehouse/SQL endpoint
Very much the same as user data functions but in T-SQL instead of python. Again in private preview.
SQL audit logs for data warehouse
Ensures an audit trail is created when specific events occur using out the box functionality - rather than having to hand crank something.
Comments
Post a Comment