Data factory in Microsoft Fabric: A powerful data integration tool for modern businesses

May 21, 2024

In the modern business landscape, organizations are continually seeking efficient ways to process, integrate, engineer, and analyze vast amounts of data. One of the most powerful tools for achieving this is through Microsoft Fabric, particularly through its workload Data Factory.

Microsoft Fabric represents a groundbreaking advancement in data and AI services by unifying these capabilities into a single, cohesive platform. This integration democratizes data analysis and empowers businesses to harness the power of AI more effectively. With Data Factory as a core component, Microsoft Fabric aims to streamline complex data integration needs, enabling businesses to embrace new digital transformation opportunities and drive innovation.

This blog will explore the concept of a data factory in Microsoft Fabric, which offers businesses a robust data integration experience with high-level functionalities.

What is Data Factory?

Data Factory is a powerful integration service within the Microsoft Fabric platform that automates and orchestrates data workflows and handles the entire ETL (extract, transform, load) process.

Microsoft Fabric Data Factory: A robust data integration tool within the Fabric environment

Data factory in Fabric provides a modern data integration experience for data professionals to ingest, prepare, and transform data from disparate data sources, such as Lakehouse, data warehouse, and database, etc. With a modern and trusted data integration experience, data professionals can extract, load, and transform data for their organization. Moreover, the powerful data orchestration capabilities empower data and business users to build simple or complex workflows. These workflows can be tailored to perfectly match their specific data integration needs.

Data Factory in Microsoft Fabric features Fast Copy, a game-changer for both dataflows and data pipelines. With this innovative capability of Fast Copy, you can easily transfer data to your preferred storage solutions at lightning-fast speed. Most importantly, Fast Copy empowers you to seamlessly populate your Microsoft Fabric Lakehouse and Data Warehouse for data analysis, further enhancing the efficiency of your data operations.

High-level features of Data Factory in Microsoft Fabric

The high-level features of Fabric Data Factory, i.e., Dataflows and Data pipelines, provide a comprehensive solution to streamline data integration workflows with precision and efficiency. Let’s explore these features in detail and identify how they revolutionize data management and process for modern businesses.

Dataflows

Data Factory Dataflows in Microsoft Fabric provides a low-code data ingestion and transformation interface. This user-friendly environment empowers users to access data from hundreds of diverse sources. Over 300 built-in data transformation functions enable users to manipulate the data to their specific requirements. The resulting data can be loaded into various destinations, including Lakehouse, and Azure SQL databases following transformation. Dataflows offer flexible execution options, including manual or scheduled refresh and integration into broader data pipeline orchestrations.

The dataflows within Microsoft Data Factory offer a visual, Power Query Experience that integrates with different Microsoft products and services, such as Excel, Power BI, Microsoft Dynamics 365 Insights applications, and more. This empowers data professionals of all skill levels and allows seamless data ingestion and transformation. The user-friendly, low-code interface empowers users to perform essential tasks like joins, aggregations, data cleansing, and custom transformations – all within a highly visual environment.

Data pipelines

Data Factory pipelines offer powerful, cloud-scalable workflow capabilities. This empowers you to create complex workflows to refresh dataflows and efficiently transfer massive datasets. Additionally, you can define sophisticated control flow pipelines that meticulously manage your data processing tasks.

Data Factory pipelines empower you to build robust ETL (Extract, Transform, Load) and broader data factory workflows. These workflows can handle a multitude of tasks at scale by using the built-in control flow capabilities of data pipelines.

Data Factory seamlessly combines low-code dataflow refreshes with configuration-driven copy activities within a single pipeline to easily build end-to-end ETL workflows. For advanced scenarios, the platform allows users to add code-first activities for Spark Notebooks, SQL scripts, stored procedures, and more to ensure maximum flexibility for your data processing needs.

Fabric Data Factory vs Azure Data Factory: Which one is better?

Since both Azure Data Factory and Data Factory in Fabric offer robust capabilities for managing and transforming data, Fabric Data Factory makes the integration experience easy to use, powerful, and truly enterprise-grade. While both platforms exhibit distinctive features and functionalities, organizations can make informed choices by simply leveraging the platform that best aligns with their specific data processing requirements.

Let’s compare Azure Data Factory and Microsoft Fabric Data Factory features to help you choose the best option for your organization.

Azure Data Factory Data Factory in Fabric Description 
PipelineData pipelineData pipeline in Fabric is better integrated with the unified data platform including Lakehouse, Data warehouse, and more. 
Mapping dataflow Dataflow Gen2 Dataflow Gen2 provides easier experience to build transformation. We are in progress of letting more functions of mapping dataflow supported in Dataflow Gen2.
ActivitiesActivitiesWe are in progress to make more activities of ADF supported in Data Factory in Fabric. Data Factory in Fabric also has some newly attracted activities like Office 365 Outlook activity. Details are in Activity overview. 
DatasetNot applicableData Factory in Fabric doesn’t have dataset concepts. Connection will be used for connecting each data source and pull data. 
Linked ServiceConnectionsConnections have similar functionality as linked service, but connections in Fabric have more intuitive way to create. 
Triggers Schedules (other triggers are in progress) Fabric can use the schedule to automatically run pipeline. We are adding more triggers supported by ADF in Microsoft Fabric. 
PublishSave, RunFor pipeline in Fabric, you don’t need to publish to save the content. Instead, you can use Save button to save the content directly. When you click Run button, it will save the content before running pipeline. 
Auto resolve and Azure Integration runtime Not applicableIn Fabric, we don’t have the concept of Integration runtime. 
Self-hosted integration runtimes On-premises Data Gateway (in design) The capability in Fabric is still in progress of design. 
Azure-SSIS integration runtimes To be determined The capability in Fabric hasn’t confirmed the roadmap and design. 
MVNet and Private End Point To be determined The capability in Fabric hasn’t confirmed the roadmap and design. 
Expression language Expression language Expression language is similar in ADF and Fabric. 
Authentication type in linked service Authentication kind in connection Authentication kind in Fabric pipeline already supported popular authentication types in ADF, and more authentication kinds will be added. 
CI/CD CI/CD CI/CD capability in Fabric Data Factory will be coming soon. 
Export and Import ARM Save as Save as is available in Fabric pipeline to duplicate a pipeline. 
Monitoring Monitoring, Run history The monitoring hub in Fabric has more advanced functions and modern experience like monitoring across different workspaces for better insights. 

Azure Data Factory Migration to Microsoft Fabric: What features are supported?

Migration from Azure Data Factory to Fabric Data Factory is made easy as Microsoft offers comprehensive features. As you upgrade to Microsoft Fabric from ADF, Microsoft supports the following features:

Data pipeline activities

Microsoft supports user activities in Azure Data Factory to Fabric Data Factory. New activities, such as Teams and Outlook, have been added for notifications.

OneLake/Lakehouse connector in Azure Data Factory

Azure Data Factory customers can now leverage Microsoft Fabric through the new OneLake/Lakehouse connector. This integration allows users to transfer data into Fabric OneLake effortlessly.

Azure Data Factory Mapping Dataflow to Fabric

For migrating Azure Data Factory (ADF) mapping dataflows to Microsoft Fabric Data Factory, the Fabric Customer Advisory Team (Fabric CAT) provides sample code. This code can convert existing dataflows into Spark code, ensuring a smooth transition to the Fabric environment.

Looking ahead: Other supporting features we will see in the future versions

Mounting of ADF in Fabric

This feature will allow users to connect to their existing Azure Data Factory (ADF) environment in Microsoft Fabric. This “mounting” capability will enable your ADF pipelines to continue running on Azure without interruption. At the same time, it will grant you access to explore and experiment with Microsoft Fabric functionalities.

Upgrade from Azure Data Factory pipelines to Fabric

Microsoft is upgrading data pipelines from Azure Data Factory to Fabric to enhance the user upgrade experience. This allows users to test their existing data pipelines in Fabric by mounting and upgrading them.

Simplify your data integration journey with Microsoft Fabric Data Factory

Data Factory in Microsoft Fabric stands out as a comprehensive data integration service to meet the needs of modern businesses. By combining the intuitive data transformation capabilities of Dataflows with the robust workflow orchestration of data pipelines, Fabric empowers you to automate complex data processes, gain valuable insights, and fuel data-driven decision-making. As organizations continue to embrace digital transformation, Data Factory in Fabric provides the distinguished capabilities to stay ahead in a data-driven world.

Is your data holding you back? Confiz is your trusted Microsoft Partner specialized in data solutions to help you leverage the power of Data Factory within Microsoft Fabric. From consulting, implementation to migration and ongoing optimization and support, our experts in Microsoft Fabric services helps you maximize this unified analytics platform. Contact us today at marketing@confiz.com for Microsoft Fabric services with Confiz.