Data preparation available in Power Apps, you can create a collection of data called a dataflow, which you can then use to connect with business data from various sources, clean the data, transform it, and then load it to Common Data Service or your organization’s Azure Data Lake Gen2 storage account.
A dataflow is a collection of entities (entities are similar to tables) that are created and managed in environments in the Power Apps service. You can add and edit entities in your dataflow, as well as manage data refresh schedules, directly from the environment in which your dataflow was created.
Once you create a dataflow in the Power Apps portal, you can get data from it using the Common Data Service connector or Power BI Desktop Dataflow connector, depending on which destination you chose when creating the dataflow.
There are three primary steps to using a dataflow:
- Author the dataflow in the Power Apps portal. You select the destination to load the output data to, the source to get the data from, and the Power Query steps to transform the data using Microsoft tools that are designed to make doing so straightforward.
- Schedule dataflow runs. This is the frequency in which the Power Platform Dataflow should refresh the data that your dataflow will load and transform.
- Use the data you loaded to the destination storage. You can build apps, flows, Power BI reports, and dashboards or connect directly to the dataflow’s Common Data Model folder in your organization’s lake using Azure data services like Azure Data Factory, Azure Databricks or any other service that supports the Common Data Model folder standard.
The following sections look at each of these steps so you can become familiar with the tools provided to complete each step.
Create a dataflow
Dataflows are created in one environment. Therefore, you will only be able to see and manage them from that environment. In addition, individuals who want to get data from your dataflow must have access to the environment in which you created it.
- Sign in to Power Apps, and verify which environment you’re in, find the environment switcher near the right side of the command bar.
2. On the left navigation pane, select the down arrow next to Data.
3. In the Data list, select Dataflows and then select New dataflow.
4. On the Select load target page, select the destination storage where you want entities to be stored. Dataflows can store entities in Common Data Service or in your organization’s Azure Data Lake storage account. Once you select a destination to load data to, enter a Name for the dataflow, and then select Create.
5. On the Choose data source page, select the data source where the entities are stored, and then select Create. The selection of data sources displayed allows you to create dataflow entities.
6. After you select a data source, you’re prompted to provide the connection settings, including the account to use when connecting to the data source.
7. Once connected, you select the data to use for your entity. When you choose data and a source, the Power Platform Dataflow service will subsequently reconnect to the data source in order to keep the data in your dataflow refreshed, at the frequency, you select later in the setup process.
Now that you’ve selected the data to use in the entity, you can use the dataflow editor to shape or transform that data into the format necessary for use in your dataflow.
Dataflows and the Common Data Model
Dataflows entities include new tools to easily map your business data to the Common Data Model, enrich it with Microsoft and third-party data, and gain simplified access to machine learning. These new capabilities can be leveraged to provide intelligent and actionable insights into your business data. Once you’ve completed any transformations in the edit queries step described below, you can map columns from your data source tables to standard entity fields as defined by the Common Data Model. Standard entities have a known schema defined by the Common Data Model.
For more information about this approach, and about the Common Data Model, see The Common Data Model.
To leverage the Common Data Model with your dataflow, select the Map to Standard transformation in the Edit Queries dialog. In the Map Entities screen that appears, select the standard entity that you want to map.
When you map a source column to a standard field, the following occurs:
- The source column takes on the standard field name (the column is renamed if the names are different).
- The source column gets the standard field data type.
To keep the Common Data Model standard entity, all standard fields that are not mapped get Null values.
All source columns that are not mapped remain as is to ensure that the result of the mapping is a standard entity with custom fields.
Once you’ve completed your selections and your entity and its data settings are complete, you’re ready for the next step, which is selecting the refresh frequency of your dataflow.
Set the refresh frequency
Once your entities have been defined, you’ll want to schedule the refresh frequency for each of your connected data sources.
- Dataflows use a data refresh process to keep data up to date. In the Power Platform Dataflow authoring tool, you can choose to refresh your dataflow manually or automatically on a scheduled interval of your choice. To schedule a refresh automatically, select Refresh automatically.
- Enter the dataflow refresh frequency, start date, and time, in UTC.
- Select Create.