Different ways of getting Excel data using Power Automate

.

Anoop
7 min readJun 2, 2020

Introduction

In this blog post we will see 3 different (1 slow and 2 fast) ways of getting data from a table in an excel file using Power Automate. We will start by looking at

  1. a Power Automate flow which uses simple actions but takes long time to run,
  2. then we will look at using a combination of simple actions and Pieter’s method which takes lesser time to run and
  3. finally we will use a couple of HTTP actions along with Pieter’s method which will take only a couple of seconds to run.

Credits

Pieter Veenstra and John Liu write amazing blogs on Power Automate. The content in their blogs is brilliant and helps a lot in creating fast Power Automate flows. We will be using the different techniques mentioned by them in this post.

Data that will be used

Our ultimate aim is to transform large excel data into JSON. Say for example we have the excel file below that has a table named tblWorkwith 1500 rows of data (20k rows later), which shows 3 jobs — Painting, Gardening and Plumbing that are completed in different houses.

We intend to get that data into 3 JSON objects as shown below — each representing the completed job with an array of ‘House number’, ‘Street’ and ‘City’. This data can then be used for further analysis/reporting.

Method 1 — Avoid if possible

A simple way to get the data is using the “List rows present in a table” action. We pass in the required details to this action i.e. the location of the excel file and the table (tblWork) in which the data is present and we will get the rows as JSON.

The list rows action uses Microsoft Graph to get the data. So to get the file dynamically please refer this post by John Liu.

We then loop through the rows (in parallel if needed) and for each row, check if Painting is not empty and collect the House number, Street and City related to that row and append that to an array variable. Do the same for Gardening and Plumbing.

When the loop completes execution, the array variables will have the required rows and we can use a say Compose action to get the require data.

{"Painting":variables('varPainting')}

Time to execute

This is a simple method and works well with small data however, if we have large data then the time this takes to execute is very long.

In the next methods we will see how to reduce the time taken for these actions.

Method 2

Let’s take a look at reducing the time taken for the ‘Apply to each row’ action. The main reason it takes longer is because that action uses variables inside it. As John Liu explains, Power Automate needs to obtain the lock for the variables, set the variable values and then release the lock when some task needs to be performed on the variables. Hence we need a way of avoiding the variables.

To do that we will use the ‘Filter’, ‘Select’ actions and we will get the required data using the brilliant Pieter’s method.

The flow

The flow explained below can be downloaded from the GitHub repository here.

We will start by using the “List rows present in a table” action to get the data from excel. After that we create an array that represents the columns/headings in excel

["Painting", "Gardening work", "Plumbing"]

These column/heading names can be stored in a list in SharePoint or any other place if needed so that there wouldn’t be a need to edit the flow later.

Then we will loop through each of the columns/headings,

  1. filter the rows for which the current column is not empty and
  2. then select the House number, Street and City related to that row.

This will give us the JSON array data for Painting, Gardening and Plumbing in that order and it will be sorted in ascending order by row. We then use Pieter’s method to get the required JSON i.e. use the body of the select action outside the loop.

When the select action is referenced outside the loop, it will have concatenated all the values in the loop. Since the looping is done for 3 columns/headings, the concatenated data will have 3 values. The first item of the concatenated data will be the Painting data

{"Painting": body('Map_non_empty_values')[0]}

Similarly, for Gardening and Plumbing we need to get the second and the third item of the concatenated data i.e.

{ "Gardening": body('Map_non_empty_values')[1]}

{ "Plumbing": body('Map_non_empty_values')[2]}

By using Pieter’s method, we have reduced the execution time from 20 minutes to 1 second!

Extra info — In the “Apply to each column” loop, we can use Select operation first and the Filter the data if needed. We would need to write an expression in Select for that instead

if(empty(item()?[Outputs("current_column")]),null,json(concat('{"House number":"',item()?['House number'],'","Street":"',item()?['Street'],'","City":"',item()?['City'],'"}')))

Method 3

In the final method we will see how to reduce the time taken for the “List rows present in a table” action. We can use this method when the data we are dealing with is large and/or execution time needs to be smaller. Also this method uses the premium HTTP action with user credentials.

The “List rows present in a table” action gets the rows present in a specified excel worksheet table. We can directly use the Microsoft Graph API’s List Rows endpoint to get that data. (Or the other option is to use the range data like Pieter mentions here).

Since this endpoint requires delegated Files.ReadWrite permissions, we would need a delegated access token. To get that we will need to register an application in Azure AD, and provide Files.ReadWrite delegated permissions to that app registration.

We will be using an excel file in OneDrive as the source. So we will need an account with OneDrive and the credentials of that account. (Note — I did try to use SharePoint as the source for this API, however I was not able to get the data from that file using the below method — comments related to this are welcome).

The flow

The flow explained below can be downloaded from the GitHub repository here.

The first step in the flow is to call the HTTP action with the following settings

This step will return a delegated access token with which one thing we can do is access data present in files in OneDrive.

We then use another HTTP action to call the Microsoft Graph List Rows endpoint and with this access token in the header.

This will return the data as JSON — a sample of which is shown below and we are interested in getting the “values” data.

In the values data we know that we have the data in the order — house number, street, city, painting, gardening and plumbing. Meaning

  1. the value of “Painting” is represented by data at the 3rd index
  2. the value of “Gardening” is represented by data at the 4th index
  3. the value of “Plumbing” is represented by data at the 5th index

So we need to check the empty condition at these indices. To that we create an array with these indices — [3,4,5], loop through those and check the empty row value for that index. Once we have filtered the values, all we need to is use select to map the data.

In the select action, House number will be obtained from item()['values'][0][0], Street will be obtained from item()['values'][0][1] and City will be obtained from item()['values'][0][0] .

We then use Pieter’s method like we did earlier outside the loop to get the required JSON

Execution time

With these actions the time now gets reduced to 1 second to read the table rows. That is, only 1 second to read about 1500 rows of data!

If we increase the data in the excel file to say 20k rows then by using this method we get the required data in approximately 10 seconds!

Conclusion

We can try and avoid using for each loops with variables inside them, whenever possible. Instead we can use the filter and select actions which are fast and help us in getting the required data in a few seconds. Combining those with Pieter’s method help us making our flows really fast.

--

--

Anoop

Microsoft MVP. M365 Developer Architect at Content+Cloud.