Power Automate: How to parse a CSV file

Today I answered a question in the Power Automate Community, and one of the members posted an interesting question. How to parse a CSV file and get its elements? I see this question asked a lot, but the problem is always to use the external component X or Y, and you can do it.

New template to parse a CSV file without premium connectors

I wrote a new template, and there’s a lot of new stuff. You can now define if the file has headers, define what’s the separator character(s) and it now supports quotes.
It’s a huge upgrade from the other template, and I think you will like it. You can find the detail of all the changes here.
I’ll leave both links below so that you can follow the steps in this article, but if you want to jump to the new one, go right ahead.

Although some of the components offer free tiers, being dependent on an external connection to parse information is not the best solution. You may not be able to share it because you have confidential information or have the need to parse many CSV files, and the free tiers are not enough. Finally, we depend on an external service, and if something changes, our Power Automates will break.

The motivation

I want to answer this question with a complete answer. This will benefit the overall community, so I decided to build a CSV parser using only Power Automate’s actions.

The overall idea is to parse a CSV file, transform it into a JSON, and collect the information from the JSON by reference.

For example, if we have the file:

name,date
Manuel, 12-12-2020
Gomes, 13-12-2020

Teixeira, 12-1-2020

I inserted the space on purpose, but we’ll get to that.

We’ll get the following JSON:

[
  {
    "name": "Manuel",
    "date": " 12-12-2020"
  },
  {
    "name": "Gomes",
    "date": " 13-12-2020"
  },
  {
    "name": "Teixeira",
    "date": " 12-1-2020"
  }
]

And then, we can do a simple “Apply to each” to get the items we want by reference. Looks nice.

The template

The template may look complicated, but it isn’t. I’ll explain step by step, but here’s the overview.

You can trigger it inside a solution by calling the “Run Child Flow” and getting the JSON string. If you don’t know how to do it, here’s a step-by-step tutorial.

The trigger

The trigger is quite simple. We need to provide two parameters:

  1. The path of the file in OneDrive. With this, we make the Power Automate generic
  2. It’s important to know if the first row has the name of the columns. With this information, we’ll be able to reference the data directly.

Fetching the file

With the parameter in the trigger, we can easily fetch the information from the path.

The variables

The variables serve multiple purposes, so let’s go one by one.

Each Row

In this one, we break down the file into rows and get an array with the information.

The formula is quite simple:

split(outputs('Get_file_content_using_path')?['body'],'')

It’s not an error in the return between ‘. There would be the temptation to split by “,” but, for some reason, this doesn’t work. If you apply the formula above, you’ll get:

[
  "name,date",
  "Manuel, 12-12-2020",
  "Gomes, 13-12-2020",
  "",
  "Teixeira, 12-1-2020"
]

Looks good.

Helper variables

I use the other variables to control the flow of information and the result. You’ll see them in action in a bit.

The parsing

OK, let’s start with the fun stuff. Here we want to:

  1. Fetch the first row with the names of the columns.
  2. Check if we have at least two lines (1 for the column names and one with data)
  3. Get an array for each row separated by ‘,’
  4. Check if the array is not empty and has the same number of columns as the first one. If you have more or less, then we cannot do the mapping, for example:
[
  "name,date",
  "Manuel, 12-12-2020", this one doesn't have a column name
  "Gomes, 13-12-2020",
  "",
  "Teixeira, 12-1-2020"
]
  1. Get the row elements.
  2. Build the JSON element, for example:
{
    "name": "Manuel",
    "date": " 12-12-2020"
  }
  1. Add that to a JSON string (variable created above)

Looks complex? Well, a bit, but at least makes sense, right?

Check the column names and min requirements.

The first two steps we can do quickly and in the same expression. To check the number of elements of the array, you can use:

length(variables('EACH_ROW'))

Parsing the headers and checking the information

Now that we know that we have the headers in the first row and more than two rows, we can fetch the headers. To do so:

split(variables('EACH_ROW')[0],',')

We get the first element and split it by our separator to get an array of headers. Then we start parsing the rows. If it’s the beginning, then we don’t do anything because it contains the headers, and we already have them.

To check if the row has the same number of elements as the headers (second clause of the if), we have the following formulas:

Several elements in the headers:

length(variables('Headers'))

Several elements in the current item:

length(split(items('Apply_to_each'),','))

Now let’s parse the information.

Parsing the remaining rows

First, we get the array with the elements of the row split by ‘,’

split(items('Apply_to_each'),',')

Then we upgrade the iterator since we’re already parsing another row.

The application to each is a little bit more complicated, so let’s zoom in.

Generating the element

Since each row has multiple elements, we need to go through all of them.

First, let’s ad the start of the value with an if statement. This will check if we’re in the beginning and add an “{” or nothing.

if(equals(variables('CSV_ITERATOR'),0),'{','')

The formula is a concatenation:

concat('"',variables('Headers')[variables('CSV_ITERATOR')],'":"',items('Apply_to_each_2'),'"')

Let’s break it down:

  1. Add a “
  2. Go to position X of the headers and get the name and the current item. Generates
{"name": "Manuel"

We need to increase the element by one. The next column to parse and corresponding value. So that we can generate the second column and the second record:

"date": " 12-12-2020"

Now let’s bring them together:

if(equals(length(outputs('Array_with_each_element_in_the_CSV_line')),variables('CSV_ITERATOR')),'}',',')

Here we’re checking if we’re at the end of the columns. If we are, we close the element with }. Otherwise, we add a “,” and add the next value.

In the end, we get:

{
    "name": "Manuel",
    "date": " 12-12-2020"
  }

Closing the loop

Since it’s so complicated, we added a compose with the formula so that, in run time, we can check each value and see if something went wrong and what it was.

Finally, we reset the column counter for the next run and add what we get to the array:

if(equals(length(variables('EACH_ROW')),variables('ITERATION')),']',',')

If it’s the last line, we don’t add a “,” but close the JSON array “]”

Return the JSON

All we need to do now is return the value, and that’s it.

Final thoughts

Please note that you can, instead of a button trigger, have an “HTTP trigger.” With this, you can call this Power Automate from anywhere. Both the “HTTP trigger” and “Response” are Premium connectors, so be sure that you have the correct account.

If you want to call this, all you need to do is the following:

Call the Power Automate and convert the string into a JSON:

json(triggerBody()['text'])

Then all you have to do is go through all values and get the information that you need.

Again, you can find all of this already done in a handy template archive so that you can parse a CSV file in no time. There are other Power Automates that can be useful to you, so check them out. Download this template directly here. If you don’t know how to import a template, I have a step-by-step here.

Updates:
2021-05-10: New template and a new article explaining its features.
2020-11-17: Many thanks to dyl for spotting a missing formula in the template.

Have a suggestion of your own or disagree with something I said? Leave a comment or interact on Twitter and be sure to check out other Microsoft Power Automate-related articles here.

Photo by AbsolutVision on Unsplash

Manuel Gomes

I'm a Project Manager with experience in large projects and companies. I've worked in the past for companies like Bayer, Sybase (now SAP) and I'm currently working for Pestana Hotel Group.

View all posts by Manuel Gomes →
Loading comments...
%d bloggers like this: