Azure Functions: CSV to JSON

Recently I had to create an SFTP server and update tables in Azure SQL. I decided to use an Azure Logic App to check for new files and convert the data file from CSV to JSON and then insert them into the database using the SQL Connector and the Insert Row action.

Azure Logic App using Azure Function CSVToJSON

If your files are small you can call the CSV to JSON function from an Azure Logic App to process the file line by line. You could design your workflow and not have to write any other code!

What are Azure Functions?

Azure Functions are a serverless offering that can run code without having to set up infrastructure resources. They can automatically scale to support the workload needed and you only pay for them when they are used. In addition, the first 1 million exections are FREE!

If you have ever created or used an API from a third party you can think of each API call as its own Azure Function. It is a way of creating a microservice architecture.

How can I use Azure Functions?

  • From other Azure Functions - This is useful to separate components and to make them scalable.
  • Azure Logic Apps - This is useful when you do not need to write code to perform a set of logical and repeatable steps. In the process flow, you may need to call an Azure Function to handle some specific set of logic.
  • Azure Web Apps/API - You may already have an Azure Web application or API that needs some new functionality, like resizing images. You could write a function to do this and call it from those applications.
  • JavaScript Code - You have a Web application or Mobile application that needs to send you error logs. You can write an Azure Function to accept the log contents and process them.

The Code

The source code may be found here on Github.

This is the class that will be our return object. This will have the rows of the original CSV data and the source file name.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
    
    public class JsonResult
    {
        public JsonResult(string fileName)
        {
            Rows = new List<object>();
            FileName = fileName;
        }

        public string FileName { get; set; }
        public List<object> Rows { get; set; }
    }
    
    

This is an excerpt of the code from the Run method of our function class CSVToJSON. This shows how we instantiate the JsonResult object and add the data using the Rows property.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
    string[] csvLines = ToLines(csvData);

    var headers = csvLines[0].Split(fieldSeperator).ToList<string>();
    JsonResult resultSet = new JsonResult(fileName);

    foreach (var line in csvLines.Skip(rowsToSkip))
    {
        //Check to see if a line is blank.
        //This can happen on the last row if improperly terminated.
        if (line != "" || line.Trim().Length &gt; 0 )
        {
            var lineObject = new JObject();
            var fields = line.Split(fieldSeperator);

            for (int x = 0; x &lt; headers.Count; x++)
            {
                lineObject[headers[x]] = fields[x];
            }


            resultSet.Rows.Add(lineObject);
        }
        else
        {
            lineSkipCounter += 1;
        }
    }
    
    

Finnally, this is the funtion that converts the CSV data from one large string into a string array to represent the rows of data in the original source.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
    
    private static string[] ToLines(string dataIn)
    {
        char[] EOLMarkerR = new char[] { '\r' };
        char[] EOLMarkerN = new char[] { '\n' };
        char[] EOLMarker = EOLMarkerR;

        //check to see if the file has both \n and \r for end of line markers.
        //common for files comming from Unix\Linux systems.
        if (dataIn.IndexOf('\n') > 0 && dataIn.IndexOf('\r') > 0)
        {<br />
            //if we find both just remove one of them.<br />
            dataIn = dataIn.Replace("\n";, "");
        }
        //If the file only has \n then we will use that as the EOL marker to seperate the lines.
        else if(dataIn.IndexOf('\n') &gt; 0)
        {<br />
            EOLMarker = EOLMarkerN;
        }

        return dataIn.Split(EOLMarker);
    }
    
    

FunctionInput

The JSON the function accepts consists of the following fields.

rowsToSkip - This indicates the number of rows to skip in the conversion process.

fileName - This is the source file where the data came from. This is passed in for downstream processes that may need to know which file was processed.

csv - This is the raw data from the source file that you want to be converted to JSON. This content may contain \r\n or \n end of line markers. The function will detect them an process the csv data accordingly.

1
2
3
4
5
6
7
8
9
10
11
12
    
    {
        "rowsToSkip": 1,
        "fileName": "MyTestCSVFile.csv",
        "csv":  "ID,Name,Score
    1,Aaron,99
    2,Dave,55
    3,Susy,77
    "
    }
    
    

Function Output

The function will return the JsonResult object.

fileName - This is the source file where the data came from. This is passed in for downstream processes that may need to know which file was processed.

rows - list of the CSV data in JSON format, with the field name from the header row.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
    
    {
        "fileName":"MyTestCSVFile.csv",
        "rows":[
    {
        "ID":"1",
        "Name":"Aaron",
        "Score":"99"
    },
    {
        "ID":"2",
        "Name":"Dave",
        "Score":"55"
    },
    {
        "ID":"3",
        "Name":"Susy",
        "Score":"77"
    }]
    }
    
    

Final Thoughts

If you have larger files, like I did, you can use the SQL OPENROWSET for bulk inserts or use the SqlBulkCopy class to upload the data. The processing of the CSV data was similar and I used an Azure Function, BulkUpload, for that too. I will cover that in another blog post.

Azure Logic App Using Azure Function BulkUpload