Please note javascript is required for full website functionality.


Power Query: Combining Forces

24 July 2019

Welcome to our Power Query blog. Today, I look at a way to reduce overheads when combining numerous files.

I looked at combining files from a folder in Power Query: One Folder, One Query. In that query, I used the Graphical User Interface (GUI) functions available to combine files.

To get to this point, I used the ‘From Folder’ option which is in the dropdown from ‘From File’, located in the ‘New Query’ dropdown in the ‘Get & Transform’ section of the ‘Data’ tab in Excel.

However, there is another path I can take if I am concerned about the overhead of combining lots of files from a folder. Instead of combining the files, I can choose to ‘Transform Data’.

I don’t need the columns of my data at this point; all I want is to keep the Content column. I select this column and right click to remove the other columns. Now I have a single column, I choose to ‘Drill Down'.

This gives me a list of files:

This is useful to me because it means I can use an alternative method to combine my files, namely Table.Combine():

Table.Combine(tables as list, optional columns as any) as table

This returns a table that is the result of merging a list of tables.  The resulting table will have a row type structure defined by columns or by a union of the input types if columns is not specified.

Before I combine my files, there is another step I can take, using the M function List.Transform():

List.Transform(list as list, transformation as function)  as list 

This performs the transformation on each item in the list and returns the new list.

The particular transformation I want to use is Binary.Buffer:

Binary.Buffer(binary as nullable binary) as nullable binary

This buffers the binary value in memory.  The result of this call is a stable binary value, which means it will have a deterministic length and order of bytes.

What I am doing is to make the storage of my binary files more efficient.  I can apply the required M code to the formula bar.

This has no impact visually, since the function is working on the way the binaries are stored in memory.

I now have a list of binaries which are stored efficiently.  In order to use Table.Combine(), I need a list of tables.  The easiest way to achieve this is to create a transformation that List.Transform() can use.  I do this by creating a function.  I save my current query as ‘Combining_Files’ and create a new blank query.

My function is a version of Csv.Document() which receives a parameter:

Csv.Document(source as any, optional columns as any, optional delimiter as any, optional extraValues as nullable number, optional encoding as nullable number) as table

The M code I have created is:

=(salesfile as binary) => Csv.Document(salesfile, [Encoding=1252])

The “1252” encoding is explicitly requiring single-byte characters that are the defaults for English and other Western languages’ versions of Microsoft Windows. 

Having created my function, I can apply it to Combining_Files.

Once I click to execute this step, my column is transformed:

I now have a list of tables, and I can use Table.Combine().

When I execute this step, my files are combined.

I can now transform my data. I can now apply the usual transformations to remove blank rows / headers and fill down where necessary.

Come back next time for more ways to use Power Query!