Power Query: Keep on Running
3 February 2021
Welcome to our Power Query blog. This week, I revisit running totals.
I looked at running totals some time ago, Power Query: One Route to a Running Total. I used a combination of List() M functions, namely the List.Range() and List.Sum() functions. I did this so that I could get a list of all the amounts to the current point to be totalled and then simply add them up.
My example used some accounting data, so I will return to it once more:
I have extracted data from ACCT_Order_Charges_with_Group, a table on my MS Access database. I picked this data because there are more than 30,000 rows, which will be useful as I plan to look at how efficiently running totals can be calculated.
In order to use the List.Range() function, I needed a way to sequentially identify the rows containing the amounts so that they form a list. To assign a number to each row, I added an index column from the ‘Add Column’ tab, starting from 1 and not 0 (the default).
I needed to tell the List.Range() function to look at this list, and where to start reading from and (in this case) where to stop. I created a new custom column from the ‘Add Column’ tab.
The M code I used was:
= List.Range(#"Added Index"[Amount],0,[Index])
This looked at the amounts sequentially by row number, starting at the first row and ending with the index for the current row. The column created looks a little odd, as it just contains the word ‘List’ for each row (see below). However, if I clicked in the column next to (not on) the word ‘List’, then the contents appeared at the bottom left of the screen under the title ‘List’:
The next step was to add them all up. This is where List.Sum() came into play. The only parameter that List(Sum) requires is a list. Therefore, I created another custom column, with the following M code:
= List.Sum(List.Range(#"Added Index"[Amount],0,[Index]))
I loaded this data back into my workbook, but as I noted in the original blog, the load is not instantaneous. This is because for each running total calculation, List.Range() creates a list of all the amounts to that point and then List.Sum() sums them. I can add another list function which will speed up the process (subject to constraints which I will expand on later).
List.Buffer(list as list) as list
This buffers the list in memory. The result of this call is a stable list. This basically means that the list does not need to be recalculated, and in this case, can be added to.
I look at the M code I have in the Advanced Editor (I removed the custom column Running Total List as it was just to show how List.Range() worked).
The M code for the lines associated with the running total is:
#"Added Index" = Table.AddIndexColumn(#"Inserted Multiplication", "Index", 1, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "Running Total", each List.Sum(List.Range(#"Added Index"[Amount],0,[Index])))
I add a List.Buffer() step after the “Added Index” step:
BufferedValues = List.Buffer(#”Added Index”[Amount]),
Then, I change the “Added Custom” step to use BufferedValues instead of “Added Index”[Amount]
#"Added Custom" = Table.AddColumn(#"Added Index", "Running Total", each List.Sum(List.Range(BufferedValues,0,[Index])))
I make these changes in the Advanced Editor:
I click ‘Done’ and check my data.
The totals still look good, so I load the data to the workbook. I had issues with the time taken for this in the original blog – but no more!
There is an improvement, but this is not an ideal solution. If I had a query which used query folding for efficiency, then folding would be prevented because I have used a buffer function. Next week I will look at another, more complex way of creating a running total…
Come back next time for more ways to use Power Query!