# Monday Morning Mulling: August 2023 Challenge

28 August 2023

*On the final Friday of each month, we set an Excel / Power Pivot / Power
Query / Power BI problem for you to puzzle over for the weekend. On the Monday, we publish a solution. If you think there is an alternative answer,
feel free to email us. We’ll feel free
to ignore you.*

The challenge this month was to find favourite stores and most recently visited stores by customer from sales data, using only one [1] Excel formula for each task.

*The Challenge*

Last Friday, we challenged you to find out the favourite store of each customer using only one [1] Excel formula, and to find the last store that each customer visited with, again, “just” one [1] formula. You could download the question file here.

The
table **Data** contained the customer visits data, and the desired outputs were
to look like the following upon completion:

There were some requirements:

- each formula needed to be within one cell
- this was a formula challenge; no Power Query / Get & Transform or VBA.

** **

*Suggested Solution for Task 1: Most
Frequently Visited Store*

You can find our Excel file here, which shows our suggested solution. The steps are detailed below.

**COUNT THE VISITS**

We can begin by using **UNIQUE**** **and **TRANSPOSE** to get the lists of customers and stores from **Data**, and place them as
row and column headers for our array of counts.
The formula

**=UNIQUE(Data[Customer])**

provides an unsorted list of the customers down a column. The formula

**=TRANSPOSE(UNIQUE(Data[Store]))**

provides a list of the stores across a row
(hence the need for the **TRANSPOSE** function, which transforms a column vector to a row vector).

The number of visits by customer to a store
can be found using **COUNTIFS** with **Data[Store]** and **Data[Customer]** as the criteria ranges. Also, be careful with table references and
remember to use **CTRL + D** and **CTRL + R** to copy the formula down
and to the right. Then **COUNTIFS** produces the following matrix of counts:

Moreover, if we use the column headers
(stores) in the first criteria argument of **COUNTIFS** instead of single stores, and the row headers (customers) in the second
criteria argument, we can obtain the whole matrix as a spilled
array using only one formula:

**=COUNTIFS(Data[Store], F10#, Data[Customer],
E11#)**

This is the Dynamic Array equivalent of a PivotTable.

**FIND THE FAVOURITE
STORES**

Given the matrix of visit counts, we can
look for the most frequently visited store by each customer and output the
corresponding store name. An **XLOOKUP** formula for the maximum of each row with store names being the return array can
achieve that:

**=XLOOKUP(MAX($F11:$J11),
$F11:$J11, F$10#)**

Here it’s necessary to anchor the column
titles **F10#** at the row, to fill down for all customers. The output would look like the following:

So how do we look up for all customers in
one [1] formula? We can use the **BYROW** function. The **BYROW** function has two [2] arguments: an array and a **LAMBDA** function:

**=BYROW(array,
LAMBDA(parameter, calculation))**

The **LAMBDA** function takes each row of the array as its parameter and performs the
specified calculation on it. Given the
above matrix **F11# **of visit counts, we can use a **BYROW** function on the whole matrix, and write the **XLOOKUP** as the **LAMBDA** function inside:

**=BYROW(F11#,
LAMBDA(r, XLOOKUP(MAX(r), r, F10#)))**

Then, the **XLOOKUP** will be performed row-by-row (hence the name **BYROW**). For interested readers we have written blogs
with more details on **BYROW** and **LAMBDA**.

**COMBINE THE
FORMULAE**

In the previous two [2] steps, we first
obtained the matrix **F11#** of visit counts using **COUNTIFS** with two-dimensional conditions, and then we used **BYROW** on **F11#** to perform an **XLOOKUP** for the most frequently visited store by each customer. The two [2] steps can be combined by writing
the first formula in place of **F11# **inside the second formula:

**=BYROW(****COUNTIFS(Data[Store], F10#, Data[Customer], E11#),LAMBDA(r, XLOOKUP(MAX(r), r, F10#)))**

The **COUNTIFS** matrix is only a helper, so instead of referencing the column headers (**F10#**) and row headers (**E11#**), we can substitute
with the original **UNIQUE**** **and **TRANSPOSE** formulas:

**=BYROW(COUNTIFS(Data[Store],
TRANSPOSE(UNIQUE(Data[Store])),Data[Customer], UNIQUE(Data[Customer])),LAMBDA(r, XLOOKUP(MAX(r), r, TRANSPOSE(UNIQUE(Data[Store])))))**

For better readability, we can also use a **LET** function to name some variables for intermediate steps:

**=LET(Store, TRANSPOSE(UNIQUE(Data[Store])),Customer, UNIQUE(Data[Customer]),Matrix, COUNTIFS(Data[Store], Store, Data[Customer], Customer),BYROW(Matrix, LAMBDA(r, XLOOKUP(MAX(r), r, Store))))**

Here we name the intermediate
matrix **Matrix**, and the final calculation of the **LET** is a **BYROW** on **Matrix** to look-up the **MAX** from each row.

*Suggested Solution for Task 2: Most
Recently Visited Store*

You can find our Excel file __here__ which demonstrates our suggested solution.

**STARTING SIMILARLY…**

To find the lastly visited store by each
customer, we can use an approach very similar to the previous solution, by
sorting visit dates of each customer to each store with a **MAXIFS**:

**=MAXIFS(Data[Date],
Data[Store], F$10, Data[Customer], $E11)**

Again, use **CTRL + D** and **CTRL + R** to fill the formula down and to the right.
Then the output will look like the following:

We can look up the latest visit of each customer to return the corresponding store. Furthermore, following the same process we can build up a final formula:

**=LET(Store, TRANSPOSE(UNIQUE(Data[Store])),Customer, UNIQUE(Data[Customer]),Matrix, MAXIFS(Data[Date], Data[Store], Store, Data[Customer], Customer),BYROW(Matrix, LAMBDA(r, XLOOKUP(MAX(r), r, Store))))**

**A DIFFERENT APPROACH**

The above approach is arguably not optimal
for the second task, in the sense that it is sorting dates twice. The **MAXIFS** first sorts visit dates for each customer to each store, and then the **XLOOKUP** sorts the latest visits to each store by each customer. Conceptually, the optimal algorithm would be
sorting all visits of each customer by dates, and then returning the
corresponding stores of those latest visits.
That is, instead of a two-dimensional problem, this task is a two-layer
problem.

We can use the combination of **FILTER**, **SORT** and **INDEX**. For example, for the customer ‘Sam Ngo’, we
can get his most recently visited store with the following formula:

**=INDEX(SORT(FILTER(Data, Data[Customer] = "Sam
Ngo"), 1, -1), 1, 3)**

Here **FILTER** first filters for all store visits of this
customer. Then **SORT** sorts the first column, **Date**, in descending order, so the first
row of the spilled
array will contain the date, customer name and the
store location of the latest store visit of ‘Sam Ngo’. Lastly **INDEX** outputs the cell in the first row and the
third column (**Store**) of the table.

To cover all customers
in one formula, we can again use **BYROW**:

**=BYROW(UNIQUE(Data[Customer]), LAMBDA(name,
INDEX(SORT(FILTER(Data, Data[Customer]=name), 1, -1), 1, 3)))**

**COMPARING THE
APPROACHES**

The advantage of the
above approach is that **FILTER** preserves rows of the table **Data**.
Consider instead if we were to use **MAXIFS** instead of **FILTER** and **SORT**, to replicate the same idea of sorting **Date** for each **Customer** first and then return the corresponding **Store**. For each customer we would start with:

**=MAXIFS(Data[Date],
Data[Customer], $F11)**

Then the outcome would be a single date value, which is not necessarily unique:

That means we can’t
use the sorted-out dates alone to look up the corresponding **Store**. We have no choice but to combine it with the
list of customers. One way is to use the
ampersand (**&**) to concatenate a unique **Customer**-**Date** label, to look up from the table **Data**:

**=INDEX(Data[Store], MATCH($F11&" – "&$G11,
Data[Customer]&" – "&Data[Date], 0))**

In comparison, **FILTER** preserves the rows, and hence preserves the **Date**-**Customer**-**Store** connection. We just need to specify the
index to sort and to output.

*The Final Friday
Fix will return on Friday 29 September 2023 with a new Excel Challenge. In the meantime, please look out for the
Daily Excel Tip on our home page and watch out for a new blog every business
working day.*