Please note javascript is required for full website functionality.

Blog

Power Query: Comparing Lists

11 April 2018

Welcome to our Power Query blog. This week, I look at how to use the List function to compare lists.

In Power Query: If You Can’t Tell Them Apart, Join Them, I looked at how to compare data in different queries. Another way to do this is to extract the data to lists. I will start with the same example that I used in that blog.

I want to do three things:

  1. I want to see who is in Employee List 1 but not in Employee List 2
  2. I want to see who is in both lists
  3. I want to combine the lists.

1. Who is missing from Employee List 2? 

To find this out, I will create two lists in Power Query and use List.Difference to detect entries in the first list that do not appear in the second list. 

List.Difference(list1 as list, list2 as list,optional equationCriteria as any) as list

 

This returns the items in list 1 that do not appear in list 2.  Duplicate values are supported. 

On the whole, this is easy to follow.  I must enter two lists which will be compared.  When it says, “duplicates are supported”, this means that if ‘John Smith’ appears twice in list 1, but not at all in list 2, then ‘John Smith’ will appear twice in my results.  If I don’t want this, then I need to remove duplicates before I compare my lists.  I can either do this in the Query Editor interface or use List.Distinct.  The optional third parameter is equationCriteria, which I don’t need for my example.  If I wanted to ensure that upper and lower-case names were treated the same, then I could use this feature.  A detailed description with examples of how to use an equality equation is given on the Microsoft help page for the function List.Distinct

I begin by creating a list of names from Employee List 1.  My data is not already in a table in my worksheet, so a new table will be created when I choose the ‘From Table’ option in the ‘Get and Transform’ section of the ‘Data’ tab, with my all the data in the FULL_NAME column selected (but not the heading).

I am going to save this data in a query called ‘Name_List_1’:

I convert my data to a list using the option in the ‘Any Column’ section of the ‘Transform’ tab, and save it as connection only.

I do the same for my other list, so that I have ‘Name_List_2’:

I can now create a new blank query, ready for me to compare my lists.

The results are presented as a list of those names that appear in ‘Name_List_1’ but not ‘Name_List_2’.  The syntax I have used is: 

= List.Difference(Name_List_1, Name_List_2)

If there had not been anyone on the first list that wasn’t on the second, then the answer would have been a blank list.

2. Who is on Both Lists? 

This time I will use List.Intersect

List.Intersect(list as list,optional equationCriteria as any) as list 

This returns a list from a list of lists and intersects common items in individual lists.  Duplicate values are again supported. 

The first thing to notice about this function is that it refers to one list in the syntax, but that in fact is a ‘list of lists’.  This means that I can check multiple lists, not just two.  The equationCriteria parameter is optional again which would allow me to control what is classed as equal. 

I begin by creating a blank query, so that I can try out this function against the two lists I have already created.

The syntax I have used is: 

= List.Intersect({Name_List_1, Name_List_2})

I could easily to extend this to intersect a list of more lists.  In my case, I have a list of those employees on both lists.  Each employee only appears once in my case, however, if an employee appeared on both lists twice, then they could appear here twice.  If no one appeared on both lists, then the list would be empty.

3. Combine My Lists into One List 

For this I will use the List.Union function. 

List.Union(list as list,optional equationCriteria as any) as list

This returns a list from a list of lists and creates a union of the items in the individual lists.  The returned list contains all items in any input lists.  Duplicate values are matched as part of the union. 

This is fairly similar to the List.Intersect functionality, except that I want to join the lists instead of finding what is common to them.

The syntax I have used this time is: 

= List.Union({Name_List_1, Name_List_2})

which has joined the data in my lists.  Each of the names appears once unless a name appears twice on both lists.  I can extend this to include more lists by adding lists to the ‘list of lists’.

Want to read more about Power Query? A complete list of all our Power Query blogs can be found here.  Come back next time for more ways to use Power Query!

Newsletter