Please note javascript is required for full website functionality.


Power Query: Exceptional Cases

24 April 2019

Welcome to our Power Query blog. Today, I am going to look at how to protect specific data when transforming text case.

My imaginary salespeople have been providing me with information about possible sales contacts. Whilst their selling skills are excellent, their typing skills are not. I need to tidy up the data.

My first step is to upload the data to Power Query using the ‘From Table’ option on the ‘Get & Transform’ section of the ‘Data’ tab.

I can take the defaults to extract my data.

I begin by creating a new column which will convert the comments text to a ‘proper’ format. I am keeping my original column to check the results.

I can use the ‘From Text’ section in the ‘Add Column’ tab to create a new formatted version of the selected column. I choose to ‘Capitalize Each Word’.

This has improved some of my data, but other parts would be better in their original format, for example ‘USA’.  To tackle this, I am going to remove the ‘Inserted Capitalize Each Word’ step, and create a list of each word in Company.  (For more on List() functionality in M code, please see Power Query: Birthday Lists.

I have added a step using the following M code:

= Table.AddColumn(#"Changed Type", "Company_List", each Text.Split([Company], " "))

This splits each word in the Company column into a small list embedded in Company_List.  The next step is to create a new column of lists which I will convert to ‘Capitalize Each Word’ format.  This will be how I will recognise which words to substitute back to their original format.

The M code I have used is:

= Table.AddColumn(#"Custom1", "Company_Proper_list", each List.Transform([Company_List],Text.Proper)) 

This query is now ready to be substituted.  I save this query and go back to Excel to create a list of words that I want to keep in their original format:

I have a list of the words I want to preserve, now I need to extract these to Power Query using ‘From Table’ as before.

I will create a new column with the text formatted to ‘Capitalize Each Word’ so that I know which words I am looking for in the original query.

Now I am going to rename Capitalize Each Word to From, and Preserve to. I also swap the order and give my table a more useful name.

I am almost done; in order to substitute in a list, my new query must also be a list:

The M code I have used is:

= Table.AddColumn(#"Reordered Columns", "Substitute_List", each ({[From],[To]}))

I can get rid of the other columns and just have a list of lists, ready to format my original query.

I can now convert this to a list from the ‘Transform’ tab.

I go back to my first query and I can substitute using the ‘Substitute Company’ list query.

I can see the other list query, and I use it to create a new column.

The M code I have used is:

= Table.AddColumn(#"Custom2", "Company_Correct_List", each List.ReplaceMatchingItems([Company_Proper_list], Substitute_Company))

The last thing I need to do is glue my company information back together.

The M code I have used is:

= Table.AddColumn(#"Custom3", "Correct_Company", each Text.Combine([Company_Correct_List], " "))

I can remove the list columns I used along the way and I am left with the company data in the correct format.

Come back next time for more ways to use Power Query!