List Functions: an Elegant Way to Manipulate Text

***TIP: Subscribe to our newsletter. Don’t miss any new content.***

List functions can become your favorite set of functions if you often need to manipulate text (aka “keywords”). You are now probably wondering what “list” function even means in Power Query.

Basically, a “list” is a special object whose elements can be easily manipulated separately.

So let’s have a taste of what can be achieved. Assume you have these 4 keywords:

  • “I want to buy bananas”
  • “I want to buy apples”
  • “buy apples”
  • “buy bananas”

Your tasks are:

  1. From every keyword, pull the first 2 words
  2. From every keyword, pull the 2nd word
  3. Sort all words within the keyword in alphabetical order
  4. From every keyword, pull all the words starting with “w”
  5. From every keyword, pull all the words containing “b”
  6. In every keyword, always capitalize only the second word

The basic steps steps are:

  1. Convert the keyword to a list via Text.Split function
  2. Use list function(s) of your choice to manipulate the list to achieve the desired transformation
  3. Convert the list back to text with Text.Combine function

Text.Split Function

Text.Split function converts text to list.

Example: Text.Split([Keyword],” “)

First argument is the text which needs to be manipulated, usually a column. The second argument is the delimiter which will be used for splitting the keyword into the list of words. So it needs to be a space in our case.

How does it look in Power Query?

Updated on 2017-09-30: To be on safe side, you may want to Text.Lower() function first so all the text in [Keyword] is converted to lower case. Cases matter in e. g. in List.Sort function which will be used later. The function would then be written as Text.Split(Text.Lower([Keyword]),” “).

As usual, you are adding a custom column. What is the result?

Hmm… It’s a list. 🙂

Yes, it looks a bit weird. If you click into the “list” cell but not on the word “List”, you will see what each list contains:

Now, you are able to manipulate all these items in each row via list functions. The full list of list functions is here.

So let’s start with task 1:

From every keyword, pull the first 2 words

Function: Text.Combine(List.FirstN(Text.Split([Keyword],” “),2),” “)

Function in detail (using javascript notation in PSPad for easier visualization):

Result:

How does it work? You need to start from the middle:

  1. Text.Split splits the keyword by spaces.
  2. List.FirstN with 2 in argument selects the first 2 items from the list.
  3. Text.Combine joins the selected words back to “text” by using space.

From every keyword, pull the 2nd word

Function:
Text.Combine(List.Range(Text.Split([Keyword],” “),1,1),” “)

Result:

The tricky part here is the List.Range function. List.Range will select items starting at Xth offset. Note that counting starts at 0 offset, not at 1! That’s first argument. The second argument is used to tell how many items from the Xth offset to select. In my case, it is just 1 because I want only 1 word.

So what if you wanted to pull only the first word? The function would be:

Text.Combine(List.Range(Text.Split([Keyword],” “),0,1),” “)

Sort all words within the keyword in alphabetical order

Function:
Text.Combine(List.Sort(Text.Split([Keyword],” “)),” “)

Result:

List.Sort is useful step in deduplicating keywords with contain words in different word order 🙂

From every keyword, pull all the words starting with “w”

Level: Advanced

The function:
Text.Combine(List.Select(Text.Split([Keyword],” “), each Text.StartsWith(_,”w”)),” “)

Now back to function in detail:

The important piece here is the List.Select where you need to define the condition for selecting the right words. Don’t forget to use “each” and “_” to tell Power Query that it should operate over the “just created list”.

Result:

From every keyword, pull all the words containing “b”

Yes, this will be similar as the previous example. The only piece which needs to change is the “Text.StartsWith” part to “Text.Contains”.

The function:

Result:

Detail:

In every keyword, always capitalize only the second word

Level: Pro

Function:

——————–

if List.Count(Text.Split([Keyword],” “))>=2 then

Text.Combine(
List.ReplaceRange(Text.Split([Keyword],” “),1,1,

List.Transform(
List.Range(
Text.Split([Keyword],” “),
1,1),
each Text.Upper(_))
),
” “)

else “Keyword has less than 2 words”

——————–

Result:

Explanation: I will leave that to you 🙂

Summary

List functions are super powerful functions to manipulate text within “cells”. If you passed your “Power Query beginner” stage, definitely start learning List functions.

Some other useful List functions not mentioned in this article are:

  • List.First, List.Last
  • List.Distinct
  • List.Contains, List.ContainsAny
  • List.Count, List.IsEmpty

The sample file with all the functions applied is here.

That’s all!

DZ

***TIP: Subscribe to our newsletter. Don’t miss any new content.***

2 thoughts on “List Functions: an Elegant Way to Manipulate Text

  1. Great post! I really like the way you lay out the info and really go into the detail of why it’s doing what it’s doing!

    As I was following along one thing drove me crazy… the section, “Sort all words within the keyword in alphabetical order”
    I got how the function is operating, however the result was not what I expected:
    I bananas buy to want
    At first thought that is not alphabetical, however after searching I find that List.Sort, sorts first by Upper Case, then by Lower Case.

    I’m sure there are other ways around this but here’s one, add one more layer to the function, Text.Lower:
    Text.Combine(List.Sort(Text.Split(Text.Lower([Keyword]),” “)),” “))

    1. Hi Ted,
      Glad you find it useful. You are right, it did not occur to me when writing the article but I also tend to use Text.Lower(column) just to be on the safe side when working on real projects. I will update the article.

      DZ

Leave a Reply

Your email address will not be published. Required fields are marked *