Ever wondered how to iterate over a series of objects in Power Query? Let’s have a look on that. In this article, I will be covering how to create loops/iterations in Power Query. In our sample example, we will have a simple flat table with 3 columns: Country, Variant, Clicks. The goal is to determine which variant wins=has the highest number of clicks in all the 4 countries. The example is based on 13 rows only but it would be no different if there were 13,000 rows with more countries and variants.
You might be already thinking that creating loops can help to e. g. easy select winning ads across several thousand ad groups when it comes to PPC. You are right 🙂
Note that there is no special …for…in… statement in Power Query unlike in Python, for example. The main trick in Power Query is to use “Group” function wisely and to apply the right set of functions over the grouped objects.
Creating loops is relatively easy task with only few critical steps:
- Get your data
- Group as AllRows on desired level
- Apply custom function OR series of nested functions over the grouped tables.
- Expand the results
The critical step is #3 – this is where your needs have to be translated into formulas. My need is to see which variant has the highest number of clicks in every country, so we need to sort grouped subtables by clicks on country level in descending order. The second steps is to rank the rows in the sorted tables by adding an index column. The index column will tell us which variant is #1 row with the highest number of clicks.
#1 Getting the Data
In my example, I am working with a simple table which resides in the Excel file itself. However, you can create loops over e. g. 1,000 CSV or TXT or entire Excel files in similar fashion (leave a comment below the article if you are interested to see that).
#2 Grouping Your Dataset
The first step is to group your dataset on the desired level. Since I need to find the best performing variant in every country, I group only by “Country” and select option “All Rows”. This steps creates a column of “subtables” where every subtable contains data for all the variant in that specific geo:
#3 Applying Function(s) to Subtables
I will be using 2 nested formulas to do the trick, not a custom function. (If you are interested in custom function option, learn more about custom functions here). The goal is to sort the subtables by clicks and then add an index column.
I am using Table.Sort() function which takes a table as first argument and then sorting criteria as list of lists as second argument. So in my case, the column “GroupedRows” contains the subtables on country level, so I can use this column as first argument in the Table.Sort function.
Few comments about arguments of Table.AddIndexColumn:
- Table – the table where you want to add the index column – in my case, it’s the Table.Sort…. piece of code = the already sorted table by clicks in the descending order.
- New Column Name – how the new index column should be called – in my case, the name of new column is “Rank”.
- Starting number – the first index number in the series
- Increment – increment for every subsequent row.
When adding the nested formulas, it’s always good to preview the results by closing the “Custom column window” and clicking into few random rows within the new column. The results will be shown in the bottom section of the PQ window:
#4 Expanding the Results
So there you have it, we looped through X number of subtables & easily decided which is the winning variant in every country. This whole setup would take you 30 minutes max to develop according to this article.
Obviously, I am demonstrating the loop scenario on a very simple example – it could get way more complicating than this and you may need to use a custom function to do to the “looping” step.
The sample file from this article is here.