Finally, BigQuery users have option to save their custom functions so they can be re-used across the project. Up until now, you could only create a UDF function within a query but you could not save the query as a view for example. Luckily, the times are changing now. But why should you care?
Creating Persistent User Defined Functions in BigQuery
The problem with this approach is that when you need to do the same classification across many queries and views, you’d have to:
- Copy this CASE statement everywhere you want use this classification
- Edit all CASE statements in multiple places in case your classification is changing
Nowadays though, you can:
- Create a custom classification function.
- Save it just once.
- Reuse it across as many queries as you want.
- Propagate changes everywhere in case you edit your custom function.
Now, let’s decompose the elements of the statement:
- CREATE FUNCTION – you tell BQ that you want to create a new function. Alternatively, you can use CREATE FUNCTION IF NOT EXISTS.
- `test-project-excelinppccom.udfs.fruitFinder` – that’s where you want to save the function. It reads as `project.dataset.functionName`
- (some_string STRING) – that’s the argument of the function. In my case, it’s just 1 string. You can have no arguments, 1 argument, or multiple arguments. This depends on your use case. Basically, the argument is the “value” you want to somehow evaluate or act on with your custom function.
- AS ( …. ) part – that’s essentially the similar CASE which does the actual magic. Unlike in the standard SELECT … FROM …, I’m calling the argument instead of a regular column.
You will then see confirmation “This statement created a new function….” and the function will also appear in the dataset which you assigned it to.
You can now use the function in as many queries as you want 🙂