2024 Week 19 | Sigma: Can you clean a string?

Introduction

After last week’s awesome product launch, I was thinking about some of the Sigma features that we have not yet highlighted in Workout Wednesday. One of those is Custom Functions, and while the ability to create Custom Functions is currently reserved for admins, it’s a great feature for users to be aware about when you find yourself typing out / copying and pasting the same function repeatedly.

Let’s look at a puzzling example using the RegexReplace function, this involves cleaning up numbers that came through as strings, because many different numeric formats exist in the world of data. They lurk in the shadows, waiting to break your queries. We can debate about whether you should be doing this level of cleaning in Sigma or in the datawarehouse (likely the latter), but you may find this skill useful in your Sigma journey

The first part of this is going to be an intro to Regex. Regex is short for ‘Regular Expressions’, and Regex is typically used to search / match / replace strings. Regex is not everyone’s cup of tea, I know, but is *incredibly* useful to have in your pocket. The second part shows you how this complicated logic can be reduced down to a simple function by your admins.

If you are interested in learning more about Regex, I recommend https://regexlearn.com for an interactive guide. And if you’re way too excited about Regex: https://regexcrossword.com/

Need access to Sigma?

Note: You will only have view access to WOW Workbooks, Folders, and Workspaces, not edit access. Please create your WOW Workbooks under “My Documents.” We suggest creating a folder to organize all your workbooks. 

Requirements

  1. The setup
    • Create an input table and add the following values as individual rows of the input table:
      • 1,000,000.00
      • 1.000.000,00
      • 1’000’000,00
      • 1 000 000,000 000
      • 1 000 000.000 000
      • 1_000_000.000
    • Rename the column ‘String Number’
    • In the input table, add a new column of type ‘Calculation’, rename that to ‘Actual Number’
  2. [SPOILERS AHEAD] Creating your RegexpReplace() function
    • If you’d like to try to solve this on your own, know that you’ll be nesting Regex functions (or writing one really complex regex function)
    • First, paste this function in the function bar: RegexpReplace([String Number], ” “, “”)
      • This just replaces all spaces with no space, you should see rows 4 and 5 change
    • Next, we’re going to nest that first function in some other logic. You can paste the following: RegexpReplace(RegexpReplace([String Number], ” “, “”), “[,’_\\.]”, “,”)
      • This does our first step, then replaces any instances of apostrophes, underscores, or literal periods with commas
      • This step might also seem counter-intuitive, why would we change *everything* to a comma. Hold tight, we’ll get there.
    • Next, we’re going to nest the previous steps inside another regex function. You can copy and paste this over your existing function: RegexpReplace(RegexpReplace(RegexpReplace([String Number], ” “, “”), “[‘_\\.]”, “,”), “,(\\d+$)”, “.\\1“)
      • This step finds the comma that is followed by the last 1 or more digits (the dollar sign signals ‘end of a string’). It replaces that with a period and the ‘first capture group’.
    • Almost done! Lastly, we’re going to add one more nesting level. You can copy and paste this over your existing function: RegexpReplace(RegexpReplace(RegexpReplace(RegexpReplace([String Number], ” “, “”), “[‘_\\.]”, “,”), “,(\\d+$)”, “.\\1“), “,”, “”)
      • This step takes all the remaining commas and replaces them with nothing (effectively deleting them)
    • Now, wrap that whole function in a ‘Number()’ function to convert it into a number, this brings your final, unreadable function to: Number(RegexpReplace(RegexpReplace(RegexpReplace(RegexpReplace([String Number], ” “, “”), “[‘_\\.]”, “,”), “,(\\d+$)”, “.\\1“), “,”, “”))
    • Figuring out the Regex took some trial and error, and there may be better ways to do this (if you have a way, please share!), and if your head is spinning, that’s okay. Because working with Regex is not the point of this challenge
  3. Time to call an admin to create a Custom Function
    • We’ve added a custom function called StringToNumeric, and it’s time to take it for a spin
    • Create a new calculation column in the same input table (or replace the logic in the current one)
    • Use this function: StringToNumeric([String Number])
  4. Check that your function worked properly by looking at the ‘Actual Number’ column in your field picker on the left side of the screen. If it has a ‘123’, you’ve got a number!

Dataset

None, you’re creating an input table!

Share

After you finish your workout, share on LinkedIn, Sigma’s Community page, (or Twitter) using the hashtags #WOW2024 and #SigmaComputing, and tag Ashley Bennett, Eric Heidbreder, Katrina Menne, and Michal Shaffer!

Create an interactive, sharable version of your solution here.

Also, make sure to fill out the Submission Tracker so that we can count you as a participant this week to track our participation throughout the year.

Solution

Coming Soon!

Scroll to Top