This week I started doing Advent of Code for the first time ever. I looked at it last year, but felt intimidated by it, along with being really busy having 5 odd jobs at the time. This year Clean Coders has me pretty well-equipped to tackle the problems and actually have fun doing them, so I’ve been waking up an hour earlier to get part 1 knocked out, and doing part 2 before bed every night. I’ve been really enjoying it because some of the puzzles force you out of your comfort zone in coding a little bit.

Yesterday’s problem required me to get a little more comfortable with regular expressions (regex). While I’m still no expert on the subject, it definitely opened taught me quite a bit more than I knew before. While using Clojure, one function seemed to help more than ever while using regex, so I figured it’s worth a blog post about.

What is re-seq?

re-seq is a function in Clojure that scans a string for matches to a given regular expression and returns them as a lazy sequence. Its simplicity and flexibility make it a go-to tool for working with text.

(re-seq regex-pattern string)
  • regex-pattern: A regular expression, defined using Clojure’s #” syntax or java.util.regex.Pattern.
  • string: The input string to search.

The result is a lazy sequence of matches.

How Does re-seq Work?

When re-seq processes a string, it looks for all substrings that match the given pattern. If the pattern contains capturing groups, re-seq returns a sequence of vectors, where the first element is the full match and subsequent elements are the contents of the capturing groups.

Examples of re-seq in Action

1. Basic Matching

Let’s start with a simple example: finding all the digits in a string.

(re-seq #"\d" "a1b2c3")
;; => ("1" "2" "3")
2. Using Capturing Groups

Capturing groups let you extract specific parts of a match. For example, extracting key-value pairs:

(re-seq #"(\w+):(\d+)" "name:42 age:30")
;; => (["name:42" "name" "42"] ["age:30" "age" "30"])

Here, each match is a vector:

  • The first element is the full match (“name:42”, “age:30”).
  • The second and third elements are the contents of the capturing groups (“name” and “42”, etc.).
3. Working Without Capturing Groups

If the pattern doesn’t use capturing groups, re-seq returns the full matches as strings.

(re-seq #"\w+" "Clojure is fun!")
;; => ("Clojure" "is" "fun")

Tips and Tricks

  1. Use Capturing Groups Wisely: If you don’t need capturing groups, avoid them to keep results simpler.
  2. Combine with Higher-Order Functions: map, filter, and reduce pair beautifully with re-seq.
  3. Escape Special Characters: Regular expressions have special characters (e.g., ., *, +). Escape them with \ when needed.