Regular Expressions in Go
Guide to Go regular expressions: compiling patterns, matching, searching, finding submatches, replacing, and validating input with groups, quantifiers, anchors, and character classes.
Introduction to Regular Expressions
A regular expression (regex) is a pattern used to match, search, validate, or replace text. In Go, regex support lives in the `regexp` package.
1. What is a Regular Expression?
Pattern-based string matching.
Creating Regular Expressions
Use `regexp.Compile` or `regexp.MustCompile`. `Compile` returns an error; `MustCompile` panics on error but is convenient for constants.
1. MustCompile for Constant Patterns
Panics on invalid pattern; good for literals.
2. Compile with Error Handling
Use Compile when the pattern is dynamic or may be invalid.
Regex Flags (Inline Modes)
Go does not use separate flag arguments for regex; instead, you embed mode modifiers in the pattern itself, such as `(?i)` for case-insensitive or `(?m)` for multiline.
1. Case-insensitive Mode (?i)
Use (?i) at the start of the pattern for case-insensitive matches.
2. Multiline Mode (?m)
Use (?m) so ^ and $ match at line boundaries.
Matching and Searching
The `regexp.Regexp` type has methods like `MatchString`, `FindString`, `FindAllString`, and `FindStringIndex` for searching.
1. MatchString (check match)
Return true/false for a pattern.
2. FindString and FindAllString
Extract first or all matches.
3. FindStringIndex
Find index range of first match.
Replacing Text with Regex
Use `ReplaceAllString` or `ReplaceAllStringFunc` to replace matches. These are useful for sanitizing or reformatting strings.
1. ReplaceAllString
Replace all matches with a literal string.
2. ReplaceAllStringFunc
Transform each match using a function.
Character Classes and Anchors
Use `\d`, `\w`, `\s` and custom classes like `[aeiou]`. Use `^` and `$` to anchor to start or end of the string.
1. Common Character Classes
Digits, word chars, whitespace, and custom sets.
2. Anchors (^ and $)
Match at start or end of string.
Quantifiers and Groups
Quantifiers like `*`, `+`, `?`, and `{n,m}` control repetition. Parentheses create groups; `FindStringSubmatch` returns the whole match plus captured groups.
1. Basic Quantifiers
*, +, ?, and {n,m}.
2. Capturing Groups and Submatches
Use parentheses and FindStringSubmatch.
Using Regex for Validation
Regex is useful for quick format checks. For complex rules (e.g. full RFC email), specialized libraries may be better.
1. Simple Email Pattern
Basic email-like validation.