5 ways I use regex in Linux (and why they're so essential)


Quick: If you shout "regular expressions" in a crowd of Linux users, what happens?
Answer: Everyone will tell you the right way to use them, and every answer will be different.
Regular expressions -- often called regex -- are sequences of characters that define a search pattern in text. That makes them sound like a one-trick pony, but you'd be surprised at how useful these things are.
Regular expressions can be used for partner matching, text processing, data validation, and much more.
Also: 5 Linux commands I use to keep my device running smoothly
The one caveat to using regular expressions is that they can become very complex -- almost to the point of being their own language. Once you get the hang of regex, you'll find them invaluable. There are things you can do with regular expressions that you can't do with anything else, and they make interacting with the command line or even bash scripts so much more powerful.
Let me highlight five different ways I use regular expressions.
First, let's talk about regex patterns.
What makes a regular expression pattern?
There are four basic concepts you need to understand about regular expressions:
- Literal characters match the exact character(s) specified (e.g., "hello" matches only "hello").
- Character classes group characters within a set (e.g., [a-zA-Z] matches any letter from 'a' to 'z').
- Pattern matching is used to match patterns in strings (e.g., \w matches any word character).
- Quantifiers specify the number of times a pattern should be matched (e.g., * matches 0 or more occurrences, + matches 1 or more occurrences).
Here's an overview of the regular expression syntax:
- ^ - start of string
- \ - escape character
- . - any single character
- [a-z] - any lowercase letter from 'a' to 'z'
- [A-Z] - any uppercase letter from 'A' to 'Z'
- [0-9] - any digit from '0' to '9'
- ^ - end of string
Also: I was an AI skeptic until these 5 tools changed my mind
Here are some basic examples:
- ZDNET - Matches the literal string "ZDNET".
- [a-zA-Z] - Matches any letter from 'A' to 'z'.
- \d{5} - Matches exactly 5 digits (e.g., 01234).
- ^ZDNET$ - Matches the start and end of a string, ensuring it's "ZDNET" (not just part of another string).
- (abc) - Groups characters together for capturing and referencing later.
- \S+ - Matches one or more non-space characters.
For example, I could search a text file (named test) for the string ZDNET with grep, like so:
grep ZDNET test
Let's say I have Hello, ZDNET! at the top of that file. The above command would print out:
Hello, ZDNET!
But what if I used ^ZDNET$? In the above example, ZDNET is a part of a longer string, so it would produce no results. If, on the other hand, there was a line that contained only ZDNET, I could find it with the command:
grep ^ZDNET$ test
The $ character is the end of the string character, so the pattern matching ends after ZDNET.
Now, let's look at five different ways I use regular expressions in Linux.
1. File management (with the help of grep)
I've already demonstrated how regular expressions can be used with the grep command. But it's important to know that you can supercharge file management in Linux from the command line by employing the grep command. With regex and grep, you can search for patterns in text files with either simple or very complex patterns.
Also: The first 5 Linux commands every new user should learn
You could also use regex for matching whitespace and punctuation. This can be helpful for removing extra spaces after punctuation. Here's an example:
grep -E ' [^a-zA-Z0-9\.\?\!]' input.txt
Let's break this down:
- grep - the command for searching and printing lines matching a pattern.
- -E - Enables extended regular expression syntax.
- ' - First of two single quotes which surround the regular expression.
- - A space character (which matches a single space).
- [^a-zA-Z0-9\.\?\!] - This matches any single character which is not a lowercase letter (a-z), uppercase letter (A-Z), digit (0-9), period (.), question mark (?), and exclamation mark (!). The ^ negates the normal behaviour of [].
- input.txt - The file to search in
- ' - Second of two single quotes which surround the regular expression.
Let's say you have the following lines in that file:
Hello, ZDNET! How has your day been?
ZDNET
My name is Jack. What's yours?
Only in the first line do I have a space before a character that is not a lowercase letter (a-z), uppercase letter (A-Z), digit (0-9), period (.), question mark (?), and exclamation mark (!), so the output would include only that line.
2. Text editing (with Vim)
Vim is a powerful text editor and includes support for regex patterns. Unfortunately, my editor of choice (nano) does not support regex, so if I need the feature, I have to get serious with Vim.
Let's say I have a file that contains the following text:
The old cat ran quickly.
But I saw an old woman walking down the path.
New albums are really cool!
Also: 5 top Linux text editors that aren't vi or Emacs (and why they are my favorite)
What if you want to manually replace "old" with "new"? That's simple if the file is only three lines long, but if it's much larger, you could use the following regex command (within Vim) to automatically make that change:
:%s/\<old\>/new/g
Here's the breakdown of that command:
- \<old\> - Matches the entire word "old", not just part of another word. The \< matches the start of a word boundary, and the \> matches the end of a word boundary
- new - The replacement text.
- g - Global flag, which applies the substitution to all occurrences in the entire line.
3. Text editing (with find and sed)
Another method of text editing is with the sed command. This is another great option for searching and replacing in text files. Let's use the same example as above and replace old with new using the following command:
find . -name "*.txt" -exec sed -E 's/old/new/g' {} \;
We use the -exec option to execute a command on each file found by the find command. You could do the same thing with multiple *.txt files, which helps illustrate how powerful regular expressions can be.
4. Network configuration with the ip command
Let's say I have a machine with multiple networking cards attached (which would indicate that it's a server connected to my LAN). There might be both internal and external network connections on the machine, and I only want to view the connections with IP addresses that start with 192.168.1. To do that, I use two commands and a regular expression. The two commands are ip and grep. The command looks like this:
ip addr | grep -Eo '192\.168\.1\.[0-9]{1,3}'
5. Log viewing
I often follow logs with the tail command and can use regular expressions to see only what I need to do. For example, I might want to see only errors or warnings that appear in /var/log/syslog. I could simply tail that file (which will keep a real-time update of the last few entries written to the syslog file) like so:
tail -f /var/log/syslog
Also: The first 5 Linux commands every new user should learn
I would then have to comb through the output, looking for either error or warning. The better option would be to use regular expressions so that only entries with error or warning are displayed, which would use tail and grep like this:
tail -f /var/log/syslog | grep -E 'error|warning'
And that, my friends, is how I typically use regular expressions. I've only scratched the surface as to how they are used. If you're new to regular expressions, make sure to start small and build from there; otherwise, the confusion can mount quickly.
Get the morning's top stories in your inbox each day with our Tech Today newsletter.