#!/bin/bash
INT=-5

if [[ "$INT" =~ ^-?[0-9]+$ ]]; then

echo "INT is an integer."

else

echo "INT is not an integer." >&2

exit 1

fi

What does the leading ~ do in the starting regular expression?

Best Answer


The ~ is actually part of the operator =~ which performs a regular expression match of the string to its left to the extended regular expression on its right.

[[ "string" =~ pattern ]]

The string should be quoted and the regular expression shouldn't be quoted

In the perl programming language a similar operator is used

The regular expressions understood by bash are the same as those that GNU grep understands with the -E flag, i.e. the extended set of regular expressions.


Somewhat off-topic, but good to know.

When matching against a regular expression containing capturing groups, the part of the string captured by each group is available in the BASH_REMATCH array. The zeroth/first entry in this array corresponds to & in the replacement pattern of sed 's substitution command (or $& in Perl), which is the bit of the string that matches the pattern, while the entries at index 1 and onwards corresponds to \1 , \2 , etc. in a sed replacement pattern (or $1 , $2 etc. in Perl), i.e. the bits matched by each parenthesis.

Example.

string=$( date +%T )

if [[ "$string" =~ ^([0-9][0-9]):([0-9][0-9]):([0-9][0-9])$ ]]; then
  printf 'Got %s, %s and %s\n' \
    "${BASH_REMATCH[1]}" "${BASH_REMATCH[2]}" "${BASH_REMATCH[3]}"
fi

This may output

Got 09, 19 and 14

if the current time happens to be 09:19:14.

The REMATCH bit of the BASH_REMATCH array name comes from "Regular Expression Match", i.e. "RE-Match".


In non- bash Bourne-like shells, one may also use expr for limited regular expression matching (using only basic regular expressions).

A small example.

$ string="hello 123 world"
$ expr "$string" : ".*[^0-9]\([0-9][0-9]*\)"
123