REGEXP(6) REGEXP(6) NAME regexp - regular expression notation DESCRIPTION A regular expression specifies a set of strings of charac- ters. A member of this set of strings is said to be matched by the regular expression. In many applications a delimiter character, commonly `/', bounds a regular expression. In the following specification for regular expressions the word `character' means any character (rune) but newline. The syntax for a regular expression e0 is e3: literal | charclass | '.' | '^' | '$' | '(' e0 ')' e2: e3 | e2 REP REP: '*' | '+' | '?' e1: e2 | e1 e2 e0: e1 | e0 '|' e1 A literal is any non-metacharacter, or a metacharacter (one of .*+?[]()|\^$), or the delimiter preceded by `\'. A charclass is a nonempty string s bracketed [s] (or [^s]); it matches any character in (or not in) s. A negated charac- ter class never matches newline. A substring a-b, with a and b in ascending order, stands for the inclusive range of characters between a and b. In s, the metacharacters `-', `]', an initial `^', and the regular expression delimiter must be preceded by a `\'; other metacharacters have no spe- cial meaning and may appear unescaped. A `.' matches any character. A `^' matches the beginning of a line; `$' matches the end of the line. The REP operators match zero or more (*), one or more (+), zero or one (?), instances respectively of the preceding regular expression e2. A concatenated regular expression, e1e2, matches a match to e1 followed by a match to e2. Page 1 Plan 9 (printed 12/21/24) REGEXP(6) REGEXP(6) An alternative regular expression, e0|e1, matches either a match to e0 or a match to e1. A match to any part of a regular expression extends as far as possible without preventing a match to the remainder of the regular expression. SEE ALSO awk(1), ed(1), sam(1), sed(1), regexp(2) Page 2 Plan 9 (printed 12/21/24)