Blob


1 .TH REGEXP 7
2 .SH NAME
3 regexp \- Plan 9 regular expression notation
4 .SH DESCRIPTION
5 This manual page describes the regular expression
6 syntax used by the Plan 9 regular expression library
7 .MR regexp (3) .
8 It is the form used by
9 .MR egrep (1)
10 before
11 .I egrep
12 got complicated.
13 .PP
14 A
15 .I "regular expression"
16 specifies
17 a set of strings of characters.
18 A member of this set of strings is said to be
19 .I matched
20 by the regular expression. In many applications
21 a delimiter character, commonly
22 .LR / ,
23 bounds a regular expression.
24 In the following specification for regular expressions
25 the word `character' means any character (rune) but newline.
26 .PP
27 The syntax for a regular expression
28 .B e0
29 is
30 .IP
31 .EX
32 e3: literal | charclass | '.' | '^' | '$' | '(' e0 ')'
34 e2: e3
35 | e2 REP
37 REP: '*' | '+' | '?'
39 e1: e2
40 | e1 e2
42 e0: e1
43 | e0 '|' e1
44 .EE
45 .PP
46 A
47 .B literal
48 is any non-metacharacter, or a metacharacter
49 (one of
50 .BR .*+?[]()|\e^$ ),
51 or the delimiter
52 preceded by
53 .LR \e .
54 .PP
55 A
56 .B charclass
57 is a nonempty string
58 .I s
59 bracketed
60 .BI [ \|s\| ]
61 (or
62 .BI [^ s\| ]\fR);
63 it matches any character in (or not in)
64 .IR s .
65 A negated character class never
66 matches newline.
67 A substring
68 .IB a - b\f1,
69 with
70 .I a
71 and
72 .I b
73 in ascending
74 order, stands for the inclusive
75 range of
76 characters between
77 .I a
78 and
79 .IR b .
80 In
81 .IR s ,
82 the metacharacters
83 .LR - ,
84 .LR ] ,
85 an initial
86 .LR ^ ,
87 and the regular expression delimiter
88 must be preceded by a
89 .LR \e ;
90 other metacharacters
91 have no special meaning and
92 may appear unescaped.
93 .PP
94 A
95 .L .
96 matches any character.
97 .PP
98 A
99 .L ^
100 matches the beginning of a line;
101 .L $
102 matches the end of the line.
103 .PP
104 The
105 .B REP
106 operators match zero or more
107 .RB ( * ),
108 one or more
109 .RB ( + ),
110 zero or one
111 .RB ( ? ),
112 instances respectively of the preceding regular expression
113 .BR e2 .
114 .PP
115 A concatenated regular expression,
116 .BR "e1\|e2" ,
117 matches a match to
118 .B e1
119 followed by a match to
120 .BR e2 .
121 .PP
122 An alternative regular expression,
123 .BR "e0\||\|e1" ,
124 matches either a match to
125 .B e0
126 or a match to
127 .BR e1 .
128 .PP
129 A match to any part of a regular expression
130 extends as far as possible without preventing
131 a match to the remainder of the regular expression.
132 .SH "SEE ALSO"
133 .MR regexp (3)