Blob


1 .TH REGEXP9 7
2 .de EX
3 .nf
4 .ft B
5 ..
6 .de EE
7 .fi
8 .ft R
9 ..
10 .de LR
11 .if t .BR \\$1 \\$2
12 .if n .RB ` \\$1 '\\$2
13 ..
14 .de L
15 .nh
16 .if t .B \\$1
17 .if n .RB ` \\$1 '
18 ..
19 .SH NAME
20 regexp9 \- Plan 9 regular expression notation
21 .SH DESCRIPTION
22 This manual page describes the regular expression
23 syntax used by the Plan 9 regular expression library
24 .IR regexp9 (3).
25 It is the form used by
26 .IR egrep (1)
27 before
28 .I egrep
29 got complicated.
30 .PP
31 A
32 .I "regular expression"
33 specifies
34 a set of strings of characters.
35 A member of this set of strings is said to be
36 .I matched
37 by the regular expression. In many applications
38 a delimiter character, commonly
39 .LR / ,
40 bounds a regular expression.
41 In the following specification for regular expressions
42 the word `character' means any character (rune) but newline.
43 .PP
44 The syntax for a regular expression
45 .B e0
46 is
47 .IP
48 .EX
49 e3: literal | charclass | '.' | '^' | '$' | '(' e0 ')'
51 e2: e3
52 | e2 REP
54 REP: '*' | '+' | '?'
56 e1: e2
57 | e1 e2
59 e0: e1
60 | e0 '|' e1
61 .EE
62 .PP
63 A
64 .B literal
65 is any non-metacharacter, or a metacharacter
66 (one of
67 .BR .*+?[]()|\e^$ ),
68 or the delimiter
69 preceded by
70 .LR \e .
71 .PP
72 A
73 .B charclass
74 is a nonempty string
75 .I s
76 bracketed
77 .BI [ \|s\| ]
78 (or
79 .BI [^ s\| ]\fR);
80 it matches any character in (or not in)
81 .IR s .
82 A negated character class never
83 matches newline.
84 A substring
85 .IB a - b\f1,
86 with
87 .I a
88 and
89 .I b
90 in ascending
91 order, stands for the inclusive
92 range of
93 characters between
94 .I a
95 and
96 .IR b .
97 In
98 .IR s ,
99 the metacharacters
100 .LR - ,
101 .LR ] ,
102 an initial
103 .LR ^ ,
104 and the regular expression delimiter
105 must be preceded by a
106 .LR \e ;
107 other metacharacters
108 have no special meaning and
109 may appear unescaped.
110 .PP
112 .L .
113 matches any character.
114 .PP
116 .L ^
117 matches the beginning of a line;
118 .L $
119 matches the end of the line.
120 .PP
121 The
122 .B REP
123 operators match zero or more
124 .RB ( * ),
125 one or more
126 .RB ( + ),
127 zero or one
128 .RB ( ? ),
129 instances respectively of the preceding regular expression
130 .BR e2 .
131 .PP
132 A concatenated regular expression,
133 .BR "e1\|e2" ,
134 matches a match to
135 .B e1
136 followed by a match to
137 .BR e2 .
138 .PP
139 An alternative regular expression,
140 .BR "e0\||\|e1" ,
141 matches either a match to
142 .B e0
143 or a match to
144 .BR e1 .
145 .PP
146 A match to any part of a regular expression
147 extends as far as possible without preventing
148 a match to the remainder of the regular expression.
149 .SH "SEE ALSO"
150 .IR regexp9 (3)