3 awk \- pattern-directed scanning and processing language
41 for lines that match any of a set of patterns specified literally in
43 or in one or more files
48 there can be an associated action that will be performed
52 Each line is matched against the
53 pattern portion of every pattern-action statement;
54 the associated action is performed for each matched pattern.
57 means the standard input.
62 is treated as an assignment, not a file name,
63 and is executed at the time it would have been opened if it were a file name.
68 is an assignment to be done before the program
72 options may be present.
75 option defines the input field separator to be the regular expression
78 An input line is normally made up of fields separated by white space,
79 or by regular expression
81 The fields are denoted
86 refers to the entire line.
89 is null, the input line is split into one field per character.
91 To compensate for inadequate implementation of storage management,
94 option can be used to set the maximum size of the input record,
97 option to set the maximum number of fields.
105 in which it is not allowed to
106 run shell commands or open files
107 and the environment is not made available
112 A pattern-action statement has the form
114 .IB pattern " { " action " }
118 means print the line;
119 a missing pattern always matches.
120 Pattern-action statements are separated by newlines or semicolons.
122 An action is a sequence of statements.
123 A statement can be one of the following:
126 .ta \w'\fLdelete array[expression]'u
127 if(\fI expression \fP)\fI statement \fP\fR[ \fPelse\fI statement \fP\fR]\fP
128 while(\fI expression \fP)\fI statement\fP
129 for(\fI expression \fP;\fI expression \fP;\fI expression \fP)\fI statement\fP
130 for(\fI var \fPin\fI array \fP)\fI statement\fP
131 do\fI statement \fPwhile(\fI expression \fP)
134 {\fR [\fP\fI statement ... \fP\fR] \fP}
135 \fIexpression\fP #\fR commonly\fP\fI var = expression\fP
136 print\fR [ \fP\fIexpression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP
137 printf\fI format \fP\fR[ \fP,\fI expression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP
138 return\fR [ \fP\fIexpression \fP\fR]\fP
139 next #\fR skip remaining patterns on this input line\fP
140 nextfile #\fR skip rest of this file, open next, start at top\fP
141 delete\fI array\fP[\fI expression \fP] #\fR delete an array element\fP
142 delete\fI array\fP #\fR delete all elements of array\fP
143 exit\fR [ \fP\fIexpression \fP\fR]\fP #\fR exit immediately; status is \fP\fIexpression\fP
147 Statements are terminated by
148 semicolons, newlines or right braces.
153 String constants are quoted \&\fL"\ "\fR,
154 with the usual C escapes recognized within.
155 Expressions take on string or numeric values as appropriate,
156 and are built using the operators
158 (exponentiation), and concatenation (indicated by white space).
161 ! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?:
162 are also available in expressions.
163 Variables may be scalars, array elements
167 Variables are initialized to the null string.
168 Array subscripts may be any string,
169 not necessarily numeric;
170 this allows for a form of associative memory.
171 Multiple subscripts such as
173 are permitted; the constituents are concatenated,
174 separated by the value of
179 statement prints its arguments on the standard output
184 is present or on a pipe if
186 is present), separated by the current output field separator,
187 and terminated by the output record separator.
191 may be literal names or parenthesized expressions;
192 identical string values in different statements denote
196 statement formats its expression list according to the format
199 The built-in function
201 closes the file or pipe
203 The built-in function
205 flushes any buffered output for the file or pipe
209 is omitted or is a null string, all open files are flushed.
211 The mathematical functions
220 Other built-in functions:
224 If its argument is a string, the string's length is returned.
225 If its argument is an array, the number of subscripts in the array is returned.
226 If no argument, the length of
231 random number on (0,1)
236 and returns the previous seed.
239 truncates to an integer value
242 converts its numerical argument, a character number, to a
246 .BI substr( s , " m" , " n\fL)
251 that begins at position
255 .BI index( s , " t" )
260 occurs, or 0 if it does not.
262 .BI match( s , " r" )
265 where the regular expression
267 occurs, or 0 if it does not.
272 are set to the position and length of the matched string.
274 .BI split( s , " a" , " fs\fL)
284 The separation is done with the regular expression
286 or with the field separator
291 An empty string as field separator splits the string
292 into one array element per character.
294 .BI sub( r , " t" , " s\fL)
297 for the first occurrence of the regular expression
310 except that all occurrences of the regular expression
315 return the number of replacements.
317 .BI sprintf( fmt , " expr" , " ...\fL)
318 the string resulting from formatting
328 and returns its exit status
333 with all upper-case characters translated to their
334 corresponding lower-case equivalents.
339 with all lower-case characters translated to their
340 corresponding upper-case equivalents.
347 to the next input record from the current input file;
352 to the next record from
367 returns the next line of output from
371 returns 1 for a successful input,
372 0 for end of file, and \-1 for an error.
374 Patterns are arbitrary Boolean combinations
377 of regular expressions and
378 relational expressions.
379 Regular expressions are as in
381 Isolated regular expressions
382 in a pattern apply to the entire line.
383 Regular expressions may also occur in
384 relational expressions, using the operators
389 is a constant regular expression;
390 any string (constant or variable) may be used
391 as a regular expression, except in the position of an isolated regular expression
394 A pattern may consist of two patterns separated by a comma;
395 in this case, the action is performed for all lines
396 from an occurrence of the first pattern
397 though an occurrence of the second.
399 A relational expression is one of the following:
401 .I expression matchop regular-expression
403 .I expression relop expression
405 .IB expression " in " array-name
407 .BI ( expr , expr,... ") in " array-name
411 is any of the six relational operators in C,
420 A conditional is an arithmetic expression,
421 a relational expression,
422 or a Boolean combination
429 may be used to capture control before the first input line is read
434 do not combine with other patterns.
436 Variable names with special meanings:
440 conversion format used when converting numbers
445 regular expression used to separate fields; also settable
450 number of fields in the current record
453 ordinal number of the current record
456 ordinal number of the current record in the current file
459 the name of the current input file
462 input record separator (default newline)
465 output field separator (default blank)
468 output record separator (default newline)
471 output format for numbers (default
475 separates multiple subscripts (default 034)
478 argument count, assignable
481 argument array, assignable;
482 non-null members are taken as file names
485 array of environment variables; subscripts are names.
488 Functions may be defined (at the position of a pattern-action statement) thus:
491 function foo(a, b, c) { ...; return x }
493 Parameters are passed by value if scalar and by reference if array name;
494 functions may be called recursively.
495 Parameters are local to the function; all other variables are global.
496 Thus local variables may be created by providing excess parameters in
497 the function definition.
502 Print lines longer than 72 characters.
506 Print first two fields in opposite order.
509 BEGIN { FS = ",[ \et]*|[ \et]+" }
514 Same, with input fields separated by comma and/or blanks and tabs.
518 END { print "sum is", s, " average is", s/NR }
522 Add up first column, print sum and average.
526 Print all lines between start/stop pairs.
529 BEGIN { # Simulate echo(1)
530 for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
540 A. V. Aho, B. W. Kernighan, P. J. Weinberger,
542 The AWK Programming Language,
543 Addison-Wesley, 1988. ISBN 0-201-07981-X
545 There are no explicit conversions between numbers and strings.
546 To force an expression to be treated as a number add 0 to it;
547 to force it to be treated as a string concatenate
550 The scope rules for variables in functions are a botch;
553 UTF is not always dealt with correctly,
556 does make an attempt to do so.
559 function with an empty string as final argument now copes
560 with UTF in the string being split.