Blob


1 .TH SORT 1
2 .SH NAME
3 sort \- sort and/or merge files
4 .SH SYNOPSIS
5 .B sort
6 [
7 .BI -cmuMbdf\&inrwt x
8 ]
9 [
10 .BI + pos1
11 [
12 .BI - pos2
13 ] ...
14 ] ...
15 [
16 .B -k
17 .I pos1
18 [
19 .I ,pos2
20 ]
21 ] ...
22 .br
23 \h'0.5in'
24 [
25 .B -o
26 .I output
27 ]
28 [
29 .B -T
30 .I dir
31 \&...
32 ]
33 [
34 .I option
35 \&...
36 ]
37 [
38 .I file
39 \&...
40 ]
41 .SH DESCRIPTION
42 .I Sort\^
43 sorts
44 lines of all the
45 .I files
46 together and writes the result on
47 the standard output.
48 If no input files are named, the standard input is sorted.
49 .PP
50 The default sort key is an entire line.
51 Default ordering is
52 lexicographic by runes.
53 The ordering is affected globally by the following options,
54 one or more of which may appear.
55 .TP
56 .B -M
57 Compare as months.
58 The first three
59 non-white space characters
60 of the field
61 are folded
62 to upper case
63 and compared
64 so that
65 .L JAN
66 precedes
67 .LR FEB ,
68 etc.
69 Invalid fields
70 compare low to
71 .LR JAN .
72 .TP
73 .B -b
74 Ignore leading white space (spaces and tabs) in field comparisons.
75 .TP
76 .B -d
77 `Phone directory' order:
78 only letters,
79 accented letters,
80 digits and white space
81 are significant in comparisons.
82 .TP
83 .B -f
84 Fold lower case
85 letters onto upper case.
86 Accented characters are folded to their
87 non-accented upper case form.
88 .TP
89 .B -i
90 Ignore characters outside the
91 .SM ASCII
92 range 040-0176
93 in non-numeric comparisons.
94 .TP
95 .B -w
96 Like
97 .BR -i ,
98 but ignore only tabs and spaces.
99 .TP
100 .B -n
101 An initial numeric string,
102 consisting of optional white space,
103 optional plus or minus sign,
104 and zero or more digits with optional decimal point,
105 is sorted by arithmetic value.
106 .TP
107 .B -g
108 Numbers, like
109 .B -n
110 but with optional
111 .BR e -style
112 exponents, are sorted by value.
113 .TP
114 .B -r
115 Reverse the sense of comparisons.
116 .TP
117 .BI -t x\^
118 `Tab character' separating fields is
119 .IR x .
120 .PP
121 The notation
122 .BI + "pos1\| " - pos2\^
123 restricts a sort key to a field beginning at
124 .I pos1\^
125 and ending just before
126 .IR pos2 .
127 .I Pos1\^
128 and
129 .I pos2\^
130 each have the form
131 .IB m . n\f1,
132 optionally followed by one or more of the flags
133 .BR Mbdfginr ,
134 where
135 .I m\^
136 tells a number of fields to skip from the beginning of the line and
137 .I n\^
138 tells a number of characters to skip further.
139 If any flags are present they override all the global
140 ordering options for this key.
141 A missing
142 .BI \&. n\^
143 means
144 .BR \&.0 ;
145 a missing
146 .BI - pos2\^
147 means the end of the line.
148 Under the
149 .BI -t x\^
150 option, fields are strings separated by
151 .IR x ;
152 otherwise fields are
153 non-empty strings separated by white space.
154 White space before a field
155 is part of the field, except under option
156 .BR -b .
158 .B b
159 flag may be attached independently to
160 .IR pos1
161 and
162 .IR pos2.
163 .PP
164 The notation
165 .B -k
166 .IR pos1 [, pos2 ]
167 is how POSIX
168 .I sort
169 defines fields:
170 .I pos1
171 and
172 .I pos2
173 have the same format but different meanings.
174 The value of
175 .I m\^
176 is origin 1 instead of origin 0
177 and a missing
178 .BI \&. n\^
179 in
180 .I pos2
181 is the end of the field.
182 .PP
183 When there are multiple sort keys, later keys
184 are compared only after all earlier keys
185 compare equal.
186 Lines that otherwise compare equal are ordered
187 with all bytes significant.
188 .PP
189 These option arguments are also understood:
190 .TP \w'\fL-z\fIrecsize\fLXX'u
191 .B -c
192 Check that the single input file is sorted according to the ordering rules;
193 give no output unless the file is out of sort.
194 .TP
195 .B -m
196 Merge; assume the input files are already sorted.
197 .TP
198 .B -u
199 Suppress all but one in each
200 set of equal lines.
201 Ignored bytes
202 and bytes outside keys
203 do not participate in
204 this comparison.
205 .TP
206 .B -o
207 The next argument is the name of an output file
208 to use instead of the standard output.
209 This file may be the same as one of the inputs.
210 .TP
211 .BI -T dir
212 Put temporary files in
213 .I dir
214 rather than in
215 .BR /var/tmp .
216 .ne 4
217 .SH EXAMPLES
218 .TP
219 .L sort -u +0f +0 list
220 Print in alphabetical order all the unique spellings
221 in a list of words
222 where capitalized words differ from uncapitalized.
223 .TP
224 .L sort -t: +1 /adm/users
225 Print the users file
226 sorted by user name
227 (the second colon-separated field).
228 .TP
229 .L sort -umM dates
230 Print the first instance of each month in an already sorted file.
231 Options
232 .B -um
233 with just one input file make the choice of a
234 unique representative from a set of equal lines predictable.
235 .TP
236 .L
237 grep -n '^' input | sort -t: +1f +0n | sed 's/[0-9]*://'
238 A stable sort: input lines that compare equal will
239 come out in their original order.
240 .SH FILES
241 .BI /var/tmp/sort. <pid>.<ordinal>
242 .SH SOURCE
243 .B \*9/src/cmd/sort.c
244 .SH SEE ALSO
245 .MR uniq (1) ,
246 .MR look (1)
247 .SH DIAGNOSTICS
248 .I Sort
249 comments and exits with non-null status for various trouble
250 conditions and for disorder discovered under option
251 .BR -c .
252 .SH BUGS
253 An external null character can be confused
254 with an internally generated end-of-field character.
255 The result can make a sub-field not sort
256 less than a longer field.
257 .PP
258 Some of the options, e.g.
259 .B -i
260 and
261 .BR -M ,
262 are hopelessly provincial.