Blame


1 0f4372e2 2022-10-03 op # diffstat
2 0f4372e2 2022-10-03 op
3 0f4372e2 2022-10-03 op Show diff statistics.
4 0f4372e2 2022-10-03 op
5 0f4372e2 2022-10-03 op #!/usr/bin/awk -f
6 0f4372e2 2022-10-03 op
7 c71d3308 2023-01-03 op AWK is great. All hail AWK!
8 0f4372e2 2022-10-03 op
9 36294a93 2023-01-09 op First, some utility functions. parsehdr extracts the number of lines
10 36294a93 2023-01-09 op (old or new) in the given hunk header line.
11 0f4372e2 2022-10-03 op
12 c71d3308 2023-01-03 op function parsehdr(s) {
13 c71d3308 2023-01-03 op s = gensub(".*,", "", 1, s)
14 c71d3308 2023-01-03 op s = gensub("^-", "", 1, s)
15 c71d3308 2023-01-03 op return s + 0
16 c71d3308 2023-01-03 op }
17 c71d3308 2023-01-03 op
18 36294a93 2023-01-09 op filename extracts the name of the file from a "+++ path" or "--- path"
19 36294a93 2023-01-09 op line.
20 c94e7f28 2023-01-09 op
21 c94e7f28 2023-01-09 op function filename(s) {
22 c94e7f28 2023-01-09 op s = gensub("^... ", "", 1, s)
23 c94e7f28 2023-01-09 op
24 c94e7f28 2023-01-09 op These lines have an optional tab followed by extra informations (the
25 c94e7f28 2023-01-09 op date for example) that needs to be removed too.
26 c94e7f28 2023-01-09 op
27 c94e7f28 2023-01-09 op s = gensub("\t.*", "", 1, s)
28 c94e7f28 2023-01-09 op return s
29 c94e7f28 2023-01-09 op }
30 c94e7f28 2023-01-09 op
31 c71d3308 2023-01-03 op Switches the current file to the one provided. It's a great place where
32 c71d3308 2023-01-03 op accumulate part of the summary showed at the end and to reset the
33 c71d3308 2023-01-03 op per-file counters.
34 c71d3308 2023-01-03 op
35 c71d3308 2023-01-03 op function switchfile(newfile) {
36 c71d3308 2023-01-03 op if (file != "") {
37 e03fd2f8 2023-01-09 op summary = sprintf("%s%4d+ %4d-\t%s\n",
38 c71d3308 2023-01-03 op summary, add, rem, file)
39 c71d3308 2023-01-03 op }
40 c71d3308 2023-01-03 op
41 c71d3308 2023-01-03 op add = 0
42 c71d3308 2023-01-03 op rem = 0
43 c71d3308 2023-01-03 op file = newfile
44 c71d3308 2023-01-03 op }
45 c71d3308 2023-01-03 op
46 36294a93 2023-01-09 op Now, the real "parser". Initialize the state to "out" since we're
47 36294a93 2023-01-09 op looking for the start of a diff.
48 c71d3308 2023-01-03 op
49 c71d3308 2023-01-03 op BEGIN {
50 c71d3308 2023-01-03 op state = "out"
51 c71d3308 2023-01-03 op }
52 c71d3308 2023-01-03 op
53 36294a93 2023-01-09 op Parse the changed file.
54 c71d3308 2023-01-03 op
55 c71d3308 2023-01-03 op state == "out" && /^\+\+\+ / {
56 c94e7f28 2023-01-09 op nfile = filename($0)
57 c71d3308 2023-01-03 op if (nfile == "/dev/null") {
58 c71d3308 2023-01-03 op
59 36294a93 2023-01-09 op When deleting a file, the name will be "/dev/null", but it's not a great
60 36294a93 2023-01-09 op name for the stats. Let's use the "old" name instead.
61 c71d3308 2023-01-03 op
62 c71d3308 2023-01-03 op nfile = delfile
63 c71d3308 2023-01-03 op }
64 c71d3308 2023-01-03 op
65 c71d3308 2023-01-03 op switchfile(nfile)
66 c71d3308 2023-01-03 op delfile = ""
67 c71d3308 2023-01-03 op }
68 c71d3308 2023-01-03 op
69 36294a93 2023-01-09 op Similarly, extract the "old" file name for when it's needed.
70 c71d3308 2023-01-03 op
71 c71d3308 2023-01-03 op state == "out" && /^--- / && file == "" {
72 c94e7f28 2023-01-09 op delfile = filename($0)
73 c71d3308 2023-01-03 op }
74 c71d3308 2023-01-03 op
75 36294a93 2023-01-09 op Match the start of a hunk
76 c71d3308 2023-01-03 op
77 c71d3308 2023-01-03 op state == "out" && /^@@ / {
78 c71d3308 2023-01-03 op
79 c71d3308 2023-01-03 op This part is a bit complicated, but all it does is extracting the number
80 c71d3308 2023-01-03 op of "new" and "old" lines showed in the hunk. A hunk header looks like this
81 c71d3308 2023-01-03 op (except for the initial '#' character)
82 c71d3308 2023-01-03 op
83 c71d3308 2023-01-03 op # @@ -55,7 +55,19 @@ ...
84 c71d3308 2023-01-03 op
85 c71d3308 2023-01-03 op So first extract the text inside the pair of "@@"
86 c71d3308 2023-01-03 op
87 c71d3308 2023-01-03 op s = gensub("@@ ", "", 1)
88 c71d3308 2023-01-03 op s = gensub(" @@.*", "", 1, s)
89 c71d3308 2023-01-03 op
90 c71d3308 2023-01-03 op and then parse each number.
91 c71d3308 2023-01-03 op
92 c71d3308 2023-01-03 op old = gensub(" .*", "", 1, s)
93 c71d3308 2023-01-03 op old = parsehdr(old)
94 c71d3308 2023-01-03 op
95 c71d3308 2023-01-03 op new = gensub(".* ", "", 1, s)
96 c71d3308 2023-01-03 op new = parsehdr(new)
97 c71d3308 2023-01-03 op
98 c71d3308 2023-01-03 op Don't forget to switch the state of the parser, now we're reading a
99 c71d3308 2023-01-03 op hunk.
100 c71d3308 2023-01-03 op
101 c71d3308 2023-01-03 op state = "in"
102 c71d3308 2023-01-03 op }
103 c71d3308 2023-01-03 op
104 c71d3308 2023-01-03 op Keep count of the added and removed line. Also, decrement the "old" and
105 c71d3308 2023-01-03 op "new" lines when needed, to know when we're done with the hunk.
106 c71d3308 2023-01-03 op
107 c71d3308 2023-01-03 op state == "in" && /^ / {
108 c71d3308 2023-01-03 op old--
109 c71d3308 2023-01-03 op new--
110 c71d3308 2023-01-03 op }
111 c71d3308 2023-01-03 op
112 c71d3308 2023-01-03 op state == "in" && /^-/ {
113 c71d3308 2023-01-03 op old--
114 c71d3308 2023-01-03 op rem++
115 c71d3308 2023-01-03 op totrem++
116 c71d3308 2023-01-03 op }
117 c71d3308 2023-01-03 op
118 c71d3308 2023-01-03 op state == "in" && /^\+/ {
119 c71d3308 2023-01-03 op new--
120 c71d3308 2023-01-03 op add++
121 c71d3308 2023-01-03 op totadd++
122 c71d3308 2023-01-03 op }
123 c71d3308 2023-01-03 op
124 c71d3308 2023-01-03 op When there are no more "new" and "old" lines to read, go back to the
125 c71d3308 2023-01-03 op "out" state, ready to read another hunk or another file.
126 c71d3308 2023-01-03 op
127 c71d3308 2023-01-03 op state == "in" && old <= 0 && new <= 0 {
128 c71d3308 2023-01-03 op state = "out"
129 c71d3308 2023-01-03 op }
130 c71d3308 2023-01-03 op
131 0f4372e2 2022-10-03 op Don't be a sink! Continue the pipeline so we can further save or apply
132 0f4372e2 2022-10-03 op the diff.
133 0f4372e2 2022-10-03 op
134 0f4372e2 2022-10-03 op // { print $0 }
135 0f4372e2 2022-10-03 op
136 36294a93 2023-01-09 op At the end, print the stats.
137 0f4372e2 2022-10-03 op
138 0f4372e2 2022-10-03 op END {
139 36294a93 2023-01-09 op
140 36294a93 2023-01-09 op It's better to flush the output here, otherwise the stats (printed to
141 36294a93 2023-01-09 op stderr and unbuffered) may be interleaved with the output (on stdout,
142 36294a93 2023-01-09 op buffered.)
143 36294a93 2023-01-09 op
144 c71d3308 2023-01-03 op fflush()
145 36294a93 2023-01-09 op
146 36294a93 2023-01-09 op Generate the stat summary for the last processed file
147 36294a93 2023-01-09 op
148 c71d3308 2023-01-03 op switchfile("")
149 c71d3308 2023-01-03 op
150 36294a93 2023-01-09 op Print the stat to the standard error, to avoid "changing" the patch.
151 36294a93 2023-01-09 op
152 36294a93 2023-01-09 op Unfortunately, there doesn't seem to be a "built-in" way of printing to
153 36294a93 2023-01-09 op stderr other than using the pseudo-device "/dev/stderr".
154 36294a93 2023-01-09 op
155 c71d3308 2023-01-03 op printf("%s", summary) > "/dev/stderr"
156 e03fd2f8 2023-01-09 op printf("%4d+ %4d-\ttotal\n", totadd, totrem) > "/dev/stderr"
157 0f4372e2 2022-10-03 op }
158 0f4372e2 2022-10-03 op
159 36294a93 2023-01-09 op Some example usages:
160 0f4372e2 2022-10-03 op
161 ec64d77b 2023-01-09 op * cvs -q diff | diffstat > /tmp/diff
162 ec64d77b 2023-01-09 op * got diff | diffstat | ssh foo 'cd xyz && got patch'