AWK 文字处理
source link: http://blog.tangzhixiong.com/post-0012-awk.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
AWK 文字处理
AWK 文字处理
Notice:
感觉 The AWK Programming Language 讲得更清晰,虽然是进阶,但让我再来一遍,我就不看陈皓的简明教程了。本书不愧 9.5 的评分,入门就看看原书 第一章(我的笔记),进阶就继续看后几章。)
入门:AWK 简明教程
Data: netstat.txt
Proto Recv-Q Send-Q Local-Address Foreign-Address State
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN
tcp 0 0 coolshell.cn:80 124.205.5.146:18245 TIME_WAIT
tcp 0 0 coolshell.cn:80 61.140.101.185:37538 FIN_WAIT2
tcp 0 0 coolshell.cn:80 110.194.134.189:1032 ESTABLISHED
tcp 0 0 coolshell.cn:80 123.169.124.111:49809 ESTABLISHED
tcp 0 0 coolshell.cn:80 116.234.127.77:11502 FIN_WAIT2
tcp 0 0 coolshell.cn:80 123.169.124.111:49829 ESTABLISHED
tcp 0 0 coolshell.cn:80 183.60.215.36:36970 TIME_WAIT
tcp 0 4166 coolshell.cn:80 61.148.242.38:30901 ESTABLISHED
tcp 0 1 coolshell.cn:80 124.152.181.209:26825 FIN_WAIT1
tcp 0 0 coolshell.cn:80 110.194.134.189:4796 ESTABLISHED
tcp 0 0 coolshell.cn:80 183.60.212.163:51082 TIME_WAIT
tcp 0 1 coolshell.cn:80 208.115.113.92:50601 LAST_ACK
tcp 0 0 coolshell.cn:80 123.169.124.111:49840 ESTABLISHED
tcp 0 0 coolshell.cn:80 117.136.20.85:50025 FIN_WAIT2
tcp 0 0 :::22 :::* LISTEN
Some Code Examples:
# retrive spefic fields
awk '{print $1, $4}' netstat.txt
# format fields
awk '{printf "%-8s %-8s %-8s %-18s %-22s %-15s\n",$1,$2,$3,$4,$5,$6}' netstat.txt
# apply filter
awk '$3==0 && $6=="LISTEN" ' netstat.txt
awk ' $3>0 {print $0}' netstat.txt
awk '$3==0 && $6=="LISTEN" || NR==1 ' netstat.txt # plus first row
built-in variables
variable 说明
$0
当前记录(这个变量中存放着整个行的内容)
$1~$n
当前记录的第n个字段,字段间由FS分隔
FS
Field Separator. 输入字段分隔符 默认是空格或Tab
NF
Number of Fields. 当前记录中的字段个数,就是有多少列
NR
Number of Lines Read so far. 已经读出的记录数,就是行号,从1开始,如果有多个文件话,这个值也是不断累加中。
FNR
当前记录数,与NR不同的是,这个值会是各个文件自己的行号
RS
Record Separator. 输入的记录分隔符, 默认为换行符
OFS
Output FS. 输出字段分隔符, 默认也是空格
ORS
Output RS. 输出的记录分隔符,默认为换行符
FILENAME
当前输入文件的名字
more examples
awk 'BEGIN{FS=":"} {print $1,$3,$6}' /etc/passwd
awk -F: '{print $1,$3,$6}' /etc/passwd
# multi FS
awk -F '[;:]'
awk -F: '{print $1,$3,$6}' OFS="\t" /etc/passwd
# regexpr
awk '$6 ~ /WAIT|TIME/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt # match WAIT or TIME
# awk 拆分文件很简单,使用重定向就好了
awk 'NR!=1{print > $6}' netstat.txt
# bonus
awk 'NR!=1{if($6 ~ /TIME|ESTABLISHED/) print > "1.txt";
else if($6 ~ /LISTEN/) print > "2.txt";
else print > "3.txt" }' netstat.txt
for i in {1,2,3}.txt; do echo $i; cat $i; done
# sum file size
$ ls -l *.cpp *.c *.h | awk '{sum+=$5} END {print sum}'
awk 'NR!=1{a[$6]++;} END {for (i in a) print i ", " a[i];}' netstat.txt
awk -f ./call.awk score.txt
a complicated example
$ file cal.awk
#!/bin/awk -f
#运行前
BEGIN {
math = 0, english = 0, computer = 0
printf "NAME NO. MATH ENGLISH COMPUTER TOTAL\n"
printf "---------------------------------------------\n"
}
#运行中
{
math+=$3, english+=$4, computer+=$5
printf "%-6s %-6s %4d %8d %8d %8d\n", $1, $2, $3,$4,$5, $3+$4+$5
}
#运行后
END {
printf "---------------------------------------------\n"
printf " TOTAL:%10d %8d %8d \n", math, english, computer
printf "AVERAGE:%10.2f %8.2f %8.2f\n", math/NR, english/NR, computer/NR
}
$ score.txt
Marry 2143 78 84 77 239
Jack 2321 66 78 45 189
Tom 2122 48 77 71 196
Mike 2537 87 97 95 279
和环境变量交互:(使用 -v
参数和 ENVIRON
)
x=5 && y = 10 && export y
awk -v val=$x '{print $1, $2, $3, $4+val, $5+ENVIRON["y"]}' OFS="\t" score.txt
more tricks
# 打印99乘法表
seq 9 | sed 'H;g' | awk -v RS='' '{for(i=1;i<=NF;i++)printf("%dx%d=%d%s", i, NR, i*NR, i==NR?"\n":"\t")}'
The AWK Programming Language
写了点读书笔记,戳这里
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK