例如在文件1.log中
id=1
a=1,b=2,c=3,d=4,e=5....,z=100
id=2
a=3,b=4,d=20,e=6,f=7,...,z=30
id=3
a=4,b=4,c=2,d=5,e=8,...,z=29
....
现在我想统计在log中d的分布~
有什么好方法吗? grep每次都是输出整行,没法提取一个关键词的信息。
怪我咯2017-04-17 11:10:35
Awk solution:
#!/bin/bash awk -F"," ' NF == 0 {next} # skip blank line NF == 1 {printf "%s ", } # for id line # for data line { for (i = 1; i <= NF; i++) { split($i, a, "="); if (a[1] == "d") print $i; } } ' 1.log
The results are as follows:
id=1 d=4
id=2 d=20
id=3 d=5
The advantage of awk is that it can handle the input/output format in a more detailed manner.
ringa_lee2017-04-17 11:10:35
First remove the d= in id=, then
grep -o parameter Extract matching patterns. To grab the numbers again, just use awk or cut.
grep -v "id=[0-9]*" 1.log | grep -o "d=[0-9]*" | awk -F'=' '{ print }'
Or, use egrep,
grep -v "id=[0-9]*" 1.log | egrep -o "d=[0-9]+" | cut -d '=' -f 2
There are still many methods, and other sed ones can be used;
PHP中文网2017-04-17 11:10:35
Give me another idea...
mv 1.log /opt/www/1.log
Then use a php script to process it and create a new 1.php. The script is as follows:
<?php $str = file_get_contents("1.log"); $arr = explode(",",$str); foreach($arr as $k=>$v){ $b = explode("=",$v); if($b[0]=="d"){ $new_arr[] = $b[1]; } } print_r($new_arr); ?>
ringa_lee2017-04-17 11:10:35
This is more suitable to be done with awk or flex.
flex:
$ cat 1.l %% d=[0-9]*, printf("%d\n", atoi(yytext + 2)); .|\n $ flex 1.l && gcc lex.yy.c -lfl && ./a.out < 1.txt 4 20 5
怪我咯2017-04-17 11:10:35
This kind of log processing can be done with awk, perl, or ruby. Last perl version
perl -ne 'print if m/d=(\d+)/' your_log_file
PHP中文网2017-04-17 11:10:35
Use Python, it works well on any OS.
import re _re.compile('d=\d+') # readline in 'line' matched = _re.search(line) if matched: extracted = matched.group(0) print extracted
ringa_lee2017-04-17 11:10:35
Use the cut command.
cut -d 'Split characters' -f 'Select the meaning of which paragraph'
There seems to be another parameter -c.
$ cat 1.log |cut -c 0-4 |cut -d ',' -f 4
You can man it yourself.