以下是log日志,希望能够通过正则提取到各个字段的内容,任何语言的正则都可以
<code>=====================[2016-03-03 14:56:36]================== IP: 127.0.0.1 Agent: Dalvik/2.1.0 (Linux; U; Android 5.1.1; jacinto6evm Build/LMY48P) URL: http://127.0.0.1/report?power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0} POST: power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0} COOKIE: C_TOKEN=98f1706e92ab-e3195b005d65c4aa7df00566be939841 ErrorCode: 0 Result: {"error_code":0,"error_msg":"","data":null,"time":1456988196} =====================[2016-03-03 14:56:36]================== IP:127.0.0.1 Agent: Dalvik/2.1.0 (Linux; U; Android 5.1.1; jacinto6evm Build/LMY48P) URL: http://127.0.0.1/report?power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0} POST: power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0} COOKIE: C_TOKEN=98f1706e92ab-e3195b005d65c4aa7df00566be939841 ErrorCode: 0 Result: {"error_code":0,"error_msg":"","data":null,"time":1456988196}</code>
谢谢了,我自己写出来始终提取有问题
这是我自己写的
[^\[]+\[([^]]+)][^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+.*
阿里云显示是正常的,但是就是提取不出来,应该是什么细节没注意到
以下是log日志,希望能够通过正则提取到各个字段的内容,任何语言的正则都可以
<code>=====================[2016-03-03 14:56:36]================== IP: 127.0.0.1 Agent: Dalvik/2.1.0 (Linux; U; Android 5.1.1; jacinto6evm Build/LMY48P) URL: http://127.0.0.1/report?power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0} POST: power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0} COOKIE: C_TOKEN=98f1706e92ab-e3195b005d65c4aa7df00566be939841 ErrorCode: 0 Result: {"error_code":0,"error_msg":"","data":null,"time":1456988196} =====================[2016-03-03 14:56:36]================== IP:127.0.0.1 Agent: Dalvik/2.1.0 (Linux; U; Android 5.1.1; jacinto6evm Build/LMY48P) URL: http://127.0.0.1/report?power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0} POST: power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0} COOKIE: C_TOKEN=98f1706e92ab-e3195b005d65c4aa7df00566be939841 ErrorCode: 0 Result: {"error_code":0,"error_msg":"","data":null,"time":1456988196}</code>
谢谢了,我自己写出来始终提取有问题
这是我自己写的
[^\[]+\[([^]]+)][^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+.*
阿里云显示是正常的,但是就是提取不出来,应该是什么细节没注意到
贴一个perl5的,对perl5不熟悉,写的不好
<code class="perl">if ($line =~ m/[IP|Agent|URL|POST|COOKIE|ErrorCode|Result|=+\[]+[\:]?(\d*-\d*-\d* \d*:\d*:\d*|.*)/) { print $1."\n"; }</code>
然后是一个perl6的
<code class="perl">if $line ~~ /[ [\=]+\[(.*)\][\=]+ || [IP|Agent|URL|POST|COOKIE|ErrorCode|Result]\:(.*) ]/ { say $/; } </code>
log 文件过大,不建议用正则表达式,你可以通过按行读取、分割字符串的方式进行处理:
PHP 代码:
<code>$fp = fopen('xx.log', 'r'); while(!feof($fp)){ $line = trim(fgets($fp)); // 跳过空行 if(!$line){ continue; } // 以 ==== 字符串开头时 if(strpos($line, '====') === 0){ if($data){ //处理上一条记录 } $data = array(); } // 分割字符串 list($key, $value) = explode(':', $line, 2); // 写入到数组 $data[$key] = trim($value); } fclose($fp); </code>