You can roll the log by hour and use PHP regular log analysis to solve the problem
$logLine ='127.0.0.1 - - [22/May/2015:17:09:13 +0800] "GET /sale/images/y-select.png HTTP/1.1" 200 1095';
$pattern = '/^(?P<ip>[0-9.]+) - - \[(?P<time>[^\]]+)\]+ "GET (?P<url>[^ ]+) HTTP\/1.[1|0|2]" (?P<status>[0-9.]+) (?P<size>[0-9.]+)/i';
preg_match($pattern, $log, $match);
//var_dump($match);
$ip = $match['ip'];
$time = strtotime($match['time']);
$url = $match['url'];
$status = $match['status'];
$size = $match['size'];
printf("IP:%s 访问时间:%s URL:%s 状态:%s 文件尺寸:%s",$ip,$time,$url,$status,$size);
You can also do this
Use regular expressions to separate Apache log files
www.MyException.Cn Shared by netizens in: 2015- 08-26 Views: 17 times
Use regular expressions to separate Apache log files
Example of Apache log files in common log format:
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
Example of Apache log file in combined log format:
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0 " 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)"
The IP address of the client.
The RFC1413 identity determined by the client identd process, the symbol "-" in the output indicates that the information here is invalid.
The client ID (userid) obtained by the HTTP authentication system for accessing the webpage. If the webpage is not password protected, this item will be "-".
The time when the server completes request processing.
The protocol used by the resource requested by the client's action.
The status code returned by the server to the client.
The number of bytes returned to the client excluding response headers. If no information is returned, this item should be "-".
"Referer" request header.
"User-Agent" request header.
The regular expression used to extract information consists of:
^: matches the beginning of each line.
([0-9.]+)s : Match IP address.
([w.-]+)s: matches identity, consisting of numbers, letters, underscores or dot separators.
([w.-]+)s: matches userid, consisting of numbers, letters, underscores or dot separators.
([[^[]]+])s: matching time.
"((?:[^"]|")+)"s: Match request information, escaped double quotes may appear in double quotes.
(d{3})s: Match status code.
(d+ |-)s: Match the number of response bytes or -.
"((?:[^"]|")+)"s: Match the "Referer" request header, and escaped double quotes may appear in the double quotes.
"((?:[^"]|")+)": Matches the "User-Agent" request header, and escaped double quotes may appear in the double quotes.
$: Matches the end of the line.
The final expression As follows:
^([0-9.]+)s([w.-]+)s([w.-]+)s([[^[]]+])s"((?:[^ "]|")+)"s(d{3})s(d+|-)s"((?:[^"]|")+)"s"((?:[^"]|") +)"$
The above introduces the PHP regular parsing apache log file, including the content. I hope it will be helpful to friends who are interested in PHP tutorials.