Home  >  Article  >  Backend Development  >  Play with Nginx logs purely manually

Play with Nginx logs purely manually

WBOY
WBOYOriginal
2016-08-08 09:22:34829browse
Nginx logs are an undiscovered treasure for most people. Summarizing my previous experience in building a log analysis system, I would like to share with you the purely manual analysis method of Nginx logs. There are two places for Nginx log related configuration: access_log and log_format. Default format:
access_log <span>/</span>data<span>/</span>logs<span>/</span>nginx<span>-</span>access<span>.</span><span>log</span><span>;</span>log_format old <span><em>'$remote_addr [$time_local] $status $request_time $body_bytes_sent '</em></span><span><em>'"$request" "$http_referer" "$http_user_agent"'</em></span><span>;</span>
I believe that most people who have used Nginx are familiar with the default Nginx log format configuration and the content of the log. But the default configuration and format, while readable, is difficult to compute. Nginx log flush related strategies can be configured: For example, set the buffer and flush the disk only when the buffer is full 32k; if the buffer is less than 5 seconds, the forced flush configuration is as follows:
access_log <span>/</span>data<span>/</span>logs<span>/</span>nginx<span>-</span>access<span>.</span><span>log</span> buffer<span>=</span><span><em>32k</em></span> flush<span>=</span><span><em>5s</em></span><span>;</span>
This determines whether to see logs and logs in real time Impact on disk IO. There are many variables that Nginx logs can record that do not appear in the default configuration: For example: Request data size: $request_length
Return data size: $bytes_sent
Request time: $request_time
Connection number used: $ connection
Number of requests for the current connection: $connection_requestsThe default format of Nginx is not calculable. You need to find a way to convert it into a calculable format, such as using the control character ^A (ctrl+v ctrl+a under Mac) to split each field. The format of log_format can become like this:
log_format new <span><em>'$remote_addr^A$http_x_forwarded_for^A$host^A$time_local^A$status^A'</em></span><span><em>'$request_time^A$request_length^A$bytes_sent^A$http_referer^A$request^A$http_user_agent'</em></span><span>;</span>
After that, it can be analyzed through common Linux command line tools:
  • Find the URL with the highest access frequency and the number of times:

    cat access.log | awk -F ‘^A’ ‘{print $10}’ | sort | uniq -c

  • Find the current log file for 500 bad access:

    cat access.log | awk -F ‘^A’ ‘{if($5 == 500) print $0}’

  • Find the number of 500 errors in the current log file:

    cat access.log | awk -F ‘^A’ ‘{if($5 == 500) print $0}’ | wc -l

  • Find the number of 500 error accesses in a certain minute:

    cat access.log | awk -F ‘^A’ ‘{if($5 == 500) print $0}’ | grep ‘09:00’ | wc-l

  • Find slow requests that take more than 1s:

    tail -f access.log | awk -F ‘^A’ ‘{if($6>1) print $0}’

  • If you only want to view certain bits:

    tail -f access.log | awk -F ‘^A’ ‘{if($6>1) print $3″|”$4}’

  • Find the URL with the most 502 errors:

    cat access.log | awk -F ‘^A’ ‘{if($5==502) print $11}’ | sort | uniq -c

  • Find 200 blank pages

    cat access.log | awk -F '^A' '{if($5==200 && $8 < 100) print $3″|”$4″|”$11″|”$6}'

  • View real-time log data flow

    tail -f access.log | cat -e

    or

    tail -f access.log | tr '^A' '|'

  • SummaryFollowing this idea, many other analyzes can be done, such as UA The most accessed; the IP with the highest access frequency; request time-consuming analysis; request return packet size analysis; etc. This is the prototype of a large-scale web log analysis system. This format is also very convenient for subsequent large-scale batching and streaming calculations.

    The above is an introduction to manually playing with Nginx logs, including various aspects. I hope it will be helpful to friends who are interested in PHP tutorials.

    Statement:
    The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn