Home  >  Article  >  Backend Development  >  Summary of high concurrency solutions for PHP read and write file conflicts

Summary of high concurrency solutions for PHP read and write file conflicts

伊谢尔伦
伊谢尔伦Original
2017-07-17 10:09:212346browse

In PHP, flock doesn't seem to work that well! In the case of multiple concurrency, it seems that resources are often monopolized and not released immediately, or not released at all, resulting in deadlock, which causes the server's CPU usage to be very high, and sometimes even causes the server to die completely. It seems that this happens in many linux/unix systems. Therefore, you must consider carefully before using flock.

If we use flock() properly, it is entirely possible to solve the deadlock problem. Of course, if you do not consider using the flock() function, there will also be a good solution to our problem. The solutions are roughly summarized as follows.
Option 1: Set a timeout when locking the file. The rough implementation is as follows:

if($fp=fopen($fileName,'a')){
 $startTime=microtime();
 do{
  $canWrite=flock($fp,LOCK_EX);
  if(!$canWrite){
   usleep(round(rand(0,100)*1000));
  }
 }while((!$canWrite)&&((microtime()-$startTime)<1000));
 if($canWrite){
  fwrite($fp,$dataToSave);
 }
 fclose($fp);
}

The timeout is set to 1ms. If the lock is not obtained within this time, it will be obtained repeatedly until the right to operate the file is obtained, of course. If the timeout limit has been reached, you must exit immediately and give up the lock to allow other processes to operate.

Option 2: Do not use the flock function and use temporary files to solve the problem of read and write conflicts. The general principle is as follows:
(1) Put a copy of the file that needs to be updated into our temporary file directory, save the last modification time of the file to a variable, and pick a random one for this temporary file, which is not easy Duplicate filename.
(2) After updating this temporary file, check whether the last update time of the original file is consistent with the previously saved time.
(3) If the last modification time is the same, rename the modified temporary file to the original file. In order to ensure that the file status is updated synchronously, the file status needs to be cleared.
(4) However, if the last modification time is consistent with the previously saved one, it means that the original file has been modified during this period. At this time, the temporary file needs to be deleted and then return false, indicating that the file has been modified at this time. There are other processes in progress.
The implementation code is as follows:

$dir_fileopen=&#39;tmp&#39;;
function randomid(){
    return time().substr(md5(microtime()),0,rand(5,12));
}
function cfopen($filename,$mode){
    global $dir_fileopen;
    
clearstatcache
();
    do{
  $id=md5(randomid(rand(),TRUE));
        $tempfilename=$dir_fileopen.&#39;/&#39;.$id.md5($filename);
    } while(
file_exists
($tempfilename));
    if(file_exists($filename)){
        $newfile=false;
        copy($filename,$tempfilename);
    }else{
        $newfile=true;
    }
    $fp=fopen($tempfilename,$mode);
    return $fp?array($fp,$filename,$id,@
filemtime
($filename)):false;
}
function cfwrite($fp,$string){
 return fwrite($fp[0],$string);
}
function cfclose($fp,$debug=&#39;off&#39;){
    global $dir_fileopen;
    $success=fclose($fp[0]);
    clearstatcache();
    $tempfilename=$dir_fileopen.&#39;/&#39;.$fp[2].md5($fp[1]);
    if((@filemtime($fp[1])==$fp[3])||($fp[4]==true&&!file_exists($fp[1]))||$fp[5]==true){
        rename($tempfilename,$fp[1]);
    }else{
        unlink($tempfilename);
  //说明有其它进程 在操作目标文件,当前进程被拒绝
        $success=false;
    }
    return $success;
}
$fp=cfopen(&#39;lock.txt&#39;,&#39;a+&#39;);
cfwrite($fp,"welcome to beijing.\n");
fclose($fp,&#39;on&#39;);

For the functions used in the above code, it is necessary to explain:
(1) rename(); to rename a file or a directory, this function is actually more like mv in linux. It is convenient to update the path or name of a file or directory. But when I test the above code in window, if the new file name already exists, a notice will be given saying that the current file already exists. But it works fine under linux.
(2) clearstatcache(); Clear the status of the file. PHP will cache all file attribute information to provide higher performance, but sometimes, multiple processes are deleting or updating files. , PHP has not had time to update the file attributes in the cache, which can easily lead to the fact that the last update time accessed is not real data. So here you need to use this function to clear the saved cache.

Option 3: Randomly read and write the operated files to reduce the possibility of concurrency.
This solution seems to be used more often when recording user access logs. Previously, we needed to define a random space. The larger the space, the smaller the possibility of concurrency. Assuming that the random read and write space is [1-500], then the distribution of our log files ranges from log1 to log500. Every time a user accesses, data is randomly written to any file between log1~log500. At the same time, there are two processes recording logs. Process A may be the updated log32 file, but what about process B? Then the update at this time may be log399. You must know that if you want process B to also operate log32, the probability is basically 1/500, which is almost equal to zero. When we need to analyze access logs, here we only need to merge these logs first and then analyze them. One benefit of using this solution to record logs is that the possibility of queuing process operations is relatively small, allowing the process to complete each operation very quickly.

Option 4: Put all processes to be operated into a queue. Then put a dedicated service to complete the file operation. Each excluded process in the queue is equivalent to the first specific operation, so for the first time our service only needs to obtain the specific operation items from the queue. If there are still a large number of file operation processes here, it doesn't matter. , just queue to the back of our queue. As long as you are willing to queue, it doesn’t matter how long the queue is.

For the previous solutions, each has its own benefits! It can be roughly divided into two categories:
(1) queuing is required (slow impact), such as options 1, 2, and 4
(2) no queuing is required. (Fast impact) Option 3
When designing the cache system, generally we will not use Option 3. Because the analysis program and the writing program of Plan 3 are not synchronized, when writing, the difficulty of analysis is not considered at all, as long as the writing is good. Just imagine, if we also use random file reading and writing when updating a cache, it seems that a lot of processes will be added when reading the cache. But options one and two are completely different. Although the writing time needs to wait (when acquiring the lock is unsuccessful, it will be acquired repeatedly), but reading the file is very convenient. The purpose of adding cache is to reduce data reading bottlenecks and thereby improve system performance.

The above is the detailed content of Summary of high concurrency solutions for PHP read and write file conflicts. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn