Home  >  Article  >  Backend Development  >  Can php use hadoop?

Can php use hadoop?

(*-*)浩
(*-*)浩Original
2019-10-14 15:07:164232browse

Hadoop is a distributed system infrastructure developed by the Apache Foundation. Users can develop distributed programs without understanding the underlying details of distribution. Make full use of the power of clusters for high-speed computing and storage.

Can php use hadoop?

Although Hadoop is written in java, Hadoop provides Hadoop stream. Hadoop stream provides an API that allows users to write map functions and reduce using any language. function. (Recommended learning: PHP video tutorial)

The key to Hadoop flow is that it uses UNIX standard flow as the interface between the program and Hadoop. Therefore, as long as any program can read data from the standard input stream and write data to the standard output stream, the map function and reduce function of the MapReduce program can be written in any language through the Hadoop stream.

For example:

bin/hadoop jar contrib/streaming/hadoop-streaming-0.20.203.0.jar -mapper /usr/local/hadoop/mapper.php -reducer /usr/local/hadoop/reducer.php -input test/* -output out4

The package introduced by Hadoop streaming: hadoop-streaming-0.20.203.0.jar. There is no hadoop-streaming.jar in the Hadoop root directory because streaming is a contrib. So you have to find it under contrib. Taking hadoop-0.20.2 as an example, it is here:

-input: Specify the path to the input hdfs file

-output: Specify the path to the output hdfs file

-mapper: Specify the map function

-reducer: Specify the reduce function

mapper function

mapper.php file, write Enter the following code:

#!/usr/local/php/bin/php
<?php
$word2count = array();
// input comes from STDIN (standard input)
// You can this code :$stdin = fopen(“php://stdin”, “r”);
while (($line = fgets(STDIN)) !== false) {
    // remove leading and trailing whitespace and lowercase
    $line = strtolower(trim($line));
    // split the line into words while removing any empty string
    $words = preg_split(&#39;/\W/&#39;, $line, 0, PREG_SPLIT_NO_EMPTY);
    // increase counters
    foreach ($words as $word) {
        $word2count[$word] += 1;
    }
}
// write the results to STDOUT (standard output)
// what we output here will be the input for the
// Reduce step, i.e. the input for reducer.py
foreach ($word2count as $word => $count) {
    // tab-delimited
    echo $word, chr(9), $count, PHP_EOL;
}
?>

The above is the detailed content of Can php use hadoop?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn