Home >Backend Development >PHP Tutorial >hive uses php script to complete map/reduce

hive uses php script to complete map/reduce

WBOY
WBOYOriginal
2016-07-29 09:15:541216browse

hive sql has a relatively powerful function, which is that it can use external scripts to complete map/reduce. Usage is as follows:
TRANSFORM(….) USING ‘….’ AS (….).
USING can be processed using php scripts. See the example below for specific usage.

①, create a table:

<code>hive> CREATE TABLE <span>`member`</span>(
    >   <span>`id`</span><span>int</span>,
    >   <span>`user_name`</span><span>string</span>,
    >   <span>`passwd`</span><span>string</span>
    > )
    > row <span>format</span> delimited
    > fields terminated by <span>"\t"</span><span>//这句很必要,不然下面的文件导入会导致所有的值都变成null </span>
    > stored as textfile;</code>

②, prepare the following data /tmp/member.dat

<code>1     zhangsan     zs1024
2     lisi     ls1991
3     wangwu     ww2001
4     liumang     lm1234
5     linxing     lx1990</code>

③, import the data

<code>hive> <span>load</span> data <span>local</span> inpath <span>'/tmp/member.dat'</span> into <span>table</span> member;</code>

You can see:

<code>hive> <span>select</span> * <span>from</span> member;
OK
<span>1</span>       zhangsan        zs1024
<span>2</span>       lisi    ls1991
<span>3</span>       wangwu  ww2001
<span>4</span>       liumang lm1234
<span>5</span>       linxing lx1990</code>

The data is ready, now we are relatively third The column is encrypted once with md5. We use a php script to handle it. The code of the php script (/tmp/changePasswd.php) is as follows:

<code><span><span><?php</span><span>while</span>(!feof(STDIN)){
     <span>$line</span> = rtrim(fgets(STDIN), <span>"\n"</span>);      <span>//逐行读取</span><span>if</span> (<span>empty</span>(<span>$line</span>)) <span>continue</span>;     <span>//空,跳出当次循环</span><span>$data</span> = explode(<span>"\t"</span>, <span>$line</span>);    <span>//将切分出来的数组保存起来,下面判断使用</span><span>foreach</span>(<span>$data</span><span>as</span> &<span>$val</span>){
          <span>$val</span> = md5(<span>$val</span>);
     }
     <span>unset</span>(<span>$val</span>);
     <span>echo</span> implode(<span>"\t"</span>, <span>$data</span>) . <span>"\n"</span>;
}</span></code>

④. Add the php script to hive.

<code>hive> <span>add</span> file /tmp/changePasswd<span>.php</span><span>;</span></span></span></code>

⑤, use php script to execute map/reduce

<code>hive> insert overwrite table member
    > select TRANSFORM(<span>`i</span>d<span>`,</span><span>`u</span>ser_name<span>`,</span><span>`p</span>asswd<span>`)</span> using <span>"/usr/bin/php changePasswd.php"</span>
    > as (<span>`i</span>d<span>`,</span><span>`u</span>ser_name<span>`,</span><span>`p</span>asswd<span>`)</span> from member<span>;</span></code>

Finally, we can see that the data in the passwd column has changed:

<code>hive> <span>select</span> * <span>from</span> member;
OK
<span>1</span>       zhangsan        d03eed89429cc3006cc279322c2800c5
<span>2</span>       lisi    <span>063401506</span>c9d9f0e49a706e3779b7428
<span>3</span>       wangwu  ac5a8109dbbb46c9f69ffd5fc93c11f8
<span>4</span>       liumang fda8b97fd723bdbf6a754812b5ecec27
<span>5</span>       linxing <span>4035378</span>ace8936e93d95aa77e7e224d4</code>

Copyright statement: This article is an original article by the blogger, please indicate when reprinting Provenance.

The above introduces hive's use of PHP scripts to complete map/reduce, including aspects of it. I hope it will be helpful to friends who are interested in PHP tutorials.

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn