eclipse中开发Hadoop2.x的Map/Reduce项目-mysql教程-PHP中文网

首页

数据库

mysql教程

eclipse中开发Hadoop2.x的Map/Reduce项目

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 pm 04:34 PM

eclipsemap开发

本文演示如何在Eclipse中开发一个Map/Reduce项目： 1、环境说明 Hadoop2.2.0 Eclipse?Juno SR2 Hadoop2.x-eclipse-plugin 插件的编译安装配置的过程参考：http://www.micmiu.com/bigdata/hadoop/hadoop2-x-eclipse-plugin-build-install/ 2、新建MR工程依次

本文演示如何在Eclipse中开发一个Map/Reduce项目： 1、环境说明

Hadoop2.2.0
Eclipse?Juno SR2
Hadoop2.x-eclipse-plugin 插件的编译安装配置的过程参考：http://www.micmiu.com/bigdata/hadoop/hadoop2-x-eclipse-plugin-build-install/

2、新建MR工程 依次点击 File →?New →?Ohter... ?选择 “Map/Reduce Project”，然后输入项目名称:micmiu_MRDemo，创建新项目: eclipse-mr-01

3、创建Mapper和Reducer 依次点击?File →?New →?Ohter... 选择Mapper，自动继承Mapper eclipse-mr-03

创建Reducer的过程同Mapper，具体的业务逻辑自己实现即可。本文就以官方自带的WordCount为例进行测试：

package com.micmiu.mr;
/**
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
public class WordCount {
  public static class TokenizerMapper 
       extends Mapper<object text intwritable>{
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();
    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
      StringTokenizer itr = new StringTokenizer(value.toString());
      while (itr.hasMoreTokens()) {
        word.set(itr.nextToken());
        context.write(word, one);
      }
    }
  }
  public static class IntSumReducer 
       extends Reducer<text> {
    private IntWritable result = new IntWritable();
    public void reduce(Text key, Iterable<intwritable> values, 
                       Context context
                       ) throws IOException, InterruptedException {
      int sum = 0;
      for (IntWritable val : values) {
        sum += val.get();
      }
      result.set(sum);
      context.write(key, result);
    }
  }
  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
    if (otherArgs.length != 2) {
      System.err.println("Usage: wordcount <in> <out>");
      System.exit(2);
    }
    //conf.set("fs.defaultFS", "hdfs://192.168.6.77:9000");
    Job job = new Job(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
    FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}</out></in></intwritable></text></object>

4、准备测试数据 micmiu-01.txt：

Hi Michael welcome to Hadoop 
more see micmiu.com

micmiu-02.txt：

Hi Michael welcome to BigData
more see micmiu.com

micmiu-03.txt：

Hi Michael welcome to Spark 
more see micmiu.com

把 micmiu 打头的三个文件上传到hdfs：

micmiu-mbp:Downloads micmiu$ hdfs dfs -copyFromLocal micmiu-*.txt /user/micmiu/test/input
micmiu-mbp:Downloads micmiu$ hdfs dfs -ls /user/micmiu/test/input
Found 3 items
-rw-r--r--   1 micmiu supergroup         50 2014-04-15 14:53 /user/micmiu/test/input/micmiu-01.txt
-rw-r--r--   1 micmiu supergroup         50 2014-04-15 14:53 /user/micmiu/test/input/micmiu-02.txt
-rw-r--r--   1 micmiu supergroup         49 2014-04-15 14:53 /user/micmiu/test/input/micmiu-03.txt
micmiu-mbp:Downloads micmiu$

5、配置运行参数 Run As →?Run Configurations… ，在Arguments中配置运行参数，例如程序的输入参数： eclipse-mr-05

6、运行 Run As -> Run on Hadoop ，执行完成后可以看到如下信息： eclipse-mr-06

到此Eclipse中调用Hadoop2x本地伪分布式模式执行MR演示成功。 ps：调用集群环境MR运行一直失败，暂时没有找到原因。 —————– ?EOF?@Michael Sun?—————–

原文地址：eclipse中开发Hadoop2.x的Map/Reduce项目, 感谢原作者分享。

声明

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系admin@php.cn

MySQL：初学者的基本技能Apr 18, 2025 am 12:24 AM

MySQL适合初学者学习数据库技能。1.安装MySQL服务器和客户端工具。2.理解基本SQL查询，如SELECT。3.掌握数据操作：创建表、插入、更新、删除数据。4.学习高级技巧：子查询和窗口函数。5.调试和优化：检查语法、使用索引、避免SELECT*，并使用LIMIT。

MySQL：结构化数据和关系数据库Apr 18, 2025 am 12:22 AM

MySQL通过表结构和SQL查询高效管理结构化数据，并通过外键实现表间关系。1.创建表时定义数据格式和类型。2.使用外键建立表间关系。3.通过索引和查询优化提高性能。4.定期备份和监控数据库确保数据安全和性能优化。

MySQL：解释的关键功能和功能Apr 18, 2025 am 12:17 AM

MySQL是一个开源的关系型数据库管理系统，广泛应用于Web开发。它的关键特性包括：1.支持多种存储引擎，如InnoDB和MyISAM，适用于不同场景；2.提供主从复制功能，利于负载均衡和数据备份；3.通过查询优化和索引使用提高查询效率。

SQL的目的：与MySQL数据库进行交互Apr 18, 2025 am 12:12 AM

SQL用于与MySQL数据库交互，实现数据的增、删、改、查及数据库设计。1）SQL通过SELECT、INSERT、UPDATE、DELETE语句进行数据操作；2）使用CREATE、ALTER、DROP语句进行数据库设计和管理；3）复杂查询和数据分析通过SQL实现，提升业务决策效率。

初学者的MySQL：开始数据库管理Apr 18, 2025 am 12:10 AM

MySQL的基本操作包括创建数据库、表格，及使用SQL进行数据的CRUD操作。1.创建数据库：CREATEDATABASEmy_first_db;2.创建表格：CREATETABLEbooks(idINTAUTO_INCREMENTPRIMARYKEY,titleVARCHAR(100)NOTNULL,authorVARCHAR(100)NOTNULL,published_yearINT);3.插入数据：INSERTINTObooks(title,author,published_year)VA

MySQL的角色：Web应用程序中的数据库Apr 17, 2025 am 12:23 AM

MySQL在Web应用中的主要作用是存储和管理数据。1.MySQL高效处理用户信息、产品目录和交易记录等数据。2.通过SQL查询，开发者能从数据库提取信息生成动态内容。3.MySQL基于客户端-服务器模型工作，确保查询速度可接受。

mysql：构建您的第一个数据库Apr 17, 2025 am 12:22 AM

构建MySQL数据库的步骤包括：1.创建数据库和表，2.插入数据，3.进行查询。首先，使用CREATEDATABASE和CREATETABLE语句创建数据库和表，然后用INSERTINTO语句插入数据，最后用SELECT语句查询数据。

MySQL：一种对数据存储的初学者友好方法Apr 17, 2025 am 12:21 AM

MySQL适合初学者，因为它易用且功能强大。1.MySQL是关系型数据库，使用SQL进行CRUD操作。2.安装简单，需配置root用户密码。3.使用INSERT、UPDATE、DELETE、SELECT进行数据操作。4.复杂查询可使用ORDERBY、WHERE和JOIN。5.调试需检查语法，使用EXPLAIN分析查询。6.优化建议包括使用索引、选择合适数据类型和良好编程习惯。

See all articles