search
HomeDatabaseMysql Tutorialeclipse中开发Hadoop2.x的Map/Reduce项目

本文演示如何在Eclipse中开发一个Map/Reduce项目: 1、环境说明 Hadoop2.2.0 Eclipse?Juno SR2 Hadoop2.x-eclipse-plugin 插件的编译安装配置的过程参考:http://www.micmiu.com/bigdata/hadoop/hadoop2-x-eclipse-plugin-build-install/ 2、新建MR工程 依次

eclipse-mr-01本文演示如何在Eclipse中开发一个Map/Reduce项目: 1、环境说明
  • Hadoop2.2.0
  • Eclipse?Juno SR2
  • Hadoop2.x-eclipse-plugin 插件的编译安装配置的过程参考:http://www.micmiu.com/bigdata/hadoop/hadoop2-x-eclipse-plugin-build-install/
2、新建MR工程 依次点击 File →?New →?Ohter... ?选择 “Map/Reduce Project”,然后输入项目名称:micmiu_MRDemo,创建新项目: eclipse-mr-01 eclipse-mr-02 3、创建Mapper和Reducer 依次点击?File →?New →?Ohter... 选择Mapper,自动继承Mapper eclipse-mr-03 eclipse-mr-04 创建Reducer的过程同Mapper,具体的业务逻辑自己实现即可。 本文就以官方自带的WordCount为例进行测试:
package com.micmiu.mr;
/**
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
public class WordCount {
  public static class TokenizerMapper 
       extends Mapper<object text intwritable>{
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();
    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
      StringTokenizer itr = new StringTokenizer(value.toString());
      while (itr.hasMoreTokens()) {
        word.set(itr.nextToken());
        context.write(word, one);
      }
    }
  }
  public static class IntSumReducer 
       extends Reducer<text> {
    private IntWritable result = new IntWritable();
    public void reduce(Text key, Iterable<intwritable> values, 
                       Context context
                       ) throws IOException, InterruptedException {
      int sum = 0;
      for (IntWritable val : values) {
        sum += val.get();
      }
      result.set(sum);
      context.write(key, result);
    }
  }
  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
    if (otherArgs.length != 2) {
      System.err.println("Usage: wordcount <in> <out>");
      System.exit(2);
    }
    //conf.set("fs.defaultFS", "hdfs://192.168.6.77:9000");
    Job job = new Job(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
    FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}</out></in></intwritable></text></object>
4、准备测试数据 micmiu-01.txt:
Hi Michael welcome to Hadoop 
more see micmiu.com
micmiu-02.txt:
Hi Michael welcome to BigData
more see micmiu.com
micmiu-03.txt:
Hi Michael welcome to Spark 
more see micmiu.com
把 micmiu 打头的三个文件上传到hdfs:
micmiu-mbp:Downloads micmiu$ hdfs dfs -copyFromLocal micmiu-*.txt /user/micmiu/test/input
micmiu-mbp:Downloads micmiu$ hdfs dfs -ls /user/micmiu/test/input
Found 3 items
-rw-r--r--   1 micmiu supergroup         50 2014-04-15 14:53 /user/micmiu/test/input/micmiu-01.txt
-rw-r--r--   1 micmiu supergroup         50 2014-04-15 14:53 /user/micmiu/test/input/micmiu-02.txt
-rw-r--r--   1 micmiu supergroup         49 2014-04-15 14:53 /user/micmiu/test/input/micmiu-03.txt
micmiu-mbp:Downloads micmiu$
5、配置运行参数 Run As →?Run Configurations… ,在Arguments中配置运行参数,例如程序的输入参数: eclipse-mr-05 6、运行 Run As -> Run on Hadoop ,执行完成后可以看到如下信息: eclipse-mr-06 到此Eclipse中调用Hadoop2x本地伪分布式模式执行MR演示成功。 ps:调用集群环境MR运行一直失败,暂时没有找到原因。 —————– ?EOF?@Michael Sun?—————–
Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
MySQL's Role: Databases in Web ApplicationsMySQL's Role: Databases in Web ApplicationsApr 17, 2025 am 12:23 AM

The main role of MySQL in web applications is to store and manage data. 1.MySQL efficiently processes user information, product catalogs, transaction records and other data. 2. Through SQL query, developers can extract information from the database to generate dynamic content. 3.MySQL works based on the client-server model to ensure acceptable query speed.

MySQL: Building Your First DatabaseMySQL: Building Your First DatabaseApr 17, 2025 am 12:22 AM

The steps to build a MySQL database include: 1. Create a database and table, 2. Insert data, and 3. Conduct queries. First, use the CREATEDATABASE and CREATETABLE statements to create the database and table, then use the INSERTINTO statement to insert the data, and finally use the SELECT statement to query the data.

MySQL: A Beginner-Friendly Approach to Data StorageMySQL: A Beginner-Friendly Approach to Data StorageApr 17, 2025 am 12:21 AM

MySQL is suitable for beginners because it is easy to use and powerful. 1.MySQL is a relational database, and uses SQL for CRUD operations. 2. It is simple to install and requires the root user password to be configured. 3. Use INSERT, UPDATE, DELETE, and SELECT to perform data operations. 4. ORDERBY, WHERE and JOIN can be used for complex queries. 5. Debugging requires checking the syntax and use EXPLAIN to analyze the query. 6. Optimization suggestions include using indexes, choosing the right data type and good programming habits.

Is MySQL Beginner-Friendly? Assessing the Learning CurveIs MySQL Beginner-Friendly? Assessing the Learning CurveApr 17, 2025 am 12:19 AM

MySQL is suitable for beginners because: 1) easy to install and configure, 2) rich learning resources, 3) intuitive SQL syntax, 4) powerful tool support. Nevertheless, beginners need to overcome challenges such as database design, query optimization, security management, and data backup.

Is SQL a Programming Language? Clarifying the TerminologyIs SQL a Programming Language? Clarifying the TerminologyApr 17, 2025 am 12:17 AM

Yes,SQLisaprogramminglanguagespecializedfordatamanagement.1)It'sdeclarative,focusingonwhattoachieveratherthanhow.2)SQLisessentialforquerying,inserting,updating,anddeletingdatainrelationaldatabases.3)Whileuser-friendly,itrequiresoptimizationtoavoidper

Explain the ACID properties (Atomicity, Consistency, Isolation, Durability).Explain the ACID properties (Atomicity, Consistency, Isolation, Durability).Apr 16, 2025 am 12:20 AM

ACID attributes include atomicity, consistency, isolation and durability, and are the cornerstone of database design. 1. Atomicity ensures that the transaction is either completely successful or completely failed. 2. Consistency ensures that the database remains consistent before and after a transaction. 3. Isolation ensures that transactions do not interfere with each other. 4. Persistence ensures that data is permanently saved after transaction submission.

MySQL: Database Management System vs. Programming LanguageMySQL: Database Management System vs. Programming LanguageApr 16, 2025 am 12:19 AM

MySQL is not only a database management system (DBMS) but also closely related to programming languages. 1) As a DBMS, MySQL is used to store, organize and retrieve data, and optimizing indexes can improve query performance. 2) Combining SQL with programming languages, embedded in Python, using ORM tools such as SQLAlchemy can simplify operations. 3) Performance optimization includes indexing, querying, caching, library and table division and transaction management.

MySQL: Managing Data with SQL CommandsMySQL: Managing Data with SQL CommandsApr 16, 2025 am 12:19 AM

MySQL uses SQL commands to manage data. 1. Basic commands include SELECT, INSERT, UPDATE and DELETE. 2. Advanced usage involves JOIN, subquery and aggregate functions. 3. Common errors include syntax, logic and performance issues. 4. Optimization tips include using indexes, avoiding SELECT* and using LIMIT.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Chat Commands and How to Use Them
1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

VSCode Windows 64-bit Download

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools