search
HomeDatabaseMysql TutorialHadoop 新特性、改进、优化和Bug分析系列1:YARN-378

Hadoop 新特性、改进、优化和Bug分析系列1:YARN-378

Jun 07, 2016 pm 04:30 PM
bughadoopoptimizationanalyzeImprovenew featurescharacteristicseries

作者: Dong | 新浪微博: 西成懂 | 可以转载, 但必须以超链接形式标明文章原始出处和作者信息及版权声明 网址:http://dongxicheng.org/mapreduce-nextgen/hadoop-jira-yarn-378/ 本博客的文章集合:http://dongxicheng.org/recommend/ 重大消息:我的Hadoop新


重大消息:我的Hadoop新书《Hadoop技术内幕:深入解析MapReduce架构设计与实现原理》已经开始在各大网站销售了,购书链接地址: 当当购书网址,京东购书网址,卓越购书网址。新书官方宣传主页: http://hadoop123.com/。

Hadoop jira链接:https://issues.apache.org/jira/browse/YARN-378
所属范围(新特性、改进、优化或Bug):改进
修复版本:2.1.0-beta及以上版本
所属分支(Common、HDFS、YARN或MapReduce):YARN
涉及模块:client, resourcemanager
英文标题:“ApplicationMaster retry times should be set by Client”

1. ?背景介绍

在Hadoop分支YARN中,当用户提交应用程序后(提交到ResourceManager上),ResourceManager首先要做的是为该应用程序申请资源以启动它的ApplicationMaster,而ApplicationMaster启动后,它(ApplicationMaster)负责应用程序内部任务的分解,监控、容错等。对于每个应用程序,由于只有一个ApplicationMaster,因此ApplicationMaster存在单点故障问题,一旦ApplicationMaster死掉,则整个应用程序可能会运行失败。当ResourceManager探测到ApplicationMaster运行失败(通过心跳超时机制)后,它会尝试在另外一个节点上重新启动该ApplicationMaster,通常而言,ApplicationMaster重启后,会恢复之前的运行状态(前提是ApplicationMaster上次死掉之前会记录一些日志在HDFS上),当然,这是ApplicationMaster自己的事情,ResourceManager无权干涉,ResourceManager要做的只是发现ApplicationMaster死亡后,重新为它申请资源在另外一个节点上启动。而本文介绍的这个特性则是如何指定每个应用程序ApplicationMaster的重试次数。

在2.1.0-beta版本之前,所有应用程序的ApplicationMaster重试次数是均是由ResourceManager决定的,管理员可通过配置参数yarn.resourcemanager.am.max-retries配置每个ApplicationMaster的重试次数,这个配置参数值适用于所有的应用程序,不可单独对单个应用程序定制化,而这个改进正是为了解决这个问题。

2. 解决思路

首先需要明确的是,这个改进的目的是,让用户可以为自己的应用程序定制ApplicationMaster的重试次数。

其次,这个重试次数将被两个组件用到,分别是ResourceManager和ApplicationMaster,其中ResourceManager用于决定,是否对失败的ApplicationMaster进行重试;ApplicationMaster用于决定,是否需要恢复上次运行时的状态(从第二次开始恢复),以从断点开始计算。

通常而言,有点经验的人,可能认为可以这样解决问题:将用户设置的值放到Configuration中,通过job.xml传递到ResourceManager和ApplicationMaster上,这样改动是最小的。但是很遗憾,客户端传递的job.xml只有ApplicationMaster会读取,而ResourceManager不会。

YARN 2.1.0-beta版本的解决方案如下:

(1) 客户端设置重试次数后,该值将被写入ProtocolBuffer对象ApplicationSubmissionContextProto中的新增字段maxAppAttempts中(在hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto中定义);

(2) 客户端提交应用程序后,maxAppAttempts值会通过RPC函数传递给ResourceManager;

(3)ResourceManager判断maxAppAttempts是否为0,如果为0,则改为ResourceManager内部已经设置好全局值,由属性arn.resourcemanager.am.max-attempts指定,默认为1;

(4)ResourceManager为ApplicationMaster申请资源后,与对应的节点通信启动ApplicationMaster,启动之前,会将maxAppAttempts值通过环境变量“MAX_APP_ATTEMPTS”传递给它

(5) ApplicationMaster在main函数中读取环境变量MAX_APP_ATTEMPTS,然后开始执行。

这样,各个应用程序可根据实际需要单独向用户提供可配置AM尝试次数的参数,比如MapReduce的参数是mapreduce.am.max-attempts,用户设置了该参数后,参数值会经过以上5个步骤进行传递。

3. ?我们学到了什么

(1)善用环境变量传递信息,环境变量可由父进程传递给子进程;

(2)在YARN中,代码改动通常是链式的,也就是说,需要依次改动几个组件,比如该例子中,需要一次改动client、ResourceManager和ApplicationMaster的代码,改动代码之前,要规划好修改方案和估算好代码的改动幅度;

(3)当需要添加一种新的ApplicationMaster相关的可配置参数时,可仿照这个jira实现完成,比如,假设让ApplicationMaster支持多种容错机制(现在不支持),其中一种是ApplicationMaster死掉后,尽量尝试在原节点重启(通常,ApplicationMaster中运行的是服务时,需要这么做),而这样改动之后,需要用户指定应用程序采用的容错机制类别。

原创文章,转载请注明: 转载自董的博客

本文链接地址: http://dongxicheng.org/mapreduce-nextgen/hadoop-jira-yarn-378/

作者:Dong,作者介绍:http://dongxicheng.org/about/

本博客的文章集合:http://dongxicheng.org/recommend/


Copyright © 2013
This feed is for personal, non-commercial use only.
The use of this feed on other websites breaches copyright. If this content is not in your news reader, it makes the page you are viewing an infringement of the copyright. (Digital Fingerprint:
)
Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
MySQL's Place: Databases and ProgrammingMySQL's Place: Databases and ProgrammingApr 13, 2025 am 12:18 AM

MySQL's position in databases and programming is very important. It is an open source relational database management system that is widely used in various application scenarios. 1) MySQL provides efficient data storage, organization and retrieval functions, supporting Web, mobile and enterprise-level systems. 2) It uses a client-server architecture, supports multiple storage engines and index optimization. 3) Basic usages include creating tables and inserting data, and advanced usages involve multi-table JOINs and complex queries. 4) Frequently asked questions such as SQL syntax errors and performance issues can be debugged through the EXPLAIN command and slow query log. 5) Performance optimization methods include rational use of indexes, optimized query and use of caches. Best practices include using transactions and PreparedStatemen

MySQL: From Small Businesses to Large EnterprisesMySQL: From Small Businesses to Large EnterprisesApr 13, 2025 am 12:17 AM

MySQL is suitable for small and large enterprises. 1) Small businesses can use MySQL for basic data management, such as storing customer information. 2) Large enterprises can use MySQL to process massive data and complex business logic to optimize query performance and transaction processing.

What are phantom reads and how does InnoDB prevent them (Next-Key Locking)?What are phantom reads and how does InnoDB prevent them (Next-Key Locking)?Apr 13, 2025 am 12:16 AM

InnoDB effectively prevents phantom reading through Next-KeyLocking mechanism. 1) Next-KeyLocking combines row lock and gap lock to lock records and their gaps to prevent new records from being inserted. 2) In practical applications, by optimizing query and adjusting isolation levels, lock competition can be reduced and concurrency performance can be improved.

MySQL: Not a Programming Language, But...MySQL: Not a Programming Language, But...Apr 13, 2025 am 12:03 AM

MySQL is not a programming language, but its query language SQL has the characteristics of a programming language: 1. SQL supports conditional judgment, loops and variable operations; 2. Through stored procedures, triggers and functions, users can perform complex logical operations in the database.

MySQL: An Introduction to the World's Most Popular DatabaseMySQL: An Introduction to the World's Most Popular DatabaseApr 12, 2025 am 12:18 AM

MySQL is an open source relational database management system, mainly used to store and retrieve data quickly and reliably. Its working principle includes client requests, query resolution, execution of queries and return results. Examples of usage include creating tables, inserting and querying data, and advanced features such as JOIN operations. Common errors involve SQL syntax, data types, and permissions, and optimization suggestions include the use of indexes, optimized queries, and partitioning of tables.

The Importance of MySQL: Data Storage and ManagementThe Importance of MySQL: Data Storage and ManagementApr 12, 2025 am 12:18 AM

MySQL is an open source relational database management system suitable for data storage, management, query and security. 1. It supports a variety of operating systems and is widely used in Web applications and other fields. 2. Through the client-server architecture and different storage engines, MySQL processes data efficiently. 3. Basic usage includes creating databases and tables, inserting, querying and updating data. 4. Advanced usage involves complex queries and stored procedures. 5. Common errors can be debugged through the EXPLAIN statement. 6. Performance optimization includes the rational use of indexes and optimized query statements.

Why Use MySQL? Benefits and AdvantagesWhy Use MySQL? Benefits and AdvantagesApr 12, 2025 am 12:17 AM

MySQL is chosen for its performance, reliability, ease of use, and community support. 1.MySQL provides efficient data storage and retrieval functions, supporting multiple data types and advanced query operations. 2. Adopt client-server architecture and multiple storage engines to support transaction and query optimization. 3. Easy to use, supports a variety of operating systems and programming languages. 4. Have strong community support and provide rich resources and solutions.

Describe InnoDB locking mechanisms (shared locks, exclusive locks, intention locks, record locks, gap locks, next-key locks).Describe InnoDB locking mechanisms (shared locks, exclusive locks, intention locks, record locks, gap locks, next-key locks).Apr 12, 2025 am 12:16 AM

InnoDB's lock mechanisms include shared locks, exclusive locks, intention locks, record locks, gap locks and next key locks. 1. Shared lock allows transactions to read data without preventing other transactions from reading. 2. Exclusive lock prevents other transactions from reading and modifying data. 3. Intention lock optimizes lock efficiency. 4. Record lock lock index record. 5. Gap lock locks index recording gap. 6. The next key lock is a combination of record lock and gap lock to ensure data consistency.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use