1、资源隔离 1.1、现状 a、每一个队列设置'Min Resources'、'Max Resources',当该队列处于空闲状态,其他队列可从该队列争夺资源,突破该队列的最小资源数。而忙碌的队列可突破最大资源数。此时空闲队列,同时提交很多job,资源不够,抢占队列在一定的时间内
1、资源隔离
1.1、现状
a、每一个队列设置'Min Resources'、'Max Resources',当该队列处于空闲状态,其他队列可从该队列争夺资源,突破该队列的最小资源数。而忙碌的队列可突破最大资源数。此时空闲队列,同时提交很多job,资源不够,抢占队列在一定的时间内没有释放资源,会强制kill job,释放资源,还给空闲队列。
b、设置 mapreduce.job.queuename='资源多队列',可跨队列提交。
1.2、解决方案
1.2.1、禁止跨队列提交任务,即屏蔽'mapreduce.job.queuename'参数。
1.2.2、修改配置文件步骤
a、修改fair_scheduler.xml 文件在
dd001 --- dd001为user dd001
描述: aclSubmitApps:可向队列中提交应用程序的Linux用户或用户组列表,默认情况下为“*”,表示任何用户均可以向该队列提交应用程序。
需要注意的是,该属性具有继承性,即子队列的列表会继承父队列的列表。配置该属性时,用户之间或用户组之间用“,”分割,用户和用户组之间用空格分割,比如“user1, user2 group1,group2”。
aclAdministerApps:该队列的管理员列表。一个队列的管理员可管理该队列中的资源和应用程序,比如可杀死任意应用程序。
2、禁止跨队列kill job
2.1、现状
a、yarn.admin.acl的value值为'*',表示所有的用户都可以kill其他用户的job。
2.2、解决方案
2.2.1、禁止跨队列kill job,保证除了超级用户,其他用户只能kill自己对应的队列job。 2.2.2、修改配置步骤
a、mapred_site.xml 文件增加以下参数
mapreduce.cluster.acls.enabled true
b、yarn-site.xml 文件增加以下参数
yarn.acl.enable true yarn.admin.acl hadp
c、core-site.xml文件注入如下参数 -----防止前端的appcluser UI 出现访问权限问题
hadoop.http.staticuser.user hadp
3、存储隔离
3.1、现状
a、不同用户只能对自己用户下的目录有写的权限,但目录大小没有上限。可能导致有些用户无止境的写,而另一些用户,没有空间写。
3.2、解决方案
3.2.1、根据业务大小,对用户对应的目录大小进行配置。
a、未设置配额的文件属性
[dd001[@test_12123](/user/test_12123) ~]$hdfs dfs -count -q hdfs://ns1/user/dd001/warehouse/test_lh none inf none inf 1 0 0 hdfs://ns1/user/dd_edw/warehouse/test_lh
文件数限额 可用文件数 空间限额 可用空间 目录数 文件数 总大小 文件/目录名
b、设置配额命令
[dd001[@test_12123](/user/test_12123) ~]$hdfs dfsadmin -setSpaceQuota 400 hdfs://ns1/user/dd001/warehouse/test_lh
c、设置配额后的属性值
[dd001[@test_12123](/user/test_12123) ~]$hdfs dfs -count -q hdfs://ns1/user/dd001/warehouse/test_lh none inf 400 400 1 0 0 hdfs://ns1/user/dd_edw/warehouse/test_lh
d、测试目录超过配额后,出现什么结果
[dd001[@test_12123](/user/test_12123) ~]$hdfs dfs -cp hdfs://ns1/user/dd001/warehouse/000026_0.lzo hdfs://ns1/user/dd001/warehouse/test_lh 14/10/04 17:54:14 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/dd_edw/warehouse/test_lh is exceeded: quota = 400 B = 400 B but diskspace consumed = 402653184 B = 384 MB at org.apache.hadoop.hdfs.server.namenode.INodeDirectoryWithQuota.verifyQuota(INodeDirectoryWithQuota.java:191) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:2054) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1789) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1764) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:357) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:2847) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2508) at org.apache.hadoop.hd cp文件时候报错,文件比配额来的大。
e、配额删除命令
[dd001[@test_12123](/user/test_12123) ~]$hdfs dfsadmin -clrSpaceQuota hdfs://ns1/user/dd001/warehouse/test_lh
3.3、监控
增加配额只是一条命令的事情,限制存储不是目的,是手段。最终目的还是为了资源更充分的得到利用,防止超过配额,而不是任务报错。因此做好监控是首要任务。
3.3.1、资源分配
队列名 用户机器数 机器总配额(T) 集群机器分配总数 平均配额=(总配额/集群机器分配总数)(T) 硬盘预留值(T) 实际配额=(平均配额-硬盘预留值 )*机器数
dd001 20 21 20 20.9715 0.0488 418.454
a、平均配额=总配额/集群机器分配总数。
实际配额=(平均配额-硬盘预留值 )*机器数。
b、报警值=实际配额 * 0.8。
3.3.2、磁盘报警后处理
a、删除冗余数据。
b、加机器。
在加机器的时候,内存、cpu也需要相对的调整。
401 mb,19vcores 401 mb,19vcores
两个参数做相应的调整,而配额的调整命令如下:
a、hdfs dfsadmin –clrSpaceQuota hdfs://ns1/user/dd001/warehouse/test_lh ---删除配额
b、hdfs dfsadmin -setSpaceQuota ‘实际配额’ hdfs://ns1/user/dd001/warehouse/test_lh ---增加新的配额。
c、增加多少配额,即增加多少机器
c.1、目录存储量使用平均日增长=sum(日增长)/count(1)
c.2、机器数 =(磁盘可用存储天数 * 目录存储量使用平均日增长)/(平均配额-硬盘预留值)
c.3、实例:
假设'目录存储量使用平均日增长'=0.5T
机器数=(90*0.5)/ (18.4279-0.0488)= 3台
参考文档:
http://blog.csdn.net/caizhongda/article/details/7468363
http://dongxicheng.org/mapreduce-nextgen/hadoop-yarn-configurations-fair-scheduler/
http://www.yufan-liu.com/blog/?p=291
http://blog.itpub.net/122978/viewspace-1119883/
http://www.07net01.com/zhishi/520762.html
http://f.dataguru.cn/thread-103012-1-1.html
原文地址:HADOOP资源/存储隔离, 感谢原作者分享。

The main role of MySQL in web applications is to store and manage data. 1.MySQL efficiently processes user information, product catalogs, transaction records and other data. 2. Through SQL query, developers can extract information from the database to generate dynamic content. 3.MySQL works based on the client-server model to ensure acceptable query speed.

The steps to build a MySQL database include: 1. Create a database and table, 2. Insert data, and 3. Conduct queries. First, use the CREATEDATABASE and CREATETABLE statements to create the database and table, then use the INSERTINTO statement to insert the data, and finally use the SELECT statement to query the data.

MySQL is suitable for beginners because it is easy to use and powerful. 1.MySQL is a relational database, and uses SQL for CRUD operations. 2. It is simple to install and requires the root user password to be configured. 3. Use INSERT, UPDATE, DELETE, and SELECT to perform data operations. 4. ORDERBY, WHERE and JOIN can be used for complex queries. 5. Debugging requires checking the syntax and use EXPLAIN to analyze the query. 6. Optimization suggestions include using indexes, choosing the right data type and good programming habits.

MySQL is suitable for beginners because: 1) easy to install and configure, 2) rich learning resources, 3) intuitive SQL syntax, 4) powerful tool support. Nevertheless, beginners need to overcome challenges such as database design, query optimization, security management, and data backup.

Yes,SQLisaprogramminglanguagespecializedfordatamanagement.1)It'sdeclarative,focusingonwhattoachieveratherthanhow.2)SQLisessentialforquerying,inserting,updating,anddeletingdatainrelationaldatabases.3)Whileuser-friendly,itrequiresoptimizationtoavoidper

ACID attributes include atomicity, consistency, isolation and durability, and are the cornerstone of database design. 1. Atomicity ensures that the transaction is either completely successful or completely failed. 2. Consistency ensures that the database remains consistent before and after a transaction. 3. Isolation ensures that transactions do not interfere with each other. 4. Persistence ensures that data is permanently saved after transaction submission.

MySQL is not only a database management system (DBMS) but also closely related to programming languages. 1) As a DBMS, MySQL is used to store, organize and retrieve data, and optimizing indexes can improve query performance. 2) Combining SQL with programming languages, embedded in Python, using ORM tools such as SQLAlchemy can simplify operations. 3) Performance optimization includes indexing, querying, caching, library and table division and transaction management.

MySQL uses SQL commands to manage data. 1. Basic commands include SELECT, INSERT, UPDATE and DELETE. 2. Advanced usage involves JOIN, subquery and aggregate functions. 3. Common errors include syntax, logic and performance issues. 4. Optimization tips include using indexes, avoiding SELECT* and using LIMIT.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

Zend Studio 13.0.1
Powerful PHP integrated development environment

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

Dreamweaver CS6
Visual web development tools