search
HomeOperation and MaintenanceLinux Operation and MaintenanceBigData big data operation and maintenance

Big data operation and maintenance

1.HDFSDistributed file system operation and maintenance

1.Create recursion in the root directory of the HDFS file system Directory "1daoyun/file", upload the BigDataSkills.txt file in the attachment Go to the 1daoyun/file directory and use the relevant commands to view the files in the 1daoyun/file directory in the system List information.

hadoop fs -mkdir -p /1daoyun/file

hadoop fs -put BigDataSkills.txt /1daoyun/file

hadoop fs -ls /1daoyun/file

2.

at HDFS Create a recursive directory under the root directory of the file system"1daoyun/file", and add the ## in the attachment #BigDataSkills.txt file, upload it to the 1daoyun/file directory, and use HDFS File systemCheck tool checks whether files are damaged. hadoop fs -mkdir -p /1daoyun/file

hadoop fs -put BigDataSkills.txt/1daoyun/file

hadoop fsck /1daoyun/file/BigDataSkills.txt

3.

at HDFS Create a recursive directory in the root directory of the file system "1daoyun/file", and add in the attachment BigDataSkills.txt file, upload to the 1daoyun/file directory, specify BigDataSkills.txt # during the upload process The ## file has a replication factor of #HDFS file system of 2 and uses fsck ToolTool checks the number of copies of storage blocks.

hadoop fs -mkdir -p /1daoyun/file

##hadoop fs -D dfs.replication=2 -put BigDataSkills.txt /1daoyun/file

hadoop fsck /1daoyun/file/BigDataSkills.txt

4.HDFS There is one in the root directory of the file system /apps file directory, it is required to enable the snapshot creation function of the directory and create a snapshot for the directory file , the snapshot name is apps_1daoyun, so use related commands to view the list information of the snapshot file.

hadoop dfsadmin -allowSnapshot /apps

hadoop fs -createSnapshot /apps apps_1daoyun

hadoop fs -ls /apps/.snapshot

5.when Hadoop When the cluster starts, it will first enter the safe mode state, which will exit after 30 seconds by default. When the system is in safe mode, the HDFS file system can only be read, and cannot be written, modified, deleted, etc. It is assumed that the Hadoop cluster needs to be maintained. It is necessary to put the cluster into safe mode and check its status.

hdfs dfsadmin -safemode enter

##hdfs dfsadmin -safemode get

6.

In order to prevent operators from accidentally deleting files, HDFS The file system provides the recycle bin function, but Many junk files will take up a lot of storage space. It is required that the WEB interface of the Xiandian big data platform completely delete the files in the HDFS file system recycle bin The time interval is 7 days. Advancedcore-sitefs.trash.interval: 10080

BigData big data operation and maintenance##

7.In order to prevent operators from accidentally deleting files, the HDFS file system provides a recycle bin function, but too many junk files will take up a lot of storage space. It is required to use the "vi" command in Linux Shell to modify the corresponding configuration file and parameter information. Turn off the recycle bin function. After completion, restart the corresponding service. Advancedcore-sitefs.trash.interval: 0vi /etc/hadoop/2.4.3.0 -227/0/core-site.xml

##

## trash.interval

##                                                                                         #

sbin/stop-dfs.sh##sbin/start- dfs.sh

8.Hadoop

The hosts in the cluster may experience downtime or system damage under certain circumstances. One

Once these problems are encountered,

HDFS

The data files in the file system will inevitably be damaged or lost,

In order to ensure that

HDFS The reliability of the file system now requires the redundancy replication factor of the cluster in the WEB interface of the Xidian big data platform Modify to 5. GeneralBlock replication5

9.Hadoop The hosts in the cluster may experience downtime or system damage under certain circumstances. Once Due to these problems, HDFS the data files in the file system will inevitably be damaged or lost, In order to ensure that HDFS For the reliability of the file system, the redundancy replication factor of the cluster needs to be modified to 5, in Linux Shell Use the "vi" command to modify the corresponding configuration file and parameter information. After completion, restart the corresponding service.

BigData big data operation and maintenance or

vi/etc/hadoop/2.4.3.0-227/0/hdfs- site.xml

##

##                                                                                                                               

#                                                                                                                                   #/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf stop {namenode/datenode}

/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start {namenode/datenode}

10.

Use the command to view hdfs

in the file system/tmp

The number of directories under the directory, the number of files and the total size of the files .

hadoop fs -count /tmp2.MapREDUCE Case question

1.In the cluster node/usr/hdp/2.4.3.0-227/hadoop-mapreduce/## In the # directory, there is a case JAR Packagehadoop-mapreduce-examples.jar. Run the PI program in the JAR package to calculate Piπ## Approximate value of #, requires running 5 Map tasks, each Map The number of throws for the task is 5.

cd

/usr/hdp/2.4.3.0-227/hadoop-mapreduce/##hadoop jar hadoop- mapreduce-examples-2.7.1.2.4.3.0-227.jar pi 5 5

BigData big data operation and maintenance##2.

In the cluster node/usr/hdp/2.4.3.0-227/hadoop-mapreduce/ directory, there is a caseJAR Packagehadoop-mapreduce-examples.jar. Run the wordcount program in the JAR package to #/1daoyun/file/ BigDataSkills.txt file counts words, outputs the operation results to the /1daoyun/output directory, and uses related commands to query the word count results. hadoop jar/usr/hdp/2.4.3.0-227/hadoop-mapreduce/hadoop-mapreduce-examples-2.7.1.2.4.3.0-227.jar wordcount /1daoyun/ file/BigDataSkills.txt /1daoyun/output

3.In the cluster node/usr/hdp/2.4.3.0-227/hadoop-mapreduce/## In the # directory, there is a case JAR Packagehadoop-mapreduce-examples.jar. Run the sudoku program in the JAR package to calculate the results of the Sudoku problems in the table below. .

BigData big data operation and maintenance

cat puzzle1.dta

##hadoop jarhadoop-mapreduce-examples- 2.7.1.2.4.3.0-227.jar sudoku /root/puzzle1.dta

4.

In the cluster node## There is a case JAR in the #/usr/hdp/2.4.3.0-227/hadoop-mapreduce/ directory. Packagehadoop-mapreduce-examples.jar. Run the grep program in the JAR package to count / in the file system 1daoyun/file/BigDataSkills.txt The number of times "Hadoop" appears in the file. After the statistics are completed, query the statistical result information. hadoop jarhadoop-mapreduce-examples-2.7.1.2.4.3.0-227.jar grep /1daoyun/file/BigDataSkills.txt /output hadoop

The above is the detailed content of BigData big data operation and maintenance. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Tutorial on finding keywords for common Linux commandsTutorial on finding keywords for common Linux commandsMar 05, 2025 am 11:45 AM

This tutorial demonstrates efficient keyword searching in Linux using the grep command family and related tools. It covers basic and advanced techniques, including regular expressions, recursive searches, and combining commands like awk, sed, and xa

Work content of Linux operation and maintenance engineers What does Linux operation and maintenance engineers do?Work content of Linux operation and maintenance engineers What does Linux operation and maintenance engineers do?Mar 05, 2025 am 11:37 AM

This article details the multifaceted role of a Linux system administrator, encompassing system maintenance, troubleshooting, security, and collaboration. It highlights essential technical and soft skills, salary expectations, and diverse career pr

How do I configure SELinux or AppArmor to enhance security in Linux?How do I configure SELinux or AppArmor to enhance security in Linux?Mar 12, 2025 pm 06:59 PM

This article compares SELinux and AppArmor, Linux kernel security modules providing mandatory access control. It details their configuration, highlighting the differences in approach (policy-based vs. profile-based) and potential performance impacts

How do I back up and restore a Linux system?How do I back up and restore a Linux system?Mar 12, 2025 pm 07:01 PM

This article details Linux system backup and restoration methods. It compares full system image backups with incremental backups, discusses optimal backup strategies (regularity, multiple locations, versioning, testing, security, rotation), and da

How do I use regular expressions (regex) in Linux for pattern matching?How do I use regular expressions (regex) in Linux for pattern matching?Mar 17, 2025 pm 05:25 PM

The article explains how to use regular expressions (regex) in Linux for pattern matching, file searching, and text manipulation, detailing syntax, commands, and tools like grep, sed, and awk.

How do I monitor system performance in Linux using tools like top, htop, and vmstat?How do I monitor system performance in Linux using tools like top, htop, and vmstat?Mar 17, 2025 pm 05:28 PM

The article discusses using top, htop, and vmstat for monitoring Linux system performance, detailing their unique features and customization options for effective system management.

How do I implement two-factor authentication (2FA) for SSH in Linux?How do I implement two-factor authentication (2FA) for SSH in Linux?Mar 17, 2025 pm 05:31 PM

The article provides a guide on setting up two-factor authentication (2FA) for SSH on Linux using Google Authenticator, detailing installation, configuration, and troubleshooting steps. It highlights the security benefits of 2FA, such as enhanced sec

Methods for uploading files for common Linux commandsMethods for uploading files for common Linux commandsMar 05, 2025 am 11:42 AM

This article compares Linux commands (scp, sftp, rsync, ftp) for uploading files. It emphasizes security (favoring SSH-based methods) and efficiency, highlighting rsync's delta transfer capabilities for large files. The choice depends on file size,

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

Atom editor mac version download

Atom editor mac version download

The most popular open source editor