search
HomeJavajavaTutorialWhat to learn about java big data
What to learn about java big dataMay 27, 2019 pm 02:30 PM
java

Java big data learning process.

What to learn about java big data

The first stage: Static web page basics (HTML CSS)

1. Difficulty level: one star

2. Comprehensive ability of project tasks in the technical knowledge point stage

3. Main technologies include:

html common tags, common CSS layouts, styles, positioning, etc., and static page design and production methods Wait

Second stage: JavaSE JavaWeb

1. Difficulty level: two stars

2. Technical knowledge point stage project task comprehensive ability

3. Main technologies include:

java basic syntax, java object-oriented (class, object, encapsulation, inheritance, polymorphism, abstract class, interface, common class, internal class, common modification symbols, etc.), exceptions, collections, files, IO, MYSQL (basic SQL statement operations, multi-table queries, subqueries, stored procedures, transactions, distributed transactions), JDBC, threads, reflection, Socket programming, enumerations, generics , Design pattern

4. The description is as follows:

is called Java basics, from shallow to deep technical points, real business project module analysis, and the design and implementation of multiple storage methods. This stage is the most important stage of the first four stages, because all subsequent stages are based on this stage, and it is also the stage with the highest density of learning big data. This stage will be the first time for the team to develop and produce real projects with front and backends (the first stage of technology and the second stage of comprehensive application of technology).

The third stage: front-end framework

1. Difficulty and easy procedures: two stars

2. Technical knowledge point stage project task comprehensive ability

3. Main technologies include:

Java, Jquery, annotation reflection are used together, XML and XML parsing, parsing dom4j, jxab, jdk8.0 new features, SVN, Maven, easyui

4. The description is as follows:

Based on the first two stages, we can turn static into dynamic, which can make the content of our web pages richer. Of course, if from the perspective of marketers, there are professional front-end designers , Our goal in designing this stage is that front-end technology can more intuitively exercise people's thinking and design capabilities. At the same time, we also integrate the advanced features of the second stage into this stage. Taking learners to the next level.

The fourth stage: enterprise-level development framework

1. Difficult and easy procedures: three stars

3.Main technologies include:

Hibernate, Spring, SpringMVC, log4j slf4j integration, myBatis, struts2, Shiro, redis, process engine activity, crawler technology nutch, lucene, webService CXF, Tomcat cluster and hot standby, MySQL read and write separation

The fifth stage: First introduction to big data

1. Difficulty level: three stars

2. Comprehensive ability of project tasks in the technical knowledge point stage

3. Main technologies include:

Part 1 of big data (what is big data, application scenarios, how to learn big databases, virtual machine concepts and installation, etc.), common Linux commands (file management, system management, disk management), Linux Shell programming (SHELL variables, loop control, applications), getting started with Hadoop (Hadoop composition, stand-alone environment, directory structure, HDFS interface, MR interface, simple SHELL, java access hadoop), HDFS (introduction, SHELL, Use of IDEA development tools, fully distributed cluster construction), MapReduce applications (intermediate calculation process, Java operation MapReduce, program running, log monitoring), Hadoop advanced applications (YARN framework introduction, configuration items and optimization, CDH introduction, environment construction), Extension (MAP side optimization, COMBINER usage method, see TOP K, SQOOP export, snapshots of other virtual machine VMs, permission management commands, AWK and SED commands)

4. The description is as follows:

This stage is designed to allow newcomers to have a relatively big concept of big data. How to deal with it? After studying JAVA in the pre-requisite course, you will be able to understand how the program runs on a stand-alone computer. Now, what about big data? Big data is processed by running programs on a cluster of large-scale machines. Of course, big data requires data processing, so similarly, data storage changes from single-machine storage to large-scale cluster storage on multiple machines. (You ask me what is a cluster? Well, I have a big pot of rice. I can finish it by myself, but it will take a long time. Now I ask everyone to eat together. When I am alone, I call it people, but when there are more people? Is it called a crowd? ) Then big data can be roughly divided into: big data storage and big data processing. So at this stage, our course designed the standard of big data: HADOOP. The operation of big data is not that we often use WINDOWS. 7 or W10, but the most widely used system now: LINUX.

The sixth stage: big data database

1.Difficulty level: four stars

2.Technical knowledge point stage project task comprehensive ability

3. Main technologies include: Getting started with Hive (Introduction to Hive, Hive usage scenarios, environment construction, architecture description, working mechanism), Hive Shell programming (table creation, query statements, partitioning and bucketing, index management and views), Hive advanced application (DISTINCT implementation, groupby, join, SQL conversion principle, Java programming, configuration and optimization), introduction to hbase, Hbase SHELL programming (DDL, DML, Java operation table creation, query, compression, filter), detailed description of Hbase Modules (REGION, HREGION SERVER, HMASTER, ZOOKEEPER introduction, ZOOKEEPER configuration, Hbase and Zookeeper integration), HBASE advanced features (read and write processes, data models, schema design read and write hotspots, optimization and configuration)

4. The description is as follows:

This stage is designed to allow everyone to understand how big data handles large-scale data. Simplify our programming time and increase reading speed.

How to simplify it? In the first stage, if complex business correlation and data mining are required, it is very complicated to write MR programs by yourself. So at this stage we introduced HIVE, a data warehouse in big data. There is a keyword here, data warehouse. I know you are going to ask me, so let me first say that the data warehouse is used for data mining and analysis. It is usually a very large data center. The data is stored in large databases such as ORACLE and DB2. These databases are usually Used as a real-time online business. In short, analyzing data based on data warehouse is relatively slow. But the convenience is that as long as you are familiar with SQL, it is relatively easy to learn, and HIVE is such a tool, a SQL query tool based on big data. This stage also includes HBASE, which is a database in big data. I'm confused, haven't you learned about a data "warehouse" called HIVE? HIVE is based on MR, so the query is quite slow. HBASE is based on big data and can perform real-time data query. One for analysis, the other for query.

The seventh stage: real-time data collection

1. Difficult and easy procedures: four stars

2. Technical knowledge point stage project task comprehensive ability

3. Main technologies include:

Flume log collection, KAFKA introduction (message queue, application scenarios, cluster construction), KAFKA detailed explanation (partition, topic, receiver, sender, and ZOOKEEPER Integration, Shell development, Shell debugging), advanced use of KAFKA (java development, main configuration, optimization projects), data visualization (introduction to graphics and charts, CHARTS tool classification, bar charts and pie charts, 3D charts and maps), introduction to STORM ( Design ideas, application scenarios, processing procedures, cluster installation), STROM development (STROM MVN development, writing STORM local programs), STORM advancement (java development, main configuration, optimization projects), KAFKA asynchronous sending and batch sending timeliness, KAFKA global Messages are in order, STORM multi-concurrency optimization

4. The description is as follows:

The data source in the previous stage is based on the existing large-scale data set, and the results after data processing and analysis There is a certain delay, and the data usually processed is the data of the previous day. Example scenarios: website anti-hotlinking, customer account anomalies, and real-time credit reporting. What if these scenarios are analyzed based on the data from the previous day? Is it too late? Therefore, in this stage we introduced real-time data collection and analysis. It mainly includes: FLUME real-time data collection, which supports a wide range of collection sources, KAFKA data reception and transmission, STORM real-time data processing, and data processing at the second level.

The eighth stage: SPARK data analysis

1. Difficulty and easy procedures: five stars

2. Comprehensive ability of project tasks in the technical knowledge point stage

3. Main technologies include: SCALA introduction (data types, operators, control statements, basic functions), SCALA advanced (data structures, classes, objects, traits, pattern matching, regular expressions), SCALA Advanced usage (higher-order functions, Corey functions, partial functions, tail iterations, built-in higher-order functions, etc.), introduction to SPARK (environment construction, infrastructure, operating mode), Spark data sets and programming models, SPARK SQL, SPARK advanced Stage (DATA FRAME, DATASET, SPARK STREAMING principle, SPARK STREAMING support source, integrated KAFKA and SOCKET, programming model), SPARK advanced programming (Spark-GraphX, Spark-Mllib machine learning), SPARK advanced application (system architecture, main configuration and Performance optimization, fault and stage recovery), SPARK ML KMEANS algorithm, SCALA implicit conversion advanced features

4. The description is as follows:

Let’s also talk about the previous stages, mainly the first stage. HADOOP is relatively slow in analyzing large-scale data sets based on MR, including machine learning, artificial intelligence, etc. And it is not suitable for iterative calculations. SPARK is analyzed as a substitute product for MR. How to replace it? Let’s talk about their operating mechanisms first. HADOOP is based on disk storage analysis, while SPARK is based on memory analysis. You may not understand when I say this, but to be more descriptive, just like if you are taking a train from Beijing to Shanghai, MR is a green train, and SPARK is a high-speed rail or maglev. SPARK is developed based on the SCALA language. Of course, it has the best support for SCALA, so we first learn the SCALA development language in the course. What? Want to learn another development language? No no no! ! ! Let me just say one thing: SCALA is based on JAVA. From historical data storage and analysis (HADOOP, HIVE, HBASE) to real-time data storage (FLUME, KAFKA) and analysis (STORM, SPARK), these are all interdependent in real projects.

The above is the detailed content of What to learn about java big data. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
带你搞懂Java结构化数据处理开源库SPL带你搞懂Java结构化数据处理开源库SPLMay 24, 2022 pm 01:34 PM

本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于结构化数据处理开源库SPL的相关问题,下面就一起来看一下java下理想的结构化数据处理类库,希望对大家有帮助。

Java集合框架之PriorityQueue优先级队列Java集合框架之PriorityQueue优先级队列Jun 09, 2022 am 11:47 AM

本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于PriorityQueue优先级队列的相关知识,Java集合框架中提供了PriorityQueue和PriorityBlockingQueue两种类型的优先级队列,PriorityQueue是线程不安全的,PriorityBlockingQueue是线程安全的,下面一起来看一下,希望对大家有帮助。

完全掌握Java锁(图文解析)完全掌握Java锁(图文解析)Jun 14, 2022 am 11:47 AM

本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于java锁的相关问题,包括了独占锁、悲观锁、乐观锁、共享锁等等内容,下面一起来看一下,希望对大家有帮助。

一起聊聊Java多线程之线程安全问题一起聊聊Java多线程之线程安全问题Apr 21, 2022 pm 06:17 PM

本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于多线程的相关问题,包括了线程安装、线程加锁与线程不安全的原因、线程安全的标准类等等内容,希望对大家有帮助。

详细解析Java的this和super关键字详细解析Java的this和super关键字Apr 30, 2022 am 09:00 AM

本篇文章给大家带来了关于Java的相关知识,其中主要介绍了关于关键字中this和super的相关问题,以及他们的一些区别,下面一起来看一下,希望对大家有帮助。

Java基础归纳之枚举Java基础归纳之枚举May 26, 2022 am 11:50 AM

本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于枚举的相关问题,包括了枚举的基本操作、集合类对枚举的支持等等内容,下面一起来看一下,希望对大家有帮助。

java中封装是什么java中封装是什么May 16, 2019 pm 06:08 PM

封装是一种信息隐藏技术,是指一种将抽象性函式接口的实现细节部分包装、隐藏起来的方法;封装可以被认为是一个保护屏障,防止指定类的代码和数据被外部类定义的代码随机访问。封装可以通过关键字private,protected和public实现。

Java数据结构之AVL树详解Java数据结构之AVL树详解Jun 01, 2022 am 11:39 AM

本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于平衡二叉树(AVL树)的相关知识,AVL树本质上是带了平衡功能的二叉查找树,下面一起来看一下,希望对大家有帮助。

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Hot Tools

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!