What is the difference and connection between big data and massive data?-Common Problem-php.cn

What is the difference and connection between big data and massive data?

Jul 15, 2020 pm 06:02 PM

"Big data" includes the meaning of "mass data" and exceeds mass data in content. In short, "big data" is "mass data" complex type of data. Big data includes all data sets, including transactional and interactive data sets, that are of a size or complexity that exceeds the ability of commonly used technologies to capture, manage, and process these data sets at reasonable cost and time limits.

What is the difference and connection between big data and massive data?

If it is just massive structured data, then the solution is relatively simple. Users can improve the efficiency of storage devices by purchasing more storage devices. Wait to resolve such issues. However, when people discover that the data in the database can be divided into three types: structured data, unstructured data and semi-structured data and other complex situations, the problem does not seem to be that simple.

Big data is coming

When complex types of data come flooding in, the impact on user IT systems will be dealt with in another way. Many industry experts and third-party investigation agencies have discovered through some market research data that the era of big data is coming. A survey found that 85% of this complex data is unstructured data that widely exists in social networks, the Internet of Things, e-commerce, etc. The generation of these unstructured data is often accompanied by the continuous emergence and application of new channels and technologies such as social networks, mobile computing and sensors.

There is also a lot of hype and a lot of uncertainty surrounding the concept of big data today. To this end, the editor asked some industry experts to learn more about relevant issues and asked them to talk about what big data is and is not, as well as how to deal with big data and other issues, and will meet netizens in the form of a series of articles.

Some people also refer to multi-terabyte data sets as "big data". According to statistics from market research company IDC, data usage is expected to grow 44 times, and global data usage will reach approximately 35.2ZB (1ZB
= 1 billion TB). However, the file size of individual data sets will also increase, resulting in the need for greater processing power in order to analyze and understand these data sets.

EMC has said that its more than 1,000 customers use more than 1 petabyte of data in its arrays, and that number will grow to 100,000 by 2020. Some customers will start using thousands of times more data within a year or two, 1 exabyte (1 exabyte = 1 billion GB) or more.

For large enterprises, the rise of big data is partly because computing power is available at a lower cost and systems are now capable of multitasking. Secondly, the cost of memory is also plummeting, companies can process more data in memory than ever before, and it is becoming easier and easier to aggregate computers into server clusters. IDC believes that the combination of these three factors gave rise to big data. At the same time, IDC also stated that for a certain technology to become a big data technology, it must first be affordable, and secondly, it must meet two of the three "V" criteria described by IBM: variety, Volume and velocity.

The difference between big data and massive data

Diversity means that data should include structured and unstructured data.

Volume refers to the amount of data aggregated for analysis that must be very large.

And speed means that the speed of data processing must be very fast.

Big data does not always mean hundreds of TB. Depending on actual usage, sometimes hundreds of GB of data can also be called big data. This mainly depends on its third Dimension, that is, speed or time dimension.

Garter said that the global information volume is growing at an annual growth rate of more than 59%, and volume is a significant challenge in managing data and business. IT leaders must focus on In terms of information volume, variety and speed.

Volume: The increase in data volume within enterprise systems is caused by transaction volume, other traditional data types and new data types. Excessive volume is a storage problem, but too much data is also a problem for large amounts of analysis.

Category: IT leaders have always struggled with turning large amounts of transactional information into decisions – now there are many more types of information to analyze –

Mainly from social media and mobile (contextual awareness). Categories include tabular data (database), hierarchical data, files, emails, metered data, video, still images, audio, stock quotes data, financial transactions and many more .

Speed: This relates to the flow of data, the creation of structured records, and the availability of access and delivery. Speed means how quickly the data is being generated and how quickly the data must be processed to meet demand .

Although big data is a major issue, Gartner analysts said that the real issue is making big data more meaningful and finding patterns in big data to help organizations make better business decisions.

Zhuzishuijia talks about how to define "big data"

Although "Big Data" can be translated into big data or massive data, there is a difference between big data and massive data.

Definition 1: Big data = massive data and complex types of data

Informatica China Chief Product Consultant Dan Bin believes that "big data" includes the meaning of "mass data" and goes beyond mass data in content. In short, "big data" is "mass data" and is complex. type of data.

But Bin further pointed out: Big data includes all data sets, including transaction and interaction data sets, whose scale or complexity exceeds the ability of commonly used technologies to capture, manage and process these data sets at a reasonable cost and time limit. ability.

Big data is composed of the convergence of three major technological trends:

海量交易数据：在从
ERP应用程序到数据仓库应用程序的在线交易处理（OLTP）与分析系统中，传统的关系数据以及非结构化和半结构化信息仍在继续增长。随着企业将更多的数据和业务流程移向公共和私有云，这一局面变得更加复杂。

海量交互数据：这一新生力量由源于
Facebook、Twitter、LinkedIn
及其它来源的社交媒体数据构成。它包括了呼叫详细记录（CDR）、设备和传感器信息、GPS和地理定位映射数据、通过管理文件传输（Manage
File Transfer）协议传送的海量图像文件、Web 文本和点击流数据、科学信息、电子邮件等等。

海量数据处理：大数据的涌现已经催生出了设计用于数据密集型处理的架构，例如具有开放源码、在商品硬件群中运行的
Apache Hadoop。对于企业来说，难题在于以具备成本效益的方式快速可靠地从 Hadoop 中存取数据。

Definition 2: Big data includes three elements A, B, and C

How to understand big data? Chen Wen, general manager of NetApp
Greater China, believes that big data means getting information faster to make the way of doing things different and achieve breakthroughs. Big data is defined as large amounts of data (often unstructured) that requires us to rethink how we store, manage, and recover data. So, how big is too big? One way to think about this problem is that it is so big that none of the tools we use today can handle it, so how to digest the data and transform it into valuable insights and information is key. It's transformation.

Based on the workload requirements learned from customers, NetApp understands big data as including three elements A, B, and C: analysis (Analytic), bandwidth (Bandwidth) and content (Content).

1. Big Analytics helps gain insights –
refers to the requirement for real-time analysis of huge data sets, which can bring new business models and better customer service. and achieve better results.

2. High bandwidth (Big Bandwidth) helps to go faster –
refers to the requirements for processing extremely high-speed critical data. It enables fast and efficient digestion and processing of large data sets.

3. Big Content (Big Content), no information is lost -
refers to highly scalable data storage that requires extremely high security and can be easily restored. It supports a manageable repository of information content, not just outdated data, and can span different continents.

Big data is a disruptive economic and technological force that introduces new infrastructure for IT
support. Big data solutions eliminate traditional computing and storage limitations. With the help of the growing private and public data, an epoch-making new business model is emerging, which is expected to bring new substantial revenue growth points and competitive advantages to big data customers.

The above is the detailed content of What is the difference and connection between big data and massive data?. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn