search
HomeCommon ProblemWhat is the difference and connection between big data and massive data?

"Big data" includes the meaning of "mass data" and exceeds mass data in content. In short, "big data" is "mass data" complex type of data. Big data includes all data sets, including transactional and interactive data sets, that are of a size or complexity that exceeds the ability of commonly used technologies to capture, manage, and process these data sets at reasonable cost and time limits.

What is the difference and connection between big data and massive data?

If it is just massive structured data, then the solution is relatively simple. Users can improve the efficiency of storage devices by purchasing more storage devices. Wait to resolve such issues. However, when people discover that the data in the database can be divided into three types: structured data, unstructured data and semi-structured data and other complex situations, the problem does not seem to be that simple.

Big data is coming

When complex types of data come flooding in, the impact on user IT systems will be dealt with in another way. Many industry experts and third-party investigation agencies have discovered through some market research data that the era of big data is coming. A survey found that 85% of this complex data is unstructured data that widely exists in social networks, the Internet of Things, e-commerce, etc. The generation of these unstructured data is often accompanied by the continuous emergence and application of new channels and technologies such as social networks, mobile computing and sensors.

There is also a lot of hype and a lot of uncertainty surrounding the concept of big data today. To this end, the editor asked some industry experts to learn more about relevant issues and asked them to talk about what big data is and is not, as well as how to deal with big data and other issues, and will meet netizens in the form of a series of articles.

Some people also refer to multi-terabyte data sets as "big data". According to statistics from market research company IDC, data usage is expected to grow 44 times, and global data usage will reach approximately 35.2ZB (1ZB
= 1 billion TB). However, the file size of individual data sets will also increase, resulting in the need for greater processing power in order to analyze and understand these data sets.

EMC has said that its more than 1,000 customers use more than 1 petabyte of data in its arrays, and that number will grow to 100,000 by 2020. Some customers will start using thousands of times more data within a year or two, 1 exabyte (1 exabyte = 1 billion GB) or more.

For large enterprises, the rise of big data is partly because computing power is available at a lower cost and systems are now capable of multitasking. Secondly, the cost of memory is also plummeting, companies can process more data in memory than ever before, and it is becoming easier and easier to aggregate computers into server clusters. IDC believes that the combination of these three factors gave rise to big data. At the same time, IDC also stated that for a certain technology to become a big data technology, it must first be affordable, and secondly, it must meet two of the three "V" criteria described by IBM: variety, Volume and velocity.

The difference between big data and massive data

Diversity means that data should include structured and unstructured data.

Volume refers to the amount of data aggregated for analysis that must be very large.

And speed means that the speed of data processing must be very fast.

Big data does not always mean hundreds of TB. Depending on actual usage, sometimes hundreds of GB of data can also be called big data. This mainly depends on its third Dimension, that is, speed or time dimension.

Garter said that the global information volume is growing at an annual growth rate of more than 59%, and volume is a significant challenge in managing data and business. IT leaders must focus on In terms of information volume, variety and speed.

Volume: The increase in data volume within enterprise systems is caused by transaction volume, other traditional data types and new data types. Excessive volume is a storage problem, but too much data is also a problem for large amounts of analysis.

Category: IT leaders have always struggled with turning large amounts of transactional information into decisions – now there are many more types of information to analyze –

Mainly from social media and mobile (contextual awareness). Categories include tabular data (database), hierarchical data, files, emails, metered data, video, still images, audio, stock quotes data, financial transactions and many more .

Speed: This relates to the flow of data, the creation of structured records, and the availability of access and delivery. Speed ​​means how quickly the data is being generated and how quickly the data must be processed to meet demand .

Although big data is a major issue, Gartner analysts said that the real issue is making big data more meaningful and finding patterns in big data to help organizations make better business decisions.

Zhuzishuijia talks about how to define "big data"

Although "Big Data" can be translated into big data or massive data, there is a difference between big data and massive data.

Definition 1: Big data = massive data and complex types of data

Informatica China Chief Product Consultant Dan Bin believes that "big data" includes the meaning of "mass data" and goes beyond mass data in content. In short, "big data" is "mass data" and is complex. type of data.

But Bin further pointed out: Big data includes all data sets, including transaction and interaction data sets, whose scale or complexity exceeds the ability of commonly used technologies to capture, manage and process these data sets at a reasonable cost and time limit. ability.

Big data is composed of the convergence of three major technological trends:

海量交易数据:在从
ERP应用程序到数据仓库应用程序的在线交易处理(OLTP)与分析系统中,传统的关系数据以及非结构化和半结构化信息仍在继续增长。随着企业将更多的数据和业务流程移向公共和私有云,这一局面变得更加复杂。

海量交互数据:这一新生力量由源于
Facebook、Twitter、LinkedIn
及其它来源的社交媒体数据构成。它包括了呼叫详细记录(CDR)、设备和传感器信息、GPS和地理定位映射数据、通过管理文件传输(Manage
File Transfer)协议传送的海量图像文件、Web 文本和点击流数据、科学信息、电子邮件等等。

海量数据处理:大数据的涌现已经催生出了设计用于数据密集型处理的架构,例如具有开放源码、在商品硬件群中运行的
Apache Hadoop。对于企业来说,难题在于以具备成本效益的方式快速可靠地从 Hadoop 中存取数据。

Definition 2: Big data includes three elements A, B, and C

How to understand big data? Chen Wen, general manager of NetApp
Greater China, believes that big data means getting information faster to make the way of doing things different and achieve breakthroughs. Big data is defined as large amounts of data (often unstructured) that requires us to rethink how we store, manage, and recover data. So, how big is too big? One way to think about this problem is that it is so big that none of the tools we use today can handle it, so how to digest the data and transform it into valuable insights and information is key. It's transformation.

Based on the workload requirements learned from customers, NetApp understands big data as including three elements A, B, and C: analysis (Analytic), bandwidth (Bandwidth) and content (Content).

1. Big Analytics helps gain insights –
refers to the requirement for real-time analysis of huge data sets, which can bring new business models and better customer service. and achieve better results.

2. High bandwidth (Big Bandwidth) helps to go faster –
refers to the requirements for processing extremely high-speed critical data. It enables fast and efficient digestion and processing of large data sets.

3. Big Content (Big Content), no information is lost -
refers to highly scalable data storage that requires extremely high security and can be easily restored. It supports a manageable repository of information content, not just outdated data, and can span different continents.

Big data is a disruptive economic and technological force that introduces new infrastructure for IT
support. Big data solutions eliminate traditional computing and storage limitations. With the help of the growing private and public data, an epoch-making new business model is emerging, which is expected to bring new substantial revenue growth points and competitive advantages to big data customers.

The above is the detailed content of What is the difference and connection between big data and massive data?. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
deepseek web version official entrancedeepseek web version official entranceMar 12, 2025 pm 01:42 PM

The domestic AI dark horse DeepSeek has risen strongly, shocking the global AI industry! This Chinese artificial intelligence company, which has only been established for a year and a half, has won wide praise from global users for its free and open source mockups, DeepSeek-V3 and DeepSeek-R1. DeepSeek-R1 is now fully launched, with performance comparable to the official version of OpenAIo1! You can experience its powerful functions on the web page, APP and API interface. Download method: Supports iOS and Android systems, users can download it through the app store; the web version has also been officially opened! DeepSeek web version official entrance: ht

In-depth search deepseek official website entranceIn-depth search deepseek official website entranceMar 12, 2025 pm 01:33 PM

At the beginning of 2025, domestic AI "deepseek" made a stunning debut! This free and open source AI model has a performance comparable to the official version of OpenAI's o1, and has been fully launched on the web side, APP and API, supporting multi-terminal use of iOS, Android and web versions. In-depth search of deepseek official website and usage guide: official website address: https://www.deepseek.com/Using steps for web version: Click the link above to enter deepseek official website. Click the "Start Conversation" button on the homepage. For the first use, you need to log in with your mobile phone verification code. After logging in, you can enter the dialogue interface. deepseek is powerful, can write code, read file, and create code

How to solve the problem of busy servers for deepseekHow to solve the problem of busy servers for deepseekMar 12, 2025 pm 01:39 PM

DeepSeek: How to deal with the popular AI that is congested with servers? As a hot AI in 2025, DeepSeek is free and open source and has a performance comparable to the official version of OpenAIo1, which shows its popularity. However, high concurrency also brings the problem of server busyness. This article will analyze the reasons and provide coping strategies. DeepSeek web version entrance: https://www.deepseek.com/DeepSeek server busy reason: High concurrent access: DeepSeek's free and powerful features attract a large number of users to use at the same time, resulting in excessive server load. Cyber ​​Attack: It is reported that DeepSeek has an impact on the US financial industry.

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!