Home >Technology peripherals >AI >Data and Artificial Intelligence Technology Forecast for the Second Half of 2022

Data and Artificial Intelligence Technology Forecast for the Second Half of 2022

王林
王林forward
2023-04-12 21:34:061307browse

Based on what we've seen so far in 2022, Datanami is confident it can make these five predictions for the rest of the year.

Data and Artificial Intelligence Technology Forecast for the Second Half of 2022

Data Observability Continues

The first half of the year has been huge for data observability, allowing customers to better understand data flows situation and develop relevant indicators. As data becomes more important to decision making, so does the health and availability of that data.

We’ve seen a number of data observability startups raise hundreds of millions of dollars in venture capital, including Cribl ($150 million Series D); Monte Carlo ($135 million Series D) ; Coralogix ($142 million Series D); and others. Other companies making news include Bigeye, which launched metadata metrics; StreamSets, acquired by Software AG for $580 million; and IBM, which acquired observability startup Databand last month.

This momentum will continue in the second half of 2022, as more data observability startups emerge from the woods and existing startups seek to solidify their positions in this emerging market.

Real-time data pop-up

Real-time data has been on the back burner for years, serving niche use cases but not actually being widely used in regular enterprises. However, thanks to the COVID-19 pandemic and the associated restructuring of business plans over the past few years, conditions are now ripe for real-time data to enter the mainstream tech scene.

“I think streaming is finally happening,” Databricks CEO Ali Ghodsi said at the recent Data AI Summit, noting a 2.5x increase in streaming workloads on the company’s cloud-based data platform . “They have more and more AI use cases that require real-time.”

In-memory databases and in-memory data grids are also poised to benefit from a real-time renaissance, if that’s the case. RocksDB, a fast analytics database that enhances event-based systems like Kafka, now has a replacement called Speedb. SingleStore, which combines OLTP and OLAP capabilities in a single relational framework, reached a $1.3 billion valuation in a funding round last month.

There is also StarRocks, which recently received funding for a fast new OLAP database based on Apache Doris; Imply completed a US$100 million Series D financing in May to continue its real-time analysis business based on Apache Druid; DataStax adds Apache Pulsar to its Apache Cassandra toolkit, raising $115 million to advance real-time application development. Datanami expects this focus on real-time data analytics to continue.

Regulatory Growth

It’s been four years since the GDPR came into effect, putting big data users in the spotlight and accelerating the rise of data governance as a necessary component of responsible data initiatives. In the United States, the task of regulating data access has fallen to the states, with California leading the way with the CCPA, which in many ways is modeled after the GPDR. But more states are likely to follow suit, complicating the data privacy equation for U.S. companies.

But GDPR and CCPA are just the beginning of regulations. We're also in the midst of the demise of third-party cookies, which make it harder for companies to track users' online behavior. Google's decision to delay the end of third-party cookies on its platform until January 1, 2023 gives marketers some extra time to adapt, but the information from the cookies will be difficult to replicate.

In addition to data regulations, we are also on the cusp of new regulations regarding the use of artificial intelligence. The EU introduced its Artificial Intelligence Bill in 2021, and experts predict it could become law by the end of 2022 or early 2023.

Datasheet format war

A classic technology war is shaping up new datasheet formats that will determine how data is stored in big data systems, who can access it, and who uses it What can be done with it.

In recent months, Apache Iceberg has gained momentum as a potential new standard for data table formats. Cloud data warehouse giants Snowflake and AWS came out earlier this year to back Iceberg, which provides transactional and other data controls and has emerged from work at Netflix and Apple. Former Hadoop distributor Cloudera also backed Iceberg in June.

But the folks at Databricks offer an alternative to Delta Lake tabular format that provides similar functionality to Iceberg. Apache Spark backers originally developed the Delta Lake tabular format in a proprietary manner, leading to accusations that Databricks was setting a lock-in for customers. But at the Data AI Summit in June, the company announced that it would make the entire format open source, allowing anyone to use it.

Lost in the shuffle is Apache Hudi, which also provides data consistency because it resides in a big data repository and can be accessed by various computing engines. Onehouse, a business backed by the creators of Apache Hudi, launched a Hudi-based Lakehouse platform earlier this year.

The big data ecosystem loves competition, so it will be interesting to watch these formats evolve and compete throughout the rest of 2022.

Language AI continues to amaze

The frontiers of artificial intelligence are getting sharper every month, and today, the spearhead of AI is big language models, which are getting better and better. In fact, large language models have become so good that in June a Google engineer claimed that the company's LaMDA conversational system had become sentient.

Artificial intelligence is not yet sentient, but that doesn’t mean they aren’t useful to businesses. As a reminder, Salesforce has a large language modeling (LLM) project called CodeGen, which is designed to understand source code and even generate its own code in different programming languages.

Last month, Meta (the parent company of Facebook) launched a massive language model that can translate into 200 languages. We’ve also seen efforts to democratize AI through projects like the BigScience Large Open Science Open Access Multilingual Language Model, or BLOOM.

The above is the detailed content of Data and Artificial Intelligence Technology Forecast for the Second Half of 2022. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete