ChatGPT application is booming. Where to find a secure big data base?-AI-php.cn

Home

Technology peripherals

ChatGPT application is booming. Where to find a secure big data base?

王林

May 21, 2023 pm 03:31 PM

AIBig Datachatgpt

There is no doubt that AIGC is bringing a profound change to human society.

ChatGPT application is booming. Where to find a secure big data base?

Peeling away its dazzling and gorgeous appearance, the core of operation cannot be separated from the support of massive data.

ChatGPT’s “intrusion” has caused concerns about content plagiarism in all walks of life, as well as increased awareness of network data security.

Although AI technology is neutral, it does not become a reason to avoid responsibilities and obligations.

Recently, the British intelligence agency, the British Government Communications Headquarters (GCHQ), warned that ChatGPT and other artificial intelligence chatbots will be a new security threat.

Although the concept of ChatGPT has not been around for long, the threats to network security and data security have become the focus of the industry.

Regarding ChatGPT, which is still in its early stages of development, are such worries unfounded?

Security threats may be occurring

At the end of last year, the startup OpenAI launched ChatGPT. After that, its investor Microsoft launched a chatbot based on ChatGPT technology this year." Bing Chat”.

Because this type of software can provide human-like conversations, this service has become popular all over the world.

ChatGPT application is booming. Where to find a secure big data base?

GCHQ’s cybersecurity arm noted that companies that provide AI chatbots can see the content of queries entered by users. As far as ChatGPT is concerned, its developer OpenAI can see this.

ChatGPT is trained through a large number of text corpora, and its deep learning capabilities rely heavily on the data behind it.

Due to concerns about information leakage, many companies and institutions have issued "ChatGPT bans".

City of London law firm Mishcon de Reya has banned its lawyers from entering client data into ChatGPT over concerns that legally privileged information could be compromised.

International consulting firm Accenture warned its 700,000 employees worldwide not to use ChatGPT for similar reasons, fearing that confidential client data could end up in the wrong hands.

Japan’s SoftBank Group, the parent company of British computer chip company Arm, also warned its employees not to enter company personnel’s identifying information or confidential data into artificial intelligence chatbots.

In February this year, JPMorgan Chase became the first Wall Street investment bank to restrict the use of ChatGPT in the workplace.

Citigroup and Goldman Sachs followed suit, with the former banning employees from company-wide access to ChatGPT and the latter restricting employees from using the product on the trading floor.

Earlier, in order to prevent employees from leaking secrets when using ChatGPT, Amazon and Microsoft prohibited them from sharing sensitive data with them, because this information may be used to further Iterative training data.

In fact, behind these artificial intelligence chatbots are large language models (LLM), and the user's query content will be stored and used at some point in the future. Develop LLM services or models.

This means that the LLM provider can read related queries and possibly incorporate them into future releases in some way.

Although LLM operators should take steps to protect data, the possibility of unauthorized access cannot be completely ruled out. Therefore, enterprises need to ensure that they have strict policies and provide technical support to monitor the use of LLM to minimize the risk of data exposure.

In addition, although ChatGPT itself does not have the ability to directly attack network security and data security, due to its ability to generate and understand natural language, it can be used to forge false information, attack social engineering, etc.

In addition, attackers can also use natural language to let ChatGPT generate corresponding attack code, malware code, spam, etc.

Therefore, AI can allow those who originally have no ability to launch attacks to generate attacks based on AI, and greatly increase the success rate of attacks.

With the support of technologies and models such as automation, AI, and "attack as a service", network security attacks have shown a skyrocketing trend.

Before ChatGPT became popular, there had been many cyber attacks by hackers using AI technology.

In fact, it is not uncommon for artificial intelligence to be adjusted by users to "deviate from the rhythm". Six years ago, Microsoft launched the intelligent chat robot Tay. When it went online, Tay behaved politely. He was polite, but in less than 24 hours, he was "led bad" by unscrupulous users. He used rude and dirty words constantly, and his words even involved racism, pornography, and Nazis. He was full of discrimination, hatred, and prejudice, so he had to be taken offline and ended his short life. .

On the other hand, the risk closer to the user is that when using AI tools such as ChatGPT, users may inadvertently input private data into the cloud model, and this data may become a training tool. Data can also become part of the answers provided to others, leading to data breaches and compliance risks.

AI applications must lay a secure foundation

As a large language model, the core logic of ChatGPT is actually the collection, processing, and Processing and output of operation results.

In general, these links may be associated with risks in three aspects: technical elements, organizational management, and digital content.

Although ChatGPT stated that it will strictly abide by privacy and security policies when storing the data required for training and running models, there may still be problems such as cyber attacks and data crawling in the future. Ignored data security risks.

Especially when it comes to the capture, processing, and combined use of national core data, local and industry important data, and personal privacy data, it is necessary to balance data security protection and flow sharing.

In addition to the hidden dangers of data and privacy leaks, AI technology also has problems such as data bias, false information, and difficulty in interpreting models, which may lead to misunderstanding and distrust.

The trend has arrived, and the wave of AIGC is coming. Against the backdrop of a promising future, it is crucial to move forward and establish a data security protection wall.

Especially when AI technology gradually improves, it can not only become a powerful tool for productivity improvement, but it can also easily become a tool for illegal crimes.

Monitoring data from the Qi’anxin Threat Intelligence Center shows that from January to October 2022, more than 95 billion pieces of Chinese institutional data were illegally traded overseas, of which more than 57 billion pieces were illegally traded overseas. is personal information.

Therefore, how to ensure the security of data storage, calculation, and circulation is a prerequisite for the development of the digital economy.

From an overall perspective, top-level design and industrial development should be insisted on going hand in hand. On the basis of the "Cybersecurity Law", the risk and responsibility analysis system should be refined and a security accountability mechanism should be established.

At the same time, regulatory authorities can carry out regular inspections, and companies in the security field can work together to build a full-process data security system.

Regarding the issues of data compliance and data security, especially after the introduction of the "Data Security Law", data privacy is becoming more and more important.

If data security and compliance cannot be guaranteed during the application of AI technology, it may cause great risks to the enterprise.

In particular, small and medium-sized enterprises have relatively little knowledge about data privacy security and do not know how to protect data from security threats.

Data security compliance is not a matter for a certain department, but the most important matter for the entire enterprise.

Enterprises should train employees to make them aware that everyone who uses data has the obligation to protect data, including IT personnel, AI departments, data engineers, developers, users, etc. Reporting people, people and technology need to be integrated.

Faced with the aforementioned potential risks, how can regulators and relevant companies strengthen data security protection in the AIGC field from the institutional and technical levels?

Compared to taking regulatory measures such as restricting the use of user terminals directly, it will be more effective to clearly require AI technology research and development companies to follow scientific and technological ethical principles, because these companies can limit users at the technical level scope of use.

At the institutional level, it is necessary to establish and improve a data classification and hierarchical protection system based on the characteristics and functions of the data required by AIGC's underlying technology.

For example, the data in the training data set can be classified and managed according to the data subject, data processing level, data rights attributes, etc., according to the value of the data to the data rights subject, and once the data is Classify the degree of harm to the data subject if it has been tampered with, destroyed, etc.

On the basis of data classification and classification, establish data protection standards and sharing mechanisms that match the data type and security level.

Focusing on enterprises, it is also necessary to accelerate the application of "private computing" technology in the AIGC field.

This type of technology allows multiple data owners to share, interoperate, and calculate data by sharing SDK or opening SDK permissions without exposing the data itself. , modeling, while ensuring that AIGC can provide services normally, while ensuring that data is not leaked to other participants.

In addition, the importance of full-process compliance management has become increasingly prominent.

Enterprises should first pay attention to whether the data resources they use comply with legal and regulatory requirements. Secondly, they should ensure that the entire process of algorithm and model operation is compliant. The innovative research and development of enterprises should also maximize Meet the ethical expectations of the public.

At the same time, enterprises should formulate internal management standards and set up relevant supervision departments to supervise data in all aspects of AI technology application scenarios to ensure that data sources are legal, processing is legal, and output is legal. This ensures its own compliance.

The key to AI application lies in the consideration between deployment method and cost. However, it must be noted that if security compliance and privacy protection are not done well, it may have "" A greater risk point.”

AI is a double-edged sword. If used well, enterprises will be even more powerful; if used improperly, neglecting security, privacy and compliance will bring greater losses to the enterprise.

Therefore, before AI can be applied, it is necessary to build a more stable "data base". As the saying goes, only stability can lead to long-term development.

The above is the detailed content of ChatGPT application is booming. Where to find a secure big data base?. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

2023年机器学习的十大概念和技术Apr 04, 2023 pm 12:30 PM

机器学习是一个不断发展的学科，一直在创造新的想法和技术。本文罗列了2023年机器学习的十大概念和技术。本文罗列了2023年机器学习的十大概念和技术。2023年机器学习的十大概念和技术是一个教计算机从数据中学习的过程，无需明确的编程。机器学习是一个不断发展的学科，一直在创造新的想法和技术。为了保持领先，数据科学家应该关注其中一些网站，以跟上最新的发展。这将有助于了解机器学习中的技术如何在实践中使用，并为自己的业务或工作领域中的可能应用提供想法。2023年机器学习的十大概念和技术：1. 深度神经网

超参数优化比较之网格搜索、随机搜索和贝叶斯优化Apr 04, 2023 pm 12:05 PM

本文将详细介绍用来提高机器学习效果的最常见的超参数优化方法。译者 | 朱先忠审校 | 孙淑娟简介通常，在尝试改进机器学习模型时，人们首先想到的解决方案是添加更多的训练数据。额外的数据通常是有帮助（在某些情况下除外）的，但生成高质量的数据可能非常昂贵。通过使用现有数据获得最佳模型性能，超参数优化可以节省我们的时间和资源。顾名思义，超参数优化是为机器学习模型确定最佳超参数组合以满足优化函数（即，给定研究中的数据集，最大化模型的性能）的过程。换句话说，每个模型都会提供多个有关选项的调整“按钮

人工智能自动获取知识和技能，实现自我完善的过程是什么Aug 24, 2022 am 11:57 AM

实现自我完善的过程是“机器学习”。机器学习是人工智能核心，是使计算机具有智能的根本途径；它使计算机能模拟人的学习行为，自动地通过学习来获取知识和技能，不断改善性能，实现自我完善。机器学习主要研究三方面问题：1、学习机理，人类获取知识、技能和抽象概念的天赋能力；2、学习方法，对生物学习机理进行简化的基础上，用计算的方法进行再现；3、学习系统，能够在一定程度上实现机器学习的系统。

得益于OpenAI技术，微软必应的搜索流量超过谷歌Mar 31, 2023 pm 10:38 PM

截至3月20日的数据显示，自微软2月7日推出其人工智能版本以来，必应搜索引擎的页面访问量增加了15.8%，而Alphabet旗下的谷歌搜索引擎则下降了近1%。 3月23日消息，外媒报道称，分析公司Similarweb的数据显示，在整合了OpenAI的技术后，微软旗下的必应在页面访问量方面实现了更多的增长。截至3月20日的数据显示，自微软2月7日推出其人工智能版本以来，必应搜索引擎的页面访问量增加了15.8%，而Alphabet旗下的谷歌搜索引擎则下降了近1%。这些数据是微软在与谷歌争夺生

荣耀的人工智能助手叫什么名字Sep 06, 2022 pm 03:31 PM

荣耀的人工智能助手叫“YOYO”，也即悠悠；YOYO除了能够实现语音操控等基本功能之外，还拥有智慧视觉、智慧识屏、情景智能、智慧搜索等功能，可以在系统设置页面中的智慧助手里进行相关的设置。

人工智能在教育领域的应用主要有哪些Dec 14, 2020 pm 05:08 PM

人工智能在教育领域的应用主要有个性化学习、虚拟导师、教育机器人和场景式教育。人工智能在教育领域的应用目前还处于早期探索阶段，但是潜力却是巨大的。

30行Python代码就可以调用ChatGPT API总结论文的主要内容Apr 04, 2023 pm 12:05 PM

阅读论文可以说是我们的日常工作之一，论文的数量太多，我们如何快速阅读归纳呢？自从ChatGPT出现以后，有很多阅读论文的服务可以使用。其实使用ChatGPT API非常简单，我们只用30行python代码就可以在本地搭建一个自己的应用。阅读论文可以说是我们的日常工作之一，论文的数量太多，我们如何快速阅读归纳呢？自从ChatGPT出现以后，有很多阅读论文的服务可以使用。其实使用ChatGPT API非常简单，我们只用30行python代码就可以在本地搭建一个自己的应用。使用 Python 和 C

人工智能在生活中的应用有哪些Jul 20, 2022 pm 04:47 PM

人工智能在生活中的应用有：1、虚拟个人助理，使用者可通过声控、文字输入的方式，来完成一些日常生活的小事；2、语音评测，利用云计算技术，将自动口语评测服务放在云端，并开放API接口供客户远程使用；3、无人汽车，主要依靠车内的以计算机系统为主的智能驾驶仪来实现无人驾驶的目标；4、天气预测，通过手机GPRS系统，定位到用户所处的位置，在利用算法，对覆盖全国的雷达图进行数据分析并预测。

See all articles