Home  >  Article  >  Technology peripherals  >  Application of text emotion recognition technology based on deep learning in 5G bad news security management and control platform

Application of text emotion recognition technology based on deep learning in 5G bad news security management and control platform

王林
王林forward
2023-04-09 16:41:061565browse

Author | Sun Yue, Unit: China Mobile (Hangzhou) Information Technology Co., Ltd. | China Mobile Hangzhou R&D Center

Labs Introduction

With the development of 5G network As it continues to gain popularity, a large number of users are beginning to come into contact with and use 5G networks. 5G networks can not only transmit voice, video, text and other information of traditional networks, but can also be used in more practical application scenarios with lower latency and high-precision positioning capabilities, such as: live battlefield information, satellite Positioning, navigation, etc.

Internet information is often mixed with bad information, such as political-related information, pornographic information, and gang-related information , fraudulent information, commercial advertising information, etc., and the number of bad information is increasing year by year, causing huge harassment to users. In order to purify the network environment and effectively control the spread of bad information, China Mobile's 5G bad news security management and control platform came into being.

Application of text emotion recognition technology based on deep learning in 5G bad news security management and control platform

Data source: China Mobile Group Information Security Center

##1. Application scenarios of the 5G bad information management and control platform

##When faced with a complex network information environment, this platform Such as text messages, voice messages, video messages, rich media messages, etc., classify the messages into: politics-related, pornographic, gang-related, fraud-related, commercial advertising messages, normal messages, etc., and then intercept them in a timely manner through corresponding strategies. And follow-up punishment will be carried out according to the severity of the bad news, so as to purify the network environment from the root and create a good network space.

Application of text emotion recognition technology based on deep learning in 5G bad news security management and control platform

2. Existing 5G bad information management and control platform technology Key points

##The platform mainly intercepts bad information through the following methods:

①Set first-level keywords

: First-level keywords are usually set to some extremely sensitive words. If the user sends a message containing first-level keyword content, the message will be intercepted immediately. , the information content cannot be delivered, and the user is marked.

② Set common keywords

: Common keywords are set to some more sensitive words. If the user sends a message that contains common keyword content, and within a certain period of time Within a certain period of time, if the number of times the user sends the sensitive message exceeds the system's preset interception threshold, the system will pull the user into the blacklist, and within a certain period of time, the user will not be able to use full 5G network services.

③Set complex text information monitoring

: If the user sends a PDF file, which contains text and pictures, extract the text in the file and filter it Advanced keywords and ordinary keyword mechanisms, and pictures are filtered by rich media mechanisms. According to the filtering results of text and pictures respectively, the principle of heavy processing is adopted as the processing result of the file.

3. Technical weaknesses of the existing 5G bad management and control platform

The filtering mechanism of the existing 5G bad news security control platform can only filter specified and limited phrases and short sentences. With the popularity of the Internet, new words will emerge in large numbers every day, and only manual addition is required. Vocabulary, it is no longer possible to update the vocabulary library in a timely and rapid manner. Moreover, when a large number of users today send text messages, although the entire text message does not contain illegal words, the thoughts and emotions expressed may contain a large number of negative emotional tendencies. Words and short sentences alone cannot successfully intercept negative emotional content. Therefore, using text sentiment analysis to submit sentences rich in negative emotional tendencies for review and interception can further strengthen the effect of bad information control and reduce the erosion and poisoning of users by spam information.

By establishing a text emotion library containing popular Internet phrases and news messages, the emotions rich in the text are divided into three categories: positive emotions, neutral emotions, and negative emotions, and Add corresponding labels to each text according to these three categories, and use the deep learning network to train the text in the emotional library. The trained model can be used in the 5G bad news management and control platform to intercept bad emotional messages.

4. Technical implementation details of 5G defect management and control system based on deep learning

This technology contains three major subjects: jieba word segmentation system, phrase vectorization, and text emotion recognition algorithm. The interaction between each subject is as follows:

Interaction flow chart of each moduleApplication of text emotion recognition technology based on deep learning in 5G bad news security management and control platform

Use crawler technology to crawl Internet words and news messages as original text, and divide the original text into a training set and a test set in a ratio of 8:2, label the text information in the training set, and then divide the text in the test set into The information is segmented through jieba word segmentation tool, for example: He came to Mobile Hangyan Building. After word segmentation through the jieba word segmentation tool, the result is: he/came/moved/Hangyan/building, and finally the data after word segmentation was organized into a corpus. Since the amount of text information in the training set and test set is very large (usually millions of data), the amount of data in the post-word segmentation corpus will also be very large (tens of millions of data). Although these corpora can be stored in a numbered form in the corpus, due to the huge amount of data, it is easy to suffer from the disaster of dimensionality. Therefore, for the modal particles that appear in text information, such as: "le", "的", "我", etc., although these words appear very frequently, they have little contribution to the emotional effect, so we will choose to eliminate these words from the corpus Phrases to achieve the purpose of reducing dimensions.

We send the vectorized phrases in the training set into the deep learning network for learning and training, obtain the corresponding model, and finally put the data in the test set into the model to view the corresponding recognition As a result, when the model can obtain a better accuracy rate, the model is connected to the 5G bad management and control platform, and the user sends end-to-end information for filtering. During the filtering process, if bad information is found, it will be intercepted in a timely manner, making the 5G bad information management and control system's interception of bad information more systematic and comprehensive.

Application of text emotion recognition technology based on deep learning in 5G bad news security management and control platform

Specific steps are as follows:

  1. Crawl the original text corpus from the Internet and preprocess the original text, including: removing modal particles, deleting punctuation marks and blank areas that appear in the text, deleting terminators, sparse words and specific words that appear in the text; use The jieba library performs word segmentation and accurately cuts text sentences into separate phrases;
  2. divides the crawled text data set into a training set and a test set according to a certain proportion. Text sentences are manually annotated and divided into: positive emotions, negative emotions, and neutral emotions. And use the jieba library to segment the text sentences in the training set and the test set respectively, and construct the segmented training set into a corpus;
  3. vectorize the phrases in step 1, so that each segmentation is mapped into a multi-dimensional Continuous-valued vectors to obtain the word vector matrix of the entire data set.
  4. By first extracting the clause where the emotional word is located, the complexity of the sentence is reduced, and then the position of the emotional object is predicted in the clause based on various features, and then the emotion is extracted from the corresponding position. Emotion extraction is to obtain valuable emotional information in text and determine the role a word or phrase plays in emotional expression, including tasks such as emotional expresser identification, evaluation object identification, and emotional viewpoint word identification.
  5. By sending the emotion vectors obtained by the above operations into the deep learning network to obtain a text emotion recognition model, then send the emotion vectors in the test set into the model, check the test results, and continue with the data with normal detection results. Perform regular policy filtering, such as text matching, rich media recognition, etc.

5. Advantages of 5G interception system incorporating deep learning

Compared with the existing 5G interception system, the 5G interception system integrated with deep learning has the following advantages:

  • Using deep learning The technology provides effective identification with high reliability and authenticity;
  • uses deep learning technology for emotion recognition, with less manual intervention and high work efficiency;
  • uses text emotion recognition to effectively supplement key The shortcomings of word interception;
  • Using text emotion recognition, the strategy can be automatically updated and supplemented with new entry information in a timely manner to improve efficiency.

Write at the end:

At present, the application field of deep learning is very broad, relying on its repeated training and self-learning methods. It can greatly reduce manual workload and improve efficiency and accuracy. Not only is it suitable for the above-mentioned bad information interception system, I believe that in the near future, this technology will also shine in other emerging fields. Of course, deep learning itself is not perfect and cannot solve all thorny problems. Because of this, we should continue to invest deep learning technology in new scenarios and new fields in order to achieve new breakthroughs and create a better future smart life.

The above is the detailed content of Application of text emotion recognition technology based on deep learning in 5G bad news security management and control platform. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete