Home  >  Article  >  WeChat Applet  >  Jiuyan Intelligent Information Filtering: Big Data Technology Promotes Product Upgrades

Jiuyan Intelligent Information Filtering: Big Data Technology Promotes Product Upgrades

phpcn_u1852
phpcn_u1852Original
2017-09-11 16:34:511778browse

Various platforms such as social networking, live broadcasts, forums, and e-commerce generate massive amounts of UGC every day, which is inevitably mixed with a large amount of junk text. These contents not only seriously affect the user experience, but may also cause operational risks of violations.

The embarrassment of Internet bad information filtering and content review has finally ushered in a possible solution with the development of Internet technology, and that is artificial intelligence; the "Regulations on the Internet Protection of Minors" issued by the Cyberspace Administration of China in 2016 》Explicitly encourage and support the research, development, production and promotion of online protection software for adults. With technological development and policy support, the development of content review technology has finally ushered in spring.

The content review method of artificial intelligence is to let the machine learn deeply under the massive image, text, and video data, and continuously improve the system’s recognition quantity and judgment accuracy of various types of content. In a nutshell, it is to use deep learning to Techniques applied to textual linguistic information. So far, in the domestic artificial intelligence industry, the number of companies using "intelligent text mining" as the core technology has reached hundreds, and it has been subdivided into the link of "bad information filtering", which can be closely integrated with business scenarios and There are not many domestic systems for identifying and filtering illegal text information such as "violent terrorism", "sensitive information", and "small advertisements", and the Jiuyan intelligent filtering system is one of them. It fully integrates natural language understanding, artificial intelligence, Cutting-edge technologies in the fields of big data analysis and other fields have the three characteristics of intelligence, semantics, and real-time.

Jiuyan Intelligent Filtering System is a content intelligent filtering system for complex text big data. It can intelligently identify common variants of keywords such as pronunciation, deformation and word splitting in real time, and achieve precise semantic disambiguation. The system has built-in It has established a comprehensive and real-time knowledge base in China, suitable for intelligent filtering and discovery of uncivilized information content in multiple scenarios.

The three core technologies of Jiuyan intelligent filtering: intelligent variant, semantic disambiguation, and fast real-time

1. Intelligent variant identification: Jiuyan intelligent filtering uses American double array TRIE tree dictionary management and retrieval Method, the system automatically identifies variations such as deformed words, phonetic words, word splits, noise, traditional and simplified Chinese, full-width and half-width, and various interference noises in the middle; at the same time, the system supports custom lexicon, and incrementally adds millions of lexicon .

 2. Semantic disambiguation: Jiuyan intelligent filtering uses the NLPIR semantic accurate word segmentation system and sentiment analysis system to accurately identify and filter, exclude positive and harmless information, and greatly reduce the misjudgment rate.

3. Fast and real-time: Jiuyan intelligent filtering uses a patented algorithm to scan quickly, with a single machine speed of 30MB/s; it supports single machine multi-threading, multi-machine parallelism, and Hadoop cloud service mode, achieving parallel and efficient processing of PB-level information content Check online.

As the future development direction of Internet applications or platforms, content plays a vital role in all walks of life. A better review mechanism should be introduced to establish a healthy content environment. When bad information is prevalent, even It is now imminent that it may become a means for competing products to frame themselves, which also requires bad information filtering technology to be more precise to meet its needs.


The above is the detailed content of Jiuyan Intelligent Information Filtering: Big Data Technology Promotes Product Upgrades. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn