Home >Common Problem >Natural language processing is a science that integrates linguistics, computer science and so on.

Natural language processing is a science that integrates linguistics, computer science and so on.

青灯夜游Original: 2021-02-02 10:39:176356browse

Natural language processing is a science that integrates linguistics, computer science, and mathematics. Natural language processing is mainly used in machine translation, public opinion monitoring, automatic summarization, opinion extraction, text classification, question answering, text semantic comparison, speech recognition, Chinese OCR, etc.

The operating environment of this tutorial: Windows 7 system, Dell G3 computer.

Natural Language Processing (NLP) is a science that integrates linguistics, computer science, and mathematics.

Natural language processing refers to the technology that uses the natural language used by humans to communicate with machines for interactive communication. Through artificial processing of natural language, computers can read and understand it. Related research on natural language processing began with human exploration of machine translation. Although natural language processing involves multi-dimensional operations such as pronunciation, grammar, semantics, and pragmatics, in simple terms, the basic task of natural language processing is to segment the corpus to be processed based on ontology dictionary, word frequency statistics, contextual semantic analysis, etc., to form A lexical unit based on the smallest part of speech and rich in semantics.

Natural language processing takes language as the object and uses computer technology to analyze, understand and process natural language. It uses computers as a powerful tool for language research and quantifies language information with the support of computers. research, and provide language descriptions that can be used between humans and computers. It includes two parts: Natural Language Understanding (NLU) and Natural Language Generation (NLG). It is a typical edge interdisciplinary subject, involving language science, computer science, mathematics, cognition, logic, etc., and focusing on the interaction between computers and human (natural) language. The process of using computers to process natural language is also called Natural Language Understanding (NLU), Human Language Technology (HLT), and Computational Linguistics at different times or with different emphasis. , Quantitative Linguistics, Mathematical Linguistics.

Realizing natural language communication between humans and computers means enabling computers to not only understand the meaning of natural language texts, but also to express given intentions, thoughts, etc. in natural language texts. The former is called natural language understanding, and the latter is called natural language generation. Therefore, natural language processing generally includes two parts: natural language understanding and natural language generation. Historically, more research has been done on natural language understanding, but less on natural language generation. But that has changed.

Related recommendations: "Programming Learning"

Whether it is natural language understanding or natural language generation, it is far from being as simple as people originally imagined, but is very difficult. of. Judging from the current theoretical and technical status, a universal, high-quality natural language processing system is still a long-term goal. However, for certain applications, practical systems with considerable natural language processing capabilities have emerged, and some have been commercialized. , and even started industrialization. Typical examples include: natural language interfaces for multilingual databases and expert systems, various machine translation systems, full-text information retrieval systems, automatic summarization systems, etc.

Natural language processing, that is, it is very difficult to realize natural language communication between humans and machines, or to realize natural language understanding and natural language generation. The root cause of the difficulty is the wide variety of ambiguities that exist at all levels of natural language text and dialogue.

There is a many-to-many relationship between the form of natural language (string) and its meaning. In fact, this is exactly the charm of natural language. But from a computer processing point of view, we must eliminate ambiguity, and some people think that it is the central problem in natural language understanding, that is, to convert potentially ambiguous natural language input into some unambiguous computer internal representation.

The widespread existence of ambiguity phenomena requires a large amount of knowledge and reasoning to eliminate them, which brings great difficulties to linguistics-based methods and knowledge-based methods. Therefore, these methods are the mainstream natural language methods. Processing research has made many achievements in theory and method over the past few decades, but the results are not significant in terms of the development of systems that can process large-scale real texts. Most of the systems developed are small-scale, research demonstration systems.

The current problems have two aspects: on the one hand, the grammar so far is limited to analyzing an isolated sentence. There is still a lack of systematic research on the constraints and influence of the context and conversation environment on this sentence. Therefore, the analysis of ambiguity and word omission is There are no clear rules to follow for problems such as the different meanings of the same sentence on different occasions or by different people, and the research on pragmatics needs to be strengthened to gradually solve it. On the other hand, people understand a sentence not only by grammar, but also by using a large amount of relevant knowledge, including life knowledge and professional knowledge. All of this knowledge cannot be stored in a computer. Therefore, a written comprehension system can only be established within a limited range of vocabulary, sentence patterns and specific topics; only after the storage capacity and operating speed of computers are greatly improved, it will be possible to appropriately expand the scope.

The above existing problems This has become the main problem in the application of natural language understanding in machine translation, which is one of the reasons why the translation quality of today's machine translation systems is still far from the ideal goal; and the translation quality is the key to the success or failure of the machine translation system. Professor Zhou Haizhong, a Chinese mathematician and linguist, once pointed out in the classic paper "Fifty Years of Machine Translation": To improve the quality of machine translation, the first thing to solve is the language itself rather than the programming problem; relying on a few programs alone It is certainly impossible to improve the quality of machine translation by building a machine translation system; in addition, when humans have not yet understood how the brain performs fuzzy recognition and logical judgment of language, it is difficult for machine translation to achieve the level of "faithfulness, expressiveness, and elegance". possible.

If you want to read more related articles, please visit PHP Chinese website! !

The above is the detailed content of Natural language processing is a science that integrates linguistics, computer science and so on.. For more information, please follow other related articles on the PHP Chinese website!

Statement：

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Previous article：How to create txt formatNext article：How to create txt format

See more

Natural language processing is a science that integrates linguistics, computer science and so on.

Related articles