Home > Article > Backend Development > Python is the leading language for artificial intelligence
Who will become the first development language in the era of AI and big data? This is an issue that needs no debate. If three years ago, Matlab, Scala, R, Java and Python still had their own opportunities and the situation was still unclear, then three years later, the trend has been very clear, especially after Facebook open sourced PyTorch two days ago, Python has become an AI The position of the top language of the times is basically established, and the suspense of the future is just who can secure the second spot.
Google's AI defeating a Go master is a way to measure the sudden rapid development of artificial intelligence, and also sheds light on how these technologies have developed and how they can develop in the future.
Artificial intelligence is a futuristic technology that is currently working on its own set of tools. A series of developments have occurred in the past few years: driving more than 300,000 miles without an accident and being legal in three states has ushered in a milestone for autonomous driving; IBM Watson beat the two-time Jeopardy champion; statistical learning technology has evolved from consumer to consumer Researchers are interested in pattern recognition on complex data sets of trillions of images. These developments are bound to increase the interest in artificial intelligence among scientists and gurus, which also allows developers to understand the true nature of creating artificial intelligence applications. The first thing to note when developing these is:
Which programming language is suitable for artificial intelligence?
Every programming language you are proficient in can be a development language for artificial intelligence.
Artificial intelligence programs can be implemented using almost all programming languages, the most common ones are: Lisp, Prolog, C/C++, and recently Java, and recently Python.
LISP
High-level languages like LISP are highly favored in artificial intelligence because rapid prototyping is chosen over rapid execution after years of research in various universities. Some features such as garbage collection, dynamic typing, data functions, unified syntax, interactive environment and scalability make LIST very suitable for artificial intelligence programming.
PROLOG
This language effectively combines the high-level and traditional advantages of LISP, which is very useful for AI. Its strength is solving "logic-based problems". Prolog provides solutions to logic-related problems, or its solutions have simple logical characteristics. Its main disadvantage (IMHO) is that it's hard to learn.
C/C++
Just like Cheetah, C/C++ is mainly used when execution speed is required. It is mainly used for simple programs, statistical artificial intelligence such as neural networks being a common example. Backpropagation only uses a few pages of C/C++ code, but it requires speed. Even if the programmer can only improve the speed a little, it is good.
JAVA
For newcomers, Java uses several concepts from LISP, the most obvious one is garbage collection. Its portability makes it applicable to any program, and it also has a set of built-in types. Java is not as advanced as LISP and Prolog, and it is not as fast as C, but if portability is required, it is the best.
PYTHON
Python is a language compiled with LISP and JAVA. According to the comparison of Lips and Python in Norvig's article, the two languages are very similar to each other, with only some minor differences. There is also JPthon, which provides access to Java graphical user interfaces. This is the reason why Peter Norvig chose to use JPyhton to translate the programs in his artificial intelligence books. JPython allows him to use portable GUI demonstrations and portable http/ftp/html libraries. Therefore, it is very suitable as an artificial intelligence language.
Benefits of using Python over other programming languages for artificial intelligence
High-quality documentation
Platform independent, can be used on every *nix version now
Easier and faster to learn than other object-oriented programming languages
Python has many image enhancement libraries like Python Imaging Libary, VTK and Maya 3D Visualization Toolkit, Numeric Python, Scientific Python and many other available tools for Numerical and scientific applications.
Python is designed to be very good, fast, robust, portable, and extensible. Obviously these are very important factors for artificial intelligence applications.
It is useful for a wide range of programming tasks for scientific purposes, from small shell scripts to entire website applications.
Finally, it is open source. The same community support is available.
AI's Python library
Overall AI library
AIMA: Python implements the algorithms of "Artificial Intelligence: A Modern Approach" from Russell to Norvigs
pyDatalog: Logic programming engine in Python
SimpleAI: Python implements the artificial intelligence algorithm described in the book "Artificial Intelligence: A Modern Approach". It focuses on providing an easy-to-use, well-documented and tested library.
EasyAI: A python engine for two-player AI games (negative maximum value, replacement table, game solution)
Machine learning library
PyBrain A flexible, simple and effective tool for Algorithms for machine learning tasks, it is a modular Python machine learning library. It also provides a variety of predefined environments to test and compare your algorithms.
PyML is a bilateral framework written in Python, focusing on SVM and other kernel methods. It supports Linux and Mac OS X.
scikit-learn aims to provide simple yet powerful solutions that can be reused in different contexts: Machine learning as a versatile tool in science and engineering. It is a module of python that integrates classic machine learning algorithms. These algorithms are closely linked to the python scientific package (numpy, scipy.matplotlib).
MDP-Toolkit is a Python data processing framework that can be easily extended. It collects supervised and unsupervised learning algorithms and other data processing units, which can be combined into data processing sequences or more complex feedforward network structures. The implementation of the new algorithm is simple and intuitive. The available algorithms are steadily increasing, including signal processing methods (principal component analysis, independent component analysis, slow feature analysis), flow learning methods (local linear embedding), centralized classification, and probabilistic methods (factor analysis, RBM) , data preprocessing methods, etc.
Natural Language and Text Processing Library
NLTK Open source Python module, linguistic data and documentation, used to research and develop natural language processing and text analysis. There are versions for windows, Mac OSX and Linux.
Case
conducted an experiment, a software that uses artificial intelligence and the Internet of Things to analyze employee behavior. The software provides a useful feedback to employees through their emotional and behavioral distractions, thereby improving management and work habits.
Use Python machine learning library, opencv and haarcascading concepts for training. A sample POC was built to detect basic emotions like happiness, anger, sadness, disgust, suspicion, contempt, sarcasm, and surprise delivered back through wireless cameras placed at different locations. The data collected is centralized in a cloud database and even the entire office can be retrieved with the click of a button from an Android device or desktop.
Developers are making progress in deeply analyzing the emotional complexity of faces and digging into more details. With the help of deep learning algorithms and machine learning, analysis of individual employee performance and appropriate employee/team feedback can be helped.
Conclusion
Python plays an important role in artificial intelligence because it provides a good framework like scikit-learn: machine learning in Python has realized most of the problems in this field. need. D3.js is one of the most powerful and easy-to-use tools for data-driven document visualization in JS. Processing frameworks, and its rapid prototyping make it an important language that cannot be ignored. AI requires a lot of research, so there's no need to ask for a 500KB Java boilerplate code to test a new hypothesis. Almost every idea in Python can be implemented quickly in 20-30 lines of code (the same goes for JS and LISP). Therefore, it is a very useful language for artificial intelligence.
For developers who want to join the AI and big data industry, putting eggs in the Python basket is not only safe, but also a must. Or to put it another way, if you want to work in this industry in the future, don’t think about anything, just learn Python with your eyes closed first. Of course, Python is not without its problems and shortcomings. You can and should have another or even several languages to match Python, but there is no doubt that Python will firmly occupy the position of the first language for data analysis and AI. I even think that because Python has secured its position, because this industry will need a large number of practitioners in the future, and because Python is quickly becoming the preferred teaching language for introductory programming courses in universities, middle schools and primary schools around the world, this open source dynamic scripting language has a great opportunity to be used in In the near future it will become the first true programming Esperanto.
Discussing the pros and cons of programming languages has always been considered a war of words topic and is looked down upon by senior people. But I think the rise of Python this time is a big deal. Please imagine if, fifteen years from now, all knowledge workers under the age of 40, from doctors to construction engineers, from office secretaries to film directors, from composers to salespeople, can use the same programming language. Basic data processing, calling artificial intelligence APIs on the cloud, controlling intelligent robots, and then communicating ideas with each other. The significance of this collaborative network of universal programming will far exceed any programming language dispute. Currently, Python appears to be the most promising candidate for this role.
Python’s victory is surprising because its shortcomings are obvious. It has its own syntax, which makes many veterans feel uncomfortable; "naked" Python is very slow, ranging from dozens to thousands of times slower than the C language on different tasks; due to the Global Interpreter Lock (GIL) ), a single Python program cannot be executed concurrently on multiple cores; the two versions of Python 2 and Python 3 have been running in parallel for a long time, and many modules need to maintain two different versions at the same time, which brings a lot of unnecessary confusion and trouble to developers. ; Because it is not controlled by any company, no technology giant has ever been willing to support Python. Therefore, compared with the wide range of applications of Python, the investment and support for its core infrastructure is actually very weak. To this day, 26-year-old Python does not have an official standard JIT compiler. In contrast, the Java language received standard JIT within the first three years of its release.
Another thing is more illustrative. Python's GIL core code was written by Guido van Rossum, the creator of the language, in 1992. In the following eighteen years, no one has changed a single byte of this crucial code. Eighteen years! It wasn't until 2010 that Antoine Pitrou made the first improvement to GIL in nearly two decades, and it was only used in Python 3.x. This means that for most developers using Python 2.7 today, every program they write is still tightly constrained by a piece of code 26 years ago.
Python is such a racer who rushed to the front line with various problems, but even a few years ago, not many people believed that it had a chance to win the crown. Many people believed that Java's position Unshakable, some say that all programs will be rewritten in JavaScript. But today we see that Python is already the first language for data analysis and AI, the first hacker language for network attack and defense, and is becoming the first language for introductory programming teaching, and the first language for cloud computing system management. Python has long become one of the mainstream languages for web development, game scripting, computer vision, IoT management, and robot development. With the expected growth of Python users, it has the opportunity to reach the top in multiple fields.
And don’t forget that the vast majority of Python users in the future will not be professional programmers, but those who are still using Excel, PowePoint, SAS, Matlab and video editors today. Taking AI as an example, we must first ask, where are the main groups of AI? If we talk about this topic statically today, you might think that the main force in AI is AI scientists in research institutions, machine learning experts and algorithm experts with Ph.D.s. But last time I mentioned Kai-Fu Lee’s “AI Dividend Syllogism” that clearly tells us that as long as we take a longer-term view and look back three to five years, you will see that the entire AI industry’s workforce will gradually form a huge pyramid. Structure, the above-mentioned AI scientists are just a few at the top. 95% or more of AI technical personnel will be AI engineers, application engineers and AI tool users. I believe that almost all of these people will be swept away by Python and become a huge reserve force in the Python camp. These potential Python users are still outside the technical circle, but as AI applications develop, millions of teachers, company employees, engineers, translators, editors, doctors, sales, managers and civil servants will be surrounded by their respective Industry knowledge and data resources in the field are pouring into the Python and AI tide, profoundly changing the overall pattern and appearance of the entire IT, or DT (data technology) industry.
Why can Python catch up from behind?
If I talk generally, I can list some of the advantages of Python, such as the simple and elegant language design, programmer-friendliness, and high development efficiency. But I don't think this is the root cause since some other languages are not bad at this.
Some people think that Python’s advantage lies in its rich resources, solid numerical algorithms, icons and data processing infrastructure, and has established a very good ecological environment, attracting a large number of scientists and experts in various fields to use it. The snowball keeps getting bigger and bigger. But I think this is the opposite. Why is it that Python can attract people to use it and build such a good infrastructure? Why doesn’t PHP, the best language in the world, have libraries like numpy, NLTK, sk-learn, pandas and PyTorch? Why is it that after the extreme prosperity of JavaScript, various program libraries have become uneven and useless, while Python's various program libraries are both prosperous and orderly, and can maintain a high level?
I think there is only one fundamental reason: Python is the only language among many mainstream languages that has a clear strategic positioning and always adheres to its original strategic positioning. In contrast, too many languages continue to use tactical and unprincipled diligence to erode and blur their strategic positioning, and in the end they can only wait and wait.
What is Python’s strategic positioning? In fact, it is very simple. It is to make a simple, easy-to-use but professional and rigorous universal combination language, or glue language, so that ordinary people can easily get started, assemble various basic program components together, and operate in a coordinated manner.
It is precisely because of adhering to this positioning that Python always puts the elegance and consistency of the language itself before unique tricks, developer efficiency before CPU efficiency, and horizontal expansion capabilities before vertical deep dive capabilities. Before. The long-term persistence of these strategic choices has brought Python a rich ecosystem that other languages cannot match.
For example, anyone who is willing to learn can learn the basic parts of Python in a few days and then do many, many things. This input-output ratio may be unmatched by any other language. For another example, it is precisely because the Python language itself is slow that when people develop frequently used core program libraries, they use a lot of C language to cooperate with it. As a result, real programs developed in Python run very fast, because it is very likely to exceed 80% of the time the system executes code written in C. On the contrary, if Python is not convinced and insists on competing on speed, then the result is likely to be that the naked speed is increased several times, but then no one will have the motivation to develop C modules for it. The final speed is far inferior to the mixed mode, and it is very Perhaps the language will become more complex as a result, and the result will be a slow and ugly language.
More importantly, Python has good packaging capabilities, composability, and embeddability. It can wrap various complexities in Python modules and expose beautiful interfaces. Many times, a program library itself is written in C/C++, but you will find that it is very troublesome to directly use C or C++ to call that program library, from environment configuration to interface calling. Instead, use another layer. The python packaging library is cleaner, faster and more beautiful. These characteristics have become Python's powerful advantages in the field of AI. Python has also climbed to the top of the programming language ecological chain with the help of AI and data science. Python is tied to AI. For them, whether it is e-commerce, search engines, social networks or smart hardware, in the future they will be just data cows, electronic nerves and execution tools in the lower reaches of the ecological chain, all of which will obey their orders.
People who lack understanding of the history of programming language development may feel that Python's strategic positioning is cynicism and lack of enterprising spirit. But facts have proven that it is difficult to be simple and rigorous, easy to use and professional at the same time, and it is even more difficult to stick to the positioning of glue language.
Some languages are for academic rather than practical purposes from the beginning. The learning curve is too steep and it is difficult for ordinary people to get close to them. Some languages are too dependent on the commercial support of the sponsors behind them. When they are good, they are extremely prosperous. Once they are relegated to the sidelines, even their survival becomes a problem. Some languages have clear imaginary scenarios when designing, either to solve large-scale concurrency, to solve matrix operations, or to make web page rendering templates. Once you leave this scenario, you will feel unhappy. More languages, as soon as they achieve a little bit of success, they can't wait to become the all-around champion and stretch their tentacles in all directions. Especially when it comes to enhancing expression ability and improving performance, they are often overly aggressive and do not hesitate to change the core language beyond recognition. Finally, It becomes a giant that no one can control. In contrast, Python is a successful example of modern programming language design and evolution.
The reason why Python is so clear in its strategic positioning and so firm in its strategic persistence is ultimately because its community has built an exemplary decision-making and governance mechanism. This mechanism is centered on Guido van Rossum (BDFL, Pythoners all know what this means), DavidBeazley, Raymond Hettinger and others, with PEP as the organizational platform, which is democratic and orderly, centralized and enlightened. As long as the mechanism itself is maintained, Python will continue to grow steadily in the foreseeable future.
The one most likely to challenge Python is, of course, Java. Java has a large user base, and it is also a language with a clear and firm strategic positioning. But I don't think Java has a great chance because it's essentially designed for building large, complex systems. What is a large complex system? It is a system that is clearly described and constructed by people. Its scale and complexity are exogenous, or given by the outside world. The essence of AI is a self-learning and self-organizing system. Its scale and complexity are endogenous as a mathematical model grows on its own fed by data. Therefore, most of Java's language structures are not useful for big data processing and AI system development. What you are good at cannot be used here, and what you need here is awkward to do. Python’s simplicity and power in data processing have long been known to everyone. Comparing two Java and Python machine learning programs with the same functions, normal people can make a judgment with just two glances. The Python program must be more refreshing and enjoyable.
About 2003 or 2004, I bought a Python book, written by a Brazilian. He said that the reason why he firmly chose Python was because when he was a child, he often dreamed that the future world would be ruled by a big python (python in English is python). At that time, I felt that this guy was so pitiful that he could dream of such a terrifying scene. But looking at it today, maybe he was just like the programmer Anderson in The Matrix, who accidentally traveled to the future and glimpsed the truth of the world.