Home > Article > Backend Development > Cognitive Web Server_PHP Tutorial
In the eyes of many users, the success or failure of a Web website mainly depends on the content and functions it provides. Little do they know that the Web server that supports these contents and functions is the real hero behind the scenes. According to statistics, there are more than 5 million websites in the world. There is a web server running behind every website, so what is a web server? How does it work? ...... From C/S to Web The earliest network system was a simple host/terminal system. All applications were completed by the host, and the terminal just ran the corresponding program on the server. The arrival of the PC era has led to great development of computer networks and computer applications. Due to the continuous decline in PC prices and the continuous improvement in performance, the application fields of terminal-oriented mainframes have become less and less. In particular, the rise of network operating systems such as NetWare and Windows NT, as well as the emergence of network database systems, has opened up a new model of network applications - C/S (Client/Server) model. The C/S model is a two-layer structure system. The first layer handles presentation logic and business logic on the client, and the second layer is a server system such as a database running through the network. The C/S mode separates transactions and realizes distributed computing on the network. It has also helped enterprises realize the construction of local area networks for a long time, improved internal business management of enterprises, and improved work efficiency. However, the C/S model has obvious limitations in terms of system integration and maintenance, operation interface consistency, and system scalability. Therefore, just as the host/terminal network is replaced by the C/S model network system, In the Internet/Intranet technology environment, newer system models will also appear. Internet/Intranet based on Web technology has been widely used in recent years. Intranet is an enterprise intranet based on TCP/IP protocol and with Web as the core. Users can access it at any time through a low-cost, easy-to-use customer browser. Go to the enterprise's Web site to check the data you need anywhere. The consistency of the browser client operation interface avoids the diversity of client programs in C/S mode, while the open and standards-based connection solution on the server side makes it easy for enterprises to contact the outside world through the Internet; at the same time, Web information is dynamic and The interactive publishing method fundamentally changes the service quality of enterprises and increases their business opportunities. Figure 1 Web three-tier structure In the three-tier structure of Web technology, the database does not directly provide services to each client, but communicates with the Web server to achieve dynamic, real-time and interactive customer information services. This functionality is achieved through server applications such as CGI, ISAPI, NSAPI, and Java. As shown in Figure 1. What is a Web Server? The unique feature of Web technology is the use of hyperlinks and multimedia information. Web servers use HyperText Marked Language (HTML-HyperText Marked Language) to describe network resources and create web pages for reading by Web browsers. The characteristic of HTML documents is interactivity. Whether it is general text or graphics, it can be connected to other documents on the server through links in the document, allowing customers to quickly search for the information they want. HTML web pages can also provide forms for users to fill out and submit to the database through server applications. This kind of database generally supports multimedia data types. Web browser (Web Browser) is a client application used for document retrieval and display, and is connected to the Web server through the Hypertext Transfer Protocol HTTP (HyperText Transfer Protocol). The universal, low-cost browser saves the development and maintenance costs of the two-tier structure C/S mode client software. Currently, in addition to providing basic document retrieval, display and navigation features, the popular Internet Explorer and Netscape Navigator also support advanced HTML display (such as tables and frames) as well as ActiveX, Java, JavaScript and other features. How does a Web server work? In the eyes of many users, the success or failure of a Web website mainly lies in the content and functions it provides. However, they do not know that the Web server that supports these contents and functions is the real hero behind the scenes. So, how does a web server work? A few years ago, when the Web server first appeared, the applications it supported were simple browsing of HTML files and images. When the Web server received a request for a Web page, such as http://www.ccidnet.com. index.html, it will locate the corresponding host file server through the URL (Uniform Resource Locator-Uniform Resource Locator), find the corresponding file index.html, and then download the file from the host file server and use the HTTP protocol to It is transmitted to the Web browser (Web Browser). Of course, this is just a basic function, and the relationship between the web server and the web browser is far from simple. One of the most important extensions of web applications is the introduction of dynamic content. For example, a web server can directly or indirectly create a web page based on a request entered by a user, and then return it to the web browser. The earliest way to implement dynamic content applications is through CGI (Comman Gateway Interface), which has a basic definition for the running of programs on the Web server and the transmission of dynamic content between the Web server and the Web browser. As shown in Figure 2.Another development in Web applications is the emergence of HTTPS (HyperText Transmission Protocol, Secure Hypertext Transfer Protocol), which ensures the security of communication between Web servers and Web browsers, making electronic transactions possible. The communication between the web server and the web browser is through the HTTP protocol. So, what is the HTTP protocol? Simply put, the HTTP protocol is an application layer protocol between a web browser and a web server. It is based on the TCP/IP protocol and is a universal, stateless, object-oriented protocol. Its working principle includes four steps: Figure 2 CGI definition diagram Connection: The web browser establishes a connection with the web server and opens a virtual file called socket. The establishment of this file marks a successful connection. Request: The web browser submits a request to the web server through the socket. Response: After the web browser submits the request, it is sent to the web server through the HTTP protocol. After receiving it, the web server performs transaction processing, and the processing results are sent back to the web browser through HTTP, thereby displaying the requested page on the web browser. Close the connection: When the response is completed, the web browser and the web server must be disconnected to ensure that other web browsers can establish connections with the web server. In this way, the Web server's processing includes a complete logical stage: accept the connection - generate static or dynamic content and send them back to the browser - close the connection - accept the next connection, and so on. It is conceivable that when there are many visitors, the server will inevitably be overwhelmed. Two technologies can be used to solve this problem: multi-threading and multi-process. The web server supports the port monitoring module of Unix systems (a multi-process mode), multi-threading, multi-process or a mixture of the two technologies. With the connection, how does the web server provide content to the web browser? The key here is that the content must be recognized and represented by the browser. The main mechanism that determines how to display content is the MIME (Multiple Purpose Internet Mail Extension) type. MIME tells the web browser what kind of document will be sent. Moreover, this type of identification is not limited. For simple image documents and HTML documents. For example, there are 370 default MIME types in the mine.type configuration file of Apache WebServer, and this is not all MIME types. MIME types are distinguished by a type/subtype syntax associated with the file suffix, for example, a file containing MPEG video content would have the suffix mpeg, mpg, or mpe. The role of the Web server is ultimately reflected in the provision of content, especially dynamic content. This is also the fundamental difference between a Web server and an application server. The Web server is mainly responsible for providing dynamically generated HTML documents when interacting with the Web browser (in addition to providing HTML document services In addition, the Web server also provides application data such as XML format. In other words, the Web server not only provides HTML documents, but can also establish connections with various data sources on a larger scale to provide richer content for the Web browser. .) There are many technologies to implement dynamic content on the Web. The first is CGI, which dynamically transmits HTML data according to requests entered by the user. CGI is not a development language, it is just a protocol that can use programs written for it to implement a Web server. Since each request for dynamic content requires starting a new CGI program, which increases the burden on the Web server, a big drawback of CGI is that it easily affects the speed of the Web server. Microsoft ASP (Active Server Pages - Dynamic Server Pages) technology consists of a VBScript interpreter embedded in IIS. It also supports a variety of scripting languages, including JavaScript, PerlScript and VBScript. Based on COM, it can be easily accessed Other server software components. PHP, like JSP and ASP technologies, consists of a set of additional code tags placed in an HTML document. The difference is that it is designed for developing Web pages, so applications developed with it will be simpler than corresponding applications developed with VBScript or JSP. All web servers today support Perl acceleration solutions. Apache's free mod_perl solution embeds Perl into the Apache server. This not only improves the interpretation speed of Perl code, but also greatly improves the execution efficiency of the code due to mod_perl caching. Mod_perl is also closely linked to Apache, so Perl developers can control the work of the Web server just like C developers writing low-level Apache API programs. When the system is running, the Web server often has to support a large number of intensive user clicks and the demand for dynamic content. Therefore, even with high-end server equipment, in the face of an increasing number of users, the number of visits supported per unit time will increase. There is a limit, especially for situations with a lot of dynamic content, because the application of dynamic content requires frequent calls to database data and applications, which will occupy a lot of server resources. At this time, it is necessary to distribute the server load between multiple server devices or multiple sites. There are many methods of load balancing. The simplest method is to distribute the content of the website between different servers. For example, store static HTML pages on one server, store image files on another, and run all CGI programs on a third server.However, it is obvious that this method will not be very efficient, because it cannot achieve automatic content distribution between hosts. If there is too much content in one aspect, it will still form a load bottleneck. The basic method of DNS load balancing (Domain Name Server) is to place different copies of the same site on the same physical server. Then, the DNS server can return multiple IP addresses. The method is that the DNS server can return both domain names. Multiple IP addresses can also return different IP addresses for the same DNS request. Since it is difficult to determine which IP address a client corresponds to, DNS can only provide basic load balancing services. Moreover, because the DNS request remains in the cache of the client and other servers, the same client will continue to access the same Web server. Therefore, it is possible that a large number of users who frequently access the Internet use one IP address, while other users who rarely access the Internet access another IP address, resulting in uneven distribution. Another problem is that the DNS cache is not continuously activated, which may cause a client to end access to other IP addresses of the site while it is using a Web site. This can cause problems for dynamic websites, especially if you need to accept and store data from the client. The software and hardware load balancing method is similar to DNS load balancing, but the website only publishes one IP address, and a machine is specially set up to accept HTTP requests for this IP address and distribute these requests to various servers of the website. This distribution typically occurs at the TCP/IP routing level, which transparently maps this single source/destination IP address to a specific server. This technology