Home > Article > Backend Development > Search Engine Friendly URL design_PHP tutorial
Search Engine Friendly URL Design
Copyright statement: You can reprint at will. When reprinting, please be sure to indicate the original source and author information of the article and this statement in the form of a hyperlink
http://www.chedong.com/tech/google_url.html
Keywords: "url rewrite" mod_rewrite isapirewrite path_info "search engine friendly"
Content summary:
In addition, as the content on the Internet grows at an alarming rate, the importance of search engines becomes more and more prominent. If a website wants to be better indexed by search engines, the website design should not only be user-friendly (User Friendly), but also search engine friendly. The design of (Search Engine Friendly) is also very important. The more page content that enters the search engine, the greater the chance of being found by users using different keywords. In the article Google's Algorithm Survey, it is mentioned that the number of pages indexed by Google on a site actually has a certain impact on PageRank. Since Google highlights the relatively static part of the entire network (the number of dynamic web pages indexed is relatively small), static web pages with relatively fixed link addresses are more suitable for indexing by Google (no wonder many large websites have mailing list archives and monthly archived documents that are easy to index). (searched), so many articles about search engine-oriented URL design optimization (URI Pretty) mention a lot of using a certain mechanism to turn dynamic web page parameters into a form like a static web page:
For example, you can change:
http://www.chedong.com/phpMan.php?mode=man¶meter=ls
becomes:
http://www.chedong.com/phpMan.php/man/ls
There are two main ways to implement it:
Based on url rewrite
Based on path_info
Pass URI address as parameter: URL REWRITE
The simplest is URL conversion based on the URL rewrite module in various WEB servers:
In this way, a link like news.asp?id=234 can be mapped to news/234 without modifying the program. .html, looks the same as a static link from the outside. There is a module (non-default) on the Apache server: mod_rewrite: URL REWRITE is powerful enough to write a book.
When I need to map news.asp?id=234 to news/234.html, I only need to set:
RewriteRule /news/(d+).html /news.asp?id=$1 [N,I ]
This maps requests like /news/234.html to /news.asp?id=234
When there is a request for /news/234.html: the web server will forward the actual request to/news.asp?id=234
There are also corresponding REWRITE modules in IIS: For example, ISAPI REWRITE and IIS REWRITE, the syntax is based on regular expressions, so the configuration is almost the same as apache's mod_rewrite:
For example, for a simple application it can be:
RewriteRule /news/(d+).html /news/news.php?id=$1 [N,I]
In this way, http://www.chedong. com/news/234.html is mapped to http://www.chedong.com/news/news.php?id=234
A more general expression that can parameter map all dynamic pages Yes:
represents http://www.myhost.com/foo.php?a=A&b=B&c=C
as http://www.myhost.com/foo.php/a/A/ b/B/c/C.
RewriteRule (.*?.php)(?[^/]*)?/([^/]*)/([^/]*)(.+?)? $1(?2$2&:? )$3=$4?5$5: [N,I]
Another advantage of using URL REWRITE is that it hides the background implementation, which is very useful when migrating the background application platform: when migrating from asp to java platform, the front-end users will not be able to feel the changes in the background application.
For example, when we need to migrate the application from news.asp?id=234 to news.php?query=234, the front-end performance can always remain news/234.html. From the realization of separation of application and front-end performance: the stability of the URL is maintained, and using mod_rewrite can even forward the request to other back-end servers.
URL beautification based on PATH_INFO
Another way to beautify URLs is based on PATH_INFO:
PATH_INFO is a CGI 1.1 standard. It is often found that many "/value_1/value_2" following CGI are PATH_INFO parameters:
For example, http://www.chedong .com/phpMan.php/man/ls, in: $PATH_INFO = "/man/ls"
PATH_INFO is a CGI standard, so PHP Servlet, etc. have full support. For example, there is the request.getPathInfo() method in Servlet.
Note: getPathInfo() of /myapp/servlet/Hello/foo returns /foo, and getPathInfo() of /myapp/dir/hello.jsp/foo will return /hello.jsp. From here you can also You can know that jsp is actually the PATH_INFO parameter of a Servlet. ASP does not support PATH_INFO.
An example of parameter parsing based on PATH_INFO in PHP is as follows:
//Note: Parameters are split by "/", and the first parameter is empty: from /param1/param2 Parse out the two parameters $param1 $param2
if ( isset($_SERVER["PATH_INFO"]) ) {
list($nothing, $param1, $param2) = explode('/', $_SERVER ["PATH_INFO"]);
}
How to hide an application: For example, .php, extension:
Configure it like this in APACHE:
How to make it more like a static page: app_name/my/app.html
When parsing the PATH_INFO parameter, put Just truncate the last 5 characters ".html" of the last parameter.
Note: PATH_INFO is not allowed by default in APACHE2. You need to set AcceptPathInfo on
Especially for users who use virtual hosts and do not have the right to install and configure mod_rewrite, PATH_INFO often becomes the only choice. .
OK, so when you see a webpage like http://www.example.com/article/234 in the future, you will know that it may be a dynamic webpage generated by the php program article/show.php?id=234 , many sites may appear to have many static directories, but in fact they are most likely using 1 or 2 programs to publish content. For example, many WIKIWIKI systems use this mechanism: the entire system is a simple wiki program, and the seemingly directory is actually the query result of this application using the following address as a parameter.
Using a solution based on MOD_REWRITE/PATH_INFO + CACHE server to transform the original dynamic publishing system can also greatly reduce the cost of upgrading the old system to a new content management system. And it facilitates search engine indexing.
Attachment: How to use PHP to support the ISAPI mode installation of PATH_INFOPHP on IIS. Note: Just try php-4.2.3-Win32
Unpacking directory
========
php-4.2.3-Win32.zip c:php
PHP.INI initialization file
=================
Copy: c: phpphp.ini-dist to c:winntphp.ini
Configure file association
============
Configure file association as described in install.txt
Runtime library file
==========
Copy c:phpphp4ts.dll to c:winntsystem32php4ts.dll
After running like this: you will find that php changes PATH_INFO Mapped to the physical path
Warning: Unknown(C:CheDongDownloadsariadnewwwtest.phppath): failed to create stream: No such file or directory in Unknown on line 0
Warning: Unknown(): Failed opening 'C :CheDongDownloadsariadnewwwtest.phppath' for inclusion (include_path='.;c:php4pear') in Unknown on line 0
Install ariadne's PATCH
============== ====
Stop IIS service
net stop iisadmin
ftp://ftp.muze.nl/pub/ariadne/win/iis/php-4.2.3/php4isapi.dll
Override The original c:phpsapiphp4isapi.dll
Note:
ariadne is a content publishing system based on PATH_INFO.
PATH_INFO of CGI mode in PHP 4.3.2 RC2 has been corrected, just install it as usual.
References:
URL Rewrite documentation:
http://www.isapirewrite.com/docs/
http://httpd.apache.org/docs/mod/mod_rewrite.html
http: //httpd.apache.org/docs-2.0/mod/mod_rewrite.html
Search engine friendly URL design
http://www.sitepoint.com/article/485
Maybe this URL turns out to be articel.php?id=485
An open source content management system based on PATH_INFO
http://typo3.com/
What doesn't Google index?
http://www.microdocs-news.info/newsGoogle/2003/05/10.html
Google's PageRank description:
http://pr.efactory.de/