Home > Article > Backend Development > Solution to the 404 error returned when accessing an encoded Chinese URL, url404_PHP tutorial
I was working on a project yesterday. One of the requirements was that each picture corresponds to a short paragraph of text describing the picture. The common method is to create a new table and record the picture name and description into the database. After careful consideration, I feel that this application can be completed without a database. The solution I implemented is to use the description text URLENCODE as the file name, so that when I read the file, the file name URLDECODE can be used to drive the text description of the picture.
However, when accessing the image through the browser, it prompts that the file cannot be found. For example, if there is a picture with the caption "Qiongtai Blog", the file name generated after URLENCODE is as follows
So I accessed the image through the browser and it said it could not be found
After looking carefully, I found that the browser automatically converted the file name back to Chinese when accessing it
Firefox
chrome
IE
Safari
IE and Safari did not see that the address bar was converted to Chinese characters, but they also reported that the file could not be found. But I feel that it should be automatically converted when it is requested, but the converted version is not displayed on the address bar. See the request status when accessing images from Nginx’s access records
No exceptions were found in request URL processing. Finally, after repeatedly studying the encoded file names, I found that they are all composed of percent signs and alphanumeric characters. I feel that the browser may do something when encountering the percent sign. Other conversions have been processed, so the browser prompts that the file cannot be found after accessing URLENCODE.
So I replaced all percent signs in the file names after URLENCODE with underscores
with
Reuse the browser to access and the problem is solved
To get the text description of the image, just replace the "_" in the file name with "%" and then use URLDECODE.
Finally, it should be noted that file names under Linux have length restrictions just like those under Win system. Currently, the most commonly used format is ext3, which allows 255 characters in length, after deducting about 5 characters as extensions. There are about 250 pure file names left, and the length of a Chinese character after URLENCODE is 9, so a maximum of 27 Chinese characters can be encoded as file names .
Although this method stores fewer Chinese characters, you can use some encryption methods to obtain a shorter string of ciphertext, and then URLENCODE this ciphertext. I will not give examples of the specific implementation methods, but do it yourself. Think about it!
404 is an HTTP status code. The occurrence of the HTTP 404 status code means that the web page pointed to by the link does not exist, that is, the URL of the original web page is invalid. This situation often happens, especially for large websites, it is difficult to Avoid, for example: changes in web page URL generation rules, renaming or moving of web page files, spelling errors in import links, negligence of editors or program staff, etc., resulting in the original URL address being inaccessible; when the web server receives a similar request, it will return A 404 status code tells the browser that the requested resource does not exist. Generally speaking, the causes of this error are: problems with the website itself: the web page URL itself has changed, but the front page has not been updated in time; the web page itself or the location of the web page file has changed, but has not been updated in the background in time; external links are spelled incorrectly ; Website content administrators or program administrators do not consider the handling of URLs thoroughly. For example, if link attributes are added in some places, the URL cannot be accessed normally. Due to the user's network environment: the URL address of the webpage cannot be requested on the requested port**. Web services extension locking policy blocks this request. MIME mapping policy blocks this request. For small website administrators with relatively small web content, you can use the dead link detection tool "xenu.exe" to detect dead links on web pages, discover and promptly handle incorrect link information; for large and medium-sized website administrators with more content , using the dead link detection tool above may consume a lot of time. You can pay attention to the user status codes in the daily website log files to discover and solve 404 errors in a timely manner. Create a friendly 404 error page to remind users of access error information, guide users to search on the homepage or directory, and provide site search functions or website administrator contact information. For SEO personnel (search engine optimization workers), it is recommended to refer to the article "Will 404 Errors Affect the Website" for more information. Ordinary users: Try to change the browser or clear the browser cache (to rule out inaccessibility due to browser controls or malicious plug-ins) to check whether the current user has network link permissions (some computers may be set to have time limits or restrictions on network links) detection Check whether your own network environment is normal (you can check it through security detection or anti-virus software) and check whether there are relevant restricted programs running on the computer (some computers may be set by software to control network link permissions and require a password to link)
The 404 page is the page returned when the user enters an incorrect link. The purpose of the 404 page is to tell the viewer that the requested page does not exist or that the link is incorrect, and to guide the user to use other pages of the website instead of closing the window and leaving. The impact of 404 on seo Customizing the 404 error page is a good practice to enhance user experience, but the impact on search engines is often not noticed during the application process. For example: incorrect server-side configuration causes the return of "200" status code Or use Meta Refresh to customize the 404 error page to return a "302" status code. A properly configured custom 404 error page should not only display correctly, but also return a "404" error code instead of "200" or "302". Although it makes no difference whether the HTTP status code is "404" or "200" to the visiting user, it is very important to the search engine. When a search engine spider gets a "404" status response when requesting a URL, it knows that the URL has expired, and will no longer index the web page, and feedback to the data center to delete the web page represented by the URL from the index database. Of course, The deletion process may take a long time; when the search engine gets the "200" status code, it will think that the URL is valid, index it, and include it in the index database. The result is this Two different URLs have exactly the same content: the content of a custom 404 error page, which causes duplicate webpage issues. At worst, your ranking will be demoted by search engines, and at worst, your website will be deleted. How to implement a good 404 page. Changing the server's default error page can achieve the effect. Here are some suggestions to make it easier for your visitors to visit.
Philosophy to follow
Provide a concise description of the problem to eliminate visitor frustration. Provide reasonable solutions to assist visitors in completing their visit goals. Provide a personalized and friendly interface to enhance the access experience.
Usage Guide
Implementation method (arranged from simple to complex): Make visitors go to a certain place instead of going back. Pages contain links to important parts of the site, such as the home page or site map. Don’t just tell them to check their spelling. Use text links instead of images because many visitors won't think of clicking on the image. Example: Our site has a link back to the home page. This is the bare minimum for friendly feedback. In addition, we will consider how to correct errors through visitor feedback when effective help information is insufficient. You need to include an email link to the webmaster, or a form for submitting missing links. Visitors prefer using a submission form to sending an email. Add a search box for searching the site. Example: MSN has a search box at the bottom of all pages, which also links to important parts of the site. List links on the site that are close to the page the visitor expects to infer the page the visitor is looking for. You don't need to follow all of the above suggestions, but they all serve the purpose of making visitors more likely to stay on your site. A reasonable 404 error page is now complete, and will provide visitors with a lot of valuable information. How to set up the 404 error page: 1. When the existing web page content cannot be accessed due to the path change, you can define the 404 error in IIS to point to a dynamic page, and use 301 permanent redirection in the page to jump to the new address. At this time, the server returns a 301 status code. 2. Set 404 to point to a designed html file. At this time, the page returns the 404 status code. Today's IDC providers basically provide the function of 404 settings, and you can directly upload the file settings. Setting method in IIs: Open IIS Manager --> Click on the properties of the website where you want to set a custom 404 --> Click on the Custom Error option --> Select the 404 page --> Select and open Edit Properties - ->Set as URL --> Fill in "/err404.html" in the URL --> Press OK to exit and upload the completed err404.html page to the root directory of the website. Here, be sure to select "File" or "Default Value" in "Message Type" instead of "...The rest of the full text>>