There comes a time for every web developer when they have to do some type of input validation. A form isn't a blog post where a user can wax poetically about their love of Yahoo Mail in an email field. Eventually, there needs to be word count limits, checks for specific characters, and simple validation techniques that stops the user from sending a junk POST request.
However, what if you need to validate a URL? And to add another layer to the problem, what if you only want the hostname of a URL, no paths, no protocol, just the "dubya dubya dubya dot" (www.) and the .com.
Let's start with knowing if a URL is a URL. A link requires a second and top level domain, the walmart and .com in walmart.com respectively, and a scheme (https://). Without these parts, the link doesn't link to anything and becomes no different from a line of text.
But now that we know the parts of a URL, we reach a fork in or development path. Should the validation restrict the user at the field or sanitize the user input when the data is sent to the server?
There are merits and deficiencies in either options:
Validation Before Submission
If you restrict the user from submitting an invalid URL, it allows you to easily take the data on the server side without any extra work by forcing the user to submit the exact input structure you need. In this case, the pattern attribute for the input element combined with some regex would allow for some good old fashioned field validation.
Here's an example of this approach:
<input type="text" pattern="https?://.*"> <p>However, it comes with a downside of restricting the user. It requires the user to have specific parts to their input and if you just need there to be a .com, then the long regex pattern might be overkill.</p> <hr> <p><strong>Validation After Submission</strong></p> <p>On the other hand, if you choose to sanitize the data after the user submits it, it allows the user to type anything and lets the server decide what to do with the data. Javascript's URL constructor does the validation for you, returning a TypeError if the input is invalid and also allowing you to extract specific parts of the URL like the origin or hostname.</p> <p>Here's an example of this approach:<br> </p> <pre class="brush:php;toolbar:false">export const formatWebsiteAfterDomain = (website: string): string => { if (!website.trim().length) { return ''; } const regEx = /:\/\//; const websiteTrimmed = website.trim(); const hasProtocol = regEx.exec(websiteTrimmed); const updatedWebsite = hasProtocol ? websiteTrimmed : `https://${websiteTrimmed}`; try { const url = new URL(updatedWebsite); return hasProtocol ? url.origin : url.origin.replace('https://', ''); } catch (_err) { return websiteTrimmed; } };
However, because you give the user so much freedom in their input, it requires some compromises in what the server does with the data. If the user puts an invalid URL, what do you do with it? Do you use the TypeError response and notify the user or do you just allow the server to consume what the user sent? Furthermore, the URL constructor validates the input by checking if there is a scheme present (https:// or http://), which may be too little validation for your uses.
In the end, the path taken depends on the specific edge cases of your problem. A combination of both solutions might be the most comprehensive and versatile or one of the choices might be just enough. The user can put in any input and your solution will be determined on the amount of freedom you're willing to give the user. However, what remains universal is that the ability of the user to type anything will always force the user and developer to come to some sort of compromise (often the developer gets a specific input pattern and the user gets to use their application).
But since the peculiarities of user input are eternal, there will always be developers frantically pushing out solutions so their web apps don't break when users try to paste images in the URL field of a form.
The above is the detailed content of URL Validation or: How I Learned to Stop Worrying and Love the User. For more information, please follow other related articles on the PHP Chinese website!

JavaScript is the core language of modern web development and is widely used for its diversity and flexibility. 1) Front-end development: build dynamic web pages and single-page applications through DOM operations and modern frameworks (such as React, Vue.js, Angular). 2) Server-side development: Node.js uses a non-blocking I/O model to handle high concurrency and real-time applications. 3) Mobile and desktop application development: cross-platform development is realized through ReactNative and Electron to improve development efficiency.

The latest trends in JavaScript include the rise of TypeScript, the popularity of modern frameworks and libraries, and the application of WebAssembly. Future prospects cover more powerful type systems, the development of server-side JavaScript, the expansion of artificial intelligence and machine learning, and the potential of IoT and edge computing.

JavaScript is the cornerstone of modern web development, and its main functions include event-driven programming, dynamic content generation and asynchronous programming. 1) Event-driven programming allows web pages to change dynamically according to user operations. 2) Dynamic content generation allows page content to be adjusted according to conditions. 3) Asynchronous programming ensures that the user interface is not blocked. JavaScript is widely used in web interaction, single-page application and server-side development, greatly improving the flexibility of user experience and cross-platform development.

Python is more suitable for data science and machine learning, while JavaScript is more suitable for front-end and full-stack development. 1. Python is known for its concise syntax and rich library ecosystem, and is suitable for data analysis and web development. 2. JavaScript is the core of front-end development. Node.js supports server-side programming and is suitable for full-stack development.

JavaScript does not require installation because it is already built into modern browsers. You just need a text editor and a browser to get started. 1) In the browser environment, run it by embedding the HTML file through tags. 2) In the Node.js environment, after downloading and installing Node.js, run the JavaScript file through the command line.

How to send task notifications in Quartz In advance When using the Quartz timer to schedule a task, the execution time of the task is set by the cron expression. Now...

How to obtain the parameters of functions on prototype chains in JavaScript In JavaScript programming, understanding and manipulating function parameters on prototype chains is a common and important task...

Analysis of the reason why the dynamic style displacement failure of using Vue.js in the WeChat applet web-view is using Vue.js...


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

SublimeText3 Mac version
God-level code editing software (SublimeText3)

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

SublimeText3 English version
Recommended: Win version, supports code prompts!

Dreamweaver CS6
Visual web development tools