It's also free-form text input that highlights the importance of proper context-aware output encoding and quite clearly demonstrates that input validation is not the primary safeguards against Cross-Site Scripting — if your users want to type apostrophe (') or less-than sign ( References: Input validation of free-form Unicode text in Python Developing regular expressions can be complicated, and is well beyond the scope of this cheat sheet.
There are lots of resources on the internet about how to write regular expressions, including: and the OWASP Validation Regex Repository.
Unfortunately this does and will make input harder to normalise and correctly match to a users intent.
If the input field comes from a fixed set of options, like a drop down list or radio buttons, then the input needs to match exactly one of the values offered to the user in the first place.
Free-form text, especially with Unicode characters, is perceived as difficult to validate due to a relatively large space of characters that need to be whitelisted.
Detailed information on XSS prevention here: OWASP XSS Prevention Cheat Sheet Many websites allow users to upload files, such as a profile picture or more. Many web applications do not treat email addresses correctly due to common misconceptions about what constitutes a valid address.
Specifically, it is completely valid to have an mailbox address which: At the time of writing, RFC 5321 is the current standard defining SMTP and what constitutes a valid mailbox address.
Please note, email addresses should be considered to be public data.