Tutorial :Regular expression to allow a set of characters and disallow others



Question:

I want to restrict the users from entering the below special characters in a field:

  Å"çşÇŞ  ğĞščřŠŘŇĚŽĎŤČňěž  ůŮ  İťı  â€"¿„”*@  Newline  Carriage return  

A few more will be added to this list but I will have the complete restricted list eventually.

But he can enter certain foreign characters like äöüÄÖÜÿï etc in addition to alphanumeric chars, usual special chars etc.

Is there an easy way to build a regex for doing this. Adding so many chars in the not allowed list like

  [^Å"çşÇŞ ğĞščřŠŘŇĚŽĎŤČňěž ůŮ Ä° Å¥ ı â€" ¿ „ ” * @]+  

does not seem to work.

And I do not have the complete list of allowed characters. It would be too long even if I try to get it and would include all chars like:

  ~`!#$%^&()[]{};':",.  

along with certain foreign chars.


Solution:1

You do not mention what "flavor" of regex you are using. Does the following work?

\A[^Å"çşÇŞ ğĞščřŠŘŇĚŽĎŤČňěž ůŮ Ä° Å¥ ı â€" ¿ „ ” * @]+\z  


Solution:2

A regular expression can be built to match the incorrect characters, e.g.:

[Å"çşÇŞ ğĞščřŠŘŇĚŽĎŤČňěž ůŮ Ä° Å¥ ı]  

(I didn't include all the characters; you get the idea!).

If any character matches, it's a fail.

Or, if you need a regular expression that matches valid input, simply add a caret to the front of the brackets like so:

[^Å"çşÇŞ ğĞščřŠŘŇĚŽĎŤČňěž ůŮ Ä° Å¥ ı]*  


Solution:3

You COULD use a regular expression for this, but why not just check if any of the disallowed characters are in your string with a builtin method? For example, in the .NET world you could use .Contains().

Personally, I would create a list of allowed characters, then just check that your string doesn't have any characters that aren't in your list. Using a whitelist will ensure that you haven't forgotten any "bad" characters as well.


Solution:4

A few more will be added to this list but I will have the complete restricted list eventually.

And I do not have the complete list of allowed characters (It would be too long even if I try to get it and would include all chars like ~`!#$%^&()[]{};':",.<> alongwith certain foreign chars)

You will eventually have the list of disallowed characters and probably not the list of allowed characters? You must have either the list of all allowed characters or the list of all disallowed characters. Else you cannot tell if the input is legal. Further more, if you have one of the lists, you have the second implicitly if the character set is known. Then just implement the shorter one.

Just guessing, but if you use Unicode, there will probably be much more characters you want to disallow than to allow - think of all the fancy Chinees and Japanes symbols. So I think you should really build a list of allowed characters and use ranges like a-z where posiible.

If you really want to build the list of disallowed characters, you will have to build a regular expression like [^Å"çşÇŞ ğĞščřŠŘŇĚŽĎŤČňěž ůŮ Ä° Å¥ ı â€" ¿ „ ” * @]*. Do not forget to escape the characters if required and use ranges if possible.

Adding so many chars in the not allowed list like [^Å"çşÇŞ ğĞščřŠŘŇĚŽĎŤČňěž ůŮ Ä° Å¥ ı â€" ¿ „ ” *@]+ does not seem to work.

There are spaces in your list. Are they in your code, too? I am not sure, but may be this might be a problem.


Solution:5

It would be best to try and match any character that is not allowed by negating the allowed set. For example, if you only wanted to allow 'a' through 'z', you might do the following.

[^a-z]  

You cannot possibly know all of the characters that are not allowed, but you presumably know the ones that are allowed. So, build a regular expression like the one above that matches only one character that is not in the allowed set. If you get a match, you'll know that the string contains an invalid character.

If you can, try to use built-in character class escape codes if they're available.

Find them for Perl RE here, look for "Character Classes and other Special Escapes". It may allow you to have a shorter expression like this one.

[^\w\d  ..other individual chars..  ]  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »