PHP Filter FILTER_VALIDATE_URL Limitations

Tuesday, April 8, 2008 - 16:48

I have previously talked about the filter functions available in PHP5, but failed to spot this limitation when I was doing the research for those articles. It appears that the filter to validate URL string, namely FILTER_VALIDATE_URL, is not really adequate to the task.

Take the following examples of the filter in an if statement.

1
2
3
4
5
if ( filter_var($url,FILTER_VALIDATE_URL) ) {
 return true;
}else{
 return false;
};

This will return true if the URL is valid and false if the URL is invalid. To test this I plugged the following URL strings into the function and recorded each of the outcomes.

1
2
3
4
5
6
$url = 'http://www.bbc.co.uk'; // true
$url = 'http://www.hashbangcode.com'; // true
$url = 'http://.com'; // true
$url = 'http://...'; // true
$url = 'http://'; // false
$url = 'http://i\'me really trying to break this url!!!"£$"%$&*()'; // false

As you can see most of the URL I tried with this function worked, even though only the first two should actually be valid. What this filter seems to do is that it just runs the string against the parse_url() function and then detects if an array is produced. This is clearly not good enough.

All I can suggest for the time being is that you go back to using regular expressions to test for the URL validity. I picked this one up from the regular expressions library by doing a quick search for URLs.

^(http\:\/\/[a-zA-Z0-9_\-]+(?:\.[a-zA-Z0-9_\-]+)*\.[a-zA-Z]{2,4}(?:\/[a-zA-Z0-9_]+)*(?:\/[a-zA-Z0-9_]+\.[a-zA-Z]{2,4}(?:\?[a-zA-Z0-9_]+\=[a-zA-Z0-9_]+)?)?(?:\&[a-zA-Z0-9_]+\=[a-zA-Z0-9_]+)*)$
Category: 
philipnorton42's picture

Philip Norton

Phil is the founder and administrator of #! code and is an IT professional working in the North West of the UK.
Google+ | Twitter

Comments

Add new comment