Note: This post is over two years old and so the information contained here might be out of date. If you do spot something please leave a comment and we will endeavour to correct.
6th June 2009 - 4 minutes read time
If you want to incorporate a W3C validation check into an application then you can use the following class. It uses file_get_contents() to get the contents of the file and then uses regular expressions to return either the number of errors or -1 is any errors occur. Of course if the document is valid the function will return 0.
<?php
/**
* W3cValidate
*
* For more information on this file and how to use the class please visit
* http://www.hashbangcode.com/blog/w3c-validation-php-class-1300.html
*
* @author Philip Norton
* @version 1.0
* @copyright 2009 #! code
*
*/
/**
* This class allows the retreival of W3C HTML validation results.
*
* @package W3cValidate
*/
class W3cValidate{
/**
* The URL used in the test
*
* @var string
*/
private $url;
/**
* The W3C validation results for tested URL
*
* @var integer
*/
public $result;
/**
* Constructor
*
* @param string $url The URL that will be used in the test.
*/
public function W3cValidate($url){
// Make sure the URL has http in front of it
$url = str_replace("http://","",$url);
$url = "http://".$url;
$this->url = $url;
}
/**
* Get the results from the W3C validation site and use regular expressions to return a result.
*
* @return integer The number of errors, or -1 if an error was encountered.
*/
public function getValidation(){
$w3cvalidator = strip_tags(file_get_contents("http://validator.w3.org/check?uri=".$this->url));
// Validator response is null
if ( $w3cvalidator == '' ) {
$this->result = -1;
return $this->result;
}
// Validator responded, check results
preg_match_all('/(?<=Result:)\s+(\d)*(?= errors?)/i', $w3cvalidator, $matches);
if ( isset($matches[0][0]) ) {
$this->result = trim($matches[0][0]);
return $this->result;
}
// Check for broken document
preg_match_all('/Sorry! This document can not be checked\./i', $w3cvalidator, $matches);
if ( isset($matches[0][0]) ) {
$this->result = -1;
return $this->result;
}
// Document is valid
$this->result = 0;
return $this->result;
}
}
You can use the class like this:
$w3c = new W3cValidate("http://www.hashbangcode.com");
echo $w3c->getValidation();
$w3c = new W3cValidate("http://www.amazon.co.uk");
echo $w3c->getValidation();
The first result for #! code returns 0 errors, whereas the second result for Amazon UK returns 1,565 errors. I used Amazon as an extreme example as I know that there are many HTML errors.
You can download the W3C validation code from the W3C site and install it on your own server, but this codebase is written in perl. This class is intended as a short cut to getting W3C validation results without having to integrate this into your PHP application.
There are also two points to remember about this class. First, there is a chance that the W3C will alter the output on their page and therefore change the returned results. This will essentially break the class so it is always useful to double check the results once in a while. Secondly, if you use this class to do too much you will find yourself being blocked from the W3C validation site due to overuse.
One of the basic reasons why websites often do not pass the W3C validation check is due to the fact that there are several programs and scripts that are running on the web pages and many of them do not comply with the web standards.
Submitted by Free Stuff on Mon, 03/14/2011 - 12:19
The class works well, provided you make one tiny change. Currently the regular expression only matches 2 or more errors because it looks for the "s" at the end of errors. You could make this change to handle detecting a single error.
Have you ever noticed the slightly green colouration in the movie The Matrix? The movies are full of different colour pallets, but when inside The Matrix everything gets a slight green colouration.
If you have ever used a paint program then you might have used a flood fill algorithm. This is a mechanism by which an area of an image can be filled with a different colour and is normally depicted by a paint can pouring paint.
Benford's Law is an interesting heuristic in data analysis. It states that in any large collection of numbers that are created naturally, you should expect to see numbers starting with the number 1 about 30% of the time. The frequency distribution of numbers states that 2 should appear about 17% of the time, down to 9 being seen just 5% of the time.
I was thinking recently about the number of ways in which I could restrict access to a page using PHP.
The obvious option is to create a user authentication system, but in some situations that is overkill for what is required. If you just want to prevent users from going directly to a certain page then there are a few options open to you.
A common web design pattern is to incorporate an image into the design of the page. This creates a tighter integration with the image and the rest of the page.
The main issue in designing a page around the image is that the colours of the page must match the image. Otherwise this creates a dissonance between the image and the styles of the site.
Comments
Submitted by W3c Validation on Tue, 08/10/2010 - 13:33
PermalinkOne of the basic reasons why websites often do not pass the W3C validation check is due to the fact that there are several programs and scripts that are running on the web pages and many of them do not comply with the web standards.
Submitted by Free Stuff on Mon, 03/14/2011 - 12:19
PermalinkThe class works well, provided you make one tiny change. Currently the regular expression only matches 2 or more errors because it looks for the "s" at the end of errors. You could make this change to handle detecting a single error.
The question mark just makes the 's' optional. Works perfectly after that.
Submitted by Anonymous on Tue, 03/27/2012 - 15:12
PermalinkCorrected. Thanks for the input! :)
Submitted by philipnorton42 on Tue, 03/27/2012 - 15:55
PermalinkGracias..
Submitted by Amigo on Wed, 03/28/2012 - 19:26
PermalinkSubmitted by d3iti on Tue, 08/19/2014 - 10:36
PermalinkSubmitted by Sandeep pattanaik on Tue, 07/28/2015 - 16:43
PermalinkAdd new comment