It is widely known that the data that Alexa offers on visitor numbers is far from accurate, but it is possible to obtain an XML feed from Alexa that allows you to find out all of the data that Alexa offers, which is more than just their visitor numbers. Passing the correct parameters to this feed you can find out related links, contact and domain information, the Alexa rank, associated keywords and Dmoz listings.
As an example here is a feed URL for getting information about the bbc.co.uk page.
So to get information about any site all you have to do is pass the correct URL to this address.
To get this information in a usable form with PHP you can use the curl functions. To download the Alexa feed into PHP use the following code:
- $url = 'www.bbc.co.uk';
- $querystring = 'http://xml.alexa.com/data?cli=10&dat=nsa&ver=quirk-searchstatus&uid=19700101000000&userip=127.0.0.1&url='.urlencode($url);
- $ch = curl_init();
- $user_agent = "Mozilla/4.0";
- curl_setopt ($ch, CURLOPT_URL, $querystring);
- curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent);
- curl_setopt ($ch, CURLOPT_HEADER, 1);
- curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
- curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
- curl_setopt ($ch, CURLOPT_TIMEOUT, 120);
- $alexaXml = curl_exec($ch);
You now have a variable called alexaXml that contains all of the information you need. You could use some of the XML parsing options within PHP, but a simpler method is to extract the information you need using regular expressions. Here are a few examples.
To get the Alexa popularity.
To get the Alexa links.
To get the Dmoz categories.
You can also see the data directly by printing off a couple of links.
- echo '<br />';
- echo '<a href="http://www.alexa.com/data/details/traffic_details/'.urlencode($url).'" title="Alexa Data">Data</a>';
There is more information available than this. To see everything that you can extract just copy the URL at the top into a browser window and view the output directly. I suggest doing this in Firefox because of the nice way in which it displays XML.