Downloading Alexa Data With PHP
Published by philipnorton42 on Wed, 01/23/2008 - 11:01It is widely known that the data that Alexa offers on visitor numbers is far from accurate, but it is possible to obtain an XML feed from Alexa that allows you to find out all of the data that Alexa offers, which is more than just their visitor numbers. Passing the correct parameters to this feed you can find out related links, contact and domain information, the Alexa rank, associated keywords and Dmoz listings.
As an example here is a feed URL for getting information about the bbc.co.uk page.
http://xml.alexa.com/data?cli=10&dat=nsa&ver=quirk-searchstatus&uid=19700101000000&userip=127.0.0.1&url=www.bbc.co.uk
So to get information about any site all you have to do is pass the correct URL to this address.
To get this information in a usable form with PHP you can use the curl functions. To download the Alexa feed into PHP use the following code:
1 2 3 4 5 6 7 8 9 10 11 12 | $url = 'www.bbc.co.uk'; $querystring = 'http://xml.alexa.com/data?cli=10&dat=nsa&ver=quirk-searchstatus&uid=19700101000000&userip=127.0.0.1&url='.urlencode($url); $ch = curl_init(); $user_agent = "Mozilla/4.0"; curl_setopt ($ch, CURLOPT_URL, $querystring); curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent); curl_setopt ($ch, CURLOPT_HEADER, 1); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt ($ch, CURLOPT_TIMEOUT, 120); $alexaXml = curl_exec($ch); curl_close($ch); |
You now have a variable called alexaXml that contains all of the information you need. You could use some of the XML parsing options within PHP, but a simpler method is to extract the information you need using regular expressions. Here are a few examples.
To get the Alexa popularity.
1 2 3 4 5 6 7 8 | preg_match('/\<POPULARITY URL="(.*?)" TEXT="(.*?)"\/\>/Ui',$alexaXml,$match); echo "<p>Popularity: "; if(count($match)>0){ echo $match[2]; }else{ echo 0; } echo '</p>'; |
To get the Alexa links.
1 2 3 4 5 6 7 8 | preg_match('/LINKSIN NUM="(.*?)"/Ui',$alexaXml,$match); echo "<p>Links: "; if(count($match)>0){ echo $match[1]; }else{ echo 0; } echo '</p>'; |
To get the Dmoz categories.
1 2 3 4 | preg_match_all('/CAT\sID="(.*)"/U',$alexaXml,$match); echo "<p>Dmoz cats: "; if(count($match[1])){ echo '<pre>'.print_r($match[1],true).' |
You can also see the data directly by printing off a couple of links.
1 2 3 | echo '<a href="http://www.alexa.com/data/ds/linksin?q=link%3A'.urlencode($url).'&url=http%3A//'.urlencode($url).'/" title="Alexa Links">Links</a>'; echo '<br />'; echo '<a href="http://www.alexa.com/data/details/traffic_details/'.urlencode($url).'" title="Alexa Data">Data</a>'; |
There is more information available than this. To see everything that you can extract just copy the URL at the top into a browser window and view the output directly. I suggest doing this in Firefox because of the nice way in which it displays XML.
Comments
I know about XML parsing in
philipnorton42 - Tue, 02/19/2008 - 16:04You really need to learn
Indy (not verified) - Tue, 02/19/2008 - 13:25Is it possible to get the web
Marty Martin (not verified) - Tue, 03/25/2008 - 15:52Not that I know of. You can
philipnorton42 - Tue, 03/25/2008 - 21:01http://www.alexa.com/data/details/traffic_details/{url}So for hashbangcode.com this would behttp://www.alexa.com/data/details/traffic_details/hashbangcode.comHowever, I don't see a way of getting to the traffic data itself. Not that the data is of much use anyway!Not that I can see, although
philipnorton42 - Thu, 03/04/2010 - 10:00can we get the keywords that
Mohamed Mahmoud (not verified) - Wed, 03/03/2010 - 18:13Add new comment