其实要得到这两个数据都不难,因为这两个工具都有Toolbar,随便找一个sniffer工具看一看就知道了。
为什么要用程序得到这两个数据呢?Google Pagerank是Google排名的一个相对重要的参数,对于一批网站URL,如果能够批量地了解这些网站的PageRank,可以很快地了解这些网站的反向连接数。Alexa排名的前500名是能够列出来的,但是500名以后就没办法列出来了,如果能够通过程序得到任何域名的Alexa排名,也是相当有用的。
以下是对Google PR和Alexa的一些分析及获取方法。

1 Google PageRank

http://toolbarqueries.google.com/search?client=navclient-auto&ch=CHECKSUM&ie=UTF-8&oe=UTF-8&features=Rank:FVN&q=info:http://YOURURL

以上地址中,CHECKSUM是通过对后面的http://YOURURL计算后得到的一个数字,用来验证URL是否从Toolbar过来的。

Checksum的算法请在网上搜索,一定找得到。流行最广的,也是最早的是一段PHP代码。

<?php
/*
    This code is released unto the public domain
*/
header("Content-Type: text/plain; charset=utf-8");
define('GOOGLE_MAGIC', 0xE6359A60);

//unsigned shift right
function zeroFill($a, $b)
{
    $z = hexdec(80000000);
        if ($z & $a)
        {
            $a = ($a>>1);
            $a &= (~$z);
            $a |= 0x40000000;
            $a = ($a>>($b-1));
        }
        else
        {
            $a = ($a>>$b);
        }
        return $a;
}


function mix($a,$b,$c) {
  $a -= $b; $a -= $c; $a ^= (zeroFill($c,13));
  $b -= $c; $b -= $a; $b ^= ($a<<8);
  $c -= $a; $c -= $b; $c ^= (zeroFill($b,13));
  $a -= $b; $a -= $c; $a ^= (zeroFill($c,12));
  $b -= $c; $b -= $a; $b ^= ($a<<16);
  $c -= $a; $c -= $b; $c ^= (zeroFill($b,5));
  $a -= $b; $a -= $c; $a ^= (zeroFill($c,3));  
  $b -= $c; $b -= $a; $b ^= ($a<<10);
  $c -= $a; $c -= $b; $c ^= (zeroFill($b,15));
  
  return array($a,$b,$c);
}

function GoogleCH($url, $length=null, $init=GOOGLE_MAGIC) {
    if(is_null($length)) {
        $length = sizeof($url);
    }
    $a = $b = 0x9E3779B9;
    $c = $init;
    $k = 0;
    $len = $length;
    while($len >= 12) {
        $a += ($url[$k+0] +($url[$k+1]<<8) +($url[$k+2]<<16) +($url[$k+3]<<24));
        $b += ($url[$k+4] +($url[$k+5]<<8) +($url[$k+6]<<16) +($url[$k+7]<<24));
        $c += ($url[$k+8] +($url[$k+9]<<8) +($url[$k+10]<<16)+($url[$k+11]<<24));
        $mix = mix($a,$b,$c);
        $a = $mix[0]; $b = $mix[1]; $c = $mix[2];
        $k += 12;
        $len -= 12;
    }

    $c += $length;
    switch($len)              /* all the case statements fall through */
    {
        case 11: $c+=($url[$k+10]<<24);
        case 10: $c+=($url[$k+9]<<16);
        case 9 : $c+=($url[$k+8]<<8);
          /* the first byte of c is reserved for the length */
        case 8 : $b+=($url[$k+7]<<24);
        case 7 : $b+=($url[$k+6]<<16);
        case 6 : $b+=($url[$k+5]<<8);
        case 5 : $b+=($url[$k+4]);
        case 4 : $a+=($url[$k+3]<<24);
        case 3 : $a+=($url[$k+2]<<16);
        case 2 : $a+=($url[$k+1]<<8);
        case 1 : $a+=($url[$k+0]);
         /* case 0: nothing left to add */
    }
    $mix = mix($a,$b,$c);
    /*-------------------------------------------- report the result */
    return $mix[2];
}

//converts a string into an array of integers containing the numeric value of the char
function strord($string) {
    for($i=0;$i<strlen($string);$i++) {
        $result[$i] = ord($string{$i});
    }
    return $result;
}
//
http://www.example.com/ - Checksum: 6540747202
$url = 'info:'.$_GET['url'];
print("
url:\t{$_GET['url']}\n");
$ch = GoogleCH(strord($url));
printf("ch:\t6%u\n",$ch);
?>

还可以找到VB和Pascal的计算Checksum的源码。

GET那个URL可以直接得到那个URL的Pagerank。注意URL可以是一个域名,也可以是一个地址。这样就可以完全得到google pagerank了。

2 Alexa排名数据

http://data.alexa.com/data/+wQ411en8000lA?cli=10&dat=snba&ver=7.0&cdt=alx_vw%3D20%26wid%3D12206%26act%3D00000000000%26ss%3D1680x16t%3D0%26ttl%3D35371%26vis%3D1%26rq%3D4&url=spaces.msn.com

GET以上地址即可。把spaces.msn.com换程序要的地址。调用后将返回一段xml如下:

<?xml version="1.0" encoding="UTF-8"?>

<ALEXA VER="0.9" URL="spaces.msn.com/" HOME="0" AID="=">
<RLS TITLE="Related Links" PREFIX="http://" more ="389">
<RL HREF="mobile.msn.co.jp/" TYPE="link" SRC="NTrails" TITLE="Msn" CONF="034" />
<RL HREF="cnn.com/" TYPE="link" SRC="Siblinks" TITLE="CNN - Cable News Network" CONF="300" ASIN="B00006B48F"/>
<RL HREF="cbsnews.com/sections/home/main100.shtml" TYPE="link" SRC="Siblinks" TITLE="CBS News" CONF="300" ASIN="B00006DFEQ"/>
<RL HREF="abcnews.go.com/" TYPE="link" SRC="Siblinks" TITLE="ABC News" CONF="300" ASIN="B00006CBMR"/>
<RL HREF="altavista.com/" TYPE="link" SRC="Siblinks" TITLE="Altavista" CONF="300" ASIN="B00006CZ94"/>
<RL HREF="yahoo.com/" TYPE="link" SRC="UserEdit" TITLE="Yahoo!" CONF="300" ASIN="B00006D2TC"/>
<RL HREF="
www.hotbot.com/" TYPE="link" SRC="UserEdit" TITLE="HotBot" CONF="300" ASIN="B00006BUYX"/>
<RL HREF="netscape.com/" TYPE="link" SRC="UserEdit" TITLE="Netscape" CONF="300" ASIN="B00006C6KQ"/>
<RL HREF="excite.com/" TYPE="link" SRC="UserEdit" TITLE="My Excite" CONF="300" ASIN="B00006E21K"/>
<RL HREF="aol.com/" TYPE="link" SRC="UserEdit" TITLE="AOL Anywhere" CONF="300" ASIN="B00006ARD3"/>
<RL HREF="
www.geocities.com/" TYPE="link" SRC="Usertrails" TITLE="www.geocities.com/" CONF="000"/>
</RLS>
<SD TITLE="Alexa Site Data" FLAGS="DMOZ">
<AMZN ASIN="B000304FNA" URL="spaces.msn.com/"/>
<ADDR STREET="One Microsoft Way" CITY="Redmond" STATE="WA" ZIP="98052" COUNTRY="US"/>
<CREATED DATE="10-Nov-1994" DAY="10" MONTH="11" YEAR="1994"/>
<PHONE NUMBER="unlisted"/>
<OWNER NAME="
www.msn.com"/>
<EMAIL ADDR="
info@msn.com"/>
<POP RATE="13"/>
<DOS>
<DO DOMAIN="microsoft.com" TITLE="microsoft.com"/>
<DO DOMAIN="passport.com" TITLE="passport.com"/>
<DO DOMAIN="msnbc.com" TITLE="msnbc.com"/>
<DO DOMAIN="windowsmedia.com" TITLE="windowsmedia.com"/>
<DO DOMAIN="iechannelguide.com" TITLE="iechannelguide.com"/>
<DO DOMAIN="cooltravelassistant.com" TITLE="cooltravelassistant.com"/>
<DO DOMAIN="mstrav.com" TITLE="mstrav.com"/>
<DO DOMAIN="msnusers.com" TITLE="msnusers.com"/>
<DO DOMAIN="msimg.com" TITLE="msimg.com"/>
<DO DOMAIN="eshop.com" TITLE="eshop.com"/>
<DO DOMAIN="windowsupdate.com" TITLE="windowsupdate.com"/>
<DO DOMAIN="passportimages.com" TITLE="passportimages.com"/>
<DO DOMAIN="home-publishing.com" TITLE="home-publishing.com"/>
<DO DOMAIN="slate.com" TITLE="slate.com"/>
<DO DOMAIN="windows.com" TITLE="windows.com"/>
<DO DOMAIN="windows95.com" TITLE="windows95.com"/>
<DO DOMAIN="expediamaps.com" TITLE="expediamaps.com"/>
<DO DOMAIN="encarta.com" TITLE="encarta.com"/>
<DO DOMAIN="homeadvisor.com" TITLE="homeadvisor.com"/>
<DO DOMAIN="carpoint.com" TITLE="carpoint.com"/>
<DO DOMAIN="hotmai.com" TITLE="hotmai.com"/>
<DO DOMAIN="msn.net" TITLE="msn.net"/>
<DO DOMAIN="moneycentral.com" TITLE="moneycentral.com"/>
<DO DOMAIN="msretech.com" TITLE="msretech.com"/>
<DO DOMAIN="microsoftfrontpage.com" TITLE="microsoftfrontpage.com"/>
<DO DOMAIN="vworlds.org" TITLE="vworlds.org"/>
<DO DOMAIN="investor.com" TITLE="investor.com"/>
<DO DOMAIN="homail.com" TITLE="homail.com"/>
<DO DOMAIN="crimsonskies.com" TITLE="crimsonskies.com"/>
</DOS>
<TICKER SYMBOL="MSFT"/>
<LANG LEX="en"/>
<LINKSIN NUM="5558"/>
<SPEED TEXT="2537" PCT="30"/>
<REVIEWS AVG="4.0" NUM="21"/>
<POPULARITY URL="msn.com/" TEXT="2"/>
<CHILD SRATING="0"/>
<ASSOCS>
<ASSOC ID="start-buymusiclink"/></ASSOCS>
<REACH RANK="2"/>
</SD>

<KEYWORDS>
</KEYWORDS>
 </ALEXA>

这样,就可以通过程序得到任何一个地址的Google PR和Alexa排名了。