WordPress如何实现搜索引擎蜘蛛爬行记录?

无评论

作者照片

By 霜天

在新站或网站收录有问题时,可能需要持续关注搜索引擎蜘蛛的抓取情况。对于网站的内容是否能够及时的收录,都是我们站长每天都在关注的东西,毕竟每天产出的东西有限,所更加的关注自己写的内容是否能够讨的百度蜘蛛的欢心,好及时收录自己的网页,毕竟早一天收录,这样获得流量的可能性就会更多一点,那样的话,以后变现的成本也会变得更低。那么WordPress如何实现搜索引擎蜘蛛爬行记录?

每次打开服务器端访问日志查看非常麻烦,特别是当日志文件比较大时更是不便。最好的办法就是在线直接打开看蜘蛛爬行记录。为此,我们可以免插件使用纯PHP代码来实现这个功能,以下是具体实现代码。

// 记录蜘蛛访问记录

function get_naps_bot(){

$useragent = strtolower($_SERVER[\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’HTTP_USER_AGENT\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’]);

if (strpos($useragent, \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’googlebot\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’) !== false){

return \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’Googlebot\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’;

}

if (strpos($useragent, \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’bingbot\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’) !== false){

return \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’Bingbot\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’;

}

if (strpos($useragent, \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’slurp\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’) !== false){

return \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’Yahoobot\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’;

}

if (strpos($useragent, \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’baiduspider\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’) !== false){

return \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’Baiduspider\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’;

}

if (strpos($useragent, \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’sogou web spider\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’) !== false){

return \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’Sogouspider\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’;

}

if (strpos($useragent, \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’haosouspider\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’) !== false){

return \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’HaosouSpider\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’;

}

if (strpos($useragent, \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’yodaobot\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’) !== false){

return \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’YodaoBot\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’;

}

return false;

}

function nowtime(){

date_default_timezone_set(\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’Asia/Shanghai\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’);

$date=date(\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\”Y-m-d G:i:s\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\”);

return $date;

}

$searchbot = get_naps_bot();

if ($searchbot) {

$tlc_thispage = addslashes($_SERVER[\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’HTTP_USER_AGENT\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’]);

$url=$_SERVER[\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’HTTP_REFERER\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’];

$addr=$_SERVER[\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’REMOTE_ADDR\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\’];

$file=\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\”robotslogs.txt\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\”; //根目录下记录蜘蛛访问的文件

$time=nowtime();

$data=fopen($file,\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\”a\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\”);

$PR=\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\”$_SERVER[REQUEST_URI]\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\”;

fwrite($data,\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\”[$time] – $addr – $PR – $searchbot $tlc_thispage \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\r\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\n\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\”);

fclose($data);

}

将以上代码插入funtion.php文件,并在网站根目录下创建一个名为robotslogs.txt的文件即可,文件名可自定义。注意需为robotslogs.txt设置可写权限,最好是777权限,755权限某些主机配置下可能存在无法写入的情况。以上代码能记录搜索蜘蛛的基本抓取信息,完成以后,一般24小时候,就可以看到你“robotslogs.txt”当中已经满是数据的了。

发表评论