不多BB,但是排面还是要的

前言

今天翻了翻handsome论坛,看见了Rabbit大佬的文章,为Typecho添加百度收录检测
我瞟了眼评论区,好像都是反应不能用(不管有没有收录,都显示已收录)
然后我又看了看代码,一看便知,像百度这种的搜索引擎,必然是天天有人去爬,所以需要加一些模拟UA之类的东西进去,绕过百度的反爬虫

过程

于是我提起键盘就开始喷...呸,开始写代码
首先试了试直接使用file_get_contents来获取,正如我所料
它来了,它来了,反爬虫验证踏着翔XXXXX

嘿,接下来就简单了,换成curl模拟一段header继续试

User-Agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.162 Safari/537.36
Referer:https://www.baidu.com

一下子就出来了 So easy

食用教程

注意

不知道为啥,我这小几率会很慢很慢,用原来直接php执行肯定会出大问题
于是我将检测做成了API,获取收录使用Jquery,这下就没事了

修改

下面就是需要动手的地方了
1.在模板post.php加入(我是handsome,其他模板请查看更多->其他模板

<!--百度收录-->
<li class="meta-baidu"><span class="post-icons"><i class="glyphicon glyphicon-refresh" id="baidu_icon"></i></span><span class="meta-value" id="baidu_result">加载中</span></li>

2.还是在post.php内加入(必须引入jquery)

<script>
    function baidu_check(){
        $.getJSON("https://cn1.api.wfblog.net/baidu.php?domain="+window.location.href,function(result){ 
            if (result.code == 200) {
                $('#baidu_icon').removeClass('glyphicon-refresh');
                $('#baidu_icon').addClass('glyphicon-ok-circle');
                $('#baidu_result').text('百度已收录');
            }else if(result.code == 403){
                $('#baidu_icon').removeClass('glyphicon-refresh');
                $('#baidu_icon').addClass('glyphicon-info-sign');
                $('#baidu_result').text('百度未收录');
                baidu_push();
            }else{
                 $('#baidu_icon').removeClass('glyphicon-refresh');
                $('#baidu_icon').addClass('glyphicon-remove-circle');
                $('#baidu_result').text('查询收录失败');
            }
        });
    }
    function baidu_push(){
        var bp = document.createElement('script');
        var curProtocol = window.location.protocol.split(':')[0];
        if (curProtocol === 'https') {
            bp.src = 'https://zz.bdstatic.com/linksubmit/push.js';        
        } else {
            bp.src = 'http://push.zhanzhang.baidu.com/push.js';
        }
        var s = document.getElementsByTagName("script")[0];
        s.parentNode.insertBefore(bp, s);
    }
    baidu_check();
</script>

更多

其他模板

1.在模板functions.php末尾合适处加入以下代码

function baidu_check() {
    $url = baidu_url();
    $api = 'https://cn1.api.wfblog.net/baidu.php?domain='; //更改为你自己的API
    $result = json_decode(file_get_contents($api.$url));
    if($result['code'] == 200){
        echo '百度已收录';
    }elseif($result['code'] == 403){
        echo '<a style="color:red;" rel="external nofollow" title="点击提交收录" target="_blank" href="http://zhanzhang.baidu.com/sitesubmit/index?sitename='.$url.'">百度未收录</a>';
    }else{
        echo '查询收录失败';
    }
}
function baidu_url(){
    if((isset($_SERVER['HTTPS']) && $_SERVER['HTTPS'] == 'on') || (isset($_SERVER['HTTP_X_FORWARDED_PROTO']) && $_SERVER['HTTP_X_FORWARDED_PROTO'] == 'https')){
        return 'https'.'://'.$_SERVER['HTTP_HOST'].$_SERVER['REQUEST_URI'];
    }else{
        return 'http'.'://'.$_SERVER['HTTP_HOST'].$_SERVER['REQUEST_URI'];
    }
}

2.然后在你需要输出检测结果的地方加入<?php baidu_check(); ?>即可

API源码

我不在提供API了,撑不住了!!!

<?php
/**
 * Baidu
 * @editer: Weifeng
 * @link: https://wfblog.net
 * @version: 1.0
 */

error_reporting(0);
header("Access-Control-Allow-Origin:*");
header('Content-type: application/json');

$domain = @$_GET['domain'];
if(!isset($domain) || empty($domain) || $domain==''){
    $data = array(
        "code" => false,
        "msg" => "未传入请求参数!"
    );
    echo json_encode($data,JSON_UNESCAPED_UNICODE);
    exit;
}
if(substr($domain, -1) == '/'){
    $domain = substr($domain,0,strlen($domain)-1);
}

$data = checkBaidu($domain);
echo json_encode($data,JSON_UNESCAPED_UNICODE);

function checkBaidu($url){
    $header = array(
        "Host:www.baidu.com",
        "Content-Type:application/x-www-form-urlencoded",//post请求
        "Connection: keep-alive",
        "Referer:https://www.baidu.com",
        "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.162 Safari/537.36"
    );
    $url = 'https://www.baidu.com/s?ie=UTF-8&wd='.urlencode($url).'&usm=3&rsv_idx=2&rsv_page=1';
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt ($ch, CURLOPT_HTTPHEADER, $header);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    $output = curl_exec($ch);
    curl_close($ch);
    if(strpos($output, '没有找到') || strpos($output, '很抱歉')){
        $data = array(
            "code" => 403,
            "msg" => "该域名暂时未被百度收录!"
        );
    }else{
        $number = GetBetween($output,'<span class="nums_text">百度为您找到相关结果约','个</span>');
        if(empty($number) || $number == 0){
            $number = GetBetween($output,'<b>找到相关结果数约','个</b></p>');
            if(empty($number) || $number == 0){
                $data = array(
                    "code" => false,
                    "msg" => "获取百度收录失败!"
                );
                return $data;
            }
        }
        $data = array(
            "code" => 200,
            "msg" => "该域名已被百度收录!",
            "number" => str_replace(',','',$number)
        );
    }
    return $data;
}

function GetBetween($content,$start,$end){
    $r = explode($start, $content);
    if (isset($r[1])){
        $r = explode($end, $r[1]);
        return $r[0];
    }
}
?>

效果

请看本博客的文章标题下方

最后修改:2020 年 07 月 08 日 09 : 57 AM
如果觉得我的文章对你有用,请随意赞赏