首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >[原创]Google SEO教程之Google Indexing API第一时间抓取新页面

[原创]Google SEO教程之Google Indexing API第一时间抓取新页面

作者头像
极客中心
发布2021-01-21 15:46:43
2.8K0
发布2021-01-21 15:46:43
举报
文章被收录于专栏:极客中心极客中心

Google SEO教程之Google Indexing API第一时间抓取新页面

本文阅读重点 <

1 Google SEO教程之Google Indexing API第一时间抓取新页面

2 获取indexing API的私钥文件(json格式) (https://www.geekzl.com/#%E8%8E%B7%E5%8F%96indexing_API%E7%9A%84%E7%A7%81%E9%92%A5%E6%96%87%E4%BB%B6(json%E6%A0%BC%E5%BC%8F%29)

3 记录Service account邮箱账号

4 在站点设置中给予Service account账号相应权限

5 调用Google Indexing API的node.js代码

5.1 解决方法:

别名: Node.js如何使用Google Indexing API

上篇文章 Google SEO动态之Request Indexing功能停用 中,我们提到 2020年10月14日,Google暂停了Request Indexing 功能,中心君还提到过,会告诉大家相应的解决办法 - 使用Google Indexing API,这次我们就来好好聊聊怎么操作吧~

获取indexing API的私钥文件(json格式)

打开Google服务帐号页面

Service account details

From https://console.cloud.google.com/iam-admin/serviceaccounts/details/

访问 https://console.developers.google.com/apis/credentials?project=https://console.cloud.google.com/projectselector2/iam-admin/serviceaccounts?supportedpurview=project,然后点击创建密钥(Create Key)按钮下载包含API密钥的文件(建议用json格式)。

下载完,重命名为: service_account.json,供后面代码使用。

记录Service account邮箱账号

在Google服务帐号页面找到Service account邮箱账号(Email for Service account) in Google Cloud:

indexing-api-runner@xxx.iam.gserviceaccount.com

记录下来,后面需要用。

在站点设置中给予Service account账号相应权限

Google Search Console:

如果不设置这一步,运行后文中的nodejs代码, 会出现下面的错误返回值:

{
  "error": {
    "code": 403,
    "message": "Permission denied. Failed to verify the URL ownership.",
    "status": "PERMISSION_DENIED"
  }
}

调用Google Indexing API的node.js代码

使用 Node.js库 google-api-nodejs-client 获取 OAuth 令牌:

nodejs环境准备工作:

npm install googleapis
npm install request

原始代码是:

var request = require("request");
var { google } = require("googleapis");
var key = require("./service_account.json");

const jwtClient = new google.auth.JWT(
    key.client_email,
    null,
    key.private_key,
    ["https://www.googleapis.com/auth/indexing"],
    null
);

jwtClient.authorize(function (err, tokens) {
    if (err) {
        console.log(err);
        return;
    }
    let options = {
        url: "https://indexing.googleapis.com/v3/urlNotifications:publish",
        method: "POST",
        // Your options, which must include the Content-Type and auth headers
        headers: {
            "Content-Type": "application/json"
        },
        auth: { "bearer": tokens.access_token },
        // Define contents here. The structure of the content is described in the next step.
        json: {
            "url": "https://www.geekzl.com/why-name-jekyll.html",
            "type": "URL_UPDATED"
        }
    };
    request(options, function (error, response, body) {
        // Handle the response
        console.log(body);
    });
});

直接输入 node ./indexing.js 运行,出现问题:

Error while trying to retrieve access token { FetchError: request to https://oauth2.googleapis.com/token failed, reason: connect ETIMEDOUT 216.58.200.10:443
    at ClientRequest.<anonymous> (/Users/hesk/Documents/localize-spreadsheet-bot/node_modules/node-fetch/lib/index.js:1453:11)
    at ClientRequest.emit (events.js:180:13)
    at TLSSocket.socketErrorListener (_http_client.js:395:9)
    at TLSSocket.emit (events.js:180:13)
    at emitErrorNT (internal/streams/destroy.js:64:8)
    at process._tickCallback (internal/process/next_tick.js:178:19)
  message: 'request to https://oauth2.googleapis.com/token failed, reason: connect ETIMEDOUT 216.58.200.10:443',
  type: 'system',
  errno: 'ETIMEDOUT',
  code: 'ETIMEDOUT',
  config:
   { method: 'POST',
     url: 'https://oauth2.googleapis.com/token',
     data: 'code=4%2FQgFCT-LEUxcnDljD1DMn9olKwYQVJ9bVxiaZJMmUgT7fPAyu5Gc14Ro&client_id=875537178561-5j56883h195m8e8lrggah3fes3gh253t.apps.googleusercontent.com&client_secret=6IkI8HtPvcmXU7XORCgKg7TR&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&grant_type=authorization_code&code_verifier=',
     headers:
      { 'Content-Type': 'application/x-www-form-urlencoded',
        'User-Agent': 'google-api-nodejs-client/3.1.2',
        Accept: 'application/json' },
     params: {},
     paramsSerializer: [Function: paramsSerializer],
     body: 'code=4%2FQgFCT-LEUxcnDljD1DMn9olKwYQVJ9bVxiaZJMmUgT7fPAyu5Gc14Ro&client_id=875537178561-5j56883h195m8e8lrggah3fes3gh253t.apps.googleusercontent.com&client_secret=6IkI8HtPvcmXU7XORCgKg7TR&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&grant_type=authorization_code&code_verifier=',
     validateStatus: [Function: validateStatus],
     responseType: 'json' } }

解决方法:

为nodejs代码加入ip代理(确保在能科学上网时找到相应的ip proxy, 需要放到nodejs代码中).

process.env.http_proxy = 'http://10.179.8.31:9090';  /* Set proxy */
process.env.HTTPS_PROXY = 'http://10.179.8.31:9090';

当然,你如果在浏览器中使用且能访问Google (比如,可以用Chrome上网助手 - 插件),可以直接用 Repl.it 运行你的nodejs代码。

node.js在线测试:

Repl.it - Node.js Online Compiler and IDE - Fast, Powerful, Free

https://repl.it/languages/nodejs

文件结构:

改进后的 nodejs 代码:

var request = require("request");
var { google } = require("googleapis");
var key = require("./service_account.json");

process.env.http_proxy = 'http://10.179.8.31:9090';  /* Set proxy */
process.env.HTTPS_PROXY = 'http://10.179.8.31:9090';

const jwtClient = new google.auth.JWT(
    key.client_email,
    null,
    key.private_key,
    ["https://www.googleapis.com/auth/indexing"],
    null
);

jwtClient.authorize(function (err, tokens) {
    if (err) {
        console.log(err);
        return;
    }
    let options = {
        url: "https://indexing.googleapis.com/v3/urlNotifications:publish",
        method: "POST",
        // Your options, which must include the Content-Type and auth headers
        headers: {
            "Content-Type": "application/json"
        },
        auth: { "bearer": tokens.access_token },
        // Define contents here. The structure of the content is described in the next step.
        json: {
            "url": "https://www.geekzl.com/jsdelivr-not-update.html",
            "type": "URL_UPDATED"
        }
    };
    request(options, function (error, response, body) {
        // Handle the response
        console.log(body);
    });
});

我们再次执行:

bravo@BR MINGW64 /d/coding/GitHub/google-index-api
$ node ./indexing.js

返回结果:

{
  urlNotificationMetadata: {
    url: 'https://www.geekzl.com/jsdelivr-not-update.html',
    latestUpdate: {
      url: 'https://www.geekzl.com/jsdelivr-not-update.html',
      type: 'URL_UPDATED',
      notifyTime: '2020-10-16T08:14:24.510420447Z'
    }
  }
}

参考:

Google's officially supported Node.js client library for accessing Google APIs

googleapis.dev/nodejs/googleapis/latest/

Support for authorization and authentication with OAuth 2.0, API Keys and JWT (Service Tokens) is included.

Auth error: ETIMEDOUT #283 - set proxy

From https://github.com/googleapis/google-auth-library-nodejs/issues/283#issuecomment-563285724

How to get new pages or site updates indexed by Google quickly

From https://builtvisible.com/how-do-you-get-new-pages-indexed-or-your-site-re-crawled/

How to request Google to re-crawl my website?

From https://stackoverflow.com/questions/9466360/how-to-request-google-to-re-crawl-my-website

使用Google Indexing API 的前提条件

From https://developers.google.com/search/apis/indexing-api/v3/prereqs

Google Indexing API - 403 'Forbidden Response'

Index API: Permission denied. Failed to verify the URL ownership.

From https://support.google.com/webmasters/thread/4763732?hl=en

了解服务帐号 谷歌官方文档

From https://cloud.google.com/iam/docs/understanding-service-accounts#managing_service_account_keys

给本文打分 post

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2020-11-29 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • Google SEO教程之Google Indexing API第一时间抓取新页面
  • 获取indexing API的私钥文件(json格式)
  • 记录Service account邮箱账号
  • 在站点设置中给予Service account账号相应权限
  • 调用Google Indexing API的node.js代码
    • 解决方法:
    领券
    问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档