amtoaer

晓风残月

叹息似的渺茫,你仍要保存着那真!
github
x
telegram
steam
nintendo switch
email

Automatically notify Google to update the sitemap after updating blog articles.

The last blog post on my blog was on December 29, 2020. Since its publication, I have been trying to search for it on Google every day, but I have found that it has not been indexed.

Exposed the fact that I am a person who searches for myself intensively XD

I manually checked the Google Search Console page and found that the last update to the sitemap was still on December 19... This leads to today's question: How to automatically update the sitemap on Google.

Exploration Process#

First, I need to find a way to update the sitemap on Google. By consulting the documentation, I found out:

Use the "ping" feature to request that we crawl your sitemap. Send an HTTP GET request like this:
http://www.google.com/ping?sitemap={complete_url_of_sitemap}
For example:
http://www.google.com/ping?sitemap=https://example.com/sitemap.xml

So the idea is clear, I just need to manually GET this URL after the blog is deployed.

Solution#

Github Actions#

For users who deploy their blogs using Github Actions, the process is very simple. Just add a line curl http://www.google.com/ping?sitemap=<your sitemap> after the deployment command in the yml file.

Vercel#

Because Github Pages is slow, I use Vercel instead, and it does not provide a functionality similar to After Deployment Command. This means I cannot hook into its deployment process accurately.

So, I asked for help from the NEU LUG group, and a member gave me a simple and rough solution: whether it hooks or not, just update it once in a while!

Figure 1

Originally, I was planning to do this, but considering the update frequency of my blog and what is written in the Google documentation:

Please only submit or ping your sitemap to us when you create or update your sitemap. If your sitemap doesn't change, there's no need to resubmit it multiple times.

To avoid any issues, I abandoned this method. But it gave me an idea: I don't need to blindly pursue hooking, as long as I achieve the desired result.

After thinking about it, I came up with the following method:

When updating the blog (pushing to the master branch of the repository), both Vercel builds and Github Actions are triggered simultaneously. Github Actions checks the content of the commit message, and if it contains the phrase "更新博文" (update blog post), the following process is triggered:

  1. Wait for 3 minutes (the build on Vercel takes 1-2 minutes, so I added a buffer of 3 minutes for error tolerance)
  2. Send a GET request to notify Google to update the sitemap.

The specific yml workflow file is as follows:

name: ping-google
on:
  push:
    branches:
      - master
jobs:
  ping-google:
    if: "contains(github.event.head_commit.message, '更新博文')"
    runs-on: ubuntu-latest
    steps:
      - name: wait for build
        run: sleep 3m
      - name: ping google to update sitemap.xml
        run: curl http://www.google.com/ping?sitemap=https://blog.allwens.work/sitemap.xml

It has been tested and works well.

There are also disadvantages to this approach: Github Actions has a usage limit of 3000 minutes per month for free users. Although we didn't do anything during the three-minute sleep, this time will still be counted towards your Github Actions usage time. (But since these minutes are not used up and my blog has a low update frequency, the cost is acceptable, haha)

References#

  1. Github Actions check commit message:

    Support [skip ci] out of box with github actions

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.