title: Automatically Notify Google to Update Site Map After Blog Updates
date: 2021-01-04 21:57:37
The last article on the blog was dated December 29, 2020. From the day it was published until today, I have been trying to search for it on Google every day, but I found that it has not been indexed.
Exposed the fact that I am a person who searches for myself intensively XD
I manually checked the Google Search Console page and found that the last update of the site map was still on December 19... This leads to today's question: How to automatically update the site map on Google.
First, we need to find a way to update the site map on Google. By referring to the documentation, we can learn:
Use the "ping" function to request that we crawl the site map. Send an HTTP GET request as follows:
So the idea is very clear, just manually GET this address after the blog is deployed.
For users who deploy their blogs using Github Actions, the process is very simple. Just add a line
curl http://www.google.com/ping?sitemap=<your sitemap> after the deployment command in the yml file.
Because Github Pages is slow to access, I use Vercel, which does not provide a function similar to After Deployment Command. This prevented me from accurately hooking its deployment process.
So, I asked for help from the NEU LUG group, and a group member gave a simple and rough solution: whether it hooks or not, just update it regularly!
Originally, I was planning to do this, but considering the frequency of updates on my blog, and what is written in the Google documentation:
Please only send site map-related notifications to Google when creating or updating the site map. If there are no changes to the site map, please do not submit or ping the site map to us multiple times.
To avoid any issues, I abandoned this method. But it gave me an idea, that is: there is no need to blindly pursue hooking, as long as the desired effect is achieved.
After thinking, I came up with the following method:
When updating the blog (pushing to the master branch of the repository), Vercel builds and triggers Github Actions at the same time. Github Actions checks the content of the commit message, and if it contains the phrase "更新博文" (update blog), it triggers the following process:
- Wait for 3 minutes (the build time of my blog on Vercel is 1-2 minutes, so I set it to 3 minutes for error tolerance)
- Send a GET request to notify Google to update the site map.
The specific yml workflow file is as follows:
name: ping-google on: push: branches: - master jobs: ping-google: if: "contains(github.event.head_commit.message, '更新博文')" runs-on: ubuntu-latest steps: - name: wait for build run: sleep 3m - name: ping google to update sitemap.xml run: curl http://www.google.com/ping?sitemap=https://blog.allwens.work/sitemap.xml
It has been tested and works well.
There are also disadvantages to this approach: Github Actions has a usage limit of 3000 minutes per month for free users. Although we didn't do anything during the three-minute sleep command, this time will still be counted towards your Github Actions usage time. (But since these times are not used up anyway, and my blog update frequency is very low, the cost is acceptable, haha)
Github Actions check commit message: