4

python主动推送链接至必应Bing平台

 2 years ago
source link: https://cjh0613.com/20200602pythonBingUrlPush.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

python主动推送链接至必应Bing平台

发表2020-06-02更新2020-08-01字数535预计阅读时长5分阅读次数

前几天用requests库post一直报错:格式问题;今天发现问题所在:要用json=data提交……然而我研究了requests的源码,也没有找到之前报错原因……

使用必应主动推送,收录速度贼快(2020.7.3更新,新网页刚出5分钟就被收录),可参考curl主动推送链接给Bing-分分钟让必应收录你的网页

我将其与hexo更新网页百度站长平台的主动推送代码整合使用。

import requests

def get_(data):
headers={'User-Agent':'curl/7.12.1 ',
'Content-Type':'application/json'}
try:
r = requests.post(url='https://ssl.bing.com/webmaster/api.svc/json/SubmitUrl?apikey=你的API_KEY',json=data)
print(r.status_code)
print(r.content)
except Exception.e:
print(e)

cjhpush={
"siteUrl": "网站名https://cjh0613.github.io",
"url": "网址https://cjh0613.github.io/index.html"
}
print(cjhpush)
get_(cjhpush)

推送结果可登陆必应站长平台后在此查到:

https://www.bing.com/webmasters/submiturl?siteUrl=你的网站链接

从sitemap获取链接并推送

这个代码针对google格式sitemap(使用ISO时间如2020-05-28T10:54:43.663Z),hexo安装hexo-generator-sitemap后即可使用。当然普遍支持其他网站。

推送sitemap时间距现在600秒以内的网页链接。

import requests
import json
import time
import datetime
import dateutil.parser
from bs4 import BeautifulSoup as bp

def get_(data):
headers={'User-Agent':'curl/7.12.1 ',
'Content-Type':'application/json'}
try:
r = requests.post(url='https://ssl.bing.com/webmaster/api.svc/json/SubmitUrl?apikey=APIKEY',json=data)
print(r.status_code)
print(r.content)
except Exception.e:
print(e)

print('start....','utf-8')
time.sleep(0.5)

site_url = 'https://cjh0613.com/google-sitemap.xml'

try:
print('Get sitemap....','utf-8')
data_ = bp(requests.get(site_url).content,'lxml')
except Exception.e:
print(e)

list_url=[]
list_date=[]

print('---------------------------------')
#for x1,y1 in enumerate(data_.find_all('url')):
for x,y in enumerate(data_.find_all('loc')):
print(x,y.string)
list_url.append(y.string)

for x2,y2 in enumerate(data_.find_all('lastmod')):
startTime=y2.string
startTime=dateutil.parser.parse(startTime)
date1=(startTime.isoformat())[0:10]
startTime=date1+" "+(startTime.isoformat())[11:19]
startTime=datetime.datetime.strptime(startTime,"%Y-%m-%d %H:%M:%S")
now=datetime.datetime.utcnow()
endTime = datetime.datetime(now.year, now.month, now.day, now.hour, now.minute, now.second)
date2=(endTime.isoformat())[0:10]
date = endTime- startTime
seconds=date.seconds
if date1==date2 and seconds<600:#Can be modified
list_date.append(x2)

print('---------------------------------')
print(list_date)
print('submit....','utf-8')

for x in list_date:
cjhurl=list_url[x]
print('now:','utf-8' + cjhurl)

cjhpush={
"siteUrl": "web",#Need modifing
"url": cjhurl
}
print(cjhpush)
get_(cjhpush)

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK