【python零基础爬虫入门】，爬取百度图片，小孩子也能学会

先上效果图
在这里插入图片描述
需要头文件

import re
import requests
import os

因为爬虫需要用到请求网络部分，所以需要这两个包，没有的话自行下载即可。

 headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36'

完整的请求

url = 'https://image.baidu.com/search/flip?tn=baiduimage&ie=utf-8&word=='+name+'+&pn='+str(i*30)
        result = requests.get(url,headers=headers)
        dowmloadPic(result.content.decode(), name)

得到了html之后需要用到正则表达式

 pic_url = re.findall('"objURL":"(.*?)",',html,re.S)

最后直接把请求到的图片下载好就行

 fp = open(dir, 'wb')
        fp.write(pic.content)
        fp.close()

完整代码：

#!/usr/bin/python
# -*- coding: UTF-8 -*-
import re
import requests
import os


def dowmloadPic(html, keyword,i):
    pic_url = re.findall('"objURL":"(.*?)",',html,re.S)
   
    abc=i*60
    print('找到关键词:' + keyword + '的图片，现在开始下载图片...')
    for each in pic_url:
        print('正在下载第' + str(abc) + '张图片，图片地址:' + str(each))
        try:
            pic = requests.get(each, timeout=10)
        except requests.exceptions.ConnectionError:
            print('【错误】当前图片无法下载')
            continue

        dir = r'D:\image\i' + keyword + '_' + str(abc) + '.jpg'
        if not os.path.exists('D:\image'):
            os.makedirs('D:\image')
        
        fp = open(dir, 'wb')
        fp.write(pic.content)
        fp.close()
        abc += 1


if __name__ == '__main__':
    #word = input("Input key word: ")
    headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36'}
    name = input('输入下载图片的名字')
    num = 0
    x = input('您要爬取几张呢?，n*60')

    for i in range(int(x)):
        url = 'https://image.baidu.com/search/flip?tn=baiduimage&ie=utf-8&word=='+name+'+&pn='+str(i*30)
        result = requests.get(url,headers=headers)
        dowmloadPic(result.content.decode(), name,i)
print("下载完成")

有想学爬虫的小伙伴也可以找我交流一下。
q：2316773638

【python零基础爬虫入门】，爬取百度图片，小孩子也能学会

【python零基础爬虫入门】，爬取百度图片，小孩子也能学会

Recommend

Tiny Container Challenge: Building a 6kB Containerized HTTP Server!

他15岁考进少年班，23岁成为阿里最年轻技术专家

OpenCV AI Competition 2021 Highlights and Team Profiles Part 1 - OpenCV

Comparing the New Generation of Build Tools | CSS-Tricks

Content is King in NLG

iOS 14.5 is rolling out next week — here are its best new features

Making money is easy: 4 brutally honest tips for founders

佳明Enduro深度体验：65天超长续航是亮点！但它也是不折不扣的「健康手表」…

Italian hospital employee accused of skipping work for 15 years

How Signal hacked the device that claimed to hack Signal

About Joyk