Python爬虫编程思想（52）：使用Beautiful Soup选择子节点

2 years ago

source link: https://blog.csdn.net/nokiaguy/article/details/120721729
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Python爬虫编程思想（52）：使用Beautiful Soup选择子节点

专栏收录该内容

53 篇文章 2 订阅 ¥29.90 ¥99.00

1. 获取直接子节点

2. 获取所有的子孙节点

在选取节点时，并不是总能一次就将我们需要的节点都选取出来，有时可能需要分多步来完成，例如，第一步先选取一个节点中的所有子节点，第二步再从选取的这些子节点中利用某些规则选取出特定的子节点。这就要求可以获取某个特定节点的所有子节点。在Beautiful Soup中获取子节点分下面两种情况。

Python爬虫编程思想（47）：项目实战：抓取豆瓣Top250图书榜单

本文使用requests库、lxml库以及XPath抓取豆瓣网Top250图书排行榜。读者可以通过https://book.douban.com/top250访问Top250图书榜单，如图1所示。在开始编写爬虫之前，先要分析一下Top250榜单代码和页面切换的规律。首先来分析一下页面切换的规则。在页面的最下方是分页导航条，分别切换到第1页、第2页、第3页、第4页，在地址栏会看到如下的4个URLhttps://book.douban.com/top250?start...

Recommend

Python爬虫编程思想（52）：使用Beautiful Soup选择子节点

Python爬虫编程思想（52）：使用Beautiful Soup选择子节点

Recommend

How To Handle Errors in a Flask Application

Podcast 383: A database built for a firehose

GoLand 2021.3 EAP #3: gofmt on Save, Proxy Support for SSH, the Ability to Split...

223: Productivity with TODO Apps and Personal Knowledge Management Systems

An Introduction to Hybrid Microservices

Build and Secure a FastAPI Server with Auth0

The Relationship Between OpenAPI and Postman Collections

How to Manage Cloud Spend In Azure — With Your Endpoint of Choice

Episode 215 – Rebuilding the Plane Mid-Air w/ David Merk

使用XPocket插件vmstat查看上下文切换

About Joyk