Python爬取网页小说-代码狗

重要：本文最后更新于2023-06-29 20:50:14，某些文章具有时效性，若有错误或已失效，请在下方留言或联系代码狗。

Python作为最容易上手的编程语言之一，使用它做爬虫非常方便。前天某位站长找我帮他爬一部小说，简单看了下这个网站，没啥反爬措施，为了方便我甚至没开多线程，下面是python源码。

复制

import requests
from bs4 import BeautifulSoup


def getdata(url):
    response=requests.get(url)
    response.encoding = 'UTF-8'
    gksoup = BeautifulSoup(response.text, "html")
    article=gksoup.find('div',attrs={'class':'tagCol'})
    content=article.find('p').get_text()
    return content
response=requests.get(url="https://www.huangdizhijia.com/novel-7283.html")
response.encoding = 'UTF-8'
gksoup = BeautifulSoup(response.text, "html")
article=gksoup.find('div',attrs={'class':'tagCol'})
mllist=article.find_all('a')
with open('output.txt', 'w', encoding='utf-8') as file:   
    for i in range(0, len(mllist)):
        url='https://www.huangdizhijia.com'+mllist[i].get("href")
        file.write(mllist[i].text + '\n')
        print(mllist[i].text)
        file.write(getdata(url) + '\n')

就这样简单，哈哈哈，运行过程中每爬一章就会输出章节名字，爬完后会在当前目录中输出整个小说的txt文件。

让代码更简单

Python爬取网页小说

感觉很棒！可以赞赏支持我哟~

相关阅读

评论（1）

淘客CPS推广插件

京东淘宝一键操作，支持Gutenberg编辑器

国内直连ChatGPT

调用ChatGPT API国内可用，写论文，代码等，AI智能对话

正版软件授权

团购优惠价格，赶紧上车！

广告位招租

位置可协商，符合法律法规者皆可

Tutor LMS插件授权

WordPress正版Tutor LMS在线课程插件终身授权299元

互动窗口

热门内容

聚合文章