Home  >  Q&A  >  body text

python爬虫 - python爬取豆瓣电影,无法抓取到内容

代码:

# /usr/bin/python
#coding:utf-8
__author__ = 'eyu Fanne'

import requests,re
from bs4 import BeautifulSoup

move_url = 'https://movie.douban.com/'

def Robot():
    res_url = requests.get(move_url)
    print res_url.status_code
    soup = BeautifulSoup(res_url.text,'lxml')
    print soup.title
    soup_a = soup.find_all("a",class_="item")
    for i in soup_a:
        print i
    print soup_a



if __name__=='__main__':
    Robot()

结果:
200
<title>

    豆瓣电影

</title>
[]

抓取

<a class='item' ....>

这个标签内的值,但获取到的空,这是为何。

阿神阿神2742 days ago1064

reply all(2)I'll reply

  • 大家讲道理

    大家讲道理2017-04-17 17:07:11

    Check the source code of the page, there is no movie information in it. In fact, it is rendered by JS on the page.
    You can check out this link https://movie.douban.com/j/search_subjects?type=movie&tag=%E7%83%AD%E9%97%A8&sort=recommend&page_limit=20&page_start=0

    reply
    0
  • 天蓬老师

    天蓬老师2017-04-17 17:07:11

    Douban Movies has a public API interface. . Why crawl the page? .
    http://developers.douban.com/wiki/?title=movie_v2

    reply
    0
  • Cancelreply