当前位置:编程学习 > 网站相关 >>

python抓取bing主页背景图片

/**
author: insun
title:python抓取bing主页背景图片
blog:http://yxmhero1989.blog.163.com/blog/static/112157956201311743439712/
**/
搜索巨头里面从来没有bing  最近看到一篇文章 说bing背景图还不错
的确还不赖 然后想用python练个手抓一抓
看源码 有g_img={url: 后面的url就是图片地址  点击右下角的上一页下一页可以换图片
FF中的FireBug没找出具体路径  那就HttpFox来抓个包吧
python抓取bing主页背景图片 - InSun - Minghacker is Insun
 
有一串json加载进了一张jpeg和相关信息

http://cn.bing.com/HPImageArchive.aspx?format=js&idx=0&n=1&nc=1361089515117&FORM=HYLH1

 

返回json格式:

{"images":[{"startdate":"20130216","fullstartdate":"201302161600","enddate":"20130217",
"url":"http://s.cn.bing.net/az/hprichbg/rb/LongJi_ZH-CN8658435963_1366x768.jpg",
"urlbase":"/az/hprichbg/rb/LongJi_ZH-CN8658435963",
"copyright":"桂林龙脊梯田 (? Yoshinori Kuwahara/Flickr/Getty Images)",
"copyrightlink":"http://cn.bing.com/search?q=%E9%BE%99%E8%84%8A%E6%A2%AF%E7%94%B0&go=&qs=bs&form=hpcapt",
"wp":false,"hsh":"e688c3f17a0b57306642188adcbf2187","drk":1,"top":1,"bot":1,
"hs":[{"desc":"童话故事中,莴苣姑娘可以放下长发作为绳索让王子爬入城堡,",
"link":"http://cn.bing.com/search?q=%E9%BB%84%E6%B4%9B%E7%91%B6%E5%AF%A8+%E9%95%BF%E5%8F%91%E6%9D%91&go=&qs=bs&form=hphot1",
"query":"而在龙脊梯田景区的黄洛瑶寨中,处处可见“莴苣姑娘”!","locx":11,"locy":41},{"desc":"这块土地上洒下了壮民和瑶民祖祖辈辈的血汗与生命,",
"link":"http://cn.bing.com/images/search?q=%e9%be%99%e8%84%8a%e6%a2%af%e7%94%b0&FORM=hphot2","query":"而如今,它变成了妩媚潇洒的曲线世界——龙脊梯田。","locx":46,"locy":49},{"desc":"层层叠叠,色彩斑斓,规模宏大,气势磅礴,","link":"http://cn.bing.com/search?q=%E6%9E%81%E7%BE%8E%E4%BB%99%E5%A2%83+%E4%B8%AD%E5%9B%BD%E4%B8%83%E5%A4%A7%E6%A2%AF%E7%94%B0&go=&qs=bs&form=hphot3","query":"美若仙境的梯田,中国不只七座。","locx":60,"locy":42},{"desc":"“七星伴月”是龙脊梯田的精华,由一块月亮田和七块大小山包所组成,关于它形成的缘由,","link":"http://cn.bing.com/search?q=%E9%BE%99%E8%84%8A%E6%A2%AF%E7%94%B0+%E4%B8%83%E6%98%9F%E4%BC%B4%E6%9C%88%E7%9A%84%E4%BC%A0%E8%AF%B4&go=&qs=bs&form=hphot4","query":"流传着一个凄美的爱情故事……","locx":77,"locy":40}],"msg":[{"title":"今日图片故事","link":"http://cn.bing.com/search?q=%E9%BE%99%E8%84%8A%E6%A2%AF%E7%94%B0&go=&qs=bs&form=pgbar1","text":"生机盎然的龙脊梯田"},{"title":"看图片,学英语","link":"http://cn.bing.com/dict/search?q=%E6%A2%AF%E7%94%B0&go=&qs=n&form=pgbar2","text":"用英语说梯田"}]}],
"tooltips":{"loading":"正在加载...","previous":"上一页","next":"下一页","walle":"此图片不能下载用作壁纸。","walls":"下载此图片。与 Facebook 连接以发挥必应 Bing 的最大功能。图片只能用作壁纸。"}}


原本我写了个python抓取http://cn.bing.com/ 这个页面的代码 只能抓取当天的那张图片
#!/usr/bin/env python
# -*- coding:utf-8 -*-
# -*- author:insun -*-
# python抓取bing主页背景图片

import urllib,re,sys


def get_bing_backphoto():
    url = 'http://cn.bing.com/'
    html = urllib.urlopen(url).read()
    if not html:
        print 'open & read bing error!'
        return -1
    reg = re.compile(";g_img={url:'(.*?)'",re.S)
    text = re.findall(reg,html)
    #http://s.cn.bing.net/az/hprichbg/rb/LongJi_ZH-CN8658435963_1366x768.jpg
    for imgurl in text:
        right = imgurl.rindex('/')
        savepath = imgurl.replace(imgurl[:right+1],'')
        urllib.urlretrieve(imgurl, savepath)
        

get_bing_backphoto()

 

上面也可以参考:http://www.isayme.org/python-get-bing-day-pic.html
如今思路变了 可以抓ajax那个连接  根据idx为0-N的数字抓取以往的图片  链接上的参数n只能为1 要是传其他的话 他就一直返回今天的数据 想必写过程序的人都了解。
抓过来都不用python json处理了 因为已经read后已经是str型了 不信你type看看。
然后的代码就这样了 你也可以抓他的时间再加图片后面来记录图片是哪天的
#!/usr/bin/env python
# -*- coding:utf-8 -*-
# -*- author:insun -*-
# python抓取bing主页所有背景图片

import urllib,re,sys,os


def get_bing_backphoto():
    if (os.path.exists('photos')== False):
        os.mkdir('photos')
 
    for i in range(0,1000):
        url = 'http://cn.bing.com/HPImageArchive.aspx?format=js&idx='+str(i)
               +'&n=1&nc=1361089515117&FORM=HYLH1'
        html = urllib.urlopen(url).read()
        if html == 'null':
            print 'open & read bing error!'
            sys.exit(-1)
        reg = re.compile('"url":"(.*?)","urlbase"',re.S)
        text = re.findall(reg,html)
        #http://s.cn.bing.net/az/hprichbg/rb/LongJi_ZH-CN8658435963_1366x768.jpg
        for imgurl in text:
            right = imgurl.rindex('/')
            name = imgurl.replace(imgurl[:right+1],'')
            savepath = 'photos/'+ name
            urllib.urlretrieve(imgurl, savepath)
            print name + ' save success!'

get_bing_backphoto()

 


python抓取bing主页背景图片 - InSun - Minghacker is Insun

后来发现 idx为21的时候json数据就为null了 我设置了个1000的i真是杞人忧天加痴心妄想了

python抓取bing主页背景图片 - InSun - Minghacker is Insun
补充:Web开发 , Python ,
CopyRight © 2012 站长网 编程知识问答 www.zzzyk.com All Rights Reserved
部份技术文章来自网络,