前面的話
本文將使用nodeJS實作一個簡單的網頁爬蟲功能
網頁原始碼
使用http.get()方法取得網頁原始碼,以hao123網站的頭條頁面為例
http://tuijian.hao123.com/hotrank
var http = require('http'); http.get('http://tuijian.hao123.com/hotrank',function(res){var data = ''; res.on('data',function(chunk){ data += chunk; }); res.on('end',function(){ console.log(data); }) });
獲得的結果如下所示:



nbsp;html> <noscript><meta></noscript> <meta> <meta> <meta> <meta> <meta> <meta> <meta> <meta> <title>热点排行榜-头条新闻-hao123新闻导航_hao123上网导航</title> <link> <script> window.pageId = window.pageId || "hao123-xinwen-tuijian-hotrank"; window.pageVP = window.pageVP || "hao123-xinwen-tuijian-hotrank";</script> <!--[if lt IE 7]> <script src="http://s0.hao123img.com/res/js/common/dd_belatedpng.min.js?1.1.11"></script> <script>DD_belatedPNG.fix('#channelTitle');</script> <![endif]--> <script>window.HAO=window.HAO||{};window.HAO.https = false;window.HAO.httpsTrans = function(url){return url};</script> <link><link><link><link><link><link><link><link><link><link><link><link><link><link><link><link><link><link><script>window.aid = "nWRkrj61PjnYriYYrHfsrHbsnHb";</script><div> <div><div> <div> <a>hao123</a><a><img src="/static/imghwm/default1.png" data-src="http://s0.hao123img.com/res/img/xinwen.png" class="lazy" alt="nodeJS實作網頁爬蟲功能實例代碼" ></a> </div> <div> <a><i></i><em>导航</em><i></i></a><div> <div> <h3 id="休闲娱乐">休闲娱乐</h3> <div> <a>电影</a><a>动漫</a><a>综艺</a><a>搞笑</a><a>直播</a><a>视频</a><a>页游</a><a>明星</a><a>交友</a><a>体育</a><a>足球</a><a>NBA</a><a>星座</a><a>电视剧</a><a>小游戏</a> </div> </div> <div> <h3 id="生活服务">生活服务</h3> <div> <a>团购</a><a>银行</a><a>军事</a><a>房产</a><a>股票</a><a>基金</a><a>天气</a><a>菜谱</a><a>汽车</a><a>地图</a><a>招聘</a><a>儿童</a><a>母婴</a><a>健康</a><a>大学</a><a>手机</a> </div> </div> <div> <h3 id="其他类别">其他类别</h3> <div> <a>软件</a><a>邮箱</a><a>微博</a><a>公益</a><a>宠物</a><a>杀毒</a><a>设计</a><a>电脑</a><a>桌面</a><a>行业</a><a>摄影</a><a>英语</a><a>考试</a><a>学习</a><a>小清新</a> </div> </div> </div> </div> <div><form> <input><button></button><div></div> <div></div> </form></div> <div> <div> <a><i></i><em>一键登录</em><i></i></a><div> <a><i></i><em>VIP俱乐部</em></a><a><em>退出</em></a> </div> </div> <div> <a><i></i></a><div><img src="/static/imghwm/default1.png" data-src="http://s0.hao123img.com/res/r/image/2017-05-02/8efec295cd5f4ab991383422af14dcb8.png" class="lazy" alt="nodeJS實作網頁爬蟲功能實例代碼" ></div> </div> <a><i></i></a> </div> </div></div> <div><div><div><div><ul> <li><a>头条</a></li> <li><a>娱乐</a></li> <li><a>体育</a></li> <li><a>财经</a></li> <li><a>军事</a></li> <li><a>国内</a></li> <li><a>国际</a></li> <li><a>历史</a></li> <li><a>科技</a></li> <li><a>汽车</a></li> <li><a>教育</a></li> <li><a>游戏</a></li> <li><a>房产</a></li> <li><a>时尚</a></li> <li><a>热点排行</a></li> </ul></div></div></div></div> </div><div> <div> <div> <div> <div><div><div> <div> <div> <a><img class="slider-img lazy" src="/static/imghwm/default1.png" data-src="https://gss0.bdstatic.com/5bVWsj_p_tVS5dKfpU_Y_D3/res/r/image/2017-06-07/b1e7252d66852c27dd6c924b12290017.jpg" alt="nodeJS實作網頁爬蟲功能實例代碼" ></a><div></div> <div><a>送考车要讲究 毛坦厂中学送考规模庞大</a></div> </div> <div> <a><img class="slider-img lazy" src="/static/imghwm/default1.png" data-src="https://gss0.bdstatic.com/5bVWsj_p_tVS5dKfpU_Y_D3/res/r/image/2017-06-07/008195cae4d1336af0b63b31e5b01cdb.jpg" alt="nodeJS實作網頁爬蟲功能實例代碼" ></a><div></div> <div><a>江苏"拇指西瓜"上市 可连皮食用</a></div> </div> <div> <a><img class="slider-img lazy" src="/static/imghwm/default1.png" data-src="https://gss0.bdstatic.com/5bVWsj_p_tVS5dKfpU_Y_D3/res/r/image/2017-06-07/74f7b5a74615892839e3b21de8017bc8.jpg" alt="nodeJS實作網頁爬蟲功能實例代碼" ></a><div></div> <div><a>非洲女子嫁中国郎 2年后成广场舞明星</a></div> </div> <div> <a><img class="slider-img lazy" src="/static/imghwm/default1.png" data-src="https://gss0.bdstatic.com/5bVWsj_p_tVS5dKfpU_Y_D3/res/r/image/2017-06-07/5a85d3e78965666eb616b41e2981d24d.jpg" alt="nodeJS實作網頁爬蟲功能實例代碼" ></a><div></div> <div><a>广州一考生去错考场 交警蜀黍紧急送考</a></div> </div> <div> <a><img class="slider-img lazy" src="/static/imghwm/default1.png" data-src="https://gss0.bdstatic.com/5bVWsj_p_tVS5dKfpU_Y_D3/res/r/image/2017-06-07/ea646c52aa9197228d5d2f42899221ec.jpg" alt="nodeJS實作網頁爬蟲功能實例代碼" ></a><div></div> <div><a>福建小伙南非建安保公司 持AK47与劫匪激战</a></div> </div> <div> <a><img class="slider-img lazy" src="/static/imghwm/default1.png" data-src="https://gss0.bdstatic.com/5bVWsj_p_tVS5dKfpU_Y_D3/res/r/image/2017-06-07/b4331032b6375b7b6db10f7cdf19e86c.jpg" alt="nodeJS實作網頁爬蟲功能實例代碼" ></a><div></div> <div><a>老师拔河搞怪表情走红 拔河如戏全靠演技</a></div> </div> </div> <div> <a></a><a></a><a></a><a></a><a></a><a></a> </div> <a></a><a></a> </div></div></div> <div><div><div> <div> <h2 id="八卦热点">八卦热点</h2> <a>更多八卦>></a> </div> <div> <div><ul> <li><a><img class="imglink-img lazy" src="/static/imghwm/default1.png" data-src="http://s0.hao123img.com/res/r/image/2017-04-12/1be8ce1a1520e75f17f3299532855b56.jpg" alt="nodeJS實作網頁爬蟲功能實例代碼" ><span>男子上山寻宝 挖出这物吓坏了!</span></a></li> <li><a><img class="imglink-img lazy" src="/static/imghwm/default1.png" data-src="http://s0.hao123img.com/res/r/image/2017-04-14/55d08df0d7d1023179fca92200a96ce4.jpg" alt="nodeJS實作網頁爬蟲功能實例代碼" ><span>千年巨蛇镇守古墓竟借尸还魂</span></a></li> </ul></div> <div><ul> <li><a>地球是个监狱人类只是试验品!</a></li> <li><a>DNA检测是叔叔的可爸爸是独子</a></li> <li><a>出差两月打开电饭锅后惊呆了</a></li> <li><a>女孩中大奖4年后怒告彩票公司</a></li> <li><a>印度神牛竟拉出300多颗钻石!</a></li> <li><a>21岁男孩吞云吐雾成烟雾之神!</a></li> <li><a>继母让3孩子喝农药,继女死亡</a></li> <li><a>惊呆!实拍假鸡蛋制作的全过程</a></li> </ul></div> </div> </div></div></div> </div> <div><div><script>{di:"u0000",tn:"sitehao123_03",rsi0:"1190",rsi1:"150",type:"metro",version:"201",style:"lichun"}</script></div></div> <div><div> <div><div> <h2 id="实时热点">实时热点</h2> <div> <div> <span>排名</span><span>关键词</span><span>搜索指数</span> </div> <div> <div> <span>1</span><span><a>美国逮捕女斯诺登</a></span><span></span><span>35388</span><a></a> </div> <div> <span>2</span><span><a>成都隐秘母乳买卖</a></span><span></span><span>34497</span><a></a> </div> <div> <span>3</span><span><a>曝周杰伦青涩旧照</a></span><span></span><span>1457</span><a></a> </div> <div> <span>4</span><span><a>老头公交强吻女孩</a></span><span></span><span>103307</span><a></a> </div> <div> <span>5</span><span><a>王传君恋情曝光</a></span><span></span><span>26616</span><a></a> </div> <div> <span>6</span><span><a>杭州现奇葩窗口</a></span><span></span><span>26837</span><a></a> </div> <div> <span>7</span><span><a>忘带全班准考证</a></span><span></span><span>125127</span><a></a> </div> <div> <span>8</span><span><a>未成年持械拍网红</a></span><span></span><span>1672</span><a></a> </div> <div> <span>9</span><span><a>9秒揍儿子8拳</a></span><span></span><span>93193</span><a></a> </div> <div> <span>10</span><span><a>戴耳机穿轨道被撞</a></span><span></span><span>195745</span><a></a> </div> </div> </div> </div></div> <div><div> <h2 id="今日热点">今日热点</h2> <div> <div> <span>排名</span><span>关键词</span><span>搜索指数</span> </div> <div> <div> <span>1</span><span><a>北京回龙观大火</a></span><span></span><span>174225</span><a></a> </div> <div> <span>2</span><span><a>选美冠军车祸身亡</a></span><span></span><span>172447</span><a></a> </div> <div> <span>3</span><span><a>2017高考</a></span><span></span><span>136806</span><a></a> </div> <div> <span>4</span><span><a>成都老火锅店被查</a></span><span></span><span>121729</span><a></a> </div> <div> <span>5</span><span><a>陈浩民娇妻秀身材</a></span><span></span><span>115877</span><a></a> </div> <div> <span>6</span><span><a>海边直播发现浮尸</a></span><span></span><span>86157</span><a></a> </div> <div> <span>7</span><span><a>曝印小天遭妻骗婚</a></span><span></span><span>83749</span><a></a> </div> <div> <span>8</span><span><a>苹果开发者大会</a></span><span></span><span>78140</span><a></a> </div> <div> <span>9</span><span><a>6万斤鱼缺氧死亡</a></span><span></span><span>68984</span><a></a> </div> <div> <span>10</span><span><a>安以轩夏威夷大婚</a></span><span></span><span>56675</span><a></a> </div> </div> </div> </div></div> <div><div> <h2 id="民生热点">民生热点</h2> <div> <div> <span>排名</span><span>关键词</span><span>搜索指数</span> </div> <div> <div> <span>1</span><span><a>北京回龙观大火</a></span><span></span><span>174225</span><a></a> </div> <div> <span>2</span><span><a>2017高考</a></span><span></span><span>136806</span><a></a> </div> <div> <span>3</span><span><a>成都老火锅店被查</a></span><span></span><span>121729</span><a></a> </div> <div> <span>4</span><span><a>海边直播发现浮尸</a></span><span></span><span>86157</span><a></a> </div> <div> <span>5</span><span><a>苹果开发者大会</a></span><span></span><span>78140</span><a></a> </div> <div> <span>6</span><span><a>6万斤鱼缺氧死亡</a></span><span></span><span>68984</span><a></a> </div> <div> <span>7</span><span><a>北控外援训练猝死</a></span><span></span><span>50687</span><a></a> </div> <div> <span>8</span><span><a>武汉男子裸体捅人</a></span><span></span><span>45810</span><a></a> </div> <div> <span>9</span><span><a>多国与卡塔尔断交</a></span><span></span><span>44475</span><a></a> </div> <div> <span>10</span><span><a>美驻华外交官辞职</a></span><span></span><span>44394</span><a></a> </div> </div> </div> </div></div> <div><div> <h2 id="电影">电影</h2> <div> <div> <span>排名</span><span>关键词</span><span>搜索指数</span> </div> <div> <div> <span>1</span><span><a>神奇女侠</a></span><span></span><span>40981</span><a></a> </div> <div> <span>2</span><span><a>异星觉醒</a></span><span></span><span>15245</span><a></a> </div> <div> <span>3</span><span><a>新木乃伊</a></span><span></span><span>7183</span><a></a> </div> <div> <span>4</span><span><a>中国推销员</a></span><span></span><span>5890</span><a></a> </div> <div> <span>5</span><span><a>荡寇风云</a></span><span></span><span>3006</span><a></a> </div> <div> <span>6</span><span><a>异兽来袭</a></span><span></span><span>2566</span><a></a> </div> <div> <span>7</span><span><a>李雷和韩梅梅</a></span><span></span><span>1636</span><a></a> </div> <div> <span>8</span><span><a>北极星</a></span><span></span><span>1139</span><a></a> </div> <div> <span>9</span><span><a>美好的意外</a></span><span></span><span>971</span><a></a> </div> <div> <span>10</span><span><a>夏天19岁的肖像</a></span><span></span><span>783</span><a></a> </div> </div> </div> </div></div> <div><div> <h2 id="电视剧">电视剧</h2> <div> <div> <span>排名</span><span>关键词</span><span>搜索指数</span> </div> <div> <div> <span>1</span><span><a>龙珠传奇</a></span><span></span><span>999788</span><a></a> </div> <div> <span>2</span><span><a>楚乔传</a></span><span></span><span>538848</span><a></a> </div> <div> <span>3</span><span><a>欢乐颂2</a></span><span></span><span>257015</span><a></a> </div> <div> <span>4</span><span><a>欢乐颂</a></span><span></span><span>176799</span><a></a> </div> <div> <span>5</span><span><a>职场是个技术活</a></span><span></span><span>73102</span><a></a> </div> <div> <span>6</span><span><a>择天记</a></span><span></span><span>67290</span><a></a> </div> <div> <span>7</span><span><a>美食大冒险</a></span><span></span><span>61792</span><a></a> </div> <div> <span>8</span><span><a>废柴兄弟</a></span><span></span><span>50419</span><a></a> </div> <div> <span>9</span><span><a>人民的名义</a></span><span></span><span>46353</span><a></a> </div> <div> <span>10</span><span><a>三生三世十里桃花</a></span><span></span><span>24386</span><a></a> </div> </div> </div> </div></div> <div><div> <h2 id="综艺">综艺</h2> <div> <div> <span>排名</span><span>关键词</span><span>搜索指数</span> </div> <div> <div> <span>1</span><span><a>变形计</a></span><span></span><span>223319</span><a></a> </div> <div> <span>2</span><span><a>来吧冠军</a></span><span></span><span>151641</span><a></a> </div> <div> <span>3</span><span><a>拜托了冰箱</a></span><span></span><span>149596</span><a></a> </div> <div> <span>4</span><span><a>昆仑决</a></span><span></span><span>139633</span><a></a> </div> <div> <span>5</span><span><a>天生是优我</a></span><span></span><span>124472</span><a></a> </div> <div> <span>6</span><span><a>姐姐好饿</a></span><span></span><span>99619</span><a></a> </div> <div> <span>7</span><span><a>脑力男人时代</a></span><span></span><span>68735</span><a></a> </div> <div> <span>8</span><span><a>奔跑吧兄弟</a></span><span></span><span>61903</span><a></a> </div> <div> <span>9</span><span><a>我想和你唱</a></span><span></span><span>59249</span><a></a> </div> <div> <span>10</span><span><a>玫瑰之旅</a></span><span></span><span>50425</span><a></a> </div> </div> </div> </div></div> </div></div> </div> </div> </div> <div> <div> <div></div> <div><a>意见反馈</a></div> </div> <div> <div></div> <div><a>返回顶部</a></div> </div> </div> <div><div> <div> <a>hao123 上网导航第一品牌</a><div> <a>关于我们</a><a>常见问题</a><a>反馈意见</a><a>全站地图</a><span>京ICP证030173号</span> </div> </div> <div><div> <a><i></i><span>下载<br>手机端</span></a><a><i></i><span>收藏<br>本站</span></a> </div></div> </div></div><script></script> <script></script> <script>BigPipe.lazyPagelets = [];</script> <script>BigPipe.loadedResource(["5a7c104a8_7959","d8b3cc9ac_29e3","38645dd_f7dd","8d1d978b0_a316","6cca09af6_f07f","a0832ac19_fb25","25330c25d_ce62","deba0d4c0_c8fe","1c81d5fc6_a695","0c7877e81_8719","6e9548c75_e646","38645dd_0f3e","3f6d691_9321","4d7a174_ccfc","9e71d5b_bed3","b016c1d_d1a3","e073b71_9403","77f7c66_45f3","95a138325_0731"]);</script><script>BigPipe.hooks["__cb_0_1"]=function(){'use strict';var $ = require('fe:widget/js/base/jquery.js');var fixreferrer = require('fe:widget/js/base/fixreferrer.js'); HAO.https && fixreferrer.init($(document)); };</script> <script>BigPipe.hooks["__cb_0_2"]=function(){'use strict';var $ = require('fe:widget/js/base/jquery.js');$('div[data-hook="sitemap"]').on('mouseenter', function (e) {$(this).addClass('sitemap-hover');}).on('mouseleave', function (e) {$(this).removeClass('sitemap-hover');});};</script> <script>BigPipe.hooks["__cb_0_3"]=function(){'use strict';var $ = require('fe:widget/js/base/jquery.js');var Search = require('fe:widget/js/base/search.js');var headerSearchInstance = new Search($('form[data-hook="search-form"]'));};</script> <script>BigPipe.hooks["__cb_0_4"]=function(){'use strict';var $ = require('fe:widget/js/base/jquery.js');var events = require('fe:widget/js/lib/events.js');var login = require('fe:widget/js/base/login.js');var sethome = require('fe:widget/js/base/sethome.js');var $loginCon = $('div[data-hook="c-header-login"]');var $loginDrop = $('div[js-hook="popup-list"]');login.init();events.on('loginSuccess', function(userinfo) {$loginCon.addClass('success');$loginCon.find('.key .word').html(userinfo.userName);/* if ($loginCon.find('.key .word').width() >= 60) {$loginCon.find('.key .word').width(50);$loginDrop.outerWidth($loginCon.outerWidth());}*/$('[data-hook=login]').removeAttr('data-hook');});$loginCon.mouseenter(function() {if($(this).hasClass('success')) {$(this).addClass('hover');}}).mouseleave(function() {$(this).removeClass('hover');});$('div[data-hhok="qrcode"]').on('mouseenter', function () {$(this).children('div').show();}).on('mouseleave', function () {$(this).children('div').hide();}).on('click', function (ev) {if ($(this).children('div').length > 0) {return false;}});if($('[data-hook=setHome]').length) {sethome.init();}};</script> <script>BigPipe.hooks["__cb_0_5"]=function(){'use strict';var $ = require('fe:widget/js/base/jquery.js');var popupWidth;$('div[data-hook="nav-more"]').on('mouseenter', function () {popupWidth = $(this).children('div').width();$(this).addClass('nav-more-hover');}).on('mouseleave', function () {$(this).removeClass('nav-more-hover');});};</script> <script>BigPipe.hooks["__cb_0_6"]=function(){'use strict';var $ = require('fe:widget/js/base/jquery.js');var $v2Header = $('#erjiV2Header');var $fixedNav = $('#fixedNav');if ($v2Header.hasClass('v2-fixed') && !($.browser.msie && $.browser.version < 7)) {var offHeight = 0;$(window).scroll(function () {offHeight = $v2Header.offset().top + 60;if ($(window).scrollTop() >= offHeight) {if (!$fixedNav.hasClass('nav-v2-fixed')) {$fixedNav.addClass('nav-v2-fixed').find('li.cur').removeClass('cur').addClass('cur');}}else if ($fixedNav.hasClass('nav-v2-fixed')) {$fixedNav.removeClass('nav-v2-fixed').find('li.cur').removeClass('cur').addClass('cur');}});}};</script> <script>BigPipe.hooks["__cb_0_7"]=function(){'use strict';var $ = require('fe:widget/js/base/jquery.js');var Slider = require('fe:widget/js/util/slider.js');new Slider($('.slider'));};</script> <script>BigPipe.hooks["__cb_0_8"]=function(){'use strict';if(typeof BAIDU_SS_HHRUN!='function'){var d=document;(d.getElementsByTagName('head')[0]||d.body).appendChild(d.createElement('script')).src='http://su.bdimg.com/static/dspui/js/ls.js?v='+~(-new Date()/5600e5)}else{BAIDU_SS_HHRUN()}};</script> <script>BigPipe.hooks["__cb_0_9"]=function(){'use strict';var lifttop = require('tuijian:widget/lift/lifttop.js');lifttop();};</script> <script>BigPipe.hooks["__cb_0_10"]=function(){'use strict'; window._bd_share_config = { common : { bdText : '', bdDesc : '', bdUrl : '', bdPic : ''}, share : {"bdSize" : 24}, selectShare : [{"bdselectMiniList" : ['tsina','weixin','qzone'] }] }; (document.getElementsByTagName('head')[0]||document.body) .appendChild(document.createElement('script')).src='http://bdimg.share.baidu.com/static/api/js/share.js?v=89860593.js?cdnversion='+~(-new Date());var shareEvent = require('tuijian:widget/index/content/shareEvent.js'); shareEvent(); };</script> <script>BigPipe.hooks["__cb_0_11"]=function(){'use strict';var addBookmark = require('fe:widget/js/base/addbookmark.js');addBookmark.init();};</script> <script>BigPipe.hooks["__cb_0_12"]=function(){'use strict'; (function initTrack(o){var d = document;var x = d.createElement("script"); x.src = HAO.httpsTrans('http://s0.hao123img.com/res/js/track.js') + '?'+~(new Date/36e5);var a=[];if(o){ for(var i in o){ a.push(i + ":" + (o[i])) } var config = a.join(";"); x.setAttribute("data-log-config", config); var s = d.getElementsByTagName("script")[0].parentNode; var p= s || d.head; if(p) { setTimeout(function() { p.appendChild(x) }, 0); } } })({ pageId: window.pageId, page: window.pageId, level: 2, vp: window.pageVP || window.pageId, aid: window.aid || ''}); window.js_track_loaded = function (success) {if (success) { window.js_track_loaded = null;if (window.aid) {/* globals Monkey */Monkey && Monkey.set && Monkey.set('aid', window.aid); } } };// 跨站资源统计/* (function (doc) { var s = doc.createElement('script'); s.src = HAO.httpsTrans('http://s0.hao123img.com/res/js/fe/cspalog.js') + '?t=' + (+new Date); var parent = doc.getElementsByTagName('script')[0].parentNode; parent.appendChild(s); })(document); */};</script> <script>BigPipe.hooks["__cb_0_13"]=function(){'use strict'; require.defer(["fe:widget/js/base/jquery.js?1.1.11","fe:widget/js/base/detect.js?1.1.11","tuijian:widget/index/kuaixun.js?1.1.11"], function ($, detect, kuaixun) { $(document).ready(function() { detect(); kuaixun.init(); }); }); };</script> <script>BigPipe.setResourceMap({"d8b3cc9ac_29e3":{"src":"http:\/\/s1.hao123img.com\/resource\/fe\/pkg\/aio-eef856ab5.231bb088c.css?1.1.11","type":"css","deps":[],"mods":["fe:resource\/css\/base.less"]},"38645dd_f7dd":{"src":"http:\/\/s2.hao123img.com\/resource\/tuijian\/css\/hotrank.38645dd.css?1.1.11","type":"css","deps":[],"mods":["tuijian:resource\/css\/hotrank.less"]},"8d1d978b0_a316":{"src":"http:\/\/s1.hao123img.com\/resource\/fe\/widget\/ui\/header\/common\/v2\/header.8d1d978b0.css?1.1.11","type":"css","deps":[],"mods":["fe:widget\/ui\/header\/common\/v2\/header.less"]},"6cca09af6_f07f":{"src":"http:\/\/s0.hao123img.com\/resource\/fe\/widget\/ui\/header\/common\/v2\/logo\/logo.6cca09af6.css?1.1.11","type":"css","deps":[],"mods":["fe:widget\/ui\/header\/common\/v2\/logo\/logo.less"]},"a0832ac19_fb25":{"src":"http:\/\/s1.hao123img.com\/resource\/fe\/widget\/ui\/header\/common\/v2\/sitemap\/sitemap.a0832ac19.css?1.1.11","type":"css","deps":[],"mods":["fe:widget\/ui\/header\/common\/v2\/sitemap\/sitemap.less"]},"25330c25d_ce62":{"src":"http:\/\/s2.hao123img.com\/resource\/fe\/widget\/ui\/header\/common\/v2\/adv\/adv.25330c25d.css?1.1.11","type":"css","deps":[],"mods":["fe:widget\/ui\/header\/common\/v2\/adv\/adv.less"]},"deba0d4c0_c8fe":{"src":"http:\/\/s0.hao123img.com\/resource\/fe\/widget\/ui\/header\/common\/v2\/form\/form.deba0d4c0.css?1.1.11","type":"css","deps":[],"mods":["fe:widget\/ui\/header\/common\/v2\/form\/form.less"]},"1c81d5fc6_a695":{"src":"http:\/\/s0.hao123img.com\/resource\/fe\/widget\/ui\/header\/common\/v2\/tools\/tools.1c81d5fc6.css?1.1.11","type":"css","deps":[],"mods":["fe:widget\/ui\/header\/common\/v2\/tools\/tools.less"]},"0c7877e81_8719":{"src":"http:\/\/s2.hao123img.com\/resource\/fe\/widget\/ui\/header\/common\/v2\/nav\/nav.0c7877e81.css?1.1.11","type":"css","deps":[],"mods":["fe:widget\/ui\/header\/common\/v2\/nav\/nav.less"]},"6e9548c75_e646":{"src":"http:\/\/s2.hao123img.com\/resource\/fe\/widget\/ui\/header\/common\/v2\/tuiguang\/tuiguang.6e9548c75.css?1.1.11","type":"css","deps":[],"mods":["fe:widget\/ui\/header\/common\/v2\/tuiguang\/tuiguang.less"]},"38645dd_0f3e":{"src":"http:\/\/s0.hao123img.com\/resource\/tuijian\/widget\/index\/hotrank\/hotrank.38645dd.css?1.1.11","type":"css","deps":[],"mods":["tuijian:widget\/index\/hotrank\/hotrank.less"]},"3f6d691_9321":{"src":"http:\/\/s2.hao123img.com\/resource\/tuijian\/widget\/index\/hotrank\/index\/slider\/slider.3f6d691.css?1.1.11","type":"css","deps":[],"mods":["tuijian:widget\/index\/hotrank\/index\/slider\/slider.less"]},"4d7a174_ccfc":{"src":"http:\/\/s0.hao123img.com\/resource\/tuijian\/widget\/index\/hotrank\/common\/slider\/slider.4d7a174.css?1.1.11","type":"css","deps":[],"mods":["tuijian:widget\/index\/hotrank\/common\/slider\/slider.less"]},"9e71d5b_bed3":{"src":"http:\/\/s0.hao123img.com\/resource\/tuijian\/widget\/index\/hotrank\/index\/news\/news.9e71d5b.css?1.1.11","type":"css","deps":[],"mods":["tuijian:widget\/index\/hotrank\/index\/news\/news.less"]},"b016c1d_d1a3":{"src":"http:\/\/s1.hao123img.com\/resource\/tuijian\/widget\/index\/hotrank\/index\/fyb\/fyb.b016c1d.css?1.1.11","type":"css","deps":[],"mods":["tuijian:widget\/index\/hotrank\/index\/fyb\/fyb.less"]},"e073b71_9403":{"src":"http:\/\/s0.hao123img.com\/resource\/tuijian\/widget\/index\/hotrank\/index\/top\/top.e073b71.css?1.1.11","type":"css","deps":[],"mods":["tuijian:widget\/index\/hotrank\/index\/top\/top.less"]},"77f7c66_45f3":{"src":"http:\/\/s0.hao123img.com\/resource\/tuijian\/widget\/lift\/lift.77f7c66.css?1.1.11","type":"css","deps":[],"mods":["tuijian:widget\/lift\/lift.less"]},"95a138325_0731":{"src":"http:\/\/s2.hao123img.com\/resource\/fe\/pkg\/aio-8155b5719.3dd99d32e.css?1.1.11","type":"css","deps":[],"mods":["fe:widget\/ui\/footer\/common\/footer.less"]},"ed29b1dff_99f2":{"src":"http:\/\/s1.hao123img.com\/resource\/fe\/pkg\/aio-752ba7752.ed29b1dff.js?1.1.11","type":"js","deps":[],"mods":["fe:widget\/js\/base\/jquery.js?1.1.11"]},"499abaa0e_acda":{"src":"http:\/\/s0.hao123img.com\/resource\/fe\/pkg\/aio-eef856ab5.499abaa0e.js?1.1.11","type":"js","deps":["ed29b1dff_99f2","15f327f0a_5d72"],"mods":["fe:widget\/js\/base\/browser.js?1.1.11","fe:widget\/js\/base\/fixreferrer.js?1.1.11"]},"15f327f0a_5d72":{"src":"http:\/\/s0.hao123img.com\/resource\/fe\/pkg\/aio-95cc3013d.15f327f0a.js?1.1.11","type":"js","deps":["ed29b1dff_99f2"],"mods":["fe:widget\/js\/base\/cookie.js?1.1.11"]},"331938377_b942":{"src":"http:\/\/s0.hao123img.com\/resource\/fe\/pkg\/aio-1c2d6f9f2.2b182a527.css?1.1.11","type":"css","deps":[],"mods":["fe:widget\/ui\/header\/common\/header.less"]},"2009b1512_46d0":{"src":"http:\/\/s0.hao123img.com\/resource\/fe\/pkg\/aio-1c2d6f9f2.2009b1512.js?1.1.11","type":"js","deps":["ed29b1dff_99f2","15f327f0a_5d72","331938377_b942"],"mods":["fe:widget\/js\/base\/sethome.js?1.1.11","fe:widget\/js\/lib\/events.js?1.1.11","fe:widget\/js\/base\/login.js?1.1.11","fe:widget\/js\/third\/arttemplate\/template-native.js?1.1.11","fe:widget\/js\/base\/autocomplete.js?1.1.11","fe:widget\/js\/base\/search.js?1.1.11","fe:widget\/ui\/header\/common\/header.js?1.1.11"]},"9a092a7f1_2a6f":{"src":"http:\/\/s0.hao123img.com\/resource\/fe\/widget\/js\/util\/slider.9a092a7f1.js?1.1.11","type":"js","deps":["ed29b1dff_99f2"],"mods":["fe:widget\/js\/util\/slider.js?1.1.11"]},"f271c78_c7d7":{"src":"http:\/\/s0.hao123img.com\/resource\/tuijian\/widget\/lift\/lifttop.f271c78.js?1.1.11","type":"js","deps":["ed29b1dff_99f2"],"mods":["tuijian:widget\/lift\/lifttop.js?1.1.11"]},"4d39d64_93de":{"src":"http:\/\/s1.hao123img.com\/resource\/tuijian\/widget\/index\/content\/shareEvent.4d39d64.js?1.1.11","type":"js","deps":["ed29b1dff_99f2"],"mods":["tuijian:widget\/index\/content\/shareEvent.js?1.1.11"]},"3ac67f28c_b365":{"src":"http:\/\/s2.hao123img.com\/resource\/fe\/pkg\/aio-8155b5719.3ac67f28c.js?1.1.11","type":"js","deps":["ed29b1dff_99f2"],"mods":["fe:widget\/js\/base\/addbookmark.js?1.1.11"]},"67402ee5d_d72b":{"src":"http:\/\/s2.hao123img.com\/resource\/fe\/widget\/js\/base\/track.67402ee5d.js?1.1.11","type":"js","deps":["ed29b1dff_99f2"],"mods":["fe:widget\/js\/base\/track.js?1.1.11"]},"f97e9ecfd_31c5":{"src":"http:\/\/s1.hao123img.com\/resource\/fe\/widget\/js\/base\/detect.f97e9ecfd.js?1.1.11","type":"js","deps":["67402ee5d_d72b"],"mods":["fe:widget\/js\/base\/detect.js?1.1.11"]},"2e29525_fe44":{"src":"http:\/\/s1.hao123img.com\/resource\/tuijian\/widget\/index\/kuaixun.2e29525.js?1.1.11","type":"js","deps":["ed29b1dff_99f2"],"mods":["tuijian:widget\/index\/kuaixun.js?1.1.11"]},"5a7c104a8_7959":{"src":"http:\/\/s2.hao123img.com\/resource\/fe\/js\/lib\/main.5a7c104a8.js?1.1.11","type":"js","deps":[],"mods":["fe:resource\/js\/lib\/main.js?1.1.11"]}});</script> <script>BigPipe.onPageletArrive({"id":null,"children":[],"renderMode":"default","parent":null,"deps":{"beforedisplay":["d8b3cc9ac_29e3","38645dd_f7dd","8d1d978b0_a316","6cca09af6_f07f","a0832ac19_fb25","25330c25d_ce62","deba0d4c0_c8fe","1c81d5fc6_a695","0c7877e81_8719","6e9548c75_e646","38645dd_0f3e","3f6d691_9321","4d7a174_ccfc","9e71d5b_bed3","b016c1d_d1a3","e073b71_9403","77f7c66_45f3","95a138325_0731"],"load":["ed29b1dff_99f2","499abaa0e_acda","2009b1512_46d0","9a092a7f1_2a6f","f271c78_c7d7","4d39d64_93de","3ac67f28c_b365"]},"hooks":{"load":["__cb_0_1","__cb_0_2","__cb_0_3","__cb_0_4","__cb_0_5","__cb_0_6","__cb_0_7","__cb_0_8","__cb_0_9","__cb_0_10","__cb_0_11","__cb_0_12","__cb_0_13"]}});</script> <!--24343361510346110218060803--> <script> var _trace_page_logid = 2434336151; </script>


# 相關原始碼如下

我們怎麼能從原始碼取得到有用的資料呢?首先,nodeJS不支援document物件。如果要使用笨辦法,只能使用正規表示式來處理
【安裝】
############【使用】###### 它的使用方法和jQuery相當類似,上手非常容易。以取得綜藝熱度前10名的節目名稱為例######var http = require('http');var cheerio = require('cheerio'); http.get('http://tuijian.hao123.com/hotrank',function(res){var data = ''; res.on('data',function(chunk){ data += chunk; }); res.on('end',function(){ filter(data); }) });function filter(data){//保存搜索量前10的综艺节目标题var result = [];//将页面源代码转换为$对象var $ = cheerio.load(data);//查找每个综艺节目标题的外层divvar temp_arr = $('[monkey = "zy"]').find('.point-bd').find('.point-title');//将综艺节目标题依次保存到结果数组中temp_arr.each(function(index,item){ result.push($(item).text()); })//[ '变形计','来吧冠军','拜托了冰箱','昆仑决','天生是优我','姐姐好饿','脑力男人时代','奔跑吧兄弟','我想和你唱','玫瑰之旅' ] console.log(result); }###### ######爬蟲程式碼###### 下面將hao123網頁中的'即時熱點'、 '今日熱點'、'民生熱點'、'電影'、'電視劇'、'綜藝'這6部分的排名爬下來,分別到對象名為'result'中的數組中,分別命令為'ss'、' jr'、'ms'、'dy'、'dsj'、'zy'################【程式碼如下】#####
var http = require('http');var cheerio = require('cheerio'); http.get('http://tuijian.hao123.com/hotrank',function(res){var data = ''; res.on('data',function(chunk){ data += chunk; }); res.on('end',function(){ filter(data); }) });function filter(data){//保存各部分搜索量前10的名称//对象名为榜单名,如'实时热点'//对象内容为10个标题名称组成的数组var result = {};//将页面源代码转换为$对象var $ = cheerio.load(data);//查找'实时热点'、'今日热点'、'民生热点'、'电影'、'电视剧'、'综艺'这6个榜单所在的divvar temp_div = $('.top-wrap');//保存榜单名称var temp_title = []; temp_div.each(function(index,item){//查找榜单名,并保存到temp_title文件夹中temp_title.push($(item).find('h2').text());//查找每类下每个标题的外层divvar temp_arr = $(item).find('.point-bd').find('.point-title');//将result下的每个榜单初始化为一个数组var innerResult = result[temp_title[index]] = [];//将节目标题依次保存到相应榜单的数组中temp_arr.each(function(_index,_item){ innerResult.push($(_item).text()) }) }) console.log(result); }###### 【結果如下】######
{ '实时热点': [ '美国逮捕女斯诺登', '成都隐秘母乳买卖', '曝周杰伦青涩旧照', '老头公交强吻女孩', '王传君恋情曝光', '杭州现奇葩窗口', '忘带全班准考证', '未成年持械拍网红', '9秒揍儿子8拳', '戴耳机穿轨道被撞' ], '今日热点': [ '北京回龙观大火', '选美冠军车祸身亡', '2017高考', '成都老火锅店被查', '陈浩民娇妻秀身材', '海边直播发现浮尸', '曝印小天遭妻骗婚', '苹果开发者大会', '6万斤鱼缺氧死亡', '安以轩夏威夷大婚' ], '民生热点': [ '北京回龙观大火', '2017高考', '成都老火锅店被查', '海边直播发现浮尸', '苹果开发者大会', '6万斤鱼缺氧死亡', '北控外援训练猝死', '武汉男子裸体捅人', '多国与卡塔尔断交', '美驻华外交官辞职' ], '电影': [ '神奇女侠', '异星觉醒', '新木乃伊', '中国推销员', '荡寇风云', '异兽来袭', '李雷和韩梅梅', '北极星', '美好的意外', '夏天19岁的肖像' ], '电视剧': [ '龙珠传奇', '楚乔传', '欢乐颂2', '欢乐颂', '职场是个技术活', '择天记', '美食大冒险', '废柴兄弟', '人民的名义', '三生三世十里桃花' ], '综艺': [ '变形计', '来吧冠军', '拜托了冰箱', '昆仑决', '天生是优我', '姐姐好饿', '脑力男人时代', '奔跑吧兄弟', '我想和你唱', '玫瑰之旅' ] } [Finished in 0.7s]###### ####
以上是nodeJS實作網頁爬蟲功能實例代碼的詳細內容。更多資訊請關注PHP中文網其他相關文章!

從C/C 轉向JavaScript需要適應動態類型、垃圾回收和異步編程等特點。 1)C/C 是靜態類型語言,需手動管理內存,而JavaScript是動態類型,垃圾回收自動處理。 2)C/C 需編譯成機器碼,JavaScript則為解釋型語言。 3)JavaScript引入閉包、原型鍊和Promise等概念,增強了靈活性和異步編程能力。

不同JavaScript引擎在解析和執行JavaScript代碼時,效果會有所不同,因為每個引擎的實現原理和優化策略各有差異。 1.詞法分析:將源碼轉換為詞法單元。 2.語法分析:生成抽象語法樹。 3.優化和編譯:通過JIT編譯器生成機器碼。 4.執行:運行機器碼。 V8引擎通過即時編譯和隱藏類優化,SpiderMonkey使用類型推斷系統,導致在相同代碼上的性能表現不同。

JavaScript在現實世界中的應用包括服務器端編程、移動應用開發和物聯網控制:1.通過Node.js實現服務器端編程,適用於高並發請求處理。 2.通過ReactNative進行移動應用開發,支持跨平台部署。 3.通過Johnny-Five庫用於物聯網設備控制,適用於硬件交互。

我使用您的日常技術工具構建了功能性的多租戶SaaS應用程序(一個Edtech應用程序),您可以做同樣的事情。 首先,什麼是多租戶SaaS應用程序? 多租戶SaaS應用程序可讓您從唱歌中為多個客戶提供服務

本文展示了與許可證確保的後端的前端集成,並使用Next.js構建功能性Edtech SaaS應用程序。 前端獲取用戶權限以控制UI的可見性並確保API要求遵守角色庫

JavaScript是現代Web開發的核心語言,因其多樣性和靈活性而廣泛應用。 1)前端開發:通過DOM操作和現代框架(如React、Vue.js、Angular)構建動態網頁和單頁面應用。 2)服務器端開發:Node.js利用非阻塞I/O模型處理高並發和實時應用。 3)移動和桌面應用開發:通過ReactNative和Electron實現跨平台開發,提高開發效率。

JavaScript的最新趨勢包括TypeScript的崛起、現代框架和庫的流行以及WebAssembly的應用。未來前景涵蓋更強大的類型系統、服務器端JavaScript的發展、人工智能和機器學習的擴展以及物聯網和邊緣計算的潛力。

JavaScript是現代Web開發的基石,它的主要功能包括事件驅動編程、動態內容生成和異步編程。 1)事件驅動編程允許網頁根據用戶操作動態變化。 2)動態內容生成使得頁面內容可以根據條件調整。 3)異步編程確保用戶界面不被阻塞。 JavaScript廣泛應用於網頁交互、單頁面應用和服務器端開發,極大地提升了用戶體驗和跨平台開發的靈活性。


熱AI工具

Undresser.AI Undress
人工智慧驅動的應用程序,用於創建逼真的裸體照片

AI Clothes Remover
用於從照片中去除衣服的線上人工智慧工具。

Undress AI Tool
免費脫衣圖片

Clothoff.io
AI脫衣器

AI Hentai Generator
免費產生 AI 無盡。

熱門文章

熱工具

VSCode Windows 64位元 下載
微軟推出的免費、功能強大的一款IDE編輯器

Dreamweaver CS6
視覺化網頁開發工具

WebStorm Mac版
好用的JavaScript開發工具

Safe Exam Browser
Safe Exam Browser是一個安全的瀏覽器環境,安全地進行線上考試。該軟體將任何電腦變成一個安全的工作站。它控制對任何實用工具的訪問,並防止學生使用未經授權的資源。

禪工作室 13.0.1
強大的PHP整合開發環境