PHP采集利器:Snoopy 试用心得
?
$url = "http://www.taoav.com"; include("snoopy.php"); $snoopy = new Snoopy; $snoopy->fetch($url); //获取所有内容 echo $snoopy->results; //显示结果 $snoopy->fetchtext //获取文本内容(去掉html代码) $snoopy->fetchlinks //获取链接 $snoopy->fetchform //获取表单2 表单提交
$formvars["username"] = "admin"; $formvars["pwd"] = "admin"; $action = "http://www.taoav.com";//表单提交地址 $snoopy->submit($action,$formvars);//$formvars为提交的数组 echo $snoopy->results; //获取表单提交后的 返回的结果 $snoopy->submittext; //提交后只返回 去除html的 文本 $snoopy->submitlinks;//提交后只返回 链接?既然已经提交的表单 那就可以做很多事情 接下来我们来伪装ip,伪装浏览器
$formvars["username"] = "admin"; $formvars["pwd"] = "admin"; $action = "http://www.taoav.com"; include "snoopy.php"; $snoopy = new Snoopy; $snoopy->cookies["PHPSESSID"] = 'fc106b1918bd522cc863f36890e6fff7'; //伪装sessionid $snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)"; //伪装浏览器 $snoopy->referer = "http://www.only4.cn"; //伪装来源页地址 http_referer $snoopy->rawheaders["Pragma"] = "no-cache"; //cache 的http头信息 $snoopy->rawheaders["X_FORWARDED_FOR"] = "127.0.0.101"; //伪装ip $snoopy->submit($action,$formvars); echo $snoopy->results;?
-
原来我们可以伪装session 伪装浏览器 ,伪装ip, haha 可以做很多事情了。
$snoopy->proxy_host = "www.only4.cn"; $snoopy->proxy_port = "8080"; //使用代理 $snoopy->maxredirs = 2; //重定向次数 $snoopy->expandlinks = true; //是否补全链接 在采集的时候经常用到 // 例如链接为 /images/taoav.gif 可改为它的全链接 http://www.taoav.com/images/taoav.gif,这个地方其实可以在最后输出的时候用ereg_replace函数自己替换 $snoopy->maxframes = 5 //允许的最大框架数 //注意抓取框架的时候 $snoopy->results 返回的是一个数组 $snoopy->error //返回报错信息?上面的基本用法了解了,下面我就实例演示一次:
//echo var_dump($_SERVER); include("Snoopy.class.php"); $snoopy = new Snoopy; $snoopy->agent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; zh- CN; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5 FirePHP/0.2.1";//这项是浏览器信 息,前面你用什么浏览器查看cookie,就用那个浏览器的信息(ps:$_SERVER可以查看到浏览器的信息) $snoopy->referer = "http://bbs.phpchina.com/index.php"; $snoopy->expandlinks = true; $snoopy->rawheaders["COOKIE"]="__utmz=17229162.1227682761.29.7.utmccn=(referral)|utmcsr=phpchina.com|utmcct=/html/index.html|utmcmd=referral; cdbphpchina_smile=1D2D0D1; cdbphpchina_cookietime=2592000; __utma=233700831.1562900865.1227113506.1229613449.1231233266.16; __utmz=233700831.1231233266.16.8.utmccn=(referral)|utmcsr=localhost:8080|utmcct=/test3.php|utmcmd=referral; __utma=17229162.1877703507.1227113568.1231228465.1231233160.58; uchome_loginuser=sinopf; xscdb_cookietime=2592000; __utmc=17229162; __utmb=17229162; cdbphpchina_sid=EX5w1V; __utmc=233700831; cdbphpchina_visitedfid=17; cdbphpchinaO766uPYGK6OWZaYlvHSuzJIP22VpwEMGnPQAuWCFL9Fd6CHp2e%2FKw0x4bKz0N9lGk; xscdb_auth=8106rAyhKpQL49eMs%2FyhLBf3C6ClZ%2B2idSk4bExJwbQr%2BHSZrVKgqPOttHVr%2B6KLPg3DtWpTMUI4ttqNNVpukUj6ElM; cdbphpchina_onlineusernum=3721"; $snoopy->fetch("http://bbs.phpchina.com/forum-17-1.html"); $n=ereg_replace("href=\"","href=\"http://bbs.phpchina.com/",$snoopy->results ); echo ereg_replace("src=\"","src=\"http://bbs.phpchina.com/",$n); ?>?这是模拟登陆PHPCHINA论坛的过程,首先要查看自己浏览器的信
$_SERVER['HTTP_USER_AGENT']后边的内容复制下来,粘在$snoopy->agent的地方,然后就是要查看自己的
COOKIE了,用自己在论坛的账号登陆论坛后,在浏览器地址栏里输入
javascript:document.write(document.cookie),回车,就可以看到自己的cookie信息,复制粘贴
到$snoopy->rawheaders["COOKIE"]=的后边。(我的cookie信息为了安全起见已经删除了一段内容)
?

Springboot内置tomcat禁止不安全HTTP方法1、在tomcat的web.xml中可以配置如下内容让tomcat禁止不安全的HTTP方法/*PUTDELETEHEADOPTIONSTRACEBASIC2、Springboot使用内置tomcat没有web.xml配置文件,可以通过以下配置进行,简单来说就是要注入到Spring容器中@ConfigurationpublicclassTomcatConfig{@BeanpublicEmbeddedServletContainerFacto

1.HttpURLConnection使用JDK原生提供的net,无需其他jar包,代码如下:importcom.alibaba.fastjson.JSON;importjava.io.BufferedReader;importjava.io.InputStream;importjava.io.InputStreamReader;importjava.io.OutputStream;importjava.net.HttpURLConnection;

一、前言#ssl写在443端口后面。这样http和https的链接都可以用listen443sslhttp2default_server;server_namechat.chengxinsong.cn;#hsts的合理使用,max-age表明hsts在浏览器中的缓存时间,includesubdomainscam参数指定应该在所有子域上启用hsts,preload参数表示预加载,通过strict-transport-security:max-age=0将缓存设置为0可以撤销hstsadd_head

随着互联网的不断发展和改善,Web服务器在速度和性能上的需求也越来越高。为了满足这样的需求,Nginx已经成功地掌握了HTTP2协议并将其融入其服务器的性能中。HTTP2协议要比早期的HTTP协议更加高效,但同时也存在着特定的安全问题。本文将为您详细介绍如何进行Nginx的HTTP2协议优化和安全设置。一、Nginx的HTTP2协议优化1.启用HTTP2在N

httpkeepalive在http早期,每个http请求都要求打开一个tpcsocket连接,并且使用一次之后就断开这个tcp连接。使用keep-alive可以改善这种状态,即在一次tcp连接中可以持续发送多份数据而不会断开连接。通过使用keep-alive机制,可以减少tcp连接建立次数,也意味着可以减少time_wait状态连接,以此提高性能和提高httpd服务器的吞吐率(更少的tcp连接意味着更少的系统内核调用,socket的accept()和close()调用)。但是,keep-ali

一、urllib概述:urllib是Python中请求url连接的官方标准库,就是你安装了python,这个库就已经可以直接使用了,基本上涵盖了基础的网络请求功能。在Python2中主要为urllib和urllib2,在Python3中整合成了urllib。Python3.x中将urllib2合并到了urllib,之后此包分成了以下四个模块:urllib.request:它是最基本的http请求模块,用来模拟发送请求urllib.error:异常处理模块,如果出现错误可以捕获这些异常urllib

一、概述在实际开发过程中,我们经常需要调用对方提供的接口或测试自己写的接口是否合适。很多项目都会封装规定好本身项目的接口规范,所以大多数需要去调用对方提供的接口或第三方接口(短信、天气等)。在Java项目中调用第三方接口的方式有:1、通过JDK网络类Java.net.HttpURLConnection;2、通过common封装好的HttpClient;3、通过Apache封装好的CloseableHttpClient;4、通过SpringBoot-RestTemplate;二、Java调用第三方

被动检查对于被动健康检查,nginx和nginxplus会在事件发生时对其进行监控,并尝试恢复失败的连接。如果仍然无法恢复正常,nginx开源版和nginxplus会将服务器标记为不可用,并暂时停止向其发送请求,直到它再次标记为活动状态。上游服务器标记为不可用的条件是为每个上游服务器定义的,其中包含块中server指令的参数upstream:fail_timeout-设置服务器标记为不可用时必须进行多次失败尝试的时间,以及服务器标记为不可用的时间(默认为10秒)。max_fails-设置在fai


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

Atom editor mac version download
The most popular open source editor

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.
