Home  >  Article  >  Backend Development  >  Regular expression to filter html in c#

Regular expression to filter html in c#

高洛峰
高洛峰Original
2017-01-13 17:23:451344browse

The example of this article describes the C# method of downloading the HTML source code of a web page. Share it with everyone for your reference. The specific method is as follows:

public static class DownLoad_HTML
{
private static int FailCount = 0; //记录下载失败的次数
 
public static string GetHtml(string url) //传入要下载的网址
{
string str = string.Empty;
try
{
System.Net.WebRequest request = System.Net.WebRequest.Create(url);
request.Timeout = 10000; //下载超时时间
request.Headers.Set("Pragma", "no-cache");
System.Net.WebResponse response = request.GetResponse();
System.IO.Stream streamReceive = response.GetResponseStream();
Encoding encoding = Encoding.GetEncoding("gb2312");//utf-8 网页文字编码
System.IO.StreamReader streamReader = new System.IO.StreamReader(streamReceive, encoding);
str = streamReader.ReadToEnd();
streamReader.Close();
}
catch (Exception ex)
{
FailCount++;
 
if (FailCount > 5)
{
var result = System.Windows.Forms.MessageBox.Show("已下载失败" + FailCount + "次,是否要继续尝试?" + Environment.NewLine + ex.ToString(), "数据下载异常", System.Windows.Forms.MessageBoxButtons.YesNo, System.Windows.Forms.MessageBoxIcon.Error);
if (result == System.Windows.Forms.DialogResult.Yes)
{
str = GetHtml(url);
}
else
{
System.Windows.Forms.MessageBox.Show("下载HTML失败" + Environment.NewLine + ex.ToString(), "下载HTML失败", System.Windows.Forms.MessageBoxButtons.OK, System.Windows.Forms.MessageBoxIcon.Error);
throw ex;
}
}
else
{
str = GetHtml(url);
}
}
 
FailCount = 0; //如果能执行到这一步就表示下载终于成功了
return str;
}

I hope this article will be helpful to everyone’s C# programming

For more articles related to regular expressions for filtering HTML in C#, please pay attention to the PHP Chinese website!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn