Solutions for collecting garbled codes


There are many problems with garbled codes, and the solutions are different. It depends on the specific situation. The following solutions to garbled codes are for reference only.

1. Use QueryList’s built-in garbled solution

Query method:

QueryList::Query( Collection target page, collection rules [, area selector] [, output encoding] [, input encoding] [, whether to remove the header])

1. Set input and output encoding

$html =<<<STR
<div>
   <p>这是内容</p>
</div>
STR;
$rule = array(
   'content' => array('div>p:last','text')
);
$data = QueryList::Query($html,$rule,'','UTF-8','GB2312',true)->data;

2. Set the input and output encoding, and set the last parameter to true
If setting the input and output parameters still cannot solve the garbled code, then set the last parameter to true (remove the header)

$html =<<<STR
<div>
   <p>这是内容</p>
</div>
STR;
$rule = array(
   'content' => array('div>p:last','text')
);
$data = QueryList::Query($html,$rule,'','UTF-8','GB2312',true)->data;

2. Check the QueryList forum [garbled code] related topic solutions

garbled code: http://querylist.cc/search/q-5Lmx56CB#all

3. Manually transcode the page yourself, and then pass the page to QueryList

$html =<<<STR
<div>
    <p>这是内容</p>
</div>
STR;
$rule = array(
    'content' => array('div>p:last','text')
);
$data = QueryList::Query($html,$rule,'','UTF-8','GB2312',true)->data;