Home > Article > Backend Development > PHP practice: crawling Bilibili barrage data
Bilibili is a barrage video website popular in China. It is also a treasure trove, containing all kinds of data. Among them, barrage data is a very valuable resource, so many data analysts and researchers hope to obtain this data. In this article, I will introduce the use of PHP language to crawl Bilibili barrage data.
Before starting to crawl the barrage data, we need to install a PHP crawler framework Symphony 2. It can be installed through the following command:
$ curl -LsS https://symfony.com/installer -o /usr/local/bin/symfony $ chmod a+x /usr/local/bin/symfony
After that, we need to use Composer to install GuzzleHttp and PHP-DI dependent libraries:
$ composer require guzzlehttp/guzzle php-di/php-di
Next, we need to go to the Bilibili website to find the number of the corresponding video. It can be obtained through the browser F12 developer tools.
After obtaining the number of the Bilibili video, we can use GuzzleHttp to send a GET request to obtain the data, thereby obtaining the barrage list in the video information. The following is the code to obtain data:
$client = new GuzzleHttpClient(); $res = $client->request('GET', "https://api.bilibili.com/x/v1/dm/list.so?oid={$oid}"); $xml = simplexml_load_string($res->getBody(), 'SimpleXMLElement', LIBXML_NOCDATA);
After successfully obtaining the barrage list information, we encapsulate it into an array:
$items = []; foreach ($xml->d->p as $p) { list($time, $type, $size, $color, $time) = explode(",", $p['p']); $content = (string) $p; $items[] = [ 'time' => (float) $time, 'content' => $content ]; }
After we successfully obtain the barrage data, we It can be saved to the database for subsequent analysis and use:
$builder = $this->db->createQueryBuilder(); foreach ($items as $item) { $builder->insert('danmaku') ->values([ '`time`' => ':time', '`content`' => ':content' ]) ->setParameters([ ':time' => $item['time'], ':content' => $item['content'] ]) ->execute(); }
Next, we can start to analyze the obtained barrage Data are analyzed and presented. We can use PHP with Highcharts, a data visualization tool, to build a chart of the number of barrages. The following is the display data and code implementation:
$builder = $this->db->createQueryBuilder(); $data = $builder->select('COUNT(*) as cnt, FLOOR(`time`) as time') ->from('danmaku') ->groupBy('floor(`time`)') ->execute() ->fetchAll(PDO::FETCH_ASSOC); echo $twig->render('danmaku.html.twig', [ 'data' => $data ]);
Highcharts.chart('container', { chart: { type: 'spline' }, title: { text: '弹幕数量' }, xAxis: { title: { text: '时间' } }, yAxis: { title: { text: '数量' } }, credits: { enabled: false }, series: [{ name: '弹幕数量', data: {{ data | json_encode }} }] });
Through this article, we have successfully used the PHP crawler framework Symphony 2 to crawl Bilibili barrage data function, analyzed the obtained data, and generated a chart of the number of barrages. In this process, we learned how to use PHP to send a GET request to obtain Bilibili video barrage data, and how to use Highcharts to display the data.
The above is the detailed content of PHP practice: crawling Bilibili barrage data. For more information, please follow other related articles on the PHP Chinese website!