search
HomeWeb Front-endH5 TutorialHow to use flv.js? Comprehensive interpretation of flv.js code

First of all, let me state that I don’t know much about JavaScript. I am only familiar with the audio and video processing part. It is inevitable that I will make mistakes. Corrections are welcome.

flv.jsThe code of the project has a certain scale. If you want to study it, I suggest starting with demux. If you understand demux, you will master the key steps of media data processing. The previous media data Downloading and subsequent media data playback becomes easy to understand.

First, let’s spread some background knowledge. Why does HTML5 video playback use flv format?

Because of Flash. My title picture uses "flash RIP". Flash is dying, but its influence is still there. Flash technology has been the basic technology for Internet video in the past 10 years. A large number of related infrastructures are built around Flash, such as CDN. Supported RTMP and flv over http protocols. In order to be compatible with Flash playback on the Web, companies doing Internet live broadcasts invariably choose the flv media format. During the transition period from Flash to HTML5, it would be great if HTML5 could support the flash protocol, which would allow a smooth transition. However, HTML5 does not natively support the flash protocol. The flv.js project solves the problem of HTML5 supporting the flash protocol. This is the historical background of flv.js’ emergence and short-term popularity.

The demux in flv.js is a set of parsers for the FLV media data format. If you want to understand the FLV format, the following documents must be read carefully.
Adobe’s official flv format description
http://www.adobe.com/content/dam/Adobe/en/devnet/flv/pdfs/video_file_format_spec_v10.pdf

flv. How to use js? Let’s get to the point, flv.js code interpretation: demux part

Open the code https://github.com/Bilibili/flv.js/blob/master/src/demux/flv-demuxer.js

 static probe(buffer) {
        let data = new Uint8Array(buffer);
        let mismatch = {match: false};

        if (data[0] !== 0x46 || data[1] !== 0x4C || data[2] !== 0x56 || data[3] !== 0x01) {
            return mismatch;
        }

0x46 0x4c 0x56 These numbers are actually the ASCII codes of 'F' 'L' 'V', which represent the flv file header. The following 0x01 is the version number of flv format. Use this to detect whether the data is in flv format.

let hasAudio = ((data[4] & 4) >>> 2) !== 0;
let hasVideo = (data[4] & 1) !== 0;

Take out the fifth byte. Its sixth and eighth bits indicate whether audio and video data exist respectively. The other bits are reserved bits and can be ignored.

This probe is called by parseChunks. After reading at least 13 bytes, it is judged whether it is a flv data, and then continues the subsequent analysis. Why is it 13? Because the file header of flv is 13 bytes. Refer to "The FLV header" in the PDF above. These 13 bytes include the following four-byte size. This size represents the size of the previous tag. , but since the first tag does not exist in the previous one, the first size is always 0.

The code behind parseChunks is constantly parsing tags. flv calls a piece of media data TAG. Each tag has a different type. In fact, there are only three types actually used, 8, 9, and 18 corresponding to audio, video and Script Data.

 if (tagType !== 8 && tagType !== 9 && tagType !== 18) {
                Log.w(this.TAG, `Unsupported tag type ${tagType}, skipped`);
                // consume the whole tag (skip it)
                offset += 11 + dataSize + 4;
                continue;
            }

This code is judging the tag type. Pay attention to the number 11, because the tag header is 11 bytes, followed by the tag body, so the offset plus these offsets is to jump to the next tag position.

The format of the tag header is: UI represents unsigned int, followed by the number of bits.

UI8 tag type
UI24 data size
UI24 timestamp
UI8 TimestampExtended
UI24 StreamID

Do you see if it is exactly 11 bytes? In order to save traffic, Adobe will never use 32bit if it can be expressed in 24bit, but it still sets an extension bit for timestamp to store the highest byte. This design is very painful, which leads to the following This weird code first takes three bytes, converts them into integers according to Big-Endian, and then puts the fourth byte in the high bits.

let ts2 = v.getUint8(4);
let ts1 = v.getUint8(5);
let ts0 = v.getUint8(6);
let ts3 = v.getUint8(7);
let timestamp = ts0 | (ts1 << 8) | (ts2 << 16) | (ts3 << 24);

After parsing the tag header, different parsing functions are called according to different tag types.

switch (tagType) {
    case 8:  // Audio
        this._parseAudioData(chunk, dataOffset, dataSize, timestamp);
        break;
    case 9:  // Video
        this._parseVideoData(chunk, dataOffset, dataSize, timestamp, byteStart + offset);
        break;
    case 18:  // ScriptDataObject
        this._parseScriptData(chunk, dataOffset, dataSize);
        break;
}

TAG type: 8 audio

The audio structure is relatively simple. The first byte of AUDIODATA indicates the audio format. In fact, it is basically ACC 16bit stereo 44.1kHz sampling, so the most common number is 0xAF, followed by AACAUDIODATA

TAG type: 9 video

The key thing to watch is the video,

let frameType = (spec & 240) >>> 4;
let codecId = spec & 15;

Two important values ​​are taken here. frameType indicates the frame type. 1 is a key frame and 2 is a non-key frame. codeId is the encoding type. Although flv supports six video formats, in fact, only H.264 is actually used for Internet on-demand live broadcasts. So the codecId is basically 7. The author uses decimal numbers here, which are actually bit-wise values. It will be better to understand using hexadecimal numbers.

_parseAVCVideoPacket is used to parse the AVCVIDEOPACKET structure, which is the H.264 video package

let packetType = v.getUint8(0);
let cts = v.getUint32(0, !le) & 0x00FFFFFF;

Explain the concept of CTS, CompositionTime. We got a timestamp in the tag header earlier. This corresponds to DTS in the video, which is the decoding timestamp. CTS is actually an offset, indicating the offset of PTS relative to DTS. , which is the difference between PTS and DTS.

    这里有个坑,参考adobe的文档,这是CTS是个有符号的24位整数,SI24,就是说它有可能是个负数,所以我怀疑flv.js解析cts的代码有bug,没有处理负数情况。因为负数的24位整型到32位负数转换的时候要手工处理高位的符号位和补码问题。(我只是怀疑,没有调试确认过,但是我在处理YY直播数据的时候是踩过这个坑的,个别包含 B frame的视频是会出现CTS为负数的情况的)

How to use flv.js? Comprehensive interpretation of flv.js code


    packetType有两种,0 表示 AVCDecoderConfigurationRecord,这个是H.264的视频信息头,包含了 sps 和 pps,AVCDecoderConfigurationRecord的格式不是flv定义的,而是264标准定义的,如果用ffmpeg去解码,这个结构可以直接放到 codec的extradata里送给ffmpeg去解释。

    flv.js作者选择了自己来解析这个数据结构,也是迫不得已,因为JS环境下没有ffmpeg,解析这个结构主要是为了提取 sps和pps。虽然理论上sps允许有多个,但其实一般就一个。

let config = SPSParser.parseSPS(sps);

    pps的信息没什么用,所以作者只实现了sps的分析器,说明作者下了很大功夫去学习264的标准,其中的Golomb解码还是挺复杂的,能解对不容易,我在PC和手机平台都是用ffmpeg去解析的。SPS里面包括了视频分辨率,帧率,profile level等视频重要信息。

    packetTtype 为 1 表示 NALU,NALU= network abstract layer unit,这是H.264的概念,网络抽象层数据单元,其实简单理解就是一帧视频数据。

    NALU的头有两种标准,一种是用 00 00 00 01四个字节开头这叫 start code,另一个叫mp4风格以Big-endian的四字节size开头,flv用了后一种,而我们在H.264的裸流里常见的是前一种。

TAG type : 18 Script Data

    除了音视频数据外还有 ScriptData,这是一种类似二进制json的对象描述数据格式,JavaScript比较惨只能自己写实现,其它平台可以用 librtmp的代码去做。

    我觉得作者处理解决flv播放问题外,也为前端贡献了 amf 解析,sps解析,Golomb解码等基础代码,这些是可以用在其他项目里的。

    在用传输协议获取了flv数据流后,用demux分离出音视频数据的属性和数据包,这为后面的播放打下了基础,从demux入手去读代码是个不错的切入点,而且一定要配合 flv file format spec一起看,反复多看几遍争取熟记在心。我现在已经可以从wireshark的抓包数据里人肉分析flv数据包了,对于debug相当有帮助。

相关文章:

如何看待B站 (bilibili) 开源 HTML5 播放器内核 flv.js?

开源代码flv.js的使用说明

The above is the detailed content of How to use flv.js? Comprehensive interpretation of flv.js code. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Understanding H5: The Meaning and SignificanceUnderstanding H5: The Meaning and SignificanceMay 11, 2025 am 12:19 AM

H5 is HTML5, the fifth version of HTML. HTML5 improves the expressiveness and interactivity of web pages, introduces new features such as semantic tags, multimedia support, offline storage and Canvas drawing, and promotes the development of Web technology.

H5: Accessibility and Web Standards ComplianceH5: Accessibility and Web Standards ComplianceMay 10, 2025 am 12:21 AM

Accessibility and compliance with network standards are essential to the website. 1) Accessibility ensures that all users have equal access to the website, 2) Network standards follow to improve accessibility and consistency of the website, 3) Accessibility requires the use of semantic HTML, keyboard navigation, color contrast and alternative text, 4) Following these principles is not only a moral and legal requirement, but also amplifying user base.

What is the H5 tag in HTML?What is the H5 tag in HTML?May 09, 2025 am 12:11 AM

The H5 tag in HTML is a fifth-level title that is used to tag smaller titles or sub-titles. 1) The H5 tag helps refine content hierarchy and improve readability and SEO. 2) Combined with CSS, you can customize the style to enhance the visual effect. 3) Use H5 tags reasonably to avoid abuse and ensure the logical content structure.

H5 Code: A Beginner's Guide to Web StructureH5 Code: A Beginner's Guide to Web StructureMay 08, 2025 am 12:15 AM

The methods of building a website in HTML5 include: 1. Use semantic tags to define the web page structure, such as, , etc.; 2. Embed multimedia content, use and tags; 3. Apply advanced functions such as form verification and local storage. Through these steps, you can create a modern web page with clear structure and rich features.

H5 Code Structure: Organizing Content for ReadabilityH5 Code Structure: Organizing Content for ReadabilityMay 07, 2025 am 12:06 AM

A reasonable H5 code structure allows the page to stand out among a lot of content. 1) Use semantic labels such as, etc. to organize content to make the structure clear. 2) Control the rendering effect of pages on different devices through CSS layout such as Flexbox or Grid. 3) Implement responsive design to ensure that the page adapts to different screen sizes.

H5 vs. Older HTML Versions: A ComparisonH5 vs. Older HTML Versions: A ComparisonMay 06, 2025 am 12:09 AM

The main differences between HTML5 (H5) and older versions of HTML include: 1) H5 introduces semantic tags, 2) supports multimedia content, and 3) provides offline storage functions. H5 enhances the functionality and expressiveness of web pages through new tags and APIs, such as and tags, improving user experience and SEO effects, but need to pay attention to compatibility issues.

H5 vs. HTML5: Clarifying the Terminology and RelationshipH5 vs. HTML5: Clarifying the Terminology and RelationshipMay 05, 2025 am 12:02 AM

The difference between H5 and HTML5 is: 1) HTML5 is a web page standard that defines structure and content; 2) H5 is a mobile web application based on HTML5, suitable for rapid development and marketing.

HTML5 Features: The Core of H5HTML5 Features: The Core of H5May 04, 2025 am 12:05 AM

The core features of HTML5 include semantic tags, multimedia support, form enhancement, offline storage and local storage. 1. Semantic tags such as, improve code readability and SEO effect. 2. Multimedia support simplifies the process of embedding media content through and tags. 3. Form Enhancement introduces new input types and verification properties, simplifying form development. 4. Offline storage and local storage improve web page performance and user experience through ApplicationCache and localStorage.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool