Home > Article > Web Front-end > WebSocket+MSE——HTML5 live broadcast technology analysis
Author | Liu Bo (also Paiyun Multimedia Development Engineer)
Currently, in order to meet the relatively hot demand for live broadcast on the mobile Web, a series of HTML5 live broadcast technologies have been rapidly developed.
Common live streaming technologies that can be used for HTML5 include HLS, WebSocket and WebRTC. Today I will introduce to you the technical points related to WebSocket and MSE, and finally demonstrate the specific usage through an example.
WebSocket protocol introduction
WebSocket Client/Server API introduction
MSE Introduction
fMP4 Introduction
Demo Display
Usual web applications are built around the HTTP request/response model. All HTTP communications are controlled through the client. The client sends a request to the server. After the server receives and processes it, it returns the result to the client, and the client displays the data. Since this mode cannot meet the needs of real-time applications, "server pushed" long-connection technologies such as SSE and Comet have emerged.
WebSocket is a communication protocol based on TCP connection, which can perform full-duplex communication on a single TCP connection. WebSocket was set as a standard RFC 6455 by the IETF in 2011 and supplemented by RFC 7936. The WebSocket API was set as a standard by the W3C.
WebSocket is a protocol created independently on TCP. The concepts in the HTTP protocol are not related to WebSocket. The only correlation is that when using the 101 status code of the HTTP protocol for protocol switching, the TCP port used is 80, which can bypass most firewall restrictions.
In order to more conveniently deploy the new protocol, HTTP/1.1 introduces the Upgrade mechanism, so that the client and server can use the existing Some HTTP syntax is upgraded to other protocols. This mechanism is described in detail in RFC7230, section 6.7 Upgrade.
To initiate an HTTP/1.1 protocol upgrade, the client must specify these two fields in the request header ▽
> Connection: Upgrade Upgrade: protocol-name[/protocol-version]
If the server agrees to the upgrade, Then you need to respond like this ▽
> HTTP/1.1 101 Switching Protocols Connection: upgrade Upgrade: protocol-name[/protocol-version] [... data defined by new protocol ...]
As you can see, the status code of the HTTP Upgrade response is 101, and the response body can use the data format defined by the new protocol.
The WebSocket handshake takes advantage of this HTTP Upgrade mechanism. Once the handshake is complete, subsequent data transfer is done directly over TCP.
Currently, mainstream browsers provide a WebSocket API interface, which can send messages (text or binary) to the server and receive event-driven response data.
Step1. Check whether the browser supports WebSocket
> if(window.WebSocket) { // WebSocket代码 }
Step2. Establish a connection
> var ws = new WebSocket('ws://localhost:8327');
Step3. Register callback functions and send and receive data
Register the onopen, onclose, onerror and onmessage callback functions of the WebSocket object respectively.
Send data through ws.send(). Not only strings can be sent here, but also Blob or ArrayBuffer type data can be sent.
If you receive binary data, you need to set the format of the connection object to blob or arraybuffer.
ws.binaryType = 'arraybuffer';
WebSocket Golang API
For the server-side WebSocket library, I recommend using Google’s own http://golang.org/x/net/websocket, which is very convenient Used with net/http. You can also convert WebSocket's handler function into http.Handler through websocket.Handler, so that it can be used with the net/http library.
Then receive data through websocket.Message.Receive and send data through websocket.Message.Send.
The specific code can be found in the Demo section below.
Before introducing MSE, let's first look at the limitations of HTML5b97864c2e0ef2353a16c4d64c7734e92 and 39000f942b2545a5315c57fa3276f220.
Limitations for HTML5b97864c2e0ef2353a16c4d64c7734e92 and 39000f942b2545a5315c57fa3276f220 tags
Streaming is not supported
DRM and encryption are not supported
Difficult to customize controls, and maintain cross-browser consistency
Coding and encapsulation support differs in different browsers
MSE solves the flow problem of HTML5.
Media Source Extensions (MSE) is a new Web API supported by mainstream browsers such as Chrome, Safari, and Edge. MSE is a W3C standard that allows JavaScript to dynamically construct media streams for 39000f942b2545a5315c57fa3276f220 and b97864c2e0ef2353a16c4d64c7734e92. It defines objects that allow JavaScript to transfer media stream fragments to an HTMLMediaElement.
By using MSE, you can dynamically modify media streams without the need for any plug-ins. This allows front-end JavaScript to do more - repackaging, processing, and even transcoding in JavaScript.
Although MSE cannot transmit streams directly to media tags, MSE provides the core technology for building cross-browser players, allowing the browser to push audio and video to media tags through the JavaScript API.
Use caniuse to check whether the browser supports it.
You can further check whether the codec MIME type is supported through MediaSource.isTypeSupported().
The more commonly used video encapsulation formats are WebM and fMP4.
WebM and WebP are two sister projects, both sponsored by Google. Since WebM is a container format based on Matroska, it is inherently streaming and is very suitable for use in the field of streaming media.
The following focuses on the fMP4 format.
We all know that MP4 is composed of a series of Boxes. Ordinary MP4 has a nested structure. The client must load an MP4 file from the beginning before it can be played completely, and it cannot start playing from the middle section.
And fMP4 consists of a series of fragments. If the server supports byte-range requests, then these fragments can be independently requested to the client for playback without loading the entire file.
In order to illustrate this point more vividly, below I introduce several commonly used tools for analyzing MP4 files.
gpac, formerly known as mp4box, is a media development framework. There are a large number of media analysis tools under its source code, and you can use testapps;
mp4box.js, is mp4box Javascript version;
bento4, an analysis tool specifically for MP4;
MSE 本身的设计是不依赖任务特定的编解码和容器格式的,但是不同的浏览器支持程度是不一样的。
可以通过传递一个 MIME 类型的字符串到静态方法:
> MediaSource.isTypeSupported来检查。比如 ▽ MediaSource.isTypeSupported('audio/mp3'); // false MediaSource.isTypeSupported('video/mp4'); // true MediaSource.isTypeSupported('video/mp4; codecs="avc1.4D4028, mp4a.40.2"'); // true
获取 Codec MIME string 的方法可以通过在线的 [mp4info](http://nickdesaulniers.github.io/mp4info),或者使用命令行 mp4info test.mp4 | grep Codecs,可以得到类似如下结果 ▽
> mp4info fmp4.mp4| grep Codec Codecs String: mp4a.40.2 Codecs String: avc1.42E01E
当前,H.264 + AAC 的 MP4 容器在所有的浏览器都支持。
普通的 MP4 文件是不能和 MSE 一起使用的, 需要将 MP4 进行 fragment 化。
检查一个 MP4 是否已经 fragment 的方法 ▽
> mp4dump test.mp4 | grep "\[m"
如果是non-fragment会显示如下信息 ▽
> mp4dump nfmp4.mp4 | grep "\[m" [mdat] size=8+50873 [moov] size=8+7804 [mvhd] size=12+96 [mdia] size=8+3335 [mdhd] size=12+20 [minf] size=8+3250 [mdia] size=8+3975 [mdhd] size=12+20 [minf] size=8+3890 [mp4a] size=8+82 [meta] size=12+78
如果已经 fragment,会显示如下的类似信息 ▽
> mp4dump fmp4.mp4 | grep "\[m" | head -n 30 [moov] size=8+1871 [mvhd] size=12+96 [mdia] size=8+312 [mdhd] size=12+20 [minf] size=8+219 [mp4a] size=8+67 [mdia] size=8+371 [mdhd] size=12+20 [minf] size=8+278 [mdia] size=8+248 [mdhd] size=12+20 [minf] size=8+156 [mdia] size=8+248 [mdhd] size=12+20 [minf] size=8+156 [mvex] size=8+144 [mehd] size=12+4 [moof] size=8+600 [mfhd] size=12+4 [mdat] size=8+138679 [moof] size=8+536 [mfhd] size=12+4 [mdat] size=8+24490 [moof] size=8+592 [mfhd] size=12+4 [mdat] size=8+14444 [moof] size=8+312 [mfhd] size=12+4 [mdat] size=8+1840 [moof] size=8+600
把一个 non-fragment MP4 转换成 fragment MP4。
可以使用 FFmpeg 的 -movflags 来转换。
对于原始文件为非 MP4 文件 ▽
> ffmpeg -i trailer_1080p.mov -c:v copy -c:a copy -movflags frag_keyframe+empty_moov bunny_fragmented.mp4
对于原始文件已经是 MP4 文件 ▽
> ffmpeg -i non_fragmented.mp4 -movflags frag_keyframe+empty_moov fragmented.mp4
或者使用 mp4fragment ▽
> mp4fragment input.mp4 output.mp4
DEMO TIME
最后阶段,展示两个demo,分别是 MSE Vod Demo、MSE Live Demo
MSE Vod Demo
展示利用 MSE 和 WebSocket 实现一个点播服务
后端读取一个 fMP4 文件,通过 WebSocket 发送给 MSE,进行播放
展示利用 MSE 和 WebSocket 实现一个直播服务
后端代理一条 HTTP-FLV 直播流,通过 WebSocket 发送给 MSE,进行播放
前端 MSE 部分做了很多工作, 包括将 flv 实时转封装成了 fMP4,这里引用了 videojs-flow 的实现
Refs
WebSocket
rfc6455
HTTP Upgrade
WebSocket API
MDN WebSocket
videojs-flow
MSE
W3C
MDN MSE
HTML5 Codec MIME
又拍直播云是基于又拍云内容分发网络为直播应用提供超低延迟、高码率、高并发的整套从推流端到播放端的一站式解决方案。包括实时转码,实时录制,分发加速,水印,截图,秒级禁播,延时直播等功能。直播源站支持自主源站或又拍云源,为支持用户在不同终端播放,支持 RTMP、HLS、HTTP-flv 播放输出。
详情了解:https://www.upyun.com/products/live
推荐阅读:
无连麦,不直播,都在说的直播利器连麦互动到底是啥?
技术干货|移动直播六大关键技术详解
又拍直播云SDK,自带美颜、滤镜、消噪、人声增益等功能
又拍直播云功能处理篇:转码、录制、视频水印、视频截图
又拍直播云功能基础篇:推流和拉流、多协议输出、多访问方式、回源端口自定义
又拍直播云功能高级篇:防盗链、秒级禁播、自动鉴黄、API接口
The above is the detailed content of WebSocket+MSE——HTML5 live broadcast technology analysis. For more information, please follow other related articles on the PHP Chinese website!