Home  >  Article  >  Web Front-end  >  WebSocket+MSE——HTML5 live broadcast technology analysis

WebSocket+MSE——HTML5 live broadcast technology analysis

巴扎黑
巴扎黑Original
2017-06-23 14:45:594314browse

Author | Liu Bo (also Paiyun Multimedia Development Engineer)

Currently, in order to meet the relatively hot demand for live broadcast on the mobile Web, a series of HTML5 live broadcast technologies have been rapidly developed.

Common live streaming technologies that can be used for HTML5 include HLS, WebSocket and WebRTC. Today I will introduce to you the technical points related to WebSocket and MSE, and finally demonstrate the specific usage through an example.

Article outline

  • WebSocket protocol introduction

  • WebSocket Client/Server API introduction

  • MSE Introduction

  • fMP4 Introduction

  • Demo Display

WebSocket

Usual web applications are built around the HTTP request/response model. All HTTP communications are controlled through the client. The client sends a request to the server. After the server receives and processes it, it returns the result to the client, and the client displays the data. Since this mode cannot meet the needs of real-time applications, "server pushed" long-connection technologies such as SSE and Comet have emerged.

WebSocket is a communication protocol based on TCP connection, which can perform full-duplex communication on a single TCP connection. WebSocket was set as a standard RFC 6455 by the IETF in 2011 and supplemented by RFC 7936. The WebSocket API was set as a standard by the W3C.

WebSocket is a protocol created independently on TCP. The concepts in the HTTP protocol are not related to WebSocket. The only correlation is that when using the 101 status code of the HTTP protocol for protocol switching, the TCP port used is 80, which can bypass most firewall restrictions.

WebSocket Handshake

In order to more conveniently deploy the new protocol, HTTP/1.1 introduces the Upgrade mechanism, so that the client and server can use the existing Some HTTP syntax is upgraded to other protocols. This mechanism is described in detail in RFC7230, section 6.7 Upgrade.

To initiate an HTTP/1.1 protocol upgrade, the client must specify these two fields in the request header ▽

> Connection: Upgrade
Upgrade: protocol-name[/protocol-version]

If the server agrees to the upgrade, Then you need to respond like this ▽

> HTTP/1.1 101 Switching Protocols
Connection: upgrade
Upgrade: protocol-name[/protocol-version]
[... data defined by new protocol ...]

As you can see, the status code of the HTTP Upgrade response is 101, and the response body can use the data format defined by the new protocol.

The WebSocket handshake takes advantage of this HTTP Upgrade mechanism. Once the handshake is complete, subsequent data transfer is done directly over TCP.

WebSocket JavaScript API

Currently, mainstream browsers provide a WebSocket API interface, which can send messages (text or binary) to the server and receive event-driven response data.

Step1. Check whether the browser supports WebSocket

> if(window.WebSocket) {
    // WebSocket代码
}

Step2. Establish a connection

> var ws = new WebSocket('ws://localhost:8327');

Step3. Register callback functions and send and receive data

Register the onopen, onclose, onerror and onmessage callback functions of the WebSocket object respectively.

Send data through ws.send(). Not only strings can be sent here, but also Blob or ArrayBuffer type data can be sent.

If you receive binary data, you need to set the format of the connection object to blob or arraybuffer.

ws.binaryType = 'arraybuffer';

WebSocket Golang API

For the server-side WebSocket library, I recommend using Google’s own , which is very convenient Used with net/http. You can also convert WebSocket's handler function into http.Handler through websocket.Handler, so that it can be used with the net/http library.

Then receive data through websocket.Message.Receive and send data through websocket.Message.Send.

The specific code can be found in the Demo section below.

MSE

Before introducing MSE, let's first look at the limitations of HTML5b97864c2e0ef2353a16c4d64c7734e92 and 39000f942b2545a5315c57fa3276f220.

Limitations for HTML5b97864c2e0ef2353a16c4d64c7734e92 and 39000f942b2545a5315c57fa3276f220 tags

  • Streaming is not supported

  • DRM and encryption are not supported

  • Difficult to customize controls, and maintain cross-browser consistency

  • Coding and encapsulation support differs in different browsers

MSE solves the flow problem of HTML5.

Media Source Extensions (MSE) is a new Web API supported by mainstream browsers such as Chrome, Safari, and Edge. MSE is a W3C standard that allows JavaScript to dynamically construct media streams for 39000f942b2545a5315c57fa3276f220 and b97864c2e0ef2353a16c4d64c7734e92. It defines objects that allow JavaScript to transfer media stream fragments to an HTMLMediaElement.

By using MSE, you can dynamically modify media streams without the need for any plug-ins. This allows front-end JavaScript to do more - repackaging, processing, and even transcoding in JavaScript.

Although MSE cannot transmit streams directly to media tags, MSE provides the core technology for building cross-browser players, allowing the browser to push audio and video to media tags through the JavaScript API.

Browser Support

Use caniuse to check whether the browser supports it.

You can further check whether the codec MIME type is supported through MediaSource.isTypeSupported().

fMP4

The more commonly used video encapsulation formats are WebM and fMP4.

WebM and WebP are two sister projects, both sponsored by Google. Since WebM is a container format based on Matroska, it is inherently streaming and is very suitable for use in the field of streaming media.

The following focuses on the fMP4 format.

We all know that MP4 is composed of a series of Boxes. Ordinary MP4 has a nested structure. The client must load an MP4 file from the beginning before it can be played completely, and it cannot start playing from the middle section.

And fMP4 consists of a series of fragments. If the server supports byte-range requests, then these fragments can be independently requested to the client for playback without loading the entire file.

In order to illustrate this point more vividly, below I introduce several commonly used tools for analyzing MP4 files.

gpac, formerly known as mp4box, is a media development framework. There are a large number of media analysis tools under its source code, and you can use testapps;

  • mp4box.js, is mp4box Javascript version;

  • bento4, an analysis tool specifically for MP4;

  • ##mp4parser, an online MP4 file analysis tool.

fragment mp4 VS non-fragment mp4

The following is a screenshot of a fragment mp4 file analyzed by mp4parser (Online MPEG4 Parser)▽

The following is a screenshot of a non-fragment mp4 file analyzed by mp4parser▽

We can see the top box of the non-fragment mp4 There are very few types, and fragment mp4 is composed of moof+mdat segments. They already contain enough metadata information and data, and you can directly seek to this position and start playing. In other words, fMP4 is a streaming encapsulation format, which is more suitable for streaming on the network without relying on metadata in the file header.

Apple announced at the WWDC 2016 conference that it will support fMP4 in HLS of iOS 10, tvOS, and macOS. It can be seen that the prospect of fMP4 is very good.

It is worth mentioning that fMP4, CMAF, and ISOBMFF are actually similar things.

MSE JavaScript API

At a high level, MSE provides a

  • set of JavaScript APIs to build media streams

  • A splicing and caching model

  • Identifies some byte stream types

  • WebM

  • ISO Base Media File Format

  • MPEG-2 Transport Streams

MSE Internal Structure




MSE 本身的设计是不依赖任务特定的编解码和容器格式的,但是不同的浏览器支持程度是不一样的。

可以通过传递一个 MIME 类型的字符串到静态方法:

> MediaSource.isTypeSupported来检查。比如 ▽
MediaSource.isTypeSupported('audio/mp3'); // false
MediaSource.isTypeSupported('video/mp4'); // true
MediaSource.isTypeSupported('video/mp4; codecs="avc1.4D4028, mp4a.40.2"'); // true

获取 Codec MIME string 的方法可以通过在线的 [mp4info](),或者使用命令行 mp4info test.mp4 | grep Codecs,可以得到类似如下结果 ▽

> mp4info fmp4.mp4| grep Codec
    Codecs String: mp4a.40.2
    Codecs String: avc1.42E01E

当前,H.264 + AAC 的 MP4 容器在所有的浏览器都支持。

普通的 MP4 文件是不能和 MSE 一起使用的, 需要将 MP4 进行 fragment 化。

检查一个 MP4 是否已经 fragment 的方法 ▽

> mp4dump test.mp4 | grep "\[m"

如果是non-fragment会显示如下信息 ▽

> mp4dump nfmp4.mp4 | grep "\[m"
[mdat] size=8+50873
[moov] size=8+7804
  [mvhd] size=12+96
    [mdia] size=8+3335
      [mdhd] size=12+20
      [minf] size=8+3250
    [mdia] size=8+3975
      [mdhd] size=12+20
      [minf] size=8+3890
            [mp4a] size=8+82
    [meta] size=12+78
如果已经 fragment,会显示如下的类似信息 ▽
>  mp4dump fmp4.mp4 | grep "\[m" | head -n 30
[moov] size=8+1871
  [mvhd] size=12+96
    [mdia] size=8+312
      [mdhd] size=12+20
      [minf] size=8+219
            [mp4a] size=8+67
    [mdia] size=8+371
      [mdhd] size=12+20
      [minf] size=8+278
    [mdia] size=8+248
      [mdhd] size=12+20
      [minf] size=8+156
    [mdia] size=8+248
      [mdhd] size=12+20
      [minf] size=8+156
  [mvex] size=8+144
    [mehd] size=12+4
[moof] size=8+600
  [mfhd] size=12+4
[mdat] size=8+138679
[moof] size=8+536
  [mfhd] size=12+4
[mdat] size=8+24490
[moof] size=8+592
  [mfhd] size=12+4
[mdat] size=8+14444
[moof] size=8+312
  [mfhd] size=12+4
[mdat] size=8+1840
[moof] size=8+600

把一个 non-fragment MP4 转换成 fragment MP4。

可以使用 FFmpeg 的 -movflags 来转换。

对于原始文件为非 MP4 文件 ▽

> ffmpeg -i trailer_1080p.mov -c:v copy -c:a copy -movflags frag_keyframe+empty_moov bunny_fragmented.mp4

对于原始文件已经是 MP4 文件 ▽

> ffmpeg -i non_fragmented.mp4 -movflags frag_keyframe+empty_moov fragmented.mp4

或者使用 mp4fragment ▽

> mp4fragment input.mp4 output.mp4

DEMO TIME

最后阶段,展示两个demo,分别是 MSE Vod Demo、MSE Live Demo

MSE Vod Demo

展示利用 MSE 和 WebSocket 实现一个点播服务

后端读取一个 fMP4 文件,通过 WebSocket 发送给 MSE,进行播放

展示利用 MSE 和 WebSocket 实现一个直播服务

后端代理一条 HTTP-FLV 直播流,通过 WebSocket 发送给 MSE,进行播放

前端 MSE 部分做了很多工作, 包括将 flv 实时转封装成了 fMP4,这里引用了 videojs-flow 的实现

Refs

WebSocket

  • rfc6455

  • HTTP Upgrade

  • WebSocket API

  • MDN WebSocket

  • videojs-flow

MSE

  • W3C

  • MDN MSE

  • HTML5 Codec MIME

又拍直播云是基于又拍云内容分发网络为直播应用提供超低延迟、高码率、高并发的整套从推流端到播放端的一站式解决方案。包括实时转码,实时录制,分发加速,水印,截图,秒级禁播,延时直播等功能。直播源站支持自主源站或又拍云源,为支持用户在不同终端播放,支持 RTMP、HLS、HTTP-flv 播放输出。

详情了解:

推荐阅读:

无连麦,不直播,都在说的直播利器连麦互动到底是啥?
技术干货|移动直播六大关键技术详解
又拍直播云SDK,自带美颜、滤镜、消噪、人声增益等功能
又拍直播云功能处理篇:转码、录制、视频水印、视频截图
又拍直播云功能基础篇:推流和拉流、多协议输出、多访问方式、回源端口自定义
又拍直播云功能高级篇:防盗链、秒级禁播、自动鉴黄、API接口

The above is the detailed content of WebSocket+MSE——HTML5 live broadcast technology analysis. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn