search
HomeWeb Front-endH5 TutorialWebSocket+MSE——HTML5 live broadcast technology analysis

WebSocket+MSE——HTML5 live broadcast technology analysis

Jun 23, 2017 pm 02:45 PM
h5html5webtechnologylive streamingparse

Author | Liu Bo (also Paiyun Multimedia Development Engineer)

Currently, in order to meet the relatively hot demand for live broadcast on the mobile Web, a series of HTML5 live broadcast technologies have been rapidly developed.

Common live streaming technologies that can be used for HTML5 include HLS, WebSocket and WebRTC. Today I will introduce to you the technical points related to WebSocket and MSE, and finally demonstrate the specific usage through an example.

Article outline

  • WebSocket protocol introduction

  • WebSocket Client/Server API introduction

  • MSE Introduction

  • fMP4 Introduction

  • Demo Display

WebSocket

Usual web applications are built around the HTTP request/response model. All HTTP communications are controlled through the client. The client sends a request to the server. After the server receives and processes it, it returns the result to the client, and the client displays the data. Since this mode cannot meet the needs of real-time applications, "server pushed" long-connection technologies such as SSE and Comet have emerged.

WebSocket is a communication protocol based on TCP connection, which can perform full-duplex communication on a single TCP connection. WebSocket was set as a standard RFC 6455 by the IETF in 2011 and supplemented by RFC 7936. The WebSocket API was set as a standard by the W3C.

WebSocket is a protocol created independently on TCP. The concepts in the HTTP protocol are not related to WebSocket. The only correlation is that when using the 101 status code of the HTTP protocol for protocol switching, the TCP port used is 80, which can bypass most firewall restrictions.

WebSocket Handshake

In order to more conveniently deploy the new protocol, HTTP/1.1 introduces the Upgrade mechanism, so that the client and server can use the existing Some HTTP syntax is upgraded to other protocols. This mechanism is described in detail in RFC7230, section 6.7 Upgrade.

To initiate an HTTP/1.1 protocol upgrade, the client must specify these two fields in the request header ▽

> Connection: Upgrade
Upgrade: protocol-name[/protocol-version]

If the server agrees to the upgrade, Then you need to respond like this ▽

> HTTP/1.1 101 Switching Protocols
Connection: upgrade
Upgrade: protocol-name[/protocol-version]
[... data defined by new protocol ...]

As you can see, the status code of the HTTP Upgrade response is 101, and the response body can use the data format defined by the new protocol.

The WebSocket handshake takes advantage of this HTTP Upgrade mechanism. Once the handshake is complete, subsequent data transfer is done directly over TCP.

WebSocket JavaScript API

Currently, mainstream browsers provide a WebSocket API interface, which can send messages (text or binary) to the server and receive event-driven response data.

Step1. Check whether the browser supports WebSocket

> if(window.WebSocket) {
    // WebSocket代码
}

Step2. Establish a connection

> var ws = new WebSocket('ws://localhost:8327');

Step3. Register callback functions and send and receive data

Register the onopen, onclose, onerror and onmessage callback functions of the WebSocket object respectively.

Send data through ws.send(). Not only strings can be sent here, but also Blob or ArrayBuffer type data can be sent.

If you receive binary data, you need to set the format of the connection object to blob or arraybuffer.

ws.binaryType = 'arraybuffer';

WebSocket Golang API

For the server-side WebSocket library, I recommend using Google’s own , which is very convenient Used with net/http. You can also convert WebSocket's handler function into http.Handler through websocket.Handler, so that it can be used with the net/http library.

Then receive data through websocket.Message.Receive and send data through websocket.Message.Send.

The specific code can be found in the Demo section below.

MSE

Before introducing MSE, let's first look at the limitations of HTML5

Limitations for HTML5

  • Streaming is not supported

  • DRM and encryption are not supported

  • Difficult to customize controls, and maintain cross-browser consistency

  • Coding and encapsulation support differs in different browsers

MSE solves the flow problem of HTML5.

Media Source Extensions (MSE) is a new Web API supported by mainstream browsers such as Chrome, Safari, and Edge. MSE is a W3C standard that allows JavaScript to dynamically construct media streams for

By using MSE, you can dynamically modify media streams without the need for any plug-ins. This allows front-end JavaScript to do more - repackaging, processing, and even transcoding in JavaScript.

Although MSE cannot transmit streams directly to media tags, MSE provides the core technology for building cross-browser players, allowing the browser to push audio and video to media tags through the JavaScript API.

Browser Support

Use caniuse to check whether the browser supports it.

You can further check whether the codec MIME type is supported through MediaSource.isTypeSupported().

fMP4

The more commonly used video encapsulation formats are WebM and fMP4.

WebM and WebP are two sister projects, both sponsored by Google. Since WebM is a container format based on Matroska, it is inherently streaming and is very suitable for use in the field of streaming media.

The following focuses on the fMP4 format.

We all know that MP4 is composed of a series of Boxes. Ordinary MP4 has a nested structure. The client must load an MP4 file from the beginning before it can be played completely, and it cannot start playing from the middle section.

And fMP4 consists of a series of fragments. If the server supports byte-range requests, then these fragments can be independently requested to the client for playback without loading the entire file.

In order to illustrate this point more vividly, below I introduce several commonly used tools for analyzing MP4 files.

gpac, formerly known as mp4box, is a media development framework. There are a large number of media analysis tools under its source code, and you can use testapps;

  • mp4box.js, is mp4box Javascript version;

  • bento4, an analysis tool specifically for MP4;

  • ##mp4parser, an online MP4 file analysis tool.

fragment mp4 VS non-fragment mp4

The following is a screenshot of a fragment mp4 file analyzed by mp4parser (Online MPEG4 Parser)▽

The following is a screenshot of a non-fragment mp4 file analyzed by mp4parser▽

We can see the top box of the non-fragment mp4 There are very few types, and fragment mp4 is composed of moof+mdat segments. They already contain enough metadata information and data, and you can directly seek to this position and start playing. In other words, fMP4 is a streaming encapsulation format, which is more suitable for streaming on the network without relying on metadata in the file header.

Apple announced at the WWDC 2016 conference that it will support fMP4 in HLS of iOS 10, tvOS, and macOS. It can be seen that the prospect of fMP4 is very good.

It is worth mentioning that fMP4, CMAF, and ISOBMFF are actually similar things.

MSE JavaScript API

At a high level, MSE provides a

  • set of JavaScript APIs to build media streams

  • A splicing and caching model

  • Identifies some byte stream types

  • WebM

  • ISO Base Media File Format

  • MPEG-2 Transport Streams

MSE Internal Structure




MSE 本身的设计是不依赖任务特定的编解码和容器格式的,但是不同的浏览器支持程度是不一样的。

可以通过传递一个 MIME 类型的字符串到静态方法:

> MediaSource.isTypeSupported来检查。比如 ▽
MediaSource.isTypeSupported('audio/mp3'); // false
MediaSource.isTypeSupported('video/mp4'); // true
MediaSource.isTypeSupported('video/mp4; codecs="avc1.4D4028, mp4a.40.2"'); // true

获取 Codec MIME string 的方法可以通过在线的 [mp4info](),或者使用命令行 mp4info test.mp4 | grep Codecs,可以得到类似如下结果 ▽

> mp4info fmp4.mp4| grep Codec
    Codecs String: mp4a.40.2
    Codecs String: avc1.42E01E

当前,H.264 + AAC 的 MP4 容器在所有的浏览器都支持。

普通的 MP4 文件是不能和 MSE 一起使用的, 需要将 MP4 进行 fragment 化。

检查一个 MP4 是否已经 fragment 的方法 ▽

> mp4dump test.mp4 | grep "\[m"

如果是non-fragment会显示如下信息 ▽

> mp4dump nfmp4.mp4 | grep "\[m"
[mdat] size=8+50873
[moov] size=8+7804
  [mvhd] size=12+96
    [mdia] size=8+3335
      [mdhd] size=12+20
      [minf] size=8+3250
    [mdia] size=8+3975
      [mdhd] size=12+20
      [minf] size=8+3890
            [mp4a] size=8+82
    [meta] size=12+78
如果已经 fragment,会显示如下的类似信息 ▽
>  mp4dump fmp4.mp4 | grep "\[m" | head -n 30
[moov] size=8+1871
  [mvhd] size=12+96
    [mdia] size=8+312
      [mdhd] size=12+20
      [minf] size=8+219
            [mp4a] size=8+67
    [mdia] size=8+371
      [mdhd] size=12+20
      [minf] size=8+278
    [mdia] size=8+248
      [mdhd] size=12+20
      [minf] size=8+156
    [mdia] size=8+248
      [mdhd] size=12+20
      [minf] size=8+156
  [mvex] size=8+144
    [mehd] size=12+4
[moof] size=8+600
  [mfhd] size=12+4
[mdat] size=8+138679
[moof] size=8+536
  [mfhd] size=12+4
[mdat] size=8+24490
[moof] size=8+592
  [mfhd] size=12+4
[mdat] size=8+14444
[moof] size=8+312
  [mfhd] size=12+4
[mdat] size=8+1840
[moof] size=8+600

把一个 non-fragment MP4 转换成 fragment MP4。

可以使用 FFmpeg 的 -movflags 来转换。

对于原始文件为非 MP4 文件 ▽

> ffmpeg -i trailer_1080p.mov -c:v copy -c:a copy -movflags frag_keyframe+empty_moov bunny_fragmented.mp4

对于原始文件已经是 MP4 文件 ▽

> ffmpeg -i non_fragmented.mp4 -movflags frag_keyframe+empty_moov fragmented.mp4

或者使用 mp4fragment ▽

> mp4fragment input.mp4 output.mp4

DEMO TIME

最后阶段,展示两个demo,分别是 MSE Vod Demo、MSE Live Demo

MSE Vod Demo

展示利用 MSE 和 WebSocket 实现一个点播服务

后端读取一个 fMP4 文件,通过 WebSocket 发送给 MSE,进行播放

展示利用 MSE 和 WebSocket 实现一个直播服务

后端代理一条 HTTP-FLV 直播流,通过 WebSocket 发送给 MSE,进行播放

前端 MSE 部分做了很多工作, 包括将 flv 实时转封装成了 fMP4,这里引用了 videojs-flow 的实现

Refs

WebSocket

  • rfc6455

  • HTTP Upgrade

  • WebSocket API

  • MDN WebSocket

  • videojs-flow

MSE

  • W3C

  • MDN MSE

  • HTML5 Codec MIME

又拍直播云是基于又拍云内容分发网络为直播应用提供超低延迟、高码率、高并发的整套从推流端到播放端的一站式解决方案。包括实时转码,实时录制,分发加速,水印,截图,秒级禁播,延时直播等功能。直播源站支持自主源站或又拍云源,为支持用户在不同终端播放,支持 RTMP、HLS、HTTP-flv 播放输出。

详情了解:

推荐阅读:

无连麦,不直播,都在说的直播利器连麦互动到底是啥?
技术干货|移动直播六大关键技术详解
又拍直播云SDK,自带美颜、滤镜、消噪、人声增益等功能
又拍直播云功能处理篇:转码、录制、视频水印、视频截图
又拍直播云功能基础篇:推流和拉流、多协议输出、多访问方式、回源端口自定义
又拍直播云功能高级篇:防盗链、秒级禁播、自动鉴黄、API接口

The above is the detailed content of WebSocket+MSE——HTML5 live broadcast technology analysis. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
The Building Blocks of H5 Code: Key Elements and Their PurposeThe Building Blocks of H5 Code: Key Elements and Their PurposeApr 23, 2025 am 12:09 AM

Key elements of HTML5 include,,,,,, etc., which are used to build modern web pages. 1. Define the head content, 2. Used to navigate the link, 3. Represent the content of independent articles, 4. Organize the page content, 5. Display the sidebar content, 6. Define the footer, these elements enhance the structure and functionality of the web page.

HTML5 and H5: Understanding the Common UsageHTML5 and H5: Understanding the Common UsageApr 22, 2025 am 12:01 AM

There is no difference between HTML5 and H5, which is the abbreviation of HTML5. 1.HTML5 is the fifth version of HTML, which enhances the multimedia and interactive functions of web pages. 2.H5 is often used to refer to HTML5-based mobile web pages or applications, and is suitable for various mobile devices.

HTML5: The Building Blocks of the Modern Web (H5)HTML5: The Building Blocks of the Modern Web (H5)Apr 21, 2025 am 12:05 AM

HTML5 is the latest version of the Hypertext Markup Language, standardized by W3C. HTML5 introduces new semantic tags, multimedia support and form enhancements, improving web structure, user experience and SEO effects. HTML5 introduces new semantic tags, such as, ,, etc., to make the web page structure clearer and the SEO effect better. HTML5 supports multimedia elements and no third-party plug-ins are required, improving user experience and loading speed. HTML5 enhances form functions and introduces new input types such as, etc., which improves user experience and form verification efficiency.

H5 Code: Writing Clean and Efficient HTML5H5 Code: Writing Clean and Efficient HTML5Apr 20, 2025 am 12:06 AM

How to write clean and efficient HTML5 code? The answer is to avoid common mistakes by semanticizing tags, structured code, performance optimization and avoiding common mistakes. 1. Use semantic tags such as, etc. to improve code readability and SEO effect. 2. Keep the code structured and readable, using appropriate indentation and comments. 3. Optimize performance by reducing unnecessary tags, using CDN and compressing code. 4. Avoid common mistakes, such as the tag not closed, and ensure the validity of the code.

H5: How It Enhances User Experience on the WebH5: How It Enhances User Experience on the WebApr 19, 2025 am 12:08 AM

H5 improves web user experience with multimedia support, offline storage and performance optimization. 1) Multimedia support: H5 and elements simplify development and improve user experience. 2) Offline storage: WebStorage and IndexedDB allow offline use to improve the experience. 3) Performance optimization: WebWorkers and elements optimize performance to reduce bandwidth consumption.

Deconstructing H5 Code: Tags, Elements, and AttributesDeconstructing H5 Code: Tags, Elements, and AttributesApr 18, 2025 am 12:06 AM

HTML5 code consists of tags, elements and attributes: 1. The tag defines the content type and is surrounded by angle brackets, such as. 2. Elements are composed of start tags, contents and end tags, such as contents. 3. Attributes define key-value pairs in the start tag, enhance functions, such as. These are the basic units for building web structure.

Understanding H5 Code: The Fundamentals of HTML5Understanding H5 Code: The Fundamentals of HTML5Apr 17, 2025 am 12:08 AM

HTML5 is a key technology for building modern web pages, providing many new elements and features. 1. HTML5 introduces semantic elements such as, , etc., which enhances web page structure and SEO. 2. Support multimedia elements and embed media without plug-ins. 3. Forms enhance new input types and verification properties, simplifying the verification process. 4. Offer offline and local storage functions to improve web page performance and user experience.

H5 Code: Best Practices for Web DevelopersH5 Code: Best Practices for Web DevelopersApr 16, 2025 am 12:14 AM

Best practices for H5 code include: 1. Use correct DOCTYPE declarations and character encoding; 2. Use semantic tags; 3. Reduce HTTP requests; 4. Use asynchronous loading; 5. Optimize images. These practices can improve the efficiency, maintainability and user experience of web pages.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

VSCode Windows 64-bit Download

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.