Home > Article > Web Front-end > Vite Learning's Deep Analysis 'Dependency Scanning'
This article will give you an in-depth explanation of the implementation details of dependency scanning in Vite. The final scanning result is an object containing the names of multiple modules. It does not involve the pre-building process or how the pre-built products are used.
When we run Vite for the first time, Vite will perform dependency pre-building in order to be compatible with CommonJS and UMD. And Improve performance. [Related recommendations: vuejs video tutorial]
To pre-build dependencies, you must first understand these two issues:
Pre-build What is the content of ? / Which modules need to be pre-built?
How to find the modules that need to be pre-built?
These two problems are actually dependence on the content of scanning and the implementation method.
This article will explain the implementation details of dependency scanning in depth. The final scanning result is an object containing the names of multiple modules. It does not involve the pre-building process or how the pre-built product is. in use. If you are interested in this part of the content, you can follow me and wait for subsequent articles.
Only bare import (bare dependency) will perform dependency pre-building
What is bare import?Look directly at the example below
// vue 是 bare import import xxx from "vue" import xxx from "vue/xxx" // 以下不是裸依赖 import xxx from "./foo.ts" import xxx from "/foo.ts"You can divide it simply:
Vue project
The results of the dependency scan are as follows:[ "vue", "axios" ]
Why only bare import is pre-built?
Node.js defines the addressing mechanism of bare import - Search under node_modules in the current directory. If not found, go to node_modules in the upper directory until The directory is the root path and cannot go higher.
bare import is usually a module installed by npm. It is a third-party module, not the code we wrote ourselves. Generally, will not be modified. Therefore, executing the construction of these modules in advance will help improve performance.
On the contrary, if the code written by the developer is pre-built, the project is packaged into a chunk file.When the developer modifies the code, he needs to re-perform the build and then package it into a chunk file. , this process will actually affect performance.
Will modules under monorepo also be pre-built?Won't. Because in the case of monorepo, although some modules are bare import, these modules are also written by the developers themselves and are not third-party modules, so Vite does not pre-build these modules. Actually, Vite will
determine whether the actual path of the module is in node_modules :
Implementation ideas
Let’s take a look at this module dependency tree again : To scan out all bare imports, you need to traverse the entire dependency tree, which involves a deep traversal of thetree
When we discuss tree traversal, we generally focus on these two points:When all leaf nodes are traversed, the recorded bare import object is the result of the dependency scan.
The implementation idea of relying on scanning is actually very easy to understand, but the actual processing is not simple. Let’s take a look at the processing of leaf nodes:It can be judged by the module id. The module whose module id is not a path is bare import. When encountering these modules, record the dependencies and no longer traverse in depth.
You can judge by the suffix name of the module, for example, if you encounter *.css
Module, does not require any processing, and no further traversal is required.
To obtain the dependent submodules in the JS code, you need to convert the code into AST and obtain the module introduced by the import statement, or Regularly match all imported modules , and then continue to traverse these modules in depth
This type of module is more complicated. For example, HTML or Vue contains part of JS. You need toextract this part of the JS code, and then analyze and process it according to the JS module. Continue to traverse these modules in depth. We only need to care about the JS part here, and modules will not be introduced in other parts.
Specific implementation
We already know the implementation idea of dependency scanning,The idea is actually It's not complicated. What's complicated is the processing , especially the processing of HTML, Vue and other modules.
Vite uses a more clever method here - Packaging with the esbuild tool
Why can esbuild packaging be used to replace the deep traversal process?
EssentiallyThe packaging process is also a process of deep traversal of modules. The alternative method is as follows:
Depth traversal | esbuild packaging |
---|---|
Processing of leaf nodes | esbuild can perform each module (leaf node) Parsing and loading These two processes can be extended through plug-ins and add some special logic For example, convert html to js during the loading process |
Do not process the module in depth | esbuild can specify the currently parsed module as external during the parsing process, then esbuild will no longer parse and load the module in depth Module . |
Traverse the module in depth | Parse the module normally (do nothing, esbuild default behavior), return the real path of the module’s file |
It doesn’t matter if you don’t understand this part for now, there will be examples later
Processing of various modules
Example | Processing | |
---|---|---|
vue
| During the parsing process, save the naked dependency to the deps object and set it to external | |
less file
| During the parsing process, is set to external | |
./mian.ts
| Just parse and load normally, esbuild itself can handle JS||
index.html, app.vue
| During the loading process, these modules are loaded into JS |
例子 | 处理 | |
---|---|---|
bare import | vue |
在解析过程中,将裸依赖保存到 deps 对象中,设置为 external |
其他 JS 无关的模块 | less文件 |
在解析过程中,设置为 external |
JS 模块 | ./mian.ts |
正常解析和加载即可,esbuild 本身能处理 JS |
html 类型模块 |
index.html 、app.vue
|
在加载过程中,将这些模块加载成 JS |
esbuild 本身就能处理 JS 语法,因此 JS 是不需要任何处理的,esbuild 能够分析出 JS 文件中的依赖,并进一步深入处理这些依赖。
// external urls build.onResolve({ filter: /^(https?:)?\/\// }, ({ path }) => ({ path, external: true })) // external css 等文件 build.onResolve( { filter: /\.(css|less|sass|scss|styl|stylus|pcss|postcss|json|wasm)$/ }, ({ path }) => ({ path, external: true } ) // 省略其他 JS 无关的模块
这部分处理非常简单,直接匹配,然后 external 就行了
build.onResolve( { // 第一个字符串为字母或 @,且第二个字符串不是 : 冒号。如 vite、@vite/plugin-vue // 目的是:避免匹配 window 路径,如 D:/xxx filter: /^[\w@][^:]/ }, async ({ path: id, importer, pluginData }) => { // depImports 为 if (depImports[id]) { return externalUnlessEntry({ path: id }) } // 将模块路径转换成真实路径,实际上调用 container.resolveId const resolved = await resolve(id, importer, { custom: { depScan: { loader: pluginData?.htmlType?.loader } } }) // 如果解析到路径,证明找得到依赖 // 如果解析不到路径,则证明找不到依赖,要记录下来后面报错 if (resolved) { if (shouldExternalizeDep(resolved, id)) { return externalUnlessEntry({ path: id }) } // 如果模块在 node_modules 中,则记录 bare import if (resolved.includes('node_modules')) { // 记录 bare import depImports[id] = resolved return { path, external: true } } // isScannable 判断该文件是否可以扫描,可扫描的文件有 JS、html、vue 等 // 因为有可能裸依赖的入口是 css 等非 JS 模块的文件 else if (isScannable(resolved)) { // 真实路径不在 node_modules 中,则证明是 monorepo,实际上代码还是在用户的目录中 // 是用户自己写的代码,不应该 external return { path: path.resolve(resolved) } } else { // 其他模块不可扫描,直接忽略,external return { path, external: true } } } else { // 解析不到依赖,则记录缺少的依赖 missing[id] = normalizePath(importer) } } )
如: index.html
、app.vue
const htmlTypesRE = /\.(html|vue|svelte|astro)$/ // html types: 提取 script 标签 build.onResolve({ filter: htmlTypesRE }, async ({ path, importer }) => { // 将模块路径,转成文件的真实路径 const resolved = await resolve(path, importer) if (!resolved) return // 不处理 node_modules 内的 if (resolved.includes('node_modules'){ return } return { path: resolved, // 标记 namespace 为 html namespace: 'html' } })
解析过程很简单,只是用于过滤掉一些不需要的模块,并且标记 namespace 为 html
真正的处理在加载阶段:
// 正则,匹配例子: <script></script> const scriptModuleRE = /(<script>]*type\s*=\s*(?:"module"|'module')[^>]*>)(.*?)<\/script>/gims // 正则,匹配例子: <script></script> export const scriptRE = /(<script>]*>|>))(.*?)<\/script>/gims build.onLoad( { filter: htmlTypesRE, namespace: 'html' }, async ({ path }) => { // 读取源码 let raw = fs.readFileSync(path, 'utf-8') // 去掉注释,避免后面匹配到注释 raw = raw.replace(commentRE, '<!---->') const isHtml = path.endsWith('.html') // scriptModuleRE: <script type=module></script> // scriptRE: <script></script> // html 模块,需要匹配 module 类型的 script,因为只有 module 类型的 script 才能使用 import const regex = isHtml ? scriptModuleRE : scriptRE // 重置正则表达式的索引位置,因为同一个正则表达式对象,每次匹配后,lastIndex 都会改变 // regex 会被重复使用,每次都需要重置为 0,代表从第 0 个字符开始正则匹配 regex.lastIndex = 0 // load 钩子返回值,表示加载后的 js 代码 let js = '' let scriptId = 0 let match: RegExpExecArray | null // 匹配源码的 script 标签,用 while 循环,因为 html 可能有多个 script 标签 while ((match = regex.exec(raw))) { // openTag: 它的值的例子: <script> // content: script 标签的内容 const [, openTag, content] = match // 正则匹配出 openTag 中的 type 和 lang 属性 const typeMatch = openTag.match(typeRE) const type = typeMatch && (typeMatch[1] || typeMatch[2] || typeMatch[3]) const langMatch = openTag.match(langRE) const lang = langMatch && (langMatch[1] || langMatch[2] || langMatch[3]) // 跳过 type="application/ld+json" 和其他非 non-JS 类型 if ( type && !( type.includes('javascript') || type.includes('ecmascript') || type === 'module' ) ) { continue } // esbuild load 钩子可以设置 应的 loader let loader: Loader = 'js' if (lang === 'ts' || lang === 'tsx' || lang === 'jsx') { loader = lang } else if (path.endsWith('.astro')) { loader = 'ts' } // 正则匹配出 script src 属性 const srcMatch = openTag.match(srcRE) // 有 src 属性,证明是外部 script if (srcMatch) { const src = srcMatch[1] || srcMatch[2] || srcMatch[3] // 外部 script,改为用 import 用引入外部 script js += `import ${JSON.stringify(src)}\n` } else if (content.trim()) { // 内联的 script,它的内容要做成虚拟模块 // 缓存虚拟模块的内容 // 一个 html 可能有多个 script,用 scriptId 区分 const key = `${path}?id=${scriptId++}` scripts[key] = { loader, content, pluginData: { htmlType: { loader } } } // 虚拟模块的路径,如 virtual-module:D:/project/index.html?id=0 const virtualModulePath = virtualModulePrefix + key js += `export * from ${virtualModulePath}\n` } } return { loader: 'js', contents: js } } )</script>
加载阶段的主要做有以下流程:
srcMatch[1] || srcMatch[2] || srcMatch[3] 是干嘛?
我们来看看匹配的表达式:
const srcRE = /\bsrc\s*=\s*(?:"([^"]+)"|'([^']+)'|([^\s'">]+))/im
因为 src 可以有以下三种写法:
三种情况会出现其中一种,因此是三个捕获组
虚拟模块是如何加载成对应的 script 代码的?
export const virtualModuleRE = /^virtual-module:.*/ // 匹配所有的虚拟模块,namespace 标记为 script build.onResolve({ filter: virtualModuleRE }, ({ path }) => { return { // 去掉 prefix // virtual-module:D:/project/index.html?id=0 => D:/project/index.html?id=0 path: path.replace(virtualModulePrefix, ''), namespace: 'script' } }) // 之前的内联 script 内容,保存到 script 对象,加载虚拟模块的时候取出来 build.onLoad({ filter: /.*/, namespace: 'script' }, ({ path }) => { return scripts[path] })
虚拟模块的加载很简单,直接从 script 对象中,读取之前缓存起来的内容即可。
这样之后,我们就可以把 html 类型的模块,转换成 JS 了
扫描结果
下面是一个 depImport 对象的例子:
{ "vue": "D:/app/vite/node_modules/.pnpm/vue@3.2.37/node_modules/vue/dist/vue.runtime.esm-bundler.js", "vue/dist/vue.d.ts": "D:/app/vite/node_modules/.pnpm/vue@3.2.37/node_modules/vue/dist/vue.d.ts", "lodash-es": "D:/app/vite/node_modules/.pnpm/lodash-es@4.17.21/node_modules/lodash-es/lodash.js" }
依赖扫描是预构建前的一个非常重要的步骤,这决定了 Vite 需要对哪些依赖进行预构建。
本文介绍了 Vite 会对哪些内容进行依赖预构建,然后分析了实现依赖扫描的基本思路 —— 深度遍历依赖树,并对各种类型的模块进行处理。然后介绍了 Vite 如何巧妙的使用 esbuild 实现这一过程。最后对这部分的源码进行了解析:
最后获取到的 depImport 是一个记录依赖以及其真实路径的对象
The above is the detailed content of Vite Learning's Deep Analysis 'Dependency Scanning'. For more information, please follow other related articles on the PHP Chinese website!