This time I will bring you javascriptHow to make a decision tree, and what are the precautions for making a decision tree using javascript. The following is a practical case, let's take a look.
Decision tree algorithm code implementation1. Prepare test dataHere I assume that there is a young lady in the company meeting on a blind date as an exampleThe following is the result of having met or been Eliminated data (part of the data was generated using mock.js):
var data = [ { "姓名": "余夏", "年龄": 29, "长相": "帅", "体型": "瘦", "收入": "高", 见面: "见" }, { "姓名": "豆豆", "年龄": 25, "长相": "帅", "体型": "瘦", "收入": "高", 见面: "见" }, { "姓名": "帅常荣", "年龄": 26, "长相": "帅", "体型": "胖", "收入": "高", 见面: "见" }, { "姓名": "王涛", "年龄": 22, "长相": "帅", "体型": "瘦", "收入": "高", 见面: "见" }, { "姓名": "李东", "年龄": 23, "长相": "帅", "体型": "瘦", "收入": "高", 见面: "见" }, { "姓名": "王五五", "年龄": 23, "长相": "帅", "体型": "瘦", "收入": "低", 见面: "见" }, { "姓名": "王小涛", "年龄": 22, "长相": "帅", "体型": "瘦", "收入": "低", 见面: "见" }, { "姓名": "李缤", "年龄": 21, "长相": "帅", "体型": "胖", "收入": "高", 见面: "见" }, { "姓名": "刘明", "年龄": 21, "长相": "帅", "体型": "胖", "收入": "低", 见面: "不见" }, { "姓名": "红鹤", "年龄": 21, "长相": "不帅", "体型": "胖", "收入": "高", 见面: "不见" }, { "姓名": "李理", "年龄": 32, "长相": "帅", "体型": "瘦", "收入": "高", 见面: "不见" }, { "姓名": "周州", "年龄": 31, "长相": "帅", "体型": "瘦", "收入": "高", 见面: "不见" }, { "姓名": "李乐", "年龄": 27, "长相": "不帅", "体型": "胖", "收入": "高", 见面: "不见" }, { "姓名": "韩明", "年龄": 24, "长相": "不帅", "体型": "瘦", "收入": "高", 见面: "不见" }, { "姓名": "小吕", "年龄": 28, "长相": "帅", "体型": "瘦", "收入": "低", 见面: "不见" }, { "姓名": "李四", "年龄": 25, "长相": "帅", "体型": "瘦", "收入": "低", 见面: "不见" }, { "姓名": "王鹏", "年龄": 30, "长相": "帅", "体型": "瘦", "收入": "低", 见面: "不见" }, ];2. Build the basic function of the decision tree Code:
function DecisionTree(config) { if (typeof config == "object" && !Array.isArray(config)) this.training(config); }; DecisionTree.prototype = { //分割函数 _predicates: {}, //统计属性值在数据集中的次数 countUniqueValues(items, attr) {}, //获取对象中值最大的Key 假设 counter={a:9,b:2} 得到 "a" getMaxKey(counter) {}, //寻找最频繁的特定属性值 mostFrequentValue(items, attr) {}, //根据属性切割数据集 split(items, attr, predicate, pivot) {}, //计算熵 entropy(items, attr) {}, //生成决策树 buildDecisionTree(config) {}, //初始化生成决策树 training(config) {}, //预测 测试 predict(data) {}, };var decisionTree = new DecisionTree();3. Implement the function Because some functions are too simple, I will not explain them.
You can go to JS Simple Implementation of Decision Tree (ID3 Algorithm)_demo.html to view the complete code
It contains comments and tests for each function Method
Code:
//......略//统计属性值在数据集中的次数countUniqueValues(items, attr) { var counter = {}; // 获取不同的结果值 与出现次数 for (var i of items) { if (!counter[i[attr]]) counter[i[attr]] = 0; counter[i[attr]] += 1; } return counter; },//......略//计算熵entropy(items, attr) { var counter = this.countUniqueValues(items, attr); //计算值的出现数 var p, entropy = 0; //H(S)=entropy=∑(P(Xi)(log2(P(Xi)))) for (var i in counter) { p = counter[i] / items.length; //P(Xi)概率值 entropy += -p * Math.log2(p); //entropy+=-(P(Xi)(log2(P(Xi)))) } return entropy; },//......略var decisionTree = new DecisionTree();console.log("函数 countUniqueValues 测试:");console.log(" 长相", decisionTree.countUniqueValues(data, "长相")); //测试console.log(" 年龄", decisionTree.countUniqueValues(data, "年龄")); //测试console.log(" 收入", decisionTree.countUniqueValues(data, "收入")); //测试console.log("函数 entropy 测试:");console.log(" 长相", decisionTree.entropy(data, "长相")); //测试console.log(" 年龄", decisionTree.entropy(data, "年龄")); //测试console.log(" 收入", decisionTree.entropy(data, "收入")); //测试3.2. Information gainFormulaAccording to the formula we know that to get the value of information gain we need to get:H(S) training set entropyp(t) branch element proportionH(t) branch data set entropyWe will divide t first match (suitable) and on match (unsuitable), so H(t):H(match) The entropy of the suitable data set after splittingH(on match) after splitting Entropy of inappropriate data setSo the information gain G=H(S)-(p(match)H(match)+p(on match)H(on match))
Because p( match)=number of matches/total number of items in the data set
Information gainG=H(S)-((number of matches)xH(match)+(number of on match)xH(on match))/total number of items in the data set
//......略buildDecisionTree(config){ var trainingSet = config.trainingSet;//训练集 var categoryAttr = config.categoryAttr;//用于区分的类别属性 //......略 //初始计算 训练集的熵 var initialEntropy = this.entropy(trainingSet, categoryAttr);//= 0)) continue; var pivot = item[attr];// 当前属性的值 var predicateName = ((typeof pivot == 'number') ? '>=' : '=='); //根据数据类型选择判断条件 var attrPredPivot = attr + predicateName + pivot; if (alreadyChecked.indexOf(attrPredPivot) >= 0) continue;//已经计算过则跳过 alreadyChecked.push(attrPredPivot);//记录 var predicate = this._predicates[predicateName];//匹配分割方式 var currSplit = this.split(trainingSet, attr, predicate, pivot); var matchEntropy = this.entropy(currSplit.match, categoryAttr);// H(match) 计算分割后合适的数据集的熵 var notMatchEntropy = this.entropy(currSplit.notMatch, categoryAttr);// H(on match) 计算分割后不合适的数据集的熵 //计算信息增益: // IG(A,S)=H(S)-(∑P(t)H(t))) // t为分裂的子集match(匹配),on match(不匹配) // P(match)=match的长度/数据集的长度 // P(on match)=on match的长度/数据集的长度 var iGain = initialEntropy - ((matchEntropy * currSplit.match.length + notMatchEntropy * currSplit.notMatch.length) / trainingSet.length); //不断匹配最佳增益值对应的节点信息 if (iGain > bestSplit.gain) { //......略 } } } //......递归计算分支}I believe you have mastered the method after reading the case in this article. For more exciting information, please pay attention to other related articles on the php Chinese website! Related reading:
How to use canvas to make a useful graffiti drawing board
How to use s-xlsx to import Excel files and Export (below)
The above is the detailed content of How to make a decision tree in javascript. For more information, please follow other related articles on the PHP Chinese website!

JavaandJavaScriptaredistinctlanguages:Javaisusedforenterpriseandmobileapps,whileJavaScriptisforinteractivewebpages.1)Javaiscompiled,staticallytyped,andrunsonJVM.2)JavaScriptisinterpreted,dynamicallytyped,andrunsinbrowsersorNode.js.3)JavausesOOPwithcl

JavaScript core data types are consistent in browsers and Node.js, but are handled differently from the extra types. 1) The global object is window in the browser and global in Node.js. 2) Node.js' unique Buffer object, used to process binary data. 3) There are also differences in performance and time processing, and the code needs to be adjusted according to the environment.

JavaScriptusestwotypesofcomments:single-line(//)andmulti-line(//).1)Use//forquicknotesorsingle-lineexplanations.2)Use//forlongerexplanationsorcommentingoutblocksofcode.Commentsshouldexplainthe'why',notthe'what',andbeplacedabovetherelevantcodeforclari

The main difference between Python and JavaScript is the type system and application scenarios. 1. Python uses dynamic types, suitable for scientific computing and data analysis. 2. JavaScript adopts weak types and is widely used in front-end and full-stack development. The two have their own advantages in asynchronous programming and performance optimization, and should be decided according to project requirements when choosing.

Whether to choose Python or JavaScript depends on the project type: 1) Choose Python for data science and automation tasks; 2) Choose JavaScript for front-end and full-stack development. Python is favored for its powerful library in data processing and automation, while JavaScript is indispensable for its advantages in web interaction and full-stack development.

Python and JavaScript each have their own advantages, and the choice depends on project needs and personal preferences. 1. Python is easy to learn, with concise syntax, suitable for data science and back-end development, but has a slow execution speed. 2. JavaScript is everywhere in front-end development and has strong asynchronous programming capabilities. Node.js makes it suitable for full-stack development, but the syntax may be complex and error-prone.

JavaScriptisnotbuiltonCorC ;it'saninterpretedlanguagethatrunsonenginesoftenwritteninC .1)JavaScriptwasdesignedasalightweight,interpretedlanguageforwebbrowsers.2)EnginesevolvedfromsimpleinterpreterstoJITcompilers,typicallyinC ,improvingperformance.

JavaScript can be used for front-end and back-end development. The front-end enhances the user experience through DOM operations, and the back-end handles server tasks through Node.js. 1. Front-end example: Change the content of the web page text. 2. Backend example: Create a Node.js server.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 English version
Recommended: Win version, supports code prompts!

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Dreamweaver CS6
Visual web development tools
