在 Node.js 中使用 XGBoost 預測房價-js教程-PHP中文網

首頁

web前端

js教程

在 Node.js 中使用 XGBoost 預測房價

Patricia Arquette

Nov 15, 2024 pm 02:51 PM

Predicting House Prices with XGBoost in Node.js

什麼是 XGBoost？

XGBoost 是一種流行的機器學習演算法，經常在 Kaggle 和其他資料科學競賽中名列前茅。 XGBoost 的與眾不同之處在於它能夠將多個弱模型（在本例中為決策樹）組合成一個強模型。這是透過一種稱為梯度增強的技術來完成的，該技術有助於使演算法穩健且對於各種預測任務都非常有效。

XGBoost 如何運作？

XGBoost 使用梯度提升，這意味著它按順序建立樹，其中每棵樹都嘗試糾正先前樹的錯誤。這是該過程的簡化視圖：

進行初步預測（可以是所有目標值的平均值）
計算這個預測有多錯誤（錯誤）
建立決策樹來預測此錯誤
將此樹的預測添加到我們的運行預測總數中（但按比例縮小以防止過度自信）
重複步驟2-4多次

例如，如果我們預測房價：

第一棵樹可能預測 200,000 美元
如果實際價格為 $250,000，則錯誤為 $50,000
下一棵樹專注於預測這個 50,000 美元的錯誤
最終預測結合了所有樹的預測

這個過程與一些巧妙的數學和優化相結合，使得 XGBoost 既準確又快速。

為什麼在 Node.js 中使用 XGBoost？

雖然 XGBoost 最初是作為 C 庫實現的，但有適用於 Python 和 R 等語言的綁定，使得通常專門從事資料和機器學習的廣泛開發人員可以使用它。

我最近有一個專案對 Node.js 有嚴格的要求，所以我看到了一個透過為 Node.js 編寫綁定來彌補差距的機會。我希望這有助於為 JavaScript 開發人員打開更多 ML 的大門。

在本文中，我們將仔細研究如何在 Node.js 應用程式中使用 XGBoost。

先決條件

開始之前，請確保您已經：

Linux 作業系統（xgboost_node 的目前要求）
Node.js 版本 18.0.0 或更高版本
對機器學習概念的基本了解

安裝

使用 npm 安裝 XGBoost Node.js 綁定：

npm install xgboost_node

了解數據

在進入程式碼之前，讓我們先了解我們的特徵在房價預測範例中代表什麼：

// Each feature array represents:
[square_feet, property_age, total_rooms, has_parking, neighborhood_type, is_furnished]

// Example:
[1200,       8,            10,           0,           1,                1        ]

以下是每個功能的意思：

square_feet: The size of the property (e.g., 1200 sq ft)
property_age: Age of the property in years (e.g., 8 years)
total_rooms: Total number of rooms (e.g., 10 rooms)
has_parking: Binary (0 = no parking, 1 = has parking)
neighborhood_type: Category (1 = residential, 2 = commercial area)
is_furnished: Binary (0 = unfurnished, 1 = furnished)

And the corresponding labels array contains house prices in thousands (e.g., 250 means $250,000).

Transforming Your Data

If you have raw data in a different format, here's how to transform it for XGBoost:

// Let's say you have data in this format:
const rawHouses = [
    {
        address: "123 Main St",
        sqft: 1200,
        yearBuilt: 2015,
        rooms: 10,
        parking: "Yes",
        neighborhood: "Residential",
        furnished: true,
        price: 250000
    },
    // ... more houses
];

// Transform it to XGBoost format:
const features = rawHouses.map(house => [
    house.sqft,
    new Date().getFullYear() - house.yearBuilt,  // Convert year built to age
    house.rooms,
    house.parking === "Yes" ? 1 : 0,             // Convert Yes/No to 1/0
    house.neighborhood === "Residential" ? 1 : 2, // Convert category to number
    house.furnished ? 1 : 0                       // Convert boolean to 1/0
]);

const labels = rawHouses.map(house => house.price / 1000); // Convert price to thousands

Training Your First Model

Here's a complete example that shows how to train a model and make predictions:

import xgboost from 'xgboost_node';

async function test() {
    const features = [
        [1200, 8, 10, 0, 1, 1],
        [800, 14, 15, 1, 2, 0],
        [1200, 8, 10, 0, 1, 1],
        [1200, 8, 10, 0, 1, 1],
        [1200, 8, 10, 0, 1, 1],
        [800, 14, 15, 1, 2, 0],
        [1200, 8, 10, 0, 1, 1],
        [1200, 8, 10, 0, 1, 1],
    ];
    const labels = [250, 180, 250, 180, 250, 180, 250, 180];

    const params = {
        max_depth: 3,
        eta: 0.3,
        objective: 'reg:squarederror',
        eval_metric: 'rmse',
        nthread: 4,
        num_round: 100,
        min_child_weight: 1,
        subsample: 0.8,
        colsample_bytree: 0.8,
    };

    try {
        await xgboost.train(features, labels, params);
        const predictions = await xgboost.predict([[1000, 0, 1, 0, 1, 1], [800, 0, 1, 0, 1, 1]]);
        console.log('Predicted value:', predictions[0]);
    } catch (error) {
        console.error('Error:', error);
    }
}

test();

The example above shows how to:

Set up training data with features and labels
Configure XGBoost parameters for training
Train the model
Make predictions on new data

Model Management

XGBoost provides straightforward methods for saving and loading models:

// Save model after training
await xgboost.saveModel('model.xgb');

// Load model for predictions
await xgboost.loadModel('model.xgb');

Further Considerations

You may have noticed there are parameters for this model. I would advise looking into XGBoost documentation to understand how to tune and choose your parameters. Here's what some of these parameters are trying to achieve:

const params = {
    max_depth: 3,              // Controls how deep each tree can grow
    eta: 0.3,                 // Learning rate - how much we adjust for each tree
    objective: 'reg:squarederror',  // For regression problems
    eval_metric: 'rmse',      // How we measure prediction errors
    nthread: 4,               // Number of parallel processing threads
    num_round: 100,           // Number of trees to build
    min_child_weight: 1,      // Minimum amount of data in a leaf
    subsample: 0.8,           // Fraction of data to use in each tree
    colsample_bytree: 0.8,    // Fraction of features to consider for each tree
};

These parameters significantly impact your model's performance and behavior. For example:

Lower max_depth helps prevent overfitting but might underfit if too low
Lower eta means slower learning but can lead to better generalization
Higher num_round means more trees, which can improve accuracy but increases training time

Conclusion

This guide provides a starting point for using XGBoost in Node.js. For production use, I recommend:

Understanding and tuning the XGBoost parameters for your specific use case
Implementing proper cross-validation to evaluate your model
Testing with different data scenarios to ensure robustness
Monitoring model performance in production

Jonathan Farrow

@farrow_jonny

以上是在 Node.js 中使用 XGBoost 預測房價的詳細內容。更多資訊請關注PHP中文網其他相關文章！

陳述

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

幕後：什麼語言能力JavaScript？Apr 28, 2025 am 12:01 AM

JavaScript在瀏覽器和Node.js環境中運行，依賴JavaScript引擎解析和執行代碼。 1）解析階段生成抽象語法樹（AST）；2）編譯階段將AST轉換為字節碼或機器碼；3）執行階段執行編譯後的代碼。

Python和JavaScript的未來：趨勢和預測Apr 27, 2025 am 12:21 AM

Python和JavaScript的未來趨勢包括：1.Python將鞏固在科學計算和AI領域的地位，2.JavaScript將推動Web技術發展，3.跨平台開發將成為熱門，4.性能優化將是重點。兩者都將繼續在各自領域擴展應用場景，並在性能上有更多突破。

Python vs. JavaScript：開發環境和工具Apr 26, 2025 am 12:09 AM

Python和JavaScript在開發環境上的選擇都很重要。 1)Python的開發環境包括PyCharm、JupyterNotebook和Anaconda，適合數據科學和快速原型開發。 2)JavaScript的開發環境包括Node.js、VSCode和Webpack，適用於前端和後端開發。根據項目需求選擇合適的工具可以提高開發效率和項目成功率。

JavaScript是用C編寫的嗎？檢查證據Apr 25, 2025 am 12:15 AM

是的，JavaScript的引擎核心是用C語言編寫的。 1）C語言提供了高效性能和底層控制，適合JavaScript引擎的開發。 2）以V8引擎為例，其核心用C 編寫，結合了C的效率和麵向對象特性。 3）JavaScript引擎的工作原理包括解析、編譯和執行，C語言在這些過程中發揮關鍵作用。

JavaScript的角色：使網絡交互和動態Apr 24, 2025 am 12:12 AM

JavaScript是現代網站的核心，因為它增強了網頁的交互性和動態性。 1)它允許在不刷新頁面的情況下改變內容，2)通過DOMAPI操作網頁，3)支持複雜的交互效果如動畫和拖放，4)優化性能和最佳實踐提高用戶體驗。

C和JavaScript：連接解釋Apr 23, 2025 am 12:07 AM

C 和JavaScript通過WebAssembly實現互操作性。 1）C 代碼編譯成WebAssembly模塊，引入到JavaScript環境中，增強計算能力。 2）在遊戲開發中，C 處理物理引擎和圖形渲染，JavaScript負責遊戲邏輯和用戶界面。

從網站到應用程序：JavaScript的不同應用Apr 22, 2025 am 12:02 AM

JavaScript在網站、移動應用、桌面應用和服務器端編程中均有廣泛應用。 1)在網站開發中，JavaScript與HTML、CSS一起操作DOM，實現動態效果，並支持如jQuery、React等框架。 2)通過ReactNative和Ionic，JavaScript用於開發跨平台移動應用。 3)Electron框架使JavaScript能構建桌面應用。 4)Node.js讓JavaScript在服務器端運行，支持高並發請求。

Python vs. JavaScript：比較用例和應用程序Apr 21, 2025 am 12:01 AM

Python更適合數據科學和自動化，JavaScript更適合前端和全棧開發。 1.Python在數據科學和機器學習中表現出色，使用NumPy、Pandas等庫進行數據處理和建模。 2.Python在自動化和腳本編寫方面簡潔高效。 3.JavaScript在前端開發中不可或缺，用於構建動態網頁和單頁面應用。 4.JavaScript通過Node.js在後端開發中發揮作用，支持全棧開發。

See all articles