專案 - 使用 Python 進行監督學習 - 讓我們使用邏輯回歸來預測心臟病發作的機會-Python教學-PHP中文網

首頁

後端開發

Python教學

專案 - 使用 Python 進行監督學習 - 讓我們使用邏輯回歸來預測心臟病發作的機會

DDD

Jan 18, 2025 pm 10:14 PM

Project - Supervised Learning with Python - Lets use Logistic Regression for Predicting the chances of having a Heart Attack

本教學示範了一個使用 Python 和 LogisticRegression 演算法來預測心臟病發作可能性的機器學習專案。對源自 Kaggle 的資料集進行分析以建立預測模型。

關鍵概念：

邏輯迴歸
StandardScaler（sklearn.預處理）
fit_transform()
train_test_split()
model.predict()
model.predict_proba()
classification_report()
roc_auc_score()

專案目標：

此計畫旨在說明邏輯迴歸在根據患者資料預測心臟病發作風險的實際應用。我們將利用 Python 的功能來建立和評估這個預測模型。

Jupyter Notebook 和資料集可在此處取得：

筆記本：https://www.php.cn/link/aa3f874fb850d8908be9af3a69af4289

資料集：https://www.php.cn/link/4223a1d5b9e017dda51515829140e5d2（Kaggle來源： https://www.php.cn/link/5bb77e5c6d452aee283844d47756dc05）

未來計畫：

未來的教程將探索其他機器學習概念，重點關注監督和無監督學習，如Kaggle 路線圖所述：https://www.php.cn/link/4bea9e07f447fd088811cc81697a4d4e [#機器學習工程師2025 年路線圖]

目標受眾：

本教學是為對學習機器學習感興趣的 Python 愛好者，特別是該領域的新手而設計的。它建立在之前涵蓋線性迴歸的教程的基礎上。

隨意嘗試筆記本並探索不同的機器學習模型！

逐步指南：

第 1 步：資料載入

import pandas as pd

data = pd.read_csv('heart-disease-prediction.csv')
print(data.head())

這使用 pandas 載入資料集。

第 2 步：探索性資料分析 (EDA)

print(data.info())

這提供了資料集結構和資料類型的摘要。

第 3 步：處理缺失資料

print(data.isnull().sum())
data.fillna(data.mean(), inplace=True)
print(data.isnull().sum())

使用每列的平均值來識別和填充缺失值。

第四步：資料預處理

X = data[['age', 'totChol','sysBP','diaBP', 'cigsPerDay','BMI','glucose']]
y = data['TenYearCHD']

選擇相關特徵 (X) 和目標變數 (y)。

第 5 步：資料標準化

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X = scaler.fit_transform(X)

使用 StandardScaler 對資料進行標準化，以提高模型效能。

第 6 步：資料分割

import pandas as pd

data = pd.read_csv('heart-disease-prediction.csv')
print(data.head())

資料集分為訓練集和測試集（80/20 分割）。

第七步：模型訓練

print(data.info())

使用訓練資料訓練邏輯迴歸模型。

第8步：模型評估

print(data.isnull().sum())
data.fillna(data.mean(), inplace=True)
print(data.isnull().sum())

使用 classification_report 和 roc_auc_score 評估模型的表現。

第9步：模型預測

X = data[['age', 'totChol','sysBP','diaBP', 'cigsPerDay','BMI','glucose']]
y = data['TenYearCHD']

經過訓練的模型用於預測新患者心臟病的風險。

提供額外的病患資料以便進一步練習：

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X = scaler.fit_transform(X)

以上是專案 - 使用 Python 進行監督學習 - 讓我們使用邏輯回歸來預測心臟病發作的機會的詳細內容。更多資訊請關注PHP中文網其他相關文章！

陳述

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

列表和陣列之間的選擇如何影響涉及大型數據集的Python應用程序的整體性能？May 03, 2025 am 12:11 AM

ForhandlinglargedatasetsinPython,useNumPyarraysforbetterperformance.1)NumPyarraysarememory-efficientandfasterfornumericaloperations.2)Avoidunnecessarytypeconversions.3)Leveragevectorizationforreducedtimecomplexity.4)Managememoryusagewithefficientdata

說明如何將內存分配給Python中的列表與數組。May 03, 2025 am 12:10 AM

Inpython，ListSusedynamicMemoryAllocationWithOver-Asalose，而alenumpyArraySallaySallocateFixedMemory.1）listssallocatemoremoremoremorythanneededinentientary上，respizeTized.2）numpyarsallaysallaysallocateAllocateAllocateAlcocateExactMemoryForements，OfferingPrediCtableSageButlessemageButlesseflextlessibility。

您如何在Python數組中指定元素的數據類型？May 03, 2025 am 12:06 AM

Inpython，YouCansspecthedatatAtatatPeyFelemereModeRernSpant.1）Usenpynernrump.1）Usenpynyp.dloatp.dloatp.ploatm64，formor professisconsiscontrolatatypes。

什麼是Numpy，為什麼對於Python中的數值計算很重要？May 03, 2025 am 12:03 AM

NumPyisessentialfornumericalcomputinginPythonduetoitsspeed,memoryefficiency,andcomprehensivemathematicalfunctions.1)It'sfastbecauseitperformsoperationsinC.2)NumPyarraysaremorememory-efficientthanPythonlists.3)Itoffersawiderangeofmathematicaloperation

討論'連續內存分配”的概念及其對數組的重要性。May 03, 2025 am 12:01 AM

Contiguousmemoryallocationiscrucialforarraysbecauseitallowsforefficientandfastelementaccess.1)Itenablesconstanttimeaccess,O(1),duetodirectaddresscalculation.2)Itimprovescacheefficiencybyallowingmultipleelementfetchespercacheline.3)Itsimplifiesmemorym

您如何切成python列表？May 02, 2025 am 12:14 AM

SlicingaPythonlistisdoneusingthesyntaxlist[start:stop:step].Here'showitworks:1)Startistheindexofthefirstelementtoinclude.2)Stopistheindexofthefirstelementtoexclude.3)Stepistheincrementbetweenelements.It'susefulforextractingportionsoflistsandcanuseneg

在Numpy陣列上可以執行哪些常見操作？May 02, 2025 am 12:09 AM

numpyallowsforvariousoperationsonArrays：1）basicarithmeticlikeaddition，減法，乘法和division; 2）evationAperationssuchasmatrixmultiplication; 3）element-wiseOperations wiseOperationswithOutexpliitloops; 4）

Python的數據分析中如何使用陣列？May 02, 2025 am 12:09 AM

Arresinpython，尤其是Throughnumpyandpandas，weessentialFordataAnalysis，offeringSpeedAndeffied.1）NumpyArseNable efflaysenable efficefliceHandlingAtaSetSetSetSetSetSetSetSetSetSetSetsetSetSetSetSetsopplexoperationslikemovingaverages.2）

See all articles