搜索
首页后端开发Python教程ClassiSage:基于 Terraform IaC 自动化 AWS SageMaker HDFS 日志分类模型

经典圣人

使用 AWS SageMaker 及其 Python SDK 制作的机器学习模型,用于使用 Terraform 实现基础设施设置自动化的 HDFS 日志分类。

链接:GitHub
语言:HCL(terraform)、Python

内容

  • 概述:项目概述。
  • 系统架构:系统架构图
  • ML 模型:模型概述。
  • 入门:如何运行项目。
  • 控制台观察:运行项目时可以观察到的实例和基础设施的变化。
  • 结束和清理:确保不产生额外费用。
  • 自动创建的对象:在执行过程中创建的文件和文件夹。

  • 首先遵循目录结构以便更好地设置项目。
  • 从 GitHub 上传的 ClassiSage 项目存储库中获取主要参考,以便更好地理解。

概述

  • 该模型是使用 AWS SageMaker 进行 HDFS 日志分类以及用于存储数据集的 S3、Notebook 文件(包含 SageMaker 实例的代码)和模型输出。
  • 基础设施设置是使用 Terraform 自动化的,Terraform 是一个由 HashiCorp 创建的提供基础设施即代码的工具
  • 使用的数据集是HDFS_v1。
  • 该项目使用模型 XGBoost 版本 1.2 实现 SageMaker Python SDK

系统架构

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

机器学习模型

  • 图像 URI
  # Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference.
  container = get_image_uri(boto3.Session().region_name,
                            'xgboost', 
                            repo_version='1.0-1')

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

  • 初始化对容器的超参数和估计器调用
  hyperparameters = {
        "max_depth":"5",                ## Maximum depth of a tree. Higher means more complex models but risk of overfitting.
        "eta":"0.2",                    ## Learning rate. Lower values make the learning process slower but more precise.
        "gamma":"4",                    ## Minimum loss reduction required to make a further partition on a leaf node. Controls the model’s complexity.
        "min_child_weight":"6",         ## Minimum sum of instance weight (hessian) needed in a child. Higher values prevent overfitting.
        "subsample":"0.7",              ## Fraction of training data used. Reduces overfitting by sampling part of the data. 
        "objective":"binary:logistic",  ## Specifies the learning task and corresponding objective. binary:logistic is for binary classification.
        "num_round":50                  ## Number of boosting rounds, essentially how many times the model is trained.
        }
  # A SageMaker estimator that calls the xgboost-container
  estimator = sagemaker.estimator.Estimator(image_uri=container,                  # Points to the XGBoost container we previously set up. This tells SageMaker which algorithm container to use.
                                          hyperparameters=hyperparameters,      # Passes the defined hyperparameters to the estimator. These are the settings that guide the training process.
                                          role=sagemaker.get_execution_role(),  # Specifies the IAM role that SageMaker assumes during the training job. This role allows access to AWS resources like S3.
                                          train_instance_count=1,               # Sets the number of training instances. Here, it’s using a single instance.
                                          train_instance_type='ml.m5.large',    # Specifies the type of instance to use for training. ml.m5.2xlarge is a general-purpose instance with a balance of compute, memory, and network resources.
                                          train_volume_size=5, # 5GB            # Sets the size of the storage volume attached to the training instance, in GB. Here, it’s 5 GB.
                                          output_path=output_path,              # Defines where the model artifacts and output of the training job will be saved in S3.
                                          train_use_spot_instances=True,        # Utilizes spot instances for training, which can be significantly cheaper than on-demand instances. Spot instances are spare EC2 capacity offered at a lower price.
                                          train_max_run=300,                    # Specifies the maximum runtime for the training job in seconds. Here, it's 300 seconds (5 minutes).
                                          train_max_wait=600)                   # Sets the maximum time to wait for the job to complete, including the time waiting for spot instances, in seconds. Here, it's 600 seconds (10 minutes).

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

  • 培训工作
  estimator.fit({'train': s3_input_train,'validation': s3_input_test})

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

  • 部署
  xgb_predictor = estimator.deploy(initial_instance_count=1,instance_type='ml.m5.large')

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

  • 验证
  # Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference.
  container = get_image_uri(boto3.Session().region_name,
                            'xgboost', 
                            repo_version='1.0-1')

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

入门

  • 使用 Git Bash 克隆存储库/下载 .zip 文件/分叉存储库。
  • 转到您的 AWS 管理控制台,单击右上角的帐户配置文件,然后从下拉列表中选择我的安全凭证。
  • 创建访问密钥:在访问密钥部分,单击创建新访问密钥,将出现一个对话框,其中包含您的访问密钥 ID 和秘密访问密钥。
  • 下载或复制密钥:(重要)下载 .csv 文件或将密钥复制到安全位置。这是您唯一可以查看秘密访问密钥的时间。
  • 打开克隆的存储库。在你的 VS Code 中
  • 在ClassiSage下创建一个文件为terraform.tfvars,其内容为
  hyperparameters = {
        "max_depth":"5",                ## Maximum depth of a tree. Higher means more complex models but risk of overfitting.
        "eta":"0.2",                    ## Learning rate. Lower values make the learning process slower but more precise.
        "gamma":"4",                    ## Minimum loss reduction required to make a further partition on a leaf node. Controls the model’s complexity.
        "min_child_weight":"6",         ## Minimum sum of instance weight (hessian) needed in a child. Higher values prevent overfitting.
        "subsample":"0.7",              ## Fraction of training data used. Reduces overfitting by sampling part of the data. 
        "objective":"binary:logistic",  ## Specifies the learning task and corresponding objective. binary:logistic is for binary classification.
        "num_round":50                  ## Number of boosting rounds, essentially how many times the model is trained.
        }
  # A SageMaker estimator that calls the xgboost-container
  estimator = sagemaker.estimator.Estimator(image_uri=container,                  # Points to the XGBoost container we previously set up. This tells SageMaker which algorithm container to use.
                                          hyperparameters=hyperparameters,      # Passes the defined hyperparameters to the estimator. These are the settings that guide the training process.
                                          role=sagemaker.get_execution_role(),  # Specifies the IAM role that SageMaker assumes during the training job. This role allows access to AWS resources like S3.
                                          train_instance_count=1,               # Sets the number of training instances. Here, it’s using a single instance.
                                          train_instance_type='ml.m5.large',    # Specifies the type of instance to use for training. ml.m5.2xlarge is a general-purpose instance with a balance of compute, memory, and network resources.
                                          train_volume_size=5, # 5GB            # Sets the size of the storage volume attached to the training instance, in GB. Here, it’s 5 GB.
                                          output_path=output_path,              # Defines where the model artifacts and output of the training job will be saved in S3.
                                          train_use_spot_instances=True,        # Utilizes spot instances for training, which can be significantly cheaper than on-demand instances. Spot instances are spare EC2 capacity offered at a lower price.
                                          train_max_run=300,                    # Specifies the maximum runtime for the training job in seconds. Here, it's 300 seconds (5 minutes).
                                          train_max_wait=600)                   # Sets the maximum time to wait for the job to complete, including the time waiting for spot instances, in seconds. Here, it's 600 seconds (10 minutes).
  • 下载并安装使用 Terraform 和 Python 的所有依赖项。
  • 在终端中输入/粘贴 terraform init 来初始化后端。

  • 然后输入/粘贴 terraform Plan 以查看计划或简单地进行 terraform 验证以确保没有错误。

  • 最后在终端中输入/粘贴 terraform apply --auto-approve

  • 这将显示两个输出,一个作为bucket_name,另一个作为pretrained_ml_instance_name(第三个资源是赋予存储桶的变量名称,因为它们是全局资源)。

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

  • 终端中显示命令完成后,导航到 ClassiSage/ml_ops/function.py 并在文件的第 11 行添加代码
  estimator.fit({'train': s3_input_train,'validation': s3_input_test})

并将其更改为项目目录所在的路径并保存。

  • 然后在 ClassiSageml_opsdata_upload.ipynb 上使用代码运行所有代码单元格,直到单元格编号 25
  xgb_predictor = estimator.deploy(initial_instance_count=1,instance_type='ml.m5.large')

将数据集上传到 S3 Bucket。

  • 代码单元执行的输出

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

  • 执行笔记本后,重新打开您的 AWS 管理控制台。
  • 您可以搜索 S3 和 Sagemaker 服务,并将看到启动的每个服务的实例(S3 存储桶和 SageMaker Notebook)

名为“data-bucket-”的 S3 存储桶,上传了 2 个对象、一个数据集和包含模型代码的 pretrained_sm.ipynb 文件。

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model


  • 转到AWS SageMaker中的笔记本实例,单击创建的实例,然后单击打开Jupyter。
  • 之后,单击窗口右上角的“新建”并选择“在终端上”。
  • 这将创建一个新终端。

  • 在终端上粘贴以下内容(替换为 VS Code 终端输出中显示的bucket_name 输出):
  # Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference.
  container = get_image_uri(boto3.Session().region_name,
                            'xgboost', 
                            repo_version='1.0-1')

将 pretrained_sm.ipynb 从 S3 上传到 Notebook 的 Jupyter 环境的终端命令

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model


  • 返回到打开的 Jupyter 实例,然后单击 pretrained_sm.ipynb 文件将其打开并为其分配 conda_python3 内核。
  • 向下滚动到第四个单元格,并将变量bucket_name的值替换为VS Code的终端输出bucket_name = ""
  hyperparameters = {
        "max_depth":"5",                ## Maximum depth of a tree. Higher means more complex models but risk of overfitting.
        "eta":"0.2",                    ## Learning rate. Lower values make the learning process slower but more precise.
        "gamma":"4",                    ## Minimum loss reduction required to make a further partition on a leaf node. Controls the model’s complexity.
        "min_child_weight":"6",         ## Minimum sum of instance weight (hessian) needed in a child. Higher values prevent overfitting.
        "subsample":"0.7",              ## Fraction of training data used. Reduces overfitting by sampling part of the data. 
        "objective":"binary:logistic",  ## Specifies the learning task and corresponding objective. binary:logistic is for binary classification.
        "num_round":50                  ## Number of boosting rounds, essentially how many times the model is trained.
        }
  # A SageMaker estimator that calls the xgboost-container
  estimator = sagemaker.estimator.Estimator(image_uri=container,                  # Points to the XGBoost container we previously set up. This tells SageMaker which algorithm container to use.
                                          hyperparameters=hyperparameters,      # Passes the defined hyperparameters to the estimator. These are the settings that guide the training process.
                                          role=sagemaker.get_execution_role(),  # Specifies the IAM role that SageMaker assumes during the training job. This role allows access to AWS resources like S3.
                                          train_instance_count=1,               # Sets the number of training instances. Here, it’s using a single instance.
                                          train_instance_type='ml.m5.large',    # Specifies the type of instance to use for training. ml.m5.2xlarge is a general-purpose instance with a balance of compute, memory, and network resources.
                                          train_volume_size=5, # 5GB            # Sets the size of the storage volume attached to the training instance, in GB. Here, it’s 5 GB.
                                          output_path=output_path,              # Defines where the model artifacts and output of the training job will be saved in S3.
                                          train_use_spot_instances=True,        # Utilizes spot instances for training, which can be significantly cheaper than on-demand instances. Spot instances are spare EC2 capacity offered at a lower price.
                                          train_max_run=300,                    # Specifies the maximum runtime for the training job in seconds. Here, it's 300 seconds (5 minutes).
                                          train_max_wait=600)                   # Sets the maximum time to wait for the job to complete, including the time waiting for spot instances, in seconds. Here, it's 600 seconds (10 minutes).

代码单元执行的输出

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model


  • 在文件顶部,转到“内核”选项卡来重新启动。
  • 执行 Notebook 直到代码单元格编号 27,使用代码
  estimator.fit({'train': s3_input_train,'validation': s3_input_test})
  • 您将得到预期的结果。 数据将被获取,在针对具有定义的输出路径的标签和功能进行调整后,分为训练集和测试集,然后使用 SageMaker 的 Python SDK 的模型将被训练、部署为端点、验证以提供不同的指标。

控制台观察笔记

执行第 8 个单元

  xgb_predictor = estimator.deploy(initial_instance_count=1,instance_type='ml.m5.large')
  • 将在S3中设置输出路径来存储模型数据。

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

执行第23个单元

  # Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference.
  container = get_image_uri(boto3.Session().region_name,
                            'xgboost', 
                            repo_version='1.0-1')
  • 训练作业将会开始,您可以在训练选项卡下查看。

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

  • 一段时间后(预计3分钟),它将完成并显示相同的内容。

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

执行第 24 个代码单元

  hyperparameters = {
        "max_depth":"5",                ## Maximum depth of a tree. Higher means more complex models but risk of overfitting.
        "eta":"0.2",                    ## Learning rate. Lower values make the learning process slower but more precise.
        "gamma":"4",                    ## Minimum loss reduction required to make a further partition on a leaf node. Controls the model’s complexity.
        "min_child_weight":"6",         ## Minimum sum of instance weight (hessian) needed in a child. Higher values prevent overfitting.
        "subsample":"0.7",              ## Fraction of training data used. Reduces overfitting by sampling part of the data. 
        "objective":"binary:logistic",  ## Specifies the learning task and corresponding objective. binary:logistic is for binary classification.
        "num_round":50                  ## Number of boosting rounds, essentially how many times the model is trained.
        }
  # A SageMaker estimator that calls the xgboost-container
  estimator = sagemaker.estimator.Estimator(image_uri=container,                  # Points to the XGBoost container we previously set up. This tells SageMaker which algorithm container to use.
                                          hyperparameters=hyperparameters,      # Passes the defined hyperparameters to the estimator. These are the settings that guide the training process.
                                          role=sagemaker.get_execution_role(),  # Specifies the IAM role that SageMaker assumes during the training job. This role allows access to AWS resources like S3.
                                          train_instance_count=1,               # Sets the number of training instances. Here, it’s using a single instance.
                                          train_instance_type='ml.m5.large',    # Specifies the type of instance to use for training. ml.m5.2xlarge is a general-purpose instance with a balance of compute, memory, and network resources.
                                          train_volume_size=5, # 5GB            # Sets the size of the storage volume attached to the training instance, in GB. Here, it’s 5 GB.
                                          output_path=output_path,              # Defines where the model artifacts and output of the training job will be saved in S3.
                                          train_use_spot_instances=True,        # Utilizes spot instances for training, which can be significantly cheaper than on-demand instances. Spot instances are spare EC2 capacity offered at a lower price.
                                          train_max_run=300,                    # Specifies the maximum runtime for the training job in seconds. Here, it's 300 seconds (5 minutes).
                                          train_max_wait=600)                   # Sets the maximum time to wait for the job to complete, including the time waiting for spot instances, in seconds. Here, it's 600 seconds (10 minutes).
  • 端点将部署在推理选项卡下。

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

额外的控制台观察:

  • 在“推理”选项卡下创建端点配置。

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

  • 也在“推理”选项卡下创建模型。

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model


结束和清理

  • 在 VS Code 中返回 data_upload.ipynb 执行最后 2 个代码单元,将 S3 存储桶的数据下载到本地系统。
  • 该文件夹将被命名为downloaded_bucket_content。 已下载文件夹的目录结构。

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

  • 您将在输出单元格中获得下载文件的日志。它将包含原始 pretrained_sm.ipynb、final_dataset.csv 和名为“pretrained-algo”的模型输出文件夹,其中包含 sagemaker 代码文件的执行数据。
  • 最后进入 SageMaker 实例内的 pretrained_sm.ipynb 并执行最后 2 个代码单元。 端点和S3存储桶内的资源将被删除,以确保不会产生额外费用。
  • 删除端点
  estimator.fit({'train': s3_input_train,'validation': s3_input_test})

ClassiSage: Terraform IaC Automated AWS SageMaker based HDFS Log classification Model

  • 清除S3:(需要销毁实例)
  # Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference.
  container = get_image_uri(boto3.Session().region_name,
                            'xgboost', 
                            repo_version='1.0-1')
  • 返回项目文件的 VS Code 终端,然后输入/粘贴 terraform destroy --auto-approve
  • 所有创建的资源实例将被删除。

自动创建的对象

ClassiSage/downloaded_bucket_content
ClassiSage/.terraform
ClassiSage/ml_ops/pycache
ClassiSage/.terraform.lock.hcl
ClassiSage/terraform.tfstate
ClassiSage/terraform.tfstate.backup

注意:
如果您喜欢这个机器学习项目的想法和实现,该项目使用 AWS Cloud 的 S3 和 SageMaker 进行 HDFS 日志分类,使用 Terraform 进行 IaC(基础设施设置自动化),请在查看 GitHub 上的项目存储库后考虑喜欢这篇文章并加星标.

以上是ClassiSage:基于 Terraform IaC 自动化 AWS SageMaker HDFS 日志分类模型的详细内容。更多信息请关注PHP中文网其他相关文章!

声明
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn
Python vs. C:了解关键差异Python vs. C:了解关键差异Apr 21, 2025 am 12:18 AM

Python和C 各有优势,选择应基于项目需求。1)Python适合快速开发和数据处理,因其简洁语法和动态类型。2)C 适用于高性能和系统编程,因其静态类型和手动内存管理。

Python vs.C:您的项目选择哪种语言?Python vs.C:您的项目选择哪种语言?Apr 21, 2025 am 12:17 AM

选择Python还是C 取决于项目需求:1)如果需要快速开发、数据处理和原型设计,选择Python;2)如果需要高性能、低延迟和接近硬件的控制,选择C 。

达到python目标:每天2小时的力量达到python目标:每天2小时的力量Apr 20, 2025 am 12:21 AM

通过每天投入2小时的Python学习,可以有效提升编程技能。1.学习新知识:阅读文档或观看教程。2.实践:编写代码和完成练习。3.复习:巩固所学内容。4.项目实践:应用所学于实际项目中。这样的结构化学习计划能帮助你系统掌握Python并实现职业目标。

最大化2小时:有效的Python学习策略最大化2小时:有效的Python学习策略Apr 20, 2025 am 12:20 AM

在两小时内高效学习Python的方法包括:1.回顾基础知识,确保熟悉Python的安装和基本语法;2.理解Python的核心概念,如变量、列表、函数等;3.通过使用示例掌握基本和高级用法;4.学习常见错误与调试技巧;5.应用性能优化与最佳实践,如使用列表推导式和遵循PEP8风格指南。

在Python和C之间进行选择:适合您的语言在Python和C之间进行选择:适合您的语言Apr 20, 2025 am 12:20 AM

Python适合初学者和数据科学,C 适用于系统编程和游戏开发。1.Python简洁易用,适用于数据科学和Web开发。2.C 提供高性能和控制力,适用于游戏开发和系统编程。选择应基于项目需求和个人兴趣。

Python与C:编程语言的比较分析Python与C:编程语言的比较分析Apr 20, 2025 am 12:14 AM

Python更适合数据科学和快速开发,C 更适合高性能和系统编程。1.Python语法简洁,易于学习,适用于数据处理和科学计算。2.C 语法复杂,但性能优越,常用于游戏开发和系统编程。

每天2小时:Python学习的潜力每天2小时:Python学习的潜力Apr 20, 2025 am 12:14 AM

每天投入两小时学习Python是可行的。1.学习新知识:用一小时学习新概念,如列表和字典。2.实践和练习:用一小时进行编程练习,如编写小程序。通过合理规划和坚持不懈,你可以在短时间内掌握Python的核心概念。

Python与C:学习曲线和易用性Python与C:学习曲线和易用性Apr 19, 2025 am 12:20 AM

Python更易学且易用,C 则更强大但复杂。1.Python语法简洁,适合初学者,动态类型和自动内存管理使其易用,但可能导致运行时错误。2.C 提供低级控制和高级特性,适合高性能应用,但学习门槛高,需手动管理内存和类型安全。

See all articles

热AI工具

Undresser.AI Undress

Undresser.AI Undress

人工智能驱动的应用程序,用于创建逼真的裸体照片

AI Clothes Remover

AI Clothes Remover

用于从照片中去除衣服的在线人工智能工具。

Undress AI Tool

Undress AI Tool

免费脱衣服图片

Clothoff.io

Clothoff.io

AI脱衣机

Video Face Swap

Video Face Swap

使用我们完全免费的人工智能换脸工具轻松在任何视频中换脸!

热工具

Dreamweaver CS6

Dreamweaver CS6

视觉化网页开发工具

WebStorm Mac版

WebStorm Mac版

好用的JavaScript开发工具

VSCode Windows 64位 下载

VSCode Windows 64位 下载

微软推出的免费、功能强大的一款IDE编辑器

SublimeText3 Mac版

SublimeText3 Mac版

神级代码编辑软件(SublimeText3)

螳螂BT

螳螂BT

Mantis是一个易于部署的基于Web的缺陷跟踪工具,用于帮助产品缺陷跟踪。它需要PHP、MySQL和一个Web服务器。请查看我们的演示和托管服务。