Home >Technology peripherals >AI >Overfitting problem of machine learning models
The over-fitting problem of machine learning models and its solution
In the field of machine learning, model over-fitting is a common and challenging problem . When a model performs well on the training set but performs poorly on the test set, it indicates that the model is overfitting. This article will introduce the causes of overfitting problems and their solutions, and provide specific code examples.
2.1 Data Augmentation
Data Augmentation refers to generating more samples by performing a series of transformations on the training set. For example, in image classification tasks, images can be rotated, scaled, flipped, etc. to augment the data. Doing this increases the size of the training set and helps the model generalize better.
The following is a sample code for image data expansion using the Keras library:
from keras.preprocessing.image import ImageDataGenerator # 定义数据扩充器 datagen = ImageDataGenerator( rotation_range=20, # 随机旋转角度范围 width_shift_range=0.1, # 水平平移范围 height_shift_range=0.1, # 垂直平移范围 shear_range=0.2, # 剪切变换范围 zoom_range=0.2, # 缩放范围 horizontal_flip=True, # 随机水平翻转 fill_mode='nearest' # 填充模式 ) # 加载图像数据集 train_data = datagen.flow_from_directory("train/", target_size=(224, 224), batch_size=32, class_mode='binary') test_data = datagen.flow_from_directory("test/", target_size=(224, 224), batch_size=32, class_mode='binary') # 训练模型 model.fit_generator(train_data, steps_per_epoch=len(train_data), epochs=10, validation_data=test_data, validation_steps=len(test_data))
2.2 Regularization (Regularization)
Regularization is by adding a regularization term to the loss function of the model , penalizes the complexity of the model, thereby reducing the risk of overfitting of the model. Common regularization methods include L1 regularization and L2 regularization.
The following is a sample code for L2 regularization using the PyTorch library:
import torch import torch.nn as nn # 定义模型 class MyModel(nn.Module): def __init__(self): super(MyModel, self).__init__() self.fc1 = nn.Linear(10, 10) self.fc2 = nn.Linear(10, 1) def forward(self, x): x = self.fc1(x) x = nn.ReLU()(x) x = self.fc2(x) return x model = MyModel() # 定义损失函数 criterion = nn.MSELoss() # 定义优化器 optimizer = torch.optim.SGD(model.parameters(), lr=0.01, weight_decay=0.001) # 注意weight_decay参数即为正则化项的系数 # 训练模型 for epoch in range(100): optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step()
2.3 Dropout
Dropout is a commonly used regularization technique that randomly drops some data during training. Neurons to reduce the risk of overfitting of the model. Specifically, in each training iteration, we randomly select some neurons to discard with a certain probability p.
The following is a sample code for Dropout using the TensorFlow library:
import tensorflow as tf # 定义模型 model = tf.keras.models.Sequential([ tf.keras.layers.Dense(10, activation=tf.nn.relu, input_shape=(10,)), tf.keras.layers.Dropout(0.5), # dropout率为0.5 tf.keras.layers.Dense(1) ]) # 编译模型 model.compile(optimizer='adam', loss=tf.keras.losses.BinaryCrossentropy(from_logits=True)) # 训练模型 model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))
The above is the detailed content of Overfitting problem of machine learning models. For more information, please follow other related articles on the PHP Chinese website!