Rumah >Peranti teknologi >AI >Bagaimana untuk melakukan carian grid hiperparameter untuk model PyTorch menggunakan scikit-learn?
scikit-learn ialah perpustakaan pembelajaran mesin terbaik dalam Python, dan PyTorch memberikan kami operasi yang mudah untuk membina model. Bolehkah kelebihannya disepadukan? Dalam artikel ini, kami akan membincangkan cara menggunakan fungsi carian grid dalam scikit-belajar untuk menala hiperparameter model pembelajaran mendalam PyTorch:
Salah satu cara paling mudah untuk menjadikan model PyTorch tersedia dalam scikit-learn ialah menggunakan pakej skorch. Pakej ini menyediakan API serasi pembelajaran scikit untuk model PyTorch. Dalam skorch, terdapat NeuralNetClassifier untuk klasifikasi rangkaian saraf dan NeuralNetRegressor untuk rangkaian neural regresi.
pip install skorch
Untuk menggunakan pembalut ini, anda mesti menentukan model PyTorch anda sebagai kelas menggunakan nn.Module dan kemudian hantar nama kelas kepada parameter modul semasa membina kelas NeuralNetClassifier. Contohnya:
class MyClassifier(nn.Module): def __init__(self): super().__init__() ... def forward(self, x): ... return x # create the skorch wrapper model = NeuralNetClassifier( module=MyClassifier )
Pembina kelas NeuralNetClassifier boleh mendapatkan parameter yang dihantar kepada panggilan model.fit() (kaedah memanggil gelung latihan dalam model scikit-learn), seperti nombor daripada zaman dan saiz kelompok. Sebagai contoh:
model = NeuralNetClassifier( module=MyClassifier, max_epochs=150, batch_size=10 )
Pembina kelas NeuralNetClassifier juga boleh menerima parameter baharu Parameter ini boleh dihantar kepada pembina kelas model anda Keperluan ialah modul__ (dua garis bawah) mesti ditambah hadapannya. Parameter baharu ini mungkin mempunyai nilai lalai dalam pembina, tetapi ia akan ditindih apabila pembungkus menjadikan model tersebut. Contohnya:
import torch.nn as nn from skorch import NeuralNetClassifier class SonarClassifier(nn.Module): def __init__(self, n_layers=3): super().__init__() self.layers = [] self.acts = [] for i in range(n_layers): self.layers.append(nn.Linear(60, 60)) self.acts.append(nn.ReLU()) self.add_module(f"layer{i}", self.layers[-1]) self.add_module(f"act{i}", self.acts[-1]) self.output = nn.Linear(60, 1) def forward(self, x): for layer, act in zip(self.layers, self.acts): x = act(layer(x)) x = self.output(x) return x model = NeuralNetClassifier( module=SonarClassifier, max_epochs=150, batch_size=10, module__n_layers=2 )
Kami boleh mengesahkan keputusan dengan memulakan model dan mencetak:
print(model.initialize()) #结果如下: <class 'skorch.classifier.NeuralNetClassifier'>[initialized]( module_=SonarClassifier( (layer0): Linear(in_features=60, out_features=60, bias=True) (act0): ReLU() (layer1): Linear(in_features=60, out_features=60, bias=True) (act1): ReLU() (output): Linear(in_features=60, out_features=1, bias=True) ), )
Grid search It is teknologi pengoptimuman hiperparameter model. Ia hanya menghabiskan semua kombinasi hiperparameter dan mencari satu yang memberikan skor terbaik. Dalam scikit-learn, kelas GridSearchCV menyediakan teknik ini. Apabila membina kelas ini, kamus hiperparameter mesti disediakan dalam parameter param_grid. Ini ialah peta nama parameter model dan tatasusunan nilai untuk dicuba.
Lalainya ialah menggunakan ketepatan sebagai skor untuk pengoptimuman, tetapi markah lain boleh ditentukan dalam parameter skor pembina GridSearchCV. GridSearchCV akan membina model untuk setiap gabungan parameter untuk dinilai. Dan gunakan pengesahan silang 3 kali ganda lalai, yang boleh ditetapkan melalui parameter.
Berikut ialah contoh mentakrifkan carian grid mudah:
param_grid = { 'epochs': [10,20,30] } grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3) grid_result = grid.fit(X, Y)
Dengan menetapkan parameter n_jobs dalam pembina GridSearchCV kepada -1 bermakna semua teras pada mesin akan digunakan. Jika tidak, proses carian grid hanya akan berjalan dalam satu utas, yang lebih perlahan dalam CPU berbilang teras.
Selepas berjalan, anda boleh mengakses hasil carian grid dalam objek hasil yang dikembalikan oleh grid.fit(). best_score memberikan skor terbaik yang diperhatikan semasa pengoptimuman dan best_params_ menerangkan gabungan parameter yang mencapai hasil terbaik.
Semua contoh kami akan ditunjukkan pada set data pembelajaran mesin standard yang kecil, iaitu set data klasifikasi permulaan diabetes. Ini adalah set data kecil dan semua atribut berangka mudah dikendalikan.
Dalam contoh mudah pertama, kami akan memperkenalkan cara menala saiz kelompok dan bilangan zaman yang digunakan semasa memasang rangkaian.
Kami hanya akan menilai saiz kelompok dari 10 hingga 100, penyenaraian kod adalah seperti berikut:
import random import numpy as np import torch import torch.nn as nn import torch.optim as optim from skorch import NeuralNetClassifier from sklearn.model_selection import GridSearchCV # load the dataset, split into input (X) and output (y) variables dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',') X = dataset[:,0:8] y = dataset[:,8] X = torch.tensor(X, dtype=torch.float32) y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1) # PyTorch classifier class PimaClassifier(nn.Module): def __init__(self): super().__init__() self.layer = nn.Linear(8, 12) self.act = nn.ReLU() self.output = nn.Linear(12, 1) self.prob = nn.Sigmoid() def forward(self, x): x = self.act(self.layer(x)) x = self.prob(self.output(x)) return x # create model with skorch model = NeuralNetClassifier( PimaClassifier, criterinotallow=nn.BCELoss, optimizer=optim.Adam, verbose=False ) # define the grid search parameters param_grid = { 'batch_size': [10, 20, 40, 60, 80, 100], 'max_epochs': [10, 50, 100] } grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3) grid_result = grid.fit(X, y) # summarize results print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_)) means = grid_result.cv_results_['mean_test_score'] stds = grid_result.cv_results_['std_test_score'] params = grid_result.cv_results_['params'] for mean, stdev, param in zip(means, stds, params): print("%f (%f) with: %r" % (mean, stdev, param))
Hasilnya adalah seperti berikut:
Best: 0.714844 using {'batch_size': 10, 'max_epochs': 100} 0.665365 (0.020505) with: {'batch_size': 10, 'max_epochs': 10} 0.588542 (0.168055) with: {'batch_size': 10, 'max_epochs': 50} 0.714844 (0.032369) with: {'batch_size': 10, 'max_epochs': 100} 0.671875 (0.022326) with: {'batch_size': 20, 'max_epochs': 10} 0.696615 (0.008027) with: {'batch_size': 20, 'max_epochs': 50} 0.714844 (0.019918) with: {'batch_size': 20, 'max_epochs': 100} 0.666667 (0.009744) with: {'batch_size': 40, 'max_epochs': 10} 0.687500 (0.033603) with: {'batch_size': 40, 'max_epochs': 50} 0.707031 (0.024910) with: {'batch_size': 40, 'max_epochs': 100} 0.667969 (0.014616) with: {'batch_size': 60, 'max_epochs': 10} 0.694010 (0.036966) with: {'batch_size': 60, 'max_epochs': 50} 0.694010 (0.042473) with: {'batch_size': 60, 'max_epochs': 100} 0.670573 (0.023939) with: {'batch_size': 80, 'max_epochs': 10} 0.674479 (0.020752) with: {'batch_size': 80, 'max_epochs': 50} 0.703125 (0.026107) with: {'batch_size': 80, 'max_epochs': 100} 0.680990 (0.014382) with: {'batch_size': 100, 'max_epochs': 10} 0.670573 (0.013279) with: {'batch_size': 100, 'max_epochs': 50} 0.687500 (0.017758) with: {'batch_size': 100, 'max_epochs': 100}
Anda boleh lihat' batch_size': 10, 'max_epochs': 100 mencapai hasil terbaik kira-kira 71% ketepatan.
Mari kita lihat cara melaraskan pengoptimum Kami tahu bahawa terdapat banyak pengoptimum untuk dipilih, seperti SDG, Adam, dll., jadi macam mana nak pilih?
Kod lengkap adalah seperti berikut:
import numpy as np import torch import torch.nn as nn import torch.optim as optim from skorch import NeuralNetClassifier from sklearn.model_selection import GridSearchCV # load the dataset, split into input (X) and output (y) variables dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',') X = dataset[:,0:8] y = dataset[:,8] X = torch.tensor(X, dtype=torch.float32) y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1) # PyTorch classifier class PimaClassifier(nn.Module): def __init__(self): super().__init__() self.layer = nn.Linear(8, 12) self.act = nn.ReLU() self.output = nn.Linear(12, 1) self.prob = nn.Sigmoid() def forward(self, x): x = self.act(self.layer(x)) x = self.prob(self.output(x)) return x # create model with skorch model = NeuralNetClassifier( PimaClassifier, criterinotallow=nn.BCELoss, max_epochs=100, batch_size=10, verbose=False ) # define the grid search parameters param_grid = { 'optimizer': [optim.SGD, optim.RMSprop, optim.Adagrad, optim.Adadelta, optim.Adam, optim.Adamax, optim.NAdam], } grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3) grid_result = grid.fit(X, y) # summarize results print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_)) means = grid_result.cv_results_['mean_test_score'] stds = grid_result.cv_results_['std_test_score'] params = grid_result.cv_results_['params'] for mean, stdev, param in zip(means, stds, params): print("%f (%f) with: %r" % (mean, stdev, param))
Output adalah seperti berikut:
Best: 0.721354 using {'optimizer': <class 'torch.optim.adamax.Adamax'>} 0.674479 (0.036828) with: {'optimizer': <class 'torch.optim.sgd.SGD'>} 0.700521 (0.043303) with: {'optimizer': <class 'torch.optim.rmsprop.RMSprop'>} 0.682292 (0.027126) with: {'optimizer': <class 'torch.optim.adagrad.Adagrad'>} 0.572917 (0.051560) with: {'optimizer': <class 'torch.optim.adadelta.Adadelta'>} 0.714844 (0.030758) with: {'optimizer': <class 'torch.optim.adam.Adam'>} 0.721354 (0.019225) with: {'optimizer': <class 'torch.optim.adamax.Adamax'>} 0.709635 (0.024360) with: {'optimizer': <class 'torch.optim.nadam.NAdam'>}
Ia boleh dilihat bahawa algoritma pengoptimuman Adamax adalah yang terbaik untuk model dan set data kami, Ketepatan adalah kira-kira 72%.
Walaupun pelan kadar pembelajaran dalam pytorch membolehkan kami melaraskan kadar pembelajaran secara dinamik mengikut pusingan, sebagai contoh, kami menggunakan kadar pembelajaran dan parameter kadar pembelajaran sebagai grid Cari parameter untuk ditunjukkan. Dalam PyTorch, menetapkan kadar dan momentum pembelajaran adalah seperti berikut:
optimizer = optim.SGD(lr=0.001, momentum=0.9)
Dalam pakej skorch, gunakan pengoptimum awalan__ untuk menghalakan parameter ke pengoptimum.
import numpy as np import torch import torch.nn as nn import torch.optim as optim from skorch import NeuralNetClassifier from sklearn.model_selection import GridSearchCV # load the dataset, split into input (X) and output (y) variables dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',') X = dataset[:,0:8] y = dataset[:,8] X = torch.tensor(X, dtype=torch.float32) y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1) # PyTorch classifier class PimaClassifier(nn.Module): def __init__(self): super().__init__() self.layer = nn.Linear(8, 12) self.act = nn.ReLU() self.output = nn.Linear(12, 1) self.prob = nn.Sigmoid() def forward(self, x): x = self.act(self.layer(x)) x = self.prob(self.output(x)) return x # create model with skorch model = NeuralNetClassifier( PimaClassifier, criterinotallow=nn.BCELoss, optimizer=optim.SGD, max_epochs=100, batch_size=10, verbose=False ) # define the grid search parameters param_grid = { 'optimizer__lr': [0.001, 0.01, 0.1, 0.2, 0.3], 'optimizer__momentum': [0.0, 0.2, 0.4, 0.6, 0.8, 0.9], } grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3) grid_result = grid.fit(X, y) # summarize results print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_)) means = grid_result.cv_results_['mean_test_score'] stds = grid_result.cv_results_['std_test_score'] params = grid_result.cv_results_['params'] for mean, stdev, param in zip(means, stds, params): print("%f (%f) with: %r" % (mean, stdev, param))
Keputusan adalah seperti berikut:
Best: 0.682292 using {'optimizer__lr': 0.001, 'optimizer__momentum': 0.9} 0.648438 (0.016877) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.0} 0.671875 (0.017758) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.2} 0.674479 (0.022402) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.4} 0.677083 (0.011201) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.6} 0.679688 (0.027621) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.8} 0.682292 (0.026557) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.9} 0.671875 (0.019918) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.0} 0.648438 (0.024910) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.2} 0.546875 (0.143454) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.4} 0.567708 (0.153668) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.6} 0.552083 (0.141790) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.8} 0.451823 (0.144561) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.9} 0.348958 (0.001841) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.0} 0.450521 (0.142719) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.2} 0.450521 (0.142719) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.4} 0.450521 (0.142719) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.6} 0.348958 (0.001841) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.8} 0.348958 (0.001841) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.9} 0.444010 (0.136265) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.0} 0.450521 (0.142719) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.2} 0.348958 (0.001841) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.4} 0.552083 (0.141790) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.6} 0.549479 (0.142719) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.8} 0.651042 (0.001841) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.9} 0.552083 (0.141790) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.0} 0.348958 (0.001841) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.2} 0.450521 (0.142719) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.4} 0.552083 (0.141790) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.6} 0.450521 (0.142719) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.8} 0.450521 (0.142719) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.9}
Untuk SGD, keputusan terbaik diperoleh menggunakan kadar pembelajaran 0.001 dan momentum 0.9, dengan ketepatan kira-kira 68 %.
Fungsi pengaktifan mengawal ketaklinieran neuron tunggal. Kami akan menunjukkan penilaian beberapa fungsi pengaktifan yang tersedia dalam PyTorch.
import numpy as np import torch import torch.nn as nn import torch.nn.init as init import torch.optim as optim from skorch import NeuralNetClassifier from sklearn.model_selection import GridSearchCV # load the dataset, split into input (X) and output (y) variables dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',') X = dataset[:,0:8] y = dataset[:,8] X = torch.tensor(X, dtype=torch.float32) y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1) # PyTorch classifier class PimaClassifier(nn.Module): def __init__(self, activatinotallow=nn.ReLU): super().__init__() self.layer = nn.Linear(8, 12) self.act = activation() self.output = nn.Linear(12, 1) self.prob = nn.Sigmoid() # manually init weights init.kaiming_uniform_(self.layer.weight) init.kaiming_uniform_(self.output.weight) def forward(self, x): x = self.act(self.layer(x)) x = self.prob(self.output(x)) return x # create model with skorch model = NeuralNetClassifier( PimaClassifier, criterinotallow=nn.BCELoss, optimizer=optim.Adamax, max_epochs=100, batch_size=10, verbose=False ) # define the grid search parameters param_grid = { 'module__activation': [nn.Identity, nn.ReLU, nn.ELU, nn.ReLU6, nn.GELU, nn.Softplus, nn.Softsign, nn.Tanh, nn.Sigmoid, nn.Hardsigmoid] } grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3) grid_result = grid.fit(X, y) # summarize results print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_)) means = grid_result.cv_results_['mean_test_score'] stds = grid_result.cv_results_['std_test_score'] params = grid_result.cv_results_['params'] for mean, stdev, param in zip(means, stds, params): print("%f (%f) with: %r" % (mean, stdev, param))
Keputusannya adalah seperti berikut:
Best: 0.699219 using {'module__activation': <class 'torch.nn.modules.activation.ReLU'>} 0.687500 (0.025315) with: {'module__activation': <class 'torch.nn.modules.linear.Identity'>} 0.699219 (0.011049) with: {'module__activation': <class 'torch.nn.modules.activation.ReLU'>} 0.674479 (0.035849) with: {'module__activation': <class 'torch.nn.modules.activation.ELU'>} 0.621094 (0.063549) with: {'module__activation': <class 'torch.nn.modules.activation.ReLU6'>} 0.674479 (0.017566) with: {'module__activation': <class 'torch.nn.modules.activation.GELU'>} 0.558594 (0.149189) with: {'module__activation': <class 'torch.nn.modules.activation.Softplus'>} 0.675781 (0.014616) with: {'module__activation': <class 'torch.nn.modules.activation.Softsign'>} 0.619792 (0.018688) with: {'module__activation': <class 'torch.nn.modules.activation.Tanh'>} 0.643229 (0.019225) with: {'module__activation': <class 'torch.nn.modules.activation.Sigmoid'>} 0.636719 (0.022326) with: {'module__activation': <class 'torch.nn.modules.activation.Hardsigmoid'>}
Fungsi pengaktifan ReLU memperoleh hasil terbaik, dengan ketepatan kira-kira 70%.
在本例中,我们将尝试在0.0到0.9之间的dropout百分比(1.0没有意义)和在0到5之间的MaxNorm权重约束值。
import numpy as np import torch import torch.nn as nn import torch.nn.init as init import torch.optim as optim from skorch import NeuralNetClassifier from sklearn.model_selection import GridSearchCV # load the dataset, split into input (X) and output (y) variables dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',') X = dataset[:,0:8] y = dataset[:,8] X = torch.tensor(X, dtype=torch.float32) y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1) # PyTorch classifier class PimaClassifier(nn.Module): def __init__(self, dropout_rate=0.5, weight_cnotallow=1.0): super().__init__() self.layer = nn.Linear(8, 12) self.act = nn.ReLU() self.dropout = nn.Dropout(dropout_rate) self.output = nn.Linear(12, 1) self.prob = nn.Sigmoid() self.weight_constraint = weight_constraint # manually init weights init.kaiming_uniform_(self.layer.weight) init.kaiming_uniform_(self.output.weight) def forward(self, x): # maxnorm weight before actual forward pass with torch.no_grad(): norm = self.layer.weight.norm(2, dim=0, keepdim=True).clamp(min=self.weight_constraint / 2) desired = torch.clamp(norm, max=self.weight_constraint) self.layer.weight *= (desired / norm) # actual forward pass x = self.act(self.layer(x)) x = self.dropout(x) x = self.prob(self.output(x)) return x # create model with skorch model = NeuralNetClassifier( PimaClassifier, criterinotallow=nn.BCELoss, optimizer=optim.Adamax, max_epochs=100, batch_size=10, verbose=False ) # define the grid search parameters param_grid = { 'module__weight_constraint': [1.0, 2.0, 3.0, 4.0, 5.0], 'module__dropout_rate': [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9] } grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3) grid_result = grid.fit(X, y) # summarize results print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_)) means = grid_result.cv_results_['mean_test_score'] stds = grid_result.cv_results_['std_test_score'] params = grid_result.cv_results_['params'] for mean, stdev, param in zip(means, stds, params): print("%f (%f) with: %r" % (mean, stdev, param))
结果如下:
Best: 0.701823 using {'module__dropout_rate': 0.1, 'module__weight_constraint': 2.0} 0.669271 (0.015073) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 1.0} 0.692708 (0.035132) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 2.0} 0.589844 (0.170180) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 3.0} 0.561198 (0.151131) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 4.0} 0.688802 (0.021710) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 5.0} 0.697917 (0.009744) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 1.0} 0.701823 (0.016367) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 2.0} 0.694010 (0.010253) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 3.0} 0.686198 (0.025976) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 4.0} 0.679688 (0.026107) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 5.0} 0.701823 (0.029635) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 1.0} 0.682292 (0.014731) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 2.0} 0.701823 (0.009744) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 3.0} 0.701823 (0.026557) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 4.0} 0.687500 (0.015947) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 5.0} 0.686198 (0.006639) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 1.0} 0.656250 (0.006379) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 2.0} 0.565104 (0.155608) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 3.0} 0.700521 (0.028940) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 4.0} 0.669271 (0.012890) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 5.0} 0.661458 (0.018688) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 1.0} 0.669271 (0.017566) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 2.0} 0.652344 (0.006379) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 3.0} 0.680990 (0.037783) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 4.0} 0.692708 (0.042112) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 5.0} 0.666667 (0.006639) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 1.0} 0.652344 (0.011500) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 2.0} 0.662760 (0.007366) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 3.0} 0.558594 (0.146610) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 4.0} 0.552083 (0.141826) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 5.0} 0.548177 (0.141826) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 1.0} 0.653646 (0.013279) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 2.0} 0.661458 (0.008027) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 3.0} 0.553385 (0.142719) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 4.0} 0.669271 (0.035132) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 5.0} 0.662760 (0.015733) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 1.0} 0.636719 (0.024910) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 2.0} 0.550781 (0.146818) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 3.0} 0.537760 (0.140094) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 4.0} 0.542969 (0.138144) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 5.0} 0.565104 (0.148654) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 1.0} 0.657552 (0.008027) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 2.0} 0.428385 (0.111418) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 3.0} 0.549479 (0.142719) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 4.0} 0.648438 (0.005524) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 5.0} 0.540365 (0.136861) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 1.0} 0.605469 (0.053083) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 2.0} 0.553385 (0.139948) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 3.0} 0.549479 (0.142719) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 4.0} 0.595052 (0.075566) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 5.0}
可以看到,10%的Dropout和2.0的权重约束获得了70%的最佳精度。
单层神经元的数量是一个需要调优的重要参数。一般来说,一层神经元的数量控制着网络的表示能力,至少在拓扑的这一点上是这样。
理论上来说:由于通用逼近定理,一个足够大的单层网络可以近似任何其他神经网络。
在本例中,将尝试从1到30的值,步骤为5。一个更大的网络需要更多的训练,至少批大小和epoch的数量应该与神经元的数量一起优化。
import numpy as np import torch import torch.nn as nn import torch.nn.init as init import torch.optim as optim from skorch import NeuralNetClassifier from sklearn.model_selection import GridSearchCV # load the dataset, split into input (X) and output (y) variables dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',') X = dataset[:,0:8] y = dataset[:,8] X = torch.tensor(X, dtype=torch.float32) y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1) class PimaClassifier(nn.Module): def __init__(self, n_neurnotallow=12): super().__init__() self.layer = nn.Linear(8, n_neurons) self.act = nn.ReLU() self.dropout = nn.Dropout(0.1) self.output = nn.Linear(n_neurons, 1) self.prob = nn.Sigmoid() self.weight_constraint = 2.0 # manually init weights init.kaiming_uniform_(self.layer.weight) init.kaiming_uniform_(self.output.weight) def forward(self, x): # maxnorm weight before actual forward pass with torch.no_grad(): norm = self.layer.weight.norm(2, dim=0, keepdim=True).clamp(min=self.weight_constraint / 2) desired = torch.clamp(norm, max=self.weight_constraint) self.layer.weight *= (desired / norm) # actual forward pass x = self.act(self.layer(x)) x = self.dropout(x) x = self.prob(self.output(x)) return x # create model with skorch model = NeuralNetClassifier( PimaClassifier, criterinotallow=nn.BCELoss, optimizer=optim.Adamax, max_epochs=100, batch_size=10, verbose=False ) # define the grid search parameters param_grid = { 'module__n_neurons': [1, 5, 10, 15, 20, 25, 30] } grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3) grid_result = grid.fit(X, y) # summarize results print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_)) means = grid_result.cv_results_['mean_test_score'] stds = grid_result.cv_results_['std_test_score'] params = grid_result.cv_results_['params'] for mean, stdev, param in zip(means, stds, params): print("%f (%f) with: %r" % (mean, stdev, param))
结果如下:
Best: 0.708333 using {'module__n_neurons': 30} 0.654948 (0.003683) with: {'module__n_neurons': 1} 0.666667 (0.023073) with: {'module__n_neurons': 5} 0.694010 (0.014382) with: {'module__n_neurons': 10} 0.682292 (0.014382) with: {'module__n_neurons': 15} 0.707031 (0.028705) with: {'module__n_neurons': 20} 0.703125 (0.030758) with: {'module__n_neurons': 25} 0.708333 (0.015733) with: {'module__n_neurons': 30}
你可以看到,在隐藏层中有30个神经元的网络获得了最好的结果,准确率约为71%。
Atas ialah kandungan terperinci Bagaimana untuk melakukan carian grid hiperparameter untuk model PyTorch menggunakan scikit-learn?. Untuk maklumat lanjut, sila ikut artikel berkaitan lain di laman web China PHP!