machine learning辅导、讲解Markdown留学生、辅导python, C/C++编程语言讲解留学生Prolog|辅导留学生 Sta

Project 3
Classification and inference with machine learning
This notebook is arranged in cells. Texts are usually written in the markdown cells, and here you can use html
tags (make it bold, italic, colored, etc). You can double click on this cell to see the formatting.
The ellipsis (...) are provided where you are expected to write your solution but feel free to change the template
(not over much) in case this style is not to your taste.
Hit "Shift-Enter" on a code cell to evaluate it. Double click a Markdown cell to edit.
Link Okpy
In [1]:
Imports
=====================================================================
Assignment: Project 3
OK, version v1.12.5
=====================================================================
Open the following URL:
https://okpy.org/client/login/ (https://okpy.org/client/login/)
After logging in, copy the code from the web page and paste it into th
e box.
Then press the "Enter" key on your keyboard.
Paste your code here: KozofyhR7YkKXUwCK3ycaIJ6Nubck9
Successfully logged in as ljma@berkeley.edu
from client.api.notebook import Notebook
ok = Notebook('Project3_U.ok')
_ = ok.auth(inline = True)In [2]:
Problem 1 - Using Keras - MNIST
The goal of this notebook is to introduce deep neural networks (DNNs) and convolutional neural networks
(CNNs) using the high-level Keras package and to become familiar with how to choose its architecture, cost
function, and optimizer in Keras. We will also learn how to train neural networks.
We will once again work with the MNIST dataset of hand written digits introduced in HW8. The goal is to find a
statistical model which recognizes and distinguishes between the ten handwritten digits (0-9).
The MNIST dataset comprises handwritten digits, each of which comes in a square image, divided into a
pixel grid. Every pixel can take on nuances of the gray color, interpolating between white and
black, and hence each data point assumes any value in the set . Since there are categories
in the problem, corresponding to the ten digits, this problem represents a generic classification task.
In this Notebook, we show how to use the Keras python package to tackle the MNIST problem with the help of
deep neural networks.
28 × 28 256
{0, 1, … , 255} 10
import numpy as np
from scipy.integrate import quad
#For plotting
import matplotlib.pyplot as plt
%matplotlib inline
import warnings
warnings.filterwarnings('ignore')Creating DNNs with Keras
Constructing a Deep Neural Network to solve ML problems is a multiple-stage process. Quite generally, one
can identify the key steps as follows:
step 1: Load and process the data
step 2: Define the model and its architecture
step 3: Choose the optimizer and the cost function
step 4: Train the model
step 5: Evaluate the model performance on the unseen test data
step 6: Modify the hyperparameters to optimize performance for the specific data set
We would like to emphasize that, while it is always possible to view steps 1-5 as independent of the particular
task we are trying to solve, it is only when they are put together in step 6 that the real gain of using Deep
Learning is revealed, compared to less sophisticated methods such as the regression models. With this remark
in mind, we shall focus predominantly on steps 1-5 below. We show how one can use grid search methods to
find optimal hyperparameters in step 6.
Step 1: Load and Process the Data
Keras knows to download automatically the MNIST data from the web. All we need to do is import the mnist
module and use the load_data() class, and it will create the training and test data sets or us.
The MNIST set has pre-defined test and training sets, in order to facilitate the comparison of the performance
of different models on the data.
Once we have loaded the data, we need to format it in the correct shape ( ).
The size of each sample, i.e. the number of bare features used is N_features (whis is 784 because we have a
pixel grid), while the number of potential classification categories is "num_classes" (which is 10,
number of digits).
Each pixel contains a greyscale value quantified by an integer between 0 and 255. To standardize the dataset,
we normalize the input data in the interval [0, 1].
(N , ) samples Nfeatures
28 × 28In [3]:
1. Make a plot of one MNIST digit (2D plot using X data - make sure to reshape it into a matrix) and
label it (which digit does it correspond to?).
28 × 28
Using TensorFlow backend.
from __future__ import print_function
import keras,sklearn
# suppress tensorflow compilation warnings
import os
import tensorflow as tf
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
seed=0
np.random.seed(seed) # fix random seed
tf.set_random_seed(seed)
from keras.datasets import mnist
# input image dimensions
num_classes = 10 # 10 digits
img_rows, img_cols = 28, 28 # number of pixels
# the data, shuffled and split between train and test sets
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()
X_train = X_train[:40000]
Y_train = Y_train[:40000]
# reshape data, depending on Keras backend
X_train = X_train.reshape(X_train.shape[0], img_rows*img_cols)
X_test = X_test.reshape(X_test.shape[0], img_rows*img_cols)

# cast floats to single precesion
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
# rescale data in interval [0,1]
X_train /= 255
X_test /= 255In [4]:
Last, we cast the label vectors y to binary class matrices (a.k.a. one-hot format).
plt.imshow(X_train[0].reshape((28,28)), cmap = plt.cm.gray)
plt.title('Label = %d' %Y_train[0])
plt.show()In [23]:
Here in this template, we use 40000 training samples and 10000 test samples. Remember that we
preprocessed data into the shape (N , ). samples Nfeatures
In [24]:
before conversion -
y vector : [5 0 4 1 9 2 1 3 1 4]
after conversion -
y vector : [[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]]
X_train shape: (40000, 784)
Y_train shape: (40000, 10)
40000 train samples
10000 test samples
# convert class vectors to binary class matrices
print("before conversion - ")
print("y vector : ", Y_train[0:10])
Y_train = keras.utils.to_categorical(Y_train, num_classes)
Y_test = keras.utils.to_categorical(Y_test, num_classes)
print("after conversion - ")
print("y vector : ", Y_train[0:10])
print('X_train shape:', X_train.shape)
print('Y_train shape:', Y_train.shape)
print()
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')Step 2: Define the Neural Net and its Architecture
We can now move on to construct our deep neural net. We shall use Keras's Sequential() class to
instantiate a model, and will add different deep layers one by one.
Let us create an instance of Keras' Sequential() class, called model . As the name suggests, this class
allows us to build DNNs layer by layer. (https://keras.io/getting-started/sequential-model-guide/
(https://keras.io/getting-started/sequential-model-guide/))
In [4]:
We use the add() method to attach layers to our model. For the purposes of our introductory example, it
suffices to focus on Dense layers for simplicity. (https://keras.io/layers/core/ (https://keras.io/layers/core/))
Every Dense() layer accepts as its first required argument an integer which specifies the number of neurons.
The type of activation function for the layer is defined using the activation optional argument, the input of
which is the name of the activation function in string format. Examples include relu , tanh , elu ,
sigmoid , softmax .
In order for our DNN to work properly, we have to make sure that the numbers of input and output neurons for
each layer match. Therefore, we specify the shape of the input in the first layer of the model explicitly using the
optional argument input_shape=(N_features,) . The sequential construction of the model then allows
Keras to infer the correct input/output dimensions of all hidden layers automatically. Hence, we only need to
specify the size of the softmax output layer to match the number of categories.
First, add a Dense layer with 400 output neurons and relu activation function.
In [26]:
Add another layer with 100 output neurons. Then, we will apply "dropout," a regularization scheme that has
been widely adopted in the neural networks literature: during the training procedure neurons are randomly
“dropped out” of the neural network with some probability giving rise to a thinned network. It prevents
overfitting by reducing spurious correlations between neurons within the network by introducing a
randomization procedure.
p
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
# instantiate model
model = Sequential()
model.add(Dense(400,input_shape=(img_rows*img_cols,), activation='relu'))In [27]:
Lastly, we need to add a soft-max layer since we have a multi-class output.
In [28]:
Step 3: Choose the Optimizer and the Cost Function
Next, we choose the loss function according to which to train the DNN. For classification problems, this is the
cross entropy, and since the output data was cast in categorical form, we choose the
categorical_crossentropy defined in Keras' losses module. Depending on the problem of interest
one can pick any other suitable loss function. To optimize the weights of the net, we choose SGD. This
algorithm is already available to use under Keras' optimizers module (https://keras.io/optimizers/)
(https://keras.io/optimizers/)), but we could use Adam() or any other built-in one as well. The parameters for
the optimizer, such as lr (learning rate) or momentum are passed using the corresponding optional
arguments of the SGD() function.
While the loss function and the optimizer are essential for the training procedure, to test the performance of the
model one may want to look at a particular metric of performance. For instance, in categorical tasks one
typically looks at their accuracy , which is defined as the percentage of correctly classified data points.
To complete the definition of our model, we use the compile() method, with optional arguments for the
optimizer , loss , and the validation metric as follows:
In [29]:
model.add(Dense(100, activation='relu'))
# apply dropout with rate 0.5
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
# compile the model
model.compile(loss=keras.losses.categorical_crossentropy, optimizer='SGD', metrics=[Step 4: Train the model
We train our DNN in minibatches. Shuffling the training data during training improves stability of the model.
Thus, we train over a number of training epochs.
(The number of epochs is the number of complete passes through the training dataset, and the batch size is a
number of samples propagated through the network before the model is updated.)
Training the DNN is a one-liner using the fit() method of the Sequential class. The first two required
arguments are the training input and output data. As optional arguments, we specify the mini- batch_size ,
the number of training epochs , and the test or validation data. To monitor the training procedure for every
epoch, we set verbose=True .
Let us set batch_size = 64 and epochs = 10.In [30]:
Step 5: Evaluate the Model Performance on the Unseen Test Data
Next, we evaluate the model and read of the loss on the test data, and its accuracy using the evaluate()
method.
Train on 40000 samples, validate on 10000 samples
Epoch 1/10
40000/40000 [==============================] - 4s - loss: 1.2012 - acc
: 0.6446 - val_loss: 0.5087 - val_acc: 0.8839
Epoch 2/10
40000/40000 [==============================] - 4s - loss: 0.5895 - acc
: 0.8318 - val_loss: 0.3646 - val_acc: 0.9065
Epoch 3/10
40000/40000 [==============================] - 4s - loss: 0.4755 - acc
: 0.8646 - val_loss: 0.3081 - val_acc: 0.9193
Epoch 4/10
40000/40000 [==============================] - 3s - loss: 0.4100 - acc
: 0.8814 - val_loss: 0.2755 - val_acc: 0.9243
Epoch 5/10
40000/40000 [==============================] - 4s - loss: 0.3716 - acc
: 0.8975 - val_loss: 0.2527 - val_acc: 0.9288
Epoch 6/10
40000/40000 [==============================] - 4s - loss: 0.3445 - acc
: 0.9030 - val_loss: 0.2338 - val_acc: 0.9342
Epoch 7/10
40000/40000 [==============================] - 4s - loss: 0.3185 - acc
: 0.9105 - val_loss: 0.2203 - val_acc: 0.9383
Epoch 8/10
40000/40000 [==============================] - 4s - loss: 0.2991 - acc
: 0.9171 - val_loss: 0.2060 - val_acc: 0.9413
Epoch 9/10
40000/40000 [==============================] - 3s - loss: 0.2815 - acc
: 0.9213 - val_loss: 0.1972 - val_acc: 0.9437
Epoch 10/10
40000/40000 [==============================] - 4s - loss: 0.2656 - acc
: 0.9265 - val_loss: 0.1874 - val_acc: 0.9457
# training parameters
batch_size = 64
epochs = 10
# train DNN and store training info in history
history=model.fit(X_train, Y_train, batch_size=batch_size, epochs=epochs,
verbose=1, validation_data=(X_test, Y_test))In [32]:
9632/10000 [===========================>..] - ETA: 0sTest loss: 0.187
36902387440205
Test accuracy: 0.9457
# evaluate model
score = model.evaluate(X_test, Y_test, verbose=1)
# print performance
print('Test loss:', score[0])
print('Test accuracy:', score[1])
# look into training history
# summarize history for accuracy
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.ylabel('model accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='best')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.ylabel('model loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='best')
plt.show()Step 6: Modify the Hyperparameters to Optimize Performance of the Model
Last, we show how to use the grid search option of scikit-learn (https://scikitlearn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html
(https://scikitlearn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html))
to optimize the
hyperparameters of our model.
First, define a function for crating a DNN:
In [9]:
With epochs = 1 and batch_size = 64, do grid search over the following optimization schemes: ['SGD',
'RMSprop', 'Adagrad', 'Adadelta', 'Adam', 'Adamax', 'Nadam'].
In [34]:
def create_DNN(optimizer=keras.optimizers.Adam()):
model = Sequential()
model.add(Dense(400,input_shape=(img_rows*img_cols,), activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=optimizer,
metrics=['accuracy'])
return model
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier
batch_size = 64
epochs = 1
model_gridsearch = KerasClassifier(build_fn=create_DNN,
epochs=epochs, batch_size=batch_size, verbose=1)
# list of allowed optional arguments for the optimizer, see `compile_model()`
optimizer = ['SGD', 'RMSprop', 'Adagrad', 'Adadelta', 'Adam', 'Adamax', 'Nadam']Epoch 1/1
30000/30000 [==============================] - 3s - loss: 1.3620 - acc
: 0.5839
28928/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 2s - loss: 1.4106 - acc
: 0.5763
28800/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 2s - loss: 1.3413 - acc
: 0.6017
29056/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 2s - loss: 1.3520 - acc
: 0.5856
29632/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 0.4102 - acc
: 0.8784
28736/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 0.4258 - acc
: 0.8723
28608/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 0.4126 - acc
: 0.8776
29312/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 0.4097 - acc
: 0.8779
29824/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 0.3794 - acc
: 0.8887
29824/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.4031 - acc
: 0.8816
29184/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 0.3895 - acc
: 0.8866
29824/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.3752 - acc
: 0.8911
29632/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.5778 - acc
: 0.8325
29184/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.5900 - acc
: 0.8278
28864/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.5922 - acc
: 0.8268
optimizer = ['SGD', 'RMSprop', 'Adagrad', 'Adadelta', 'Adam', 'Adamax', 'Nadam']
# define parameter dictionary
param_grid = dict(optimizer=optimizer)
# call scikit grid search module
grid = GridSearchCV(estimator=model_gridsearch, param_grid=param_grid, n_jobs=1, cv=
grid_result = grid.fit(X_train,Y_train)Show the mean test score of all optimization schemes and determine which scheme gives the best accuracy.
29184/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.5868 - acc
: 0.8266
29952/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.4324 - acc
: 0.8721
28864/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.4337 - acc
: 0.8714
29440/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.4356 - acc
: 0.8717
29568/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.4332 - acc
: 0.8727
28608/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.4958 - acc
: 0.8500
29312/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.4888 - acc
: 0.8595
28928/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.4658 - acc
: 0.8621
28736/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.4754 - acc
: 0.8601
30000/30000 [==============================] - 1s
Epoch 1/1
30000/30000 [==============================] - 4s - loss: 0.3584 - acc
: 0.8929
28992/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.3585 - acc
: 0.8944
29248/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.3631 - acc
: 0.8917
29824/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 4s - loss: 0.3624 - acc
: 0.8931
28800/30000 [===========================>..] - ETA: 0sEpoch 1/1
40000/40000 [==============================] - 6s - loss: 0.3168 - acc
: 0.9073 In [35]:
2. Create a DNN with one Dense layer having 200 output neurons. Do the grid search over any 5 different
activation functions from https://keras.io/activations/ (https://keras.io/activations/). Let epochs = 1, batches =
64, p_dropout=0.5, and optimizer=keras.optimizers.Adam(). Make sure to print the mean test score of each
case and determine which activation functions gives the best accuracy.
Doing the grid search requires quite a bit of memory. Please restart the kernel ("Kernel"-"Restart") and re-load
the data before doing a new grid search.
In [10]:
Best: 0.951650 using {'optimizer': 'Nadam'}
0.850700 (0.013746) with: {'optimizer': 'SGD'}
0.947125 (0.001248) with: {'optimizer': 'RMSprop'}
0.946550 (0.003741) with: {'optimizer': 'Adagrad'}
0.925900 (0.002684) with: {'optimizer': 'Adadelta'}
0.947200 (0.001200) with: {'optimizer': 'Adam'}
0.934825 (0.002807) with: {'optimizer': 'Adamax'}
0.951650 (0.000865) with: {'optimizer': 'Nadam'}
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
print("%f (%f) with: %r" % (mean, stdev, param))
model = Sequential()
def create_DNN(activation):
model = Sequential()
model.add(Dense(200,input_shape=(img_rows*img_cols,), activation=activation))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer='Adam',
metrics=['accuracy'])
return model
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier
batch_size = 64
epochs = 1
model_gridsearch = KerasClassifier(build_fn=create_DNN,
epochs=epochs, batch_size=batch_size, verbose=1)Epoch 1/1
30000/30000 [==============================] - 3s - loss: 0.5012 - acc
: 0.8506
29120/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 0.4967 - acc
: 0.8492
29184/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 2s - loss: 0.4963 - acc
: 0.8530
29376/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 2s - loss: 0.4939 - acc
: 0.8559
29440/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 2s - loss: 0.4717 - acc
: 0.8611
29120/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 2s - loss: 0.4722 - acc
: 0.8604
30000/30000 [==============================] - 0s
Epoch 1/1
30000/30000 [==============================] - 2s - loss: 0.4762 - acc
: 0.8587
29760/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 2s - loss: 0.4679 - acc
: 0.8637
29376/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 2s - loss: 0.4831 - acc
: 0.8567
29952/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 2s - loss: 0.4629 - acc
: 0.8612
28864/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 0.4722 - acc
: 0.8579
28992/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 0.4746 - acc
: 0.8580
29248/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 0.7890 - acc
: 0.7656
28480/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 0.7785 - acc
: 0.7720
28992/30000 [===========================>..] - ETA: 0sEpoch 1/1
# list of allowed optional arguments for the optimizer, see `compile_model()`
activation = ['relu', 'tanh', 'elu', 'sigmoid', 'softmax']
# define parameter dictionary
param_grid = dict(activation=activation)
# call scikit grid search module
grid = GridSearchCV(estimator=model_gridsearch, param_grid=param_grid, n_jobs=1, cv=
grid_result = grid.fit(X_train,Y_train)In [11]:
30000/30000 [==============================] - 3s - loss: 0.7647 - acc
: 0.7724
29312/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 0.7746 - acc
: 0.7727
29184/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 1.9335 - acc
: 0.5028
29760/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 1.9408 - acc
: 0.5048
28928/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 1.9342 - acc
: 0.4760
29824/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 1.9398 - acc
: 0.4810
10000/10000 [==============================] - 0s
28928/30000 [===========================>..] - ETA: 0sEpoch 1/1
40000/40000 [==============================] - 4s - loss: 0.4484 - acc
: 0.8681
Best: 0.932725 using {'activation': 'relu'}
0.932725 (0.002930) with: {'activation': 'relu'}
0.912850 (0.003827) with: {'activation': 'tanh'}
0.913725 (0.005063) with: {'activation': 'elu'}
0.895375 (0.005523) with: {'activation': 'sigmoid'}
0.839475 (0.016595) with: {'activation': 'softmax'}
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
print("%f (%f) with: %r" % (mean, stdev, param))3. Now, do the grid search over different combination of batch sizes (10, 30, 50, 100) and number of epochs (1,
2, 5). Make sure to print the mean test score of each case and determine which activation functions gives the
best accuracy. Here, you have a freedom to create your own DNN (assume an arbitrary number of Dense layers,
optimization scheme, etc).
Doing the grid search requires quite a bit of memory. Please restart the kernel ("Kernel"-"Restart") and re-load
the data before doing a new grid search.
Hint: To do the grid search over both batch_size and epochs, you can do:
param_grid = dict(batch_size=batch_size, epochs=epochs)In [13]:
Epoch 1/1
30000/30000 [==============================] - 14s - loss: 0.3851 - ac
c: 0.8832
29830/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 15s - loss: 0.3897 - ac
c: 0.8836
29960/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 15s - loss: 0.3945 - ac
c: 0.8820
29750/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 17s - loss: 0.3827 - ac
c: 0.8821
29980/30000 [============================>.] - ETA: 0sEpoch 1/2
30000/30000 [==============================] - 16s - loss: 0.3935 - ac
c: 0.8827
Epoch 2/2
30000/30000 [==============================] - 15s - loss: 0.2171 - ac
c: 0.9341
29930/30000 [============================>.] - ETA: 0sEpoch 1/2
model = Sequential()
def create_DNN():
model = Sequential()
model.add(Dense(200,input_shape=(img_rows*img_cols,), activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer='Adam',
metrics=['accuracy'])
return model
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier
batch_size = 64
epochs = 1
model_gridsearch = KerasClassifier(build_fn=create_DNN,
epochs=epochs, batch_size=batch_size, verbose=1)
# list of allowed optional arguments for the optimizer, see `compile_model()`
batch_size = [10,30,50,100]
epochs = [1,2,5]
# define parameter dictionary
param_grid = dict(batch_size=batch_size, epochs=epochs)
# call scikit grid search module
grid = GridSearchCV(estimator=model_gridsearch, param_grid=param_grid, n_jobs=1, cv=
grid_result = grid.fit(X_train,Y_train)In [14]:
4. Do the grid search over the number of neurons in the Dense layer and make a plot of mean test score as a
function of num_neurons. Again, you have a freedom to create your own DNN.
Doing the grid search requires quite a bit of memory. Please restart the kernel ("Kernel"-"Restart") and re-load
the data before doing a new grid search.
In [8]:
Best: 0.967475 using {'batch_size': 10, 'epochs': 5}
0.944350 (0.001052) with: {'batch_size': 10, 'epochs': 1}
0.956850 (0.002544) with: {'batch_size': 10, 'epochs': 2}
0.967475 (0.000476) with: {'batch_size': 10, 'epochs': 5}
0.939225 (0.003904) with: {'batch_size': 30, 'epochs': 1}
0.951700 (0.002840) with: {'batch_size': 30, 'epochs': 2}
0.967075 (0.000536) with: {'batch_size': 30, 'epochs': 5}
0.933700 (0.002487) with: {'batch_size': 50, 'epochs': 1}
0.949700 (0.001885) with: {'batch_size': 50, 'epochs': 2}
0.965475 (0.002815) with: {'batch_size': 50, 'epochs': 5}
0.927900 (0.002847) with: {'batch_size': 100, 'epochs': 1}
0.942250 (0.001410) with: {'batch_size': 100, 'epochs': 2}
0.960925 (0.002025) with: {'batch_size': 100, 'epochs': 5}
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
print("%f (%f) with: %r" % (mean, stdev, param))
model = Sequential()
def create_DNN(number):
model = Sequential()
model.add(Dense(number,input_shape=(img_rows*img_cols,), activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer='Adam',
metrics=['accuracy'])
return model
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier
batch_size = 64
epochs = 1Epoch 1/1
30000/30000 [==============================] - 2s - loss: 0.5748 - acc
: 0.8241
29696/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 2s - loss: 0.5448 - acc
: 0.8359
28672/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 2s - loss: 0.5499 - acc
: 0.8358
28864/30000 [===========================>..] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 2s - loss: 0.5480 - acc
: 0.8343
30000/30000 [==============================] - 0s
Epoch 1/1
30000/30000 [==============================] - 3s - loss: 0.4889 - acc
: 0.8569
29376/30000 [============================>.] - ETA: 0sEpoch 1/1
30000/30000 [==============================] - 3s - loss: 0.4796 - acc
: 0.8577
28544/30000 [========================

machine learning辅导、讲解Markdown留学生、辅导python, C/C++编程语言 讲解留学生Prolog|辅导留学生 Sta

machine learning辅导、讲解Markdown留学生、辅导python, C/C++编程语言讲解留学生Prolog|辅导留学生 Sta