告诉我为什么 predict 在不同的输入图像上不断返回相同的值,前提是神经网络经过某种训练?
model.predict(np.array([images[20]])) Out[45]: array([[0.48986772, 0.25879842]], dtype=float32) model.predict(np.array([images[17]]) ) Out[46]: array([[0.48986772,0.25879842]], dtype=float32)
images = load_data(data_dir)
images = np.asarray(images, dtype=np.float32)
images /= 255
answers = np.asarray(predicts, dtype=np.float32)
maxval = np.amax(answers)
answers /= maxval
images = images.reshape(images.shape[0], 80, 60, 1)
model=Sequential()
model.add(Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=(80, 60,1)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(256, activation='softmax'))
model.add(Dropout(0.25))
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.25))
#model.add(Dense(2, activation='softmax'))
model.add(Dense(2))
model.summary()
batch_size = 256
epochs = 15
model.compile(optimizer=Adam(lr=0.05), loss='mean_squared_error', metrics=['accuracy'])
history = model.fit(images, answers, batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.1)
升级版:
好吧,看,这是一个“通用”算法,它是为加载分类数据量身定制的)
def load_data(data_dir):
directories = [d for d in os.listdir(data_dir)
if os.path.isdir(os.path.join(data_dir, d))]
labels = []
images = []
category = 0
for d in directories:
label_dir = os.path.join(data_dir, d)
file_names = [os.path.join(label_dir, f)
for f in os.listdir(label_dir)
if f.endswith(".jpg")]
for f in file_names:
img = cv2.imread(f, cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, (80, 60))
#plt.figure()
#plt.imshow(img)
images.append(img)
labels.append(category)
category += 1
return images, labels
升级版:
def load_data(data_dir):
images = []
file_names = [os.path.join(data_dir, f)
for f in os.listdir(data_dir)
if f.endswith(".jpg")]
for f in file_names:
img = cv2.imread(f, cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, (80, 60))
images.append(img)
return images
更新:
Epoch 00024: loss did not improve from 148.46680
Epoch 00024: early stopping
392.36053
Y_test
Out[15]:
array([[ 754., 85.],
[ 214., 528.],
[ 697., 218.],
[ 830., 365.],
[ 299., 145.],
[1314., 222.],
[ 302., 439.],
[1449., 738.],
[ 856., 406.],
[ 759., 584.],
[ 336., 427.],
[ 285., 754.],
[ 373., 577.]], dtype=float32)
Y_pred
Out[16]:
array([[1200.4126 , 298.62018],
[1210.8347 , 338.11783],
[1216.1664 , 304.6189 ],
[1329.8218 , 368.26013],
[1166.9604 , 292.44904],
[1309.2661 , 352.29883],
[1195.6471 , 318.59082],
[1449.1544 , 401.64136],
[1292.0201 , 333.70294],
[1320.844 , 363.69574],
[1190.2806 , 319.49582],
[1272.7736 , 377.27615],
[1275.2628 , 351.26425]], dtype=float32)
经过几次试验和错误方法后,这就是我想出的:
预测坐标的平均误差(以像素为单位):
预测坐标的误差(以像素为单位):
验证坐标集:
预测的坐标集:
程序代码:
PS 为了促进 ANN 任务,我用眼睛训练模型在切出区域上(参见
eye_top_left
和eye_bottom_right
)UPDATE1:我将尝试回答评论中的一些问题:
是的,这里只选择了眼睛所在的照片区域。照片中的其余信息无关紧要。
在许多流行的 CNN 架构中,它们将多个层组合
Conv2D
成块,并在每个块之后添加一层MaxPool2D
——我决定尝试类似的东西。来自文档:
LeakyReLU是激活函数的一种变体
ReLU
,对于参数的负值,它返回alpha * x
(alpha
通常从范围中选择:)(0, 0.1]
而不是0
(像正常的ReLU
)。来自维基百科:Leaky ReLUs allow a small, positive gradient when the unit is not active
可能有几个原因: