圣诞树动画

Question

alex-rudenkiy

Asked:2020-09-17 05:47:53 +0000 UTC2020-09-17 05:47:53 +0000 UTC 2020-09-17 05:47:53 +0000 UTC

model.predict 总是返回相同的值

772

告诉我为什么 predict 在不同的输入图像上不断返回相同的值，前提是神经网络经过某种训练？

model.predict(np.array([images[20]])) Out[45]: array([[0.48986772, 0.25879842]], dtype=float32) model.predict(np.array([images[17]]) ) Out[46]: array([[0.48986772,0.25879842]], dtype=float32)

images = load_data(data_dir)
images = np.asarray(images, dtype=np.float32)
images /= 255
answers = np.asarray(predicts, dtype=np.float32)
maxval = np.amax(answers)
answers /= maxval
images = images.reshape(images.shape[0], 80, 60, 1)

model=Sequential()

model.add(Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=(80, 60,1)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(256, activation='softmax'))
model.add(Dropout(0.25))
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.25))

#model.add(Dense(2, activation='softmax'))

model.add(Dense(2))

model.summary()

batch_size = 256
epochs = 15
model.compile(optimizer=Adam(lr=0.05), loss='mean_squared_error', metrics=['accuracy'])

history = model.fit(images, answers, batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.1)

升级版：

好吧，看，这是一个“通用”算法，它是为加载分类数据量身定制的）

def load_data(data_dir):

    directories = [d for d in os.listdir(data_dir)
                   if os.path.isdir(os.path.join(data_dir, d))]

    labels = []
    images = []

    category = 0
    for d in directories:
        label_dir = os.path.join(data_dir, d)
        file_names = [os.path.join(label_dir, f)
                      for f in os.listdir(label_dir)
                      if f.endswith(".jpg")]

        for f in file_names:
            img = cv2.imread(f, cv2.IMREAD_GRAYSCALE)
            img = cv2.resize(img, (80, 60))
            #plt.figure()
            #plt.imshow(img)
            images.append(img)
            labels.append(category)

        category += 1

    return images, labels

升级版：

def load_data(data_dir):
    images = []
    file_names = [os.path.join(data_dir, f)
        for f in os.listdir(data_dir)
        if f.endswith(".jpg")]

    for f in file_names:
        img = cv2.imread(f, cv2.IMREAD_GRAYSCALE)
        img = cv2.resize(img, (80, 60))
        images.append(img)
    return images

更新：

Epoch 00024: loss did not improve from 148.46680
Epoch 00024: early stopping
392.36053

Y_test
Out[15]: 
array([[ 754.,   85.],
       [ 214.,  528.],
       [ 697.,  218.],
       [ 830.,  365.],
       [ 299.,  145.],
       [1314.,  222.],
       [ 302.,  439.],
       [1449.,  738.],
       [ 856.,  406.],
       [ 759.,  584.],
       [ 336.,  427.],
       [ 285.,  754.],
       [ 373.,  577.]], dtype=float32)

Y_pred
Out[16]: 
array([[1200.4126 ,  298.62018],
       [1210.8347 ,  338.11783],
       [1216.1664 ,  304.6189 ],
       [1329.8218 ,  368.26013],
       [1166.9604 ,  292.44904],
       [1309.2661 ,  352.29883],
       [1195.6471 ,  318.59082],
       [1449.1544 ,  401.64136],
       [1292.0201 ,  333.70294],
       [1320.844  ,  363.69574],
       [1190.2806 ,  319.49582],
       [1272.7736 ,  377.27615],
       [1275.2628 ,  351.26425]], dtype=float32)

1 个回答

Voted

MaxU - stop genocide of UA · Answer 1 · 2020-09-22T21:56:59Z

经过几次试验和错误方法后，这就是我想出的：

预测坐标的平均误差（以像素为单位）：

In [11]: Y_pred = model.predict(X_test)
    ...: print(np.abs(Y_pred - Y_test).mean())
146.96552

预测坐标的误差（以像素为单位）：

In [12]: np.abs(Y_test - Y_pred)
Out[12]:
array([[128.76062  , 143.46924  ],
       [  3.2285156, 105.75409  ],
       [172.33173  , 399.3662   ],
       [ 42.2726   , 153.21255  ],
       [377.16882  , 341.63898  ],
       [ 59.04413  , 131.2898   ],
       [114.95325  , 369.33795  ],
       [  7.8912354,  55.595795 ],
       [ 69.320145 , 139.88315  ],
       [251.82434  ,  33.904816 ],
       [171.96564  ,  87.72238  ],
       [280.917    ,  76.77453  ],
       [ 68.9115   ,  34.564117 ]], dtype=float32)

验证坐标集：

In [13]: Y_test
Out[13]:
array([[1301.,  437.],
       [ 708.,  543.],
       [ 993.,  909.],
       [ 129.,  362.],
       [1445.,  768.],
       [ 530.,  486.],
       [ 451.,  832.],
       [ 316.,   99.],
       [ 130.,  384.],
       [1309.,  119.],
       [ 832.,  231.],
       [ 299.,  145.],
       [ 756.,  295.]], dtype=float32)

预测的坐标集：

In [14]: Y_pred
Out[14]:
array([[1172.2394 ,  293.53076],
       [ 711.2285 ,  437.2459 ],
       [ 820.6683 ,  509.6338 ],
       [ 171.2726 ,  208.78745],
       [1067.8312 ,  426.36102],
       [ 589.0441 ,  354.7102 ],
       [ 565.95325,  462.66205],
       [ 323.89124,  154.5958 ],
       [ 199.32014,  244.11685],
       [1057.1757 ,  152.90482],
       [1003.96564,  143.27762],
       [ 579.917  ,   68.22547],
       [ 687.0885 ,  260.43588]], dtype=float32)

程序代码：

import os
import json
from datetime import datetime
import numpy as np
import cv2
from pathlib import Path
from natsort import natsorted
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.model_selection import train_test_split
from keras import Sequential
from keras.layers import *
from keras.layers.advanced_activations import LeakyReLU
from keras.optimizers import Adam, RMSprop
from keras.callbacks import EarlyStopping, ModelCheckpoint, TensorBoard
from keras.models import load_model

data_dir = Path(r'D:\temp\.data\882077-Keras_CNN\data')
model_name = str(data_dir.parent / 'model.h5')
timestamp = datetime.now().strftime("%Y-%m-%d_%H%M%S")
tensorboard_log_dir = str(data_dir.parent / f'logs/{timestamp}')


BATCH_SIZE = 32
EPOCHS = 30

INPUT_SHAPE = (80,80,1)
eye_top_left = (41, 103)                # (x, y)
eye_bottom_right = (41+313, 103+168)    # (x, y)



def load_images(data_dir, target_shape=INPUT_SHAPE):
    images = []
    p = Path(data_dir)
    for f in natsorted(p.glob('*.jpg'), key=lambda x: str(x)):
        # image coordinates: (rows, columns), i.e. (y, x)
        img = cv2.imread(str(f), cv2.IMREAD_GRAYSCALE)
        # crop everything except the eye
        img = img[eye_top_left[1]:eye_bottom_right[1],
                  eye_top_left[0]:eye_bottom_right[0]]
        # resize using 'target_shape'
        img = cv2.resize(img, target_shape[:2])
        images.append(img)
    return images

def load_data(data_dir, target_shape=INPUT_SHAPE):
    images = load_images(data_dir, target_shape=target_shape)
    # stack a list of 3D images into 4D array
    images = np.dstack(images)
    # swap last and first axes: (N, y, x, alpha_channel)
    images = np.moveaxis(images, 2, 0)
    images = (images / 255.).astype('float32')
    return images.reshape(images.shape + (1,))

def _conv2d(model, name_suffix=1, filters=32, kernel_size=(3,3),
            padding='same', use_bias=False, max_pool_size=None,
            #input_shape=None,
            **kwargs):
    model.add(Conv2D(filters=filters,
                     kernel_size=kernel_size,
                     padding=padding,
                     use_bias=use_bias,
                     name=f'conv_{name_suffix}',
                     #input_shape=input_shape,
                     **kwargs))
    model.add(BatchNormalization(name=f'norm_{name_suffix}'))
    model.add(LeakyReLU(alpha=0.1))
    if max_pool_size is not None:
        model.add(MaxPool2D(pool_size=max_pool_size))
    return model

def build_model(input_shape=INPUT_SHAPE):
    model = Sequential()
    model = _conv2d(model, 1, 32, (3,3), 'same', max_pool_size=(2,2),
                    input_shape=input_shape)
    model = _conv2d(model, 2, 64, (3,3), 'same', max_pool_size=(2,2))
    model = _conv2d(model, 3, 128, (3,3), 'same', max_pool_size=None)
    model = _conv2d(model, 4, 64, (3,3), 'same', max_pool_size=None)
    model = _conv2d(model, 5, 128, (3,3), 'same', max_pool_size=(2,2))
    model = _conv2d(model, 6, 256, (3,3), 'same', max_pool_size=None)
    model = _conv2d(model, 7, 128, (3,3), 'same', max_pool_size=None)
    model = _conv2d(model, 8, 256, (3,3), 'same', max_pool_size=(2,2))
    model.add(Flatten())
    model.add(Dense(1024, activation='relu'))
    model.add(Dropout(0.2))
    model.add(Dense(1024, activation='relu'))
    model.add(Dropout(0.2))
    model.add(Dense(2))
    model.compile(optimizer=Adam(lr=0.05), loss='logcosh',
                  metrics=['mse', 'logcosh', 'acc'])
    return model


#os.chdir(str(data_dir))

# Keras callbacks...
early_stop = EarlyStopping(monitor='loss', min_delta=0.001,
                           patience=5, verbose=1, mode='auto')
chkpt = ModelCheckpoint(model_name, 
                        monitor='loss', 
                        verbose=1, 
                        save_best_only=True, 
                        mode='auto')
os.makedirs(tensorboard_log_dir, exist_ok=True)
tensorboard = TensorBoard(log_dir=tensorboard_log_dir)
callbacks = [early_stop, chkpt, tensorboard]


# loading data
images = load_data(data_dir)
predicts = np.array(json.loads((data_dir / "coords.json").read_text()), dtype='float32')
#scaler = StandardScaler()
#Y = scaler.fit_transform(predicts)
Y = predicts

# split data into training and test data sets
X_train, X_test, Y_train, Y_test = train_test_split(images, Y, test_size=0.04)


# build model
model = build_model(input_shape=INPUT_SHAPE)
print(model.summary())

# fit model
history = model.fit(X_train, Y_train,
                    epochs=EPOCHS, batch_size=BATCH_SIZE,
                    #validation_data=(X_test, Y_test)
                    callbacks=callbacks)

model = load_model(model_name)

Y_pred = model.predict(X_test)
print(np.abs(Y_pred - Y_test).mean())

PS 为了促进 ANN 任务，我用眼睛训练模型在切出区域上（参见eye_top_left和eye_bottom_right）

UPDATE1：我将尝试回答评论中的一些问题：

是否有可能找出你的
img = img[eye_top_left[1]:eye_bottom_right[1], 
          eye_top_left[0]:eye_bottom_right[0]] 
？你喜欢自己剪掉眼睛吗？

是的，这里只选择了眼睛所在的照片区域。照片中的其余信息无关紧要。

为什么max_pool_size你有时不问（None）？

在许多流行的 CNN 架构中，它们将多个层组合Conv2D成块，并在每个块之后添加一层MaxPool2D——我决定尝试类似的东西。

你为什么选择损失logcosh，而不是通常的损失mean_squared_error？

来自文档：

This means that `logcosh` works mostly like the `mean_squared_error`,
but will not be so strongly affected by the occasional wildly incorrect prediction.

LeakyReLU是relu激活函数？

LeakyReLU是激活函数的一种变体ReLU，对于参数的负值，它返回alpha * x（alpha通常从范围中选择：）(0, 0.1]而不是0（像正常的ReLU）。来自维基百科：Leaky ReLUs allow a small, positive gradient when the unit is not active

你不知道我为什么不像你预测的那样吗？

可能有几个原因：

训练和测试样本的分解是随机完成的，很可能您的模型是在不同的数据集上训练的
神经网络中的初始权重也是随机选择的，这会影响结果

model.predict 总是返回相同的值

是否可以在 C++ 中继承类 <---> 结构？

这种神经网络架构适合文本分类吗？

为什么分配的工作方式不同？

控制台中的光标坐标

如何在 C++ 中删除类的实例？

点是否属于线段的问题

json结构错误

ServiceWorker 中的“获取”事件

c ++控制台应用程序exe文件[重复]

按多列从sql表中选择

model.predict 总是返回相同的值

1 个回答

相关问题