我想在 keras 中编写一个神经网络,绘制图像的底部,知道顶部。数据集 - mnist 数字,28x28,黑白。我知道这对于神经网络来说是一项艰巨的任务,无论如何错误都会相当高,但我不需要一切都完美,最主要的是不模糊,这就是我最终得到的结果。这是训练代码:
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D, Flatten, Reshape
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from glob import glob
from PIL import Image
import keras
import numpy as np
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = np.concatenate([x_train, x_test])
y_train = np.concatenate([y_train, y_test])
x_train = x_train[y_train == 1]
# Размеры изображения
img_width, img_height = 28, 28
input_img = Input(shape=(img_width//2, img_height, 1))
# Энкодер
x = Conv2D(64, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
encoded = Flatten()(x)
# Декодер
x = Reshape(target_shape=(7, 14, 8*8))(encoded)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
# Создание модели
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer=Adam(learning_rate=0.001), loss='binary_crossentropy')
x_train = x_train.astype('float32') / 255.0;
x_train = np.reshape(x_train, (x_train.shape[0], 28, 28, 1))
autoencoder.fit(x_train[:, :14], x_train[:, 14:], epochs=10, batch_size=128, shuffle=True)
autoencoder.save("model.h5")
我仅在单位上训练模型,以免混淆,因为 1 是最简单的数字。但即便如此,结果也不是很令人印象深刻:

如果是其他数字,结果通常是灾难性的。我将不胜感激任何能够告诉我选择什么架构和超参数,或者可能是其他东西的人......可以人为地增加图像下部的亮度,因为它总是变得更暗,但是某种扭曲仍然存在。根据我的尝试 - 学习步骤更小(不起作用),架构更复杂(部分起作用),获取所有数字(结果更糟)。