Today, I wanted to work with the MNIST dataset in the browser, so I needed to first put it in a workable format. The process involved a few steps:
tensorflow.keras.datasets
for convenience.Code to grab the dataset:
import numpy as np
import json
from tensorflow.keras.datasets import mnist
# Load the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
For my experiments, I only needed a subset of the data so I limited the dataset to the first 10,000 images.
limit = 10000
# Limit the dataset
train_images = train_images[:limit]
train_labels = train_labels[:limit]
test_images = test_images[:int(limit * 0.01)]
test_labels = test_labels[:int(limit * 0.01)]
# Convert to vector arrays
train_images = train_images.reshape((limit, 28 * 28))
test_images = test_images.reshape((int(limit * 0.01), 28 * 28))
# Normalize pixel values
train_images = train_images.astype('float32') / 255
test_images = test_images.astype('float32') / 255
# Combine images and labels and export to JSON
train_data = [{'image': train_images[i].tolist(), 'label': int(train_labels[i])} for i in range(len(train_images))]
test_data = [{'image': test_images[i].tolist(), 'label': int(test_labels[i])} for i in range(len(test_images))]
# Export to JSON
with open('train_data.json', 'w') as f:
json.dump(train_data, f)
with open('test_data.json', 'w') as f:
json.dump(test_data, f)
This code snippet does the following:
train_images
, train_labels
, test_images
, and test_labels
arrays to the first 10,000 and 100 images, respectively.train_data.json
and test_data.json
).We can convert the training data back into images using HTML5 Canvas. The key function required is outlined below with explanations:
const drawImage = (data, canvasId) => {
const canvas = document.getElementById(canvasId)
const ctx = canvas.getContext('2d')
// Create a new ImageData object with a width and height
// of 28 pixels (the size of the MNIST images).
const imageData = ctx.createImageData(28, 28)
// Loop through the data.image array:
data.image.forEach((pixel, index) => {
// Scale the pixel value (which is originally between 0 and 1) to
// between 0 and 255, the range for color values in the ImageData object
let value = pixel * 255
// Set the pixel values in the imageData object starting with
// the red channel of the pixel.
imageData.data[index * 4 + 0] = value
// Set the green channel of the pixel.
imageData.data[index * 4 + 1] = value
// Set the blue channel of the pixel.
imageData.data[index * 4 + 2] = value
// Set the alpha channel of the pixel (255 means fully opaque)
imageData.data[index * 4 + 3] = 255
})
ctx.putImageData(imageData, 0, 0)
}
ImageData
object representing a 28x28 pixel area (the size of MNIST images).data.image
array (the image vector), scaling each pixel value from the normalized range (0-1) back to the standard color range (0-255).imageData
object.imageData
onto the canvas at position (0, 0).Live demo of the images: https://jsbin.com/mowoham/edit?js,output