andy pai's tils

Porting Micrograd To Create Neuroflow

Recently, I've been experimenting with various deep learning frameworks, but I felt the itch to dive deeper into the underlying math. I had stumbled upon Andrej Karpathy's micrograd library a few years ago, which I found incredibly helpful. While working with PyTorch, I often wished I could deal with the simple building blocks Andrej built up from to get a better understanding of the internals. But building arbitrary parts of PyTorch from scratch is difficult when it already has over 200K lines of code.

So I decided to it'd be fun to port it to JavaScript and then extend it to more complex architectures like transformers. Doing it in JavaScript is nice since it makes it easier to create visualizations that run directly in the browser. This way, I can gain a better understanding of what's happening inside these networks. My priority is readability and understanding over performance. The idea is to RTFM (Read The F***ing Manual) in a sense, by going straight to the source and reimplementing the key ideas behind these networks. This allows me to grasp all of their underlying mathematical glory, even if it's just with toy examples.

Porting to JS, Creating Neuroflow

One of the biggest differences in the API between Python and JavaScript is that JavaScript doesn't really support a simple way to do operator overloading. So I relied on chaining instead.

Another challenge I encountered was that the Math.rand() function in JavaScript doesn't accept a seed. To overcome this, I added support for a Pseudo-random number generator (PRNG) based on the splitmix32 algorithm. This will ensure reproducibility of results when required.

To visualize the architecture, I used the HTML5 canvas element with the help of the p5.js library when it made things way easier. This combination allowed me to create interactive and informative visualizations that showcase the inner workings of the learning models.

Porting the Autograd Engine

The key part of micrograd is the autograd engine that implements backpropagation over a Directed acyclic graph. Here's a brief walkthrough of the ported code:

Autograd Engine

The first step was to port the engine.py. The Value class represents a scalar value and its gradient and supports operations like addition, multiplication, and activation functions. Here’s a simplified version of the code:

class Value {
  constructor(data, _children = [], _op = '') {
    this.data = data
    this.grad = 0
    this._backward = () => {}
    this._prev = new Set(_children)
    this._op = _op
  }

  static ensureValue(other) {
    return other instanceof Value ? other : new Value(other)
  }

  add(_other) {
    const other = Value.ensureValue(_other)
    const out = new Value(this.data + other.data, [this, other], '+')
    out._backward = () => {
      this.grad += out.grad
      other.grad += out.grad
    }
    return out
  }

  mul(_other) {
    const other = Value.ensureValue(_other)
    const out = new Value(this.data * other.data, [this, other], '*')
    out._backward = () => {
      this.grad += other.data * out.grad
      other.grad += this.data * out.grad
    }
    return out
  }

  relu() {
    const out = new Value(this.data < 0 ? 0 : this.data, [this], 'ReLU')
    out._backward = () => {
      this.grad = (out.data > 0) * out.grad
    }
    return out
  }

  // ... Other methods like tanh, exp, etc.

  backward() {
    const topo = []
    const visited = new Set()
    const buildTopo = (v) => {
      if (!visited.has(v)) {
        visited.add(v)
        v._prev.forEach((child) => buildTopo(child))
        topo.push(v)
      }
    }
    buildTopo(this)
    this.grad = 1
    topo.reverse().forEach((v) => v._backward())
  }

  // ... Other debugging helpers
}

Using Value Engine to Create a Neuron

Next, I implemented the Neuron class. Each neuron in a neural network computes a weighted sum of its inputs and passes this through an activation function.

class Neuron extends Module {
  // Initializes a neuron with a given number of inputs and an activation function
  constructor({ numOfInputs, activation = 'relu', weights, rand = Math.random }) {
    super()
    this.weights =
      weights ||
      // Randomly initialize weights between -1 and 1
      Array.from({ length: numOfInputs }, () => new Value(rand() * 2 - 1))

    this.bias = new Value(0)
    this.activation = activation
  }

  // Performs the forward pass for the neuron
  forward(inputs) {
    // Compute the weighted sum of inputs and bias
    const activation = this.weights.reduce(
      (sum, weight, i) => sum.add(weight.mul(inputs[i])),
      this.bias,
    )

    if (this.activation === 'relu') return activation.relu()
    if (this.activation === 'tanh') return activation.tanh()
    if (this.activation === 'linear') return activation
    throw new Error(`Unsupported activation function: ${this.activation}`)
  }

  // Returns the list of parameters (weights and bias)
  parameters() {
    return [...this.weights, this.bias]
  }

  // Returns a string representation of the neuron
  toString() {
    return `${this.activation.toUpperCase()}Neuron(${this.weights.length})`
  }
}

Using Neuron to Create a Layer

A Layer is simply a collection of neurons.

class Layer extends Module {
  // Initializes a layer with a given number of inputs and outputs, and an activation function
  constructor({ numOfInputs, numOfNeurons, activation = 'relu', neurons }) {
    super()

    // Create an array of neurons
    this.neurons =
      neurons ||
      Array.from(
        { length: numOfNeurons },
        () => new Neuron({ numOfInputs, activation }),
      )
  }

  // Performs the forward pass for the layer
  forward(inputs) {
    // Forward pass through each neuron
    const outputs = this.neurons.map((neuron) => neuron.forward(inputs))
    // Return a single output if there is only one neuron, otherwise return an array of outputs
    return outputs.length === 1 ? outputs[0] : outputs
  }

  // Returns the list of parameters for all neurons in the layer
  parameters() {
    return this.neurons.flatMap((neuron) => neuron.parameters())
  }

  // Returns a string representation of the layer
  toString() {
    return `Layer of [${this.neurons.map((neuron) => neuron.toString()).join(', ')}]`
  }
}

Using Layer to Create a Multi-Layer Perceptron

Finally, a Sequential class, which allows us to stack layers together to form a complete model. Here, I deviated from the micrograd API to make it more verbose but easier to see how a network is defined and to facilitate making arbitrary changes to each layer. This verbosity ensures that each step in the network construction is clear and modifiable, allowing for a more flexible and understandable implementation.

The benefits of this approach include:

class Sequential extends Module {
  constructor({ layers = [] } = {}) {
    super()
    this.layers = layers
  }

  add(layer) {
    this.layers.push(layer)
  }

  forward(inputs) {
    return this.layers.reduce((input, layer) => layer.forward(input), inputs)
  }

  parameters() {
    return this.layers.flatMap((layer) => layer.parameters())
  }

  weights() {
    return this.layers.map((layer) =>
      layer.neurons.map((neuron) =>
        neuron.weights.map((n) => ({ data: n.data, grad: n.grad })),
      ),
    )
  }

  toString() {
    return `Sequential of [${this.layers.map((layer) => layer.toString()).join(', ')}]`
  }
}

Demos

With this engine, I was able to create a couple of demos:

Demo 1: Graphing Calculator & Function Approximation:

Plot a function f(x). Estimate the function using a neural network.

Demo 2: Binary Classification using Hinge Loss

Classify (x,y) points using a neural network.