Posts tagged with 'ml'

Why Cross‑Entropy Still Rules Loss Functions

Negative log‑likelihood beats MSE for classification—here’s an intuitive recap

math
code

2025-05-08

Exploring Two-Tower Ranking Models

Generating synthetic data and exploring the two-tower ranking model setup with Margin and InfoNCE loss

code

2025-05-04

Style vs Substance in Chatbot Arena

How LMSYS corrects for "pretty formatting" bias when ranking LLMs.

ai

2025-03-24

Most LLMs Still Lean on AdamW

From BERT to today's open-source GPT clones, the de-coupled weight-decay trick in AdamW remains the default. (T5 is the main outlier, preferring Adafactor at pre-train time.)

ai

2025-02-24

Key Concepts Behind QLoRA Fine-Tuning

Quantization + low-rank adapters let you fine-tune huge LLMs on a single GPU.

ai
code

2025-02-13

Relationship Between L1 Norm and L1 Regularization

Exploring how L1 and L2 norms form the basis of L1 (lasso) and L2 (ridge) regularization, with concrete examples and geometric intuition.

code

2025-02-10

Floating-Point Precision & Exploding Gradients

Floating-point rounding errors in backpropagation can accumulate and magnify across layers, leading to exploding gradients in deep networks.

python

2025-01-19

Understanding Perplexity in Language Model Evaluation

A concise guide to perplexity metric, its calculation, and significance for LLMs.

ai

2024-08-31

KL divergence and cross-entropy loss

How cross-entropy loss is just KL divergence in disguise—and when to use each.

ai

2024-08-13

Curse of Dimensionality

Why high-dimensional data quickly becomes sparse, distances stop making sense, and ML algorithms struggle

code

2024-08-01

When to Use PowerTransformer vs StandardScaler

A concise comparison of StandardScaler and PowerTransformer, their purposes, effects, and when to choose each in a Python ML pipeline.

python

2024-07-05

Dot Product's Role in Similarity, Projections, and Gradients

Exploring how the dot product's measure of vector alignment underpins similarity metrics, projections, and gradients in ML and calculus

math

2024-06-12

Neuroflow Optimizations for the MNIST Dataset

Updates required to optimize Neuroflow for the MNIST dataset

2024-06-07

Working with the MNIST Dataset in JavaScript

Notes on working with the famous MNIST Dataset in JavaScript

javascript

2024-05-29

Supporting Multi-class Classification with Neuroflow

Extend Neuroflow to support multi-class classification by implementing the softmax activation function and cross-entropy loss.

2024-05-26

Porting Micrograd To Create Neuroflow

Creating Neuroflow by porting Andrej Karpathy's micrograd to JavaScript

2024-05-22

Understanding PyTorch Transforms - ToTensor() and Normalize()

A look at common PyTorch transforms for image preprocessing.

2024-05-10

Working With Dot Product And Cosine Similarity For Unit Vectors

Why dot product is equal to cosine similarity when working with unit vectors

math

2024-01-26

Understanding Array Dimensions in NumPy

Clarifying what "dimension" really means in NumPy arrays and broadcasting.

code

2023-05-02

PyTorch Basics & Quick Reference

A quick walk-through and overview of using PyTorch

2023-01-12