Posts tagged with 'ml'
-
Why Cross‑Entropy Still Rules Loss Functions
Negative log‑likelihood beats MSE for classification—here’s an intuitive recap
-
Exploring Two-Tower Ranking Models
Generating synthetic data and exploring the two-tower ranking model setup with Margin and InfoNCE loss
-
Style vs Substance in Chatbot Arena
How LMSYS corrects for "pretty formatting" bias when ranking LLMs.
-
Most LLMs Still Lean on AdamW
From BERT to today's open-source GPT clones, the de-coupled weight-decay trick in AdamW remains the default. (T5 is the main outlier, preferring Adafactor at pre-train time.)
-
Key Concepts Behind QLoRA Fine-Tuning
Quantization + low-rank adapters let you fine-tune huge LLMs on a single GPU.
-
Relationship Between L1 Norm and L1 Regularization
Exploring how L1 and L2 norms form the basis of L1 (lasso) and L2 (ridge) regularization, with concrete examples and geometric intuition.
-
Floating-Point Precision & Exploding Gradients
Floating-point rounding errors in backpropagation can accumulate and magnify across layers, leading to exploding gradients in deep networks.
-
Understanding Perplexity in Language Model Evaluation
A concise guide to perplexity metric, its calculation, and significance for LLMs.
-
KL divergence and cross-entropy loss
How cross-entropy loss is just KL divergence in disguise—and when to use each.
-
Curse of Dimensionality
Why high-dimensional data quickly becomes sparse, distances stop making sense, and ML algorithms struggle
-
When to Use PowerTransformer vs StandardScaler
A concise comparison of StandardScaler and PowerTransformer, their purposes, effects, and when to choose each in a Python ML pipeline.
-
Dot Product's Role in Similarity, Projections, and Gradients
Exploring how the dot product's measure of vector alignment underpins similarity metrics, projections, and gradients in ML and calculus
-
Neuroflow Optimizations for the MNIST Dataset
Updates required to optimize Neuroflow for the MNIST dataset
-
Working with the MNIST Dataset in JavaScript
Notes on working with the famous MNIST Dataset in JavaScript
-
Supporting Multi-class Classification with Neuroflow
Extend Neuroflow to support multi-class classification by implementing the softmax activation function and cross-entropy loss.
-
Porting Micrograd To Create Neuroflow
Creating Neuroflow by porting Andrej Karpathy's micrograd to JavaScript
-
Understanding PyTorch Transforms - ToTensor() and Normalize()
A look at common PyTorch transforms for image preprocessing.
-
Working With Dot Product And Cosine Similarity For Unit Vectors
Why dot product is equal to cosine similarity when working with unit vectors
-
Understanding Array Dimensions in NumPy
Clarifying what "dimension" really means in NumPy arrays and broadcasting.
-
PyTorch Basics & Quick Reference
A quick walk-through and overview of using PyTorch