Posts tagged with 'ml'
- 
        Why Cross‑Entropy Still Rules Loss Functions
        Negative log‑likelihood beats MSE for classification—here’s an intuitive recap 
- 
        Exploring Two-Tower Ranking Models
        Generating synthetic data and exploring the two-tower ranking model setup with Margin and InfoNCE loss 
- 
        Style vs Substance in Chatbot Arena
        How LMSYS corrects for "pretty formatting" bias when ranking LLMs. 
- 
        Most LLMs Still Lean on AdamW
        From BERT to today's open-source GPT clones, the de-coupled weight-decay trick in AdamW remains the default. (T5 is the main outlier, preferring Adafactor at pre-train time.) 
- 
        Key Concepts Behind QLoRA Fine-Tuning
        Quantization + low-rank adapters let you fine-tune huge LLMs on a single GPU. 
- 
        Relationship Between L1 Norm and L1 Regularization
        Exploring how L1 and L2 norms form the basis of L1 (lasso) and L2 (ridge) regularization, with concrete examples and geometric intuition. 
- 
        Floating-Point Precision & Exploding Gradients
        Floating-point rounding errors in backpropagation can accumulate and magnify across layers, leading to exploding gradients in deep networks. 
- 
        Understanding Perplexity in Language Model Evaluation
        A concise guide to perplexity metric, its calculation, and significance for LLMs. 
- 
        KL divergence and cross-entropy loss
        How cross-entropy loss is just KL divergence in disguise—and when to use each. 
- 
        Curse of Dimensionality
        Why high-dimensional data quickly becomes sparse, distances stop making sense, and ML algorithms struggle 
- 
        When to Use PowerTransformer vs StandardScaler
        A concise comparison of StandardScaler and PowerTransformer, their purposes, effects, and when to choose each in a Python ML pipeline. 
- 
        Dot Product's Role in Similarity, Projections, and Gradients
        Exploring how the dot product's measure of vector alignment underpins similarity metrics, projections, and gradients in ML and calculus 
- 
        Neuroflow Optimizations for the MNIST Dataset
        Updates required to optimize Neuroflow for the MNIST dataset 
- 
        Working with the MNIST Dataset in JavaScript
        Notes on working with the famous MNIST Dataset in JavaScript 
- 
        Supporting Multi-class Classification with Neuroflow
        Extend Neuroflow to support multi-class classification by implementing the softmax activation function and cross-entropy loss. 
- 
        Porting Micrograd To Create Neuroflow
        Creating Neuroflow by porting Andrej Karpathy's micrograd to JavaScript 
- 
        Understanding PyTorch Transforms - ToTensor() and Normalize()
        A look at common PyTorch transforms for image preprocessing. 
- 
        Working With Dot Product And Cosine Similarity For Unit Vectors
        Why dot product is equal to cosine similarity when working with unit vectors 
- 
        Understanding Array Dimensions in NumPy
        Clarifying what "dimension" really means in NumPy arrays and broadcasting. 
- 
        PyTorch Basics & Quick Reference
        A quick walk-through and overview of using PyTorch