Posts tagged with 'ai'
-
Style vs Substance in Chatbot Arena
How LMSYS corrects for "pretty formatting" bias when ranking LLMs.
-
Most LLMs Still Lean on AdamW
From BERT to today's open-source GPT clones, the de-coupled weight-decay trick in AdamW remains the default. (T5 is the main outlier, preferring Adafactor at pre-train time.)
-
Dead Internet Theory - Bots Run the Web
TIL about the claim that bots now outnumber humans online—and what the numbers actually say.
-
Key Concepts Behind QLoRA Fine-Tuning
Quantization + low-rank adapters let you fine-tune huge LLMs on a single GPU.
-
Why logits.exp() Equals Counts
Understanding neural network computations as log-domain operations, making multiplicative interactions additive through logs
-
Understanding Perplexity in Language Model Evaluation
A concise guide to perplexity metric, its calculation, and significance for LLMs.
-
KL divergence and cross-entropy loss
How cross-entropy loss is just KL divergence in disguise—and when to use each.
-
Why sharing GPU power between AI servers and personal PCs doesn't really work
A practical look at why latency, hardware variety, and security issues make distributed GPU computing impractical
-
Chain of Draft to Speed Up LLM Reasoning
Chain of Draft (CoD) prompts LLMs to use short, minimal reasoning steps, achieving near-CoT accuracy with far lower token use and latency.
-
FAISS vs pgvector - why one's a library and the other's a database
FAISS is a rocket-fast in-memory index, pgvector is Postgres with vectors. Here's when to pick each, with code.