FAISS vs pgvector - why one's a library and the other's a database
Today I bumped into FAISS (Facebook AI Similarity Search) and wondered how it stacks up against my default choice, pgvector.
Turns out they live in different zip codes:
FAISS | pgvector | |
---|---|---|
What it is | C++/Python library | Postgres extension |
Lives | RAM (you serialise manually) | Durable tables |
Query API | Python / C++ | SQL |
Metadata filters | DIY | WHERE … |
Scale-out | Shard yourself | Standard Postgres tooling |
Sweet spot | Ultra-low-latency top-K look-ups inside an ML pipeline | Product features that need storage + vector search |
Toy problem - modified MNIST
“Given a query digit, return the 5 nearest neighbours that share its label.”
Source data lives at
https://datasets.r2pi.co/mnist_train.json
Each record is:
{
"image": [0, 0, …, 0], // length-196 float/int
"label": 7 // 0-9
}
1. FAISS (Python)
# for colab
# %pip install faiss-cpu
import json
import requests
import numpy as np
import faiss
import pickle
from tabulate import tabulate
url = "https://datasets.r2pi.co/mnist_train.json"
items = json.loads(requests.get(url).text)
vecs = np.array([x["image"] for x in items], dtype="float32")
labels = np.array([x["label"] for x in items], dtype="int64")
vecs[123], labels[123]
# (array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
# 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
# 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
# 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 1., 1., 0.,
# 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 1., 1., 0., 0., 0., 0.,
# 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 0., 0., 0., 0., 0., 0., 0.,
# 0., 0., 0., 0., 1., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
# 1., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 0., 0.,
# 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 0., 0., 0., 0., 0., 0.,
# 0., 0., 0., 0., 0., 1., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
# 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1.,
# 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32),
# np.int64(7))
# (optional) L2-normalise so inner-product ≈ cosine
faiss.normalize_L2(vecs)
# --- build / persist index ---------------------------------------------
index = faiss.IndexFlatIP(196) # 196-dim inner-product
index.add(vecs) # lives only in RAM
pickle.dump(index, open("mnist.ip", "wb"))
# --- query --------------------------------------------------------------
q = vecs[123:124] # pretend row-123 is our query, label 7
# --- check top-5 ----------------------------------------------------------
table_data = []
for i in range(len(ids[0][:5])): # Assuming ids is a 2D array
table_data.append({"id": ids[0][i], "label": labels[ids[0][i]]})
# Use tabulate to create a formatted table
print(tabulate(table_data, headers="keys", tablefmt="grid"))
+------+---------+
| id | label |
+======+=========+
| 123 | 7 |
+------+---------+
| 3788 | 7 |
+------+---------+
| 5253 | 7 |
+------+---------+
| 1513 | 7 |
+------+---------+
| 911 | 7 |
+------+---------+
This ran blazing fast in a colab notebook (sub-ms on CPU / GPU), but storage and filtering are on you.
2. pgvector (SQL)
-- one-time setup
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE digits (
id BIGSERIAL PRIMARY KEY,
label INT,
embedding VECTOR(196)
);
# load data from Python (snippet) ------------
import psycopg2, json, requests, numpy as np
conn = psycopg2.connect("postgresql://…")
cur = conn.cursor()
items = json.loads(requests.get(url).text)
for it in items:
cur.execute(
"INSERT INTO digits (label, embedding) VALUES (%s, %s)",
(it["label"], it["image"])
)
conn.commit()
-- single-round-trip query
SELECT id, label, embedding <-> '[…query 196-vec…]' AS dist
FROM digits
WHERE label = 7
ORDER BY dist
LIMIT 5;
What you get for free: durable storage, ACID transactions, WHERE label = … in the same call, standard backups & auth.
When to reach for which
-
FAISS - lightning-fast, GPU-friendly inside an ML pipeline when you're happy to own persistence + filtering.
-
pgvector - your vectors are the data and need SQL joins, backups, migrations, or RBAC.
-
Hybrid - store embeddings in Postgres for reliability, dump them to FAISS for offline crunching when latency is king.