Ask HN: Best Embedding Models?
14 points - today at 3:15 AM
Hey HN, which embedding models are people using? There has been so much development around foundational LLMs, but haven't seen much news about embedding models.
Comments
Yogeshshirsath today at 1:37 PM
E5 (Microsoft)
emschwartz today at 12:44 PM
I’ve been using MixedBread, which is a pretty old model at this point. Recently, I tried comparing it to some newer models and was disappointed that the results weren’t dramatically and uniformly better.
You probably can’t go wrong if you pick a recent one that scores decently well on benchmarks and is at the right price point (or memory requirement) for whatever you’re trying to do.
rapatel0 today at 3:57 AM
I've liked qwen and embeddinggemma for local search. Qwen because 32K is enough to basically fit a whole page into the context window and embeddiggemma because it's crazy efficient.
LogicCraft678 today at 11:18 AM
Feels like embeddings are underrated compared to LLM's hype, but they doing great.
didgeoridoo today at 10:20 AM
I’m partial to jina.ai — they have open models for code and prose, all easily runnable locally.
PhilippGille today at 6:04 AM
Benchmarks only paint part of the picture, but it's still a decent place to start looking into recent models:
frederickabrah today at 11:11 AM
who knows a tool for rug check in crypto
halvorbuilds today at 11:46 AM
gemma4
jayshah5696 today at 5:09 AM
embeddings are easy to fine tune. Try modern bert.