Similarity & Distance
Measuring similarity
How do we know if two words are "similar" in meaning? We measure the cosine similarity between their vectors. Think of it as measuring the angle between two arrows pointing in a high-dimensional space.
- 100% = Identical meaning (same direction)
- 50% = Somewhat related
- 0% = Unrelated (perpendicular)
- Negative = Opposite meanings
Calculate Similarity
↔
Cosine Similarity
--
Words in 2D space
While word vectors live in 50+ dimensions, we can project them onto 2D to visualize clusters. Words with similar meanings appear close together.
Word Map
Select two words to highlight them on the map. Notice how similar words cluster together.
Royalty/Gender
Animals
Places
Concepts
Why this matters
Semantic similarity is the foundation of many AI capabilities:
- Search: Finding documents with similar meaning, not just exact keywords
- Recommendations: "If you like X, you might like Y"
- Understanding: Knowing that "automobile" and "car" mean the same thing
Key Takeaways
- Cosine similarity measures how close two vectors point
- Similar meanings = similar vectors = high similarity score
- This enables semantic search and understanding