Understanding semantic similarity, word vectors, analogies, and what embedding dimensions represent.
Semantic Space — 2D Projection
Words with similar meanings cluster together. Hover over a word to see its connections. This is a pre-computed 2D projection of real embedding distances.
What is a word embedding?
An embedding maps a word (or token) to a point in high-dimensional vector space — typically 768 to 4096 dimensions. Words with similar meanings end up close together because they appear in similar contexts during training. The 2D projection here (using t-SNE-like layout) preserves relative distances.
Cosine Similarity Calculator
Word A
Word B
0.00
Select two words to compare
Most similar words to —
Cosine similarity measures the angle between two vectors: 1.0 = identical direction, 0 = perpendicular (unrelated), -1 = opposite.
Unlike Euclidean distance, cosine similarity ignores vector magnitude — useful because word frequency affects vector length but not necessarily meaning.
Word Analogies — Vector Arithmetic
The famous "king - man + woman = queen" property of word embeddings. Pick words to compute: A - B + C = ?
Word A (e.g. king)
Word B (subtract)
Word C (add)
king − man + woman = ?
Result word:
queen
Try these classic examples:
How vector arithmetic works:
"man" and "king" differ by a "royalty" direction in embedding space. Adding that same direction to "woman" points toward "queen". This geometry reflects real semantic relationships encoded during training on billions of text documents.
Embedding Dimensions
Real embeddings have hundreds or thousands of dimensions. Here we show a simplified 8-dimensional view. Pick a word to see its vector.
Select a word
Dimension values
What do dimensions mean?
Individual dimensions don't have human-interpretable meanings — they're distributed representations. The model learns to encode abstract features (gender, animacy, tense, sentiment, domain, formality…) spread across all dimensions simultaneously.