Title: Geometries of Word Embeddings
Speaker: Professor Pramod Viswanath, University of Illinois at Urbana-Champaign
Venue: Packard 101 Time: 4:15 - 5:15 pm Date: Thursday, May 4
(Refreshments served after the talk)
Abstract:
Real-valued word vectors have
transformed NLP applications; popular examples are word2vec and GloVe, recognized for their ability to capture linguistic
regularities via simple geometrical operations. In this talk, we demonstrate
further striking geometrical properties of the word vectors. First, we show
that a very simple, and yet counter-intuitive, postprocessing
technique, which makes the vectors "more isotropic",
renders off-the-shelf vectors even stronger. Second, we show that
a sentence containing a target word is well represented by a low-rank subspace; subspaces associated with a particular sense
of the target word tend to intersect with a line (one-dimensional subspace). We
harness this Grassmannian geometry to disambiguate
(in an unsupervised way) multiple senses of words, specifically so on the most
promiscuously polysemous of all words: prepositions.
A surprising finding is that rare senses, including
idiomatic/sarcastic/metaphorical usages, are efficiently captured. Our
algorithms are all unsupervised and rely on no linguistic resources; we
validate them by presenting new state-of-the-art results on a variety of
multilingual benchmark datasets.
References:
1. Geometry of Compositionality,
AAAI '17, https://arxiv.org/abs/1611.09799
2. Geometry of Polysemy, ICLR,
'17, https://arxiv.org/abs/1610.07569
3. Representing Sentences as
Low-rank subspaces, ACL '17, https://arxiv.org/abs/1704.05358
4. Prepositions in Context,
preprint, https://arxiv.org/abs/1702.01466
Bio:
Pramod Viswanath received the Ph.D. degree
in electrical engineering and computer science from University of California at
Berkeley in 2000. From 2000 to 2001, he was a member of research staff at Flarion technologies, NJ. Since 2001, he is on the faculty
at University of Illinois at Urbana-Champaign in Electrical and Computer
Engineering, where he currently is a professor.
Pramod has worked extensively on information theory and its
applications, specifically as applied to wireless communication. His book on
wireless communication, coauthored with David Tse, is
used in more than 60 institutes around the world. His interest in natural
language processing is recent, although he has
long been interested in natural languages.