Brandes Vectors

contributor: Peter Leonard

This is an interactive visualization of the semantic space in all six volumes of Brandes' Main Currents in 19th Century Literature.

The browser can be accessed here.

The digital landscape is a result of several abstractions that attempt to map semantics to space:

  1. We created a "word embedding model" (Wikipedia) that represents the relationships between Brandes' words in a much lower-dimensional space -- in this case, 200 dimensions -- than the original text, which contains nearly 11,000 individual words.
  2. Although this in itself is a massive reduction in complexity, we need to further map these 200 dimensions down to the two dimensions of a computer screen in order to display it. We accomplish this using t-SNE (Wikipedia), a dimensionality reduction algorithm that attempts preserve local relationships as well as the global shape.
  3. Finally, we create an artificial digital landscape of the resulting semantic space using WebGL, a programming language similar to those used for advanced computer games.

In the resulting visualization, you can search for particular terms to locate them on the digital map. Try words such as:

  • hegel
  • frankrig
  • jakobinerne
  • reaktionær

... pressing the "Søg" [= Search] button to jump to the word's location in the digital map.

Although the map appears three-dimensional, in all honesty we need to admit the height is purely to aid in creating a sense of space. In the future, these mountains and valleys could represent the word's transformations over time, its rarity, or other aspects of its use.

Software:

  • WordVectors R wrapper for word2vec by Ben Schmidt (Assistant professor of history at Northeastern University and core faculty in the NuLab for Texts, Maps, and Networks): https://github.com/bmschmidt/wordVectors
  • Word-to-Viz visualization library by Doug Duhaime (Digital Humanities Developer, Yale DHLab). Code forthcoming: https://douglasduhaime.com/

Technical parameters:

  • Vectors: 200
  • Window: 30
  • Iterations: 30

Further reading on Word Vectors / Word Embedding: