What is the difference between Semantic Folding and other approaches?

Semantic Folding Other Word Embedding Models
Algorithm
  • sparse binary vector representation
  • topological feature arrangement enables generalization
  • dense floating point vector representation
  • independence of features can lead to false positives
Semantic Folding Other Word Embedding Models
Word ambiguity
  • all associated contexts are captured
  • terms can be computationally disambiguated
  • composing aggregated representations implicitly disambiguates
  • only the main sense is represented
  • other meanings interfere as noise
  • no computational disambiguation possible
Semantic Folding Other Word Embedding Models
Compositionality
  • atomic word representations can be aggregated for any text size: sentences, paragraphs, documents, books etc.
  • aggregated representations can be compared to each other
  • only word vectors OR sentence vectors OR paragraph vectors possible
  • vectors are not compatible with one another
Semantic Folding Other Word Embedding Models
Inspectability
  • semantically grounded features—easily debuggable
  • tuning by content experts
  • “black-box effect”: debugging only by trial and error
  • tuning by machine learning experts
Semantic Folding Other Word Embedding Models
Training Data
  • small amounts of data (high semantic payload)
  • no training data needed for classification
  • no gold-standard data needed
  • statistical encoding requires large amounts of data (low semantic payload)
  • every classifier needs individual training
  • every classifier needs its gold-standard data
Semantic Folding Other Word Embedding Models
Language Independence
  • semantic spaces can be trained on any language
  • semantic spaces can be easily aligned by unsupervised method: enabling cross-language compatible representations
  • can be trained on any language but amount of training data might become a practical limitation
  • alignment process complex due to large amounts of training data
Semantic Folding Other Word Embedding Models
Computational Efficiency
  • sparse binary vectors
  • small memory footprint
  • Boolean operators
  • dense double/floating-point vectors
  • large memory footprint
  • complex operators
Semantic Folding Other Word Embedding Models
Precision
  • precision steady across use cases
  • parameters need to be optimized for every use case