Corpora Comparison

Controls
Comparison of narrow(?) domain parallel corpora.

The commentary dataset is obtained from using google translate manually on ESPN's English commentary.

The following datasets come from crawling translated renders by google of sections of Dainik Jagran.

  • Experts Column
  • Match Report
  • Bouncer Column
Check/uncheck controls to draw comparisons on the required metric. For example, to compare 1-gram vs 2-gram growth for Hindi in commentary dataset, check only 1, 2 in ngrams section, hi in language section, and commentary in dataset section.