Word frequency data

Corpus of Contemporary American English


 Purchase data 

Overview
Using the data
Compare 100k/60k

100,000 word list
  Samples
  Compare
  FAQ / questions

5,000-60,000 lemma lists
   Samples / formats
   Compare
   Free list (5,000)

Spanish data
Portuguese data

Related sites
  Full-text data 
  Collocates
  N-grams
  WordAndPhrase
  Academic vocabulary
  corpus.byu.edu

Contact us


You can now freely download a list of the top 5000 words/lemmas from the 450 million word Corpus of Contemporary American English, which is the only large and balanced corpus of American English. Although there are many word and frequency lists of English on the web, we believe that this list is the most accurate one available (compare...).

The free list contains the lemma and part of speech for the top 5,000 words in American English. It is also possible to download other lists that contain the top 20-30 collocates (nearby words) for each of these words -- which provides useful information on word meaning and usage -- as well as to see which words are most common in certain genres (e.g. spoken or academic). It is also possible to download highly accurate lists for the top 20,000 and the top 60,000 words in English, with their top collocates as well.

If you want an eBook version of the 5,000 word list -- with collocates, genre information, etc -- you can purchase it for about $20 here.



See the list

Download the list