Word frequency data

You can now freely download a list of the top 5000 words/lemmas from the 450 million word Corpus of Contemporary American English, which is the only large and balanced corpus of American English. Although there are many word and frequency lists of English on the web, we believe that this list is the most accurate one available (compare...).

The free list contains the lemma and part of speech for the top 5,000 words in American English. It is also possible to download other lists that contain the top 20-30 collocates (nearby words) for each of these words -- which provides useful information on word meaning and usage -- as well as to see which words are most common in certain genres (e.g. spoken or academic). It is also possible to download highly accurate lists for the top 20,000 and the top 60,000 words in English, with their top collocates as well.

If you want an eBook version of the 5,000 word list -- with collocates, genre information, etc -- you can purchase it for about $20 here.

Finally, if you are a teacher of children, you might be interested in two free lists created by Dick Brandt, which show the most frequent sounds in English, based on a cross-match between the 20,000 word list and the CMU pronouncing dictionary: all words, monosyllables.

See the list

Download the list