Word frequency data

Note: this data is based on corpora that were created solely by Mark Davies, Professor of Linguistics at Brigham Young University. As the result of an agreement between BYU and Mark Davies, all transactions regarding payments and licenses for this data are made solely with Mark Davies, rather than with BYU.

The COCA 100,000 word list comes in two formats (both included in the purchase price): text and Excel. The 20,000-60,000 lemma lists can be purchased in several different formats:

 1.  Text files. Click on the appropriate link in the blue sections (e.g. 90).  With these versions, you can view, search, print, export, and re-use data.
 2.  eBook. Order via the green PayPal form below.
Licensing: A=academic, C=commercial Click on heading for samples
# words format Wordlist Wordlist +
genre frequency
5,000 lemmas Free (A) $90 (C) $45 (A) $90 (C) $19.95
20,000 lemmas $60 (A) $120 (C) $80 (A) $160 (C) $39.95
60,000 lemmas $90 (A) $180 (C) $125 (A) $250 (C) -----
100,000 words $125 (A)     $250 (C)
(price includes both text and Excel files,
with 200,000 links to COCA queries)
See comparison
of 60k and 100k lists

Questions about what size of list to purchase? (5,000, 20,000, or 60,000) Take a look at this list to see what words appear at the different levels. There's no need to buy a larger list if a smaller one is adequate for your needs.


Order eBook
via the secure PayPal site by filling out the short form below and clicking "Submit". 5-10 seconds after payment, you will receive an email with a link to download the data.


Please make sure that the email that you use with PayPal is correct, so that you can receive this email. Please check your "Junk Email" folder if you don't find it in your "Inbox". Also, note that while you can view and search all of the data in the eBook, you cannot print or export from the eBook.