Word frequency data

from the 14 billion word iWeb corpus

intro samples related get data

Note: this data is based on corpora that were created solely by Mark Davies, Professor of Linguistics at Brigham Young University. As the result of an agreement between BYU and Mark Davies, all transactions regarding payments and licenses for this data are made solely with Mark Davies, rather than with BYU.

Top 60,000 lemmas (+ word forms)
(See sample)
Academic *   $125 License agreement
Commercial   $250 License agreement

These are the steps to obtain the data:

1. Download and fill out the license agreement. This states that you will not give the data to anyone else outside of your university or company (which also means that you cannot post it on the web). You just need to fill in your name and company (if that is applicable), and then send it back to us as an attachment. * Note that you must use an academic email address (e.g. *.edu or *.ac.edu) for an academic license.
2. Once we receive the license agreement, we'll send you a request for payment from PayPal.

3. You make the payment with a credit card at PayPal. Note that you do not need a PayPal account to make the payment.

4. As soon as we receive confirmation of the payment, we'll send you the link to download the data.