AIMA Artificial Intelligence a modern approch

AIMA-exercises is an open-source community of students, instructors and developers. Anyone can add an exercise, suggest answers to existing questions, or simply help us improve the platform. We accept contributions on this github repository.

Exercise 22.3

Zipf’s law of word distribution states the following: Take a large corpus of text, count the frequency of every word in the corpus, and then rank these frequencies in decreasing order. Let $f_{I}$ be the $I$th largest frequency in this list; that is, $f_{1}$ is the frequency of the most common word (usually “the”), $f_{2}$ is the frequency of the second most common word, and so on. Zipf’s law states that $f_{I}$ is approximately equal to $\alpha / I$ for some constant $\alpha$. The law tends to be highly accurate except for very small and very large values of $I$.

View Answer