In fact, there is an easier way to achieve this kind of thing. Using google's own api or 'application program interface'. I wrote the following program to test out the number of instances of the pattern "No*":
import com.google.soap.search.GoogleSearch;
import com.google.soap.search.GoogleSearchFault;
import com.google.soap.search.GoogleSearchResult;
public class No {
public static void main(String[] args) {
String clientKey = args[0];
int maxNoLength = 20;
GoogleSearch s = new GoogleSearch();
s.setKey(clientKey);
String searchKey = "no";
try {
for (int no = 0; no < maxNoLength; no++) {
searchKey += "o";
s.setQueryString(searchKey);
GoogleSearchResult r = s.doSearch();
int count = r.getEstimatedTotalResultsCount();
System.out.println(searchKey + " = " + count);
}
} catch (GoogleSearchFault f) {
System.out.println(f);
}
}
}
All it does is query google, get the estimated
count of the results, and increase the length of the query string by one character. It prints the results as it gets them:
.java -cp googleapi.jar:. No client-key
noo = 178000
nooo = 131000
noooo = 108000
nooooo = 59700
noooooo = 37100
nooooooo = 26200
noooooooo = 19600
nooooooooo = 15100
noooooooooo = 22700
nooooooooooo = 9690
noooooooooooo = 7600
nooooooooooooo = 7510
noooooooooooooo = 6700
nooooooooooooooo = 4260
noooooooooooooooo = 3790
nooooooooooooooooo = 3060
noooooooooooooooooo = 2810
nooooooooooooooooooo = 2570
noooooooooooooooooooo = 2180
nooooooooooooooooooooo = 1950
.
Where client-key is the key given to registered users of the api (it's free, but there are sensible restrictions on the number of queries per day). Of course, you might well argue that there is no 'sensible' number for queries of this type...
In case you were wondering, the reason for choosing "No" was the film Revenge of the Sith, specifically the part where Darth Vader shouts "NOooOOooOOooOO" to the sky. This meme was briefly propogated, and I wondered what the frequency was of the different lengths.