Being the crotchety old man that I am, the time I spent this evening on my gym’s treadmill left me feeling cantankerous. I had been watching Jeopardy, and all of the categories seemed horrible, dagnabit. Back in my day we didn’t have questions about sitcoms! No, it was all Latin, and poetry, and similarly high-minded pursuits.
Then I got home and remembered I had a bunch of code left over from when we built this thing. See, there is a terrifying website called j-archive.com. It’s maintained by former players, and it comprehensively chronicles every game of Jeopardy.
It’s possible to scrape this site to reconstruct games, which is what I did for the Cordray infographic. With this as a starting point, figuring out the percentage of categories devoted to television versus weightier topics was a relative cinch. I was absolutely confident that I would find a line snaking smoothly upward. Here are the regular expressions I used:
RE_TV = re.compile(r'(T\.?V\.?|TELEVISION|SITCOM)', re.I)
RE_BIBLE = re.compile(r'BIBL(ICAL|E)', re.I)
RE_HISTORY = re.compile(r'(PRESIDENT|HISTORY|HISTORICAL)',re.I)
And here’s the graph that resulted (normalized by total number of categories in a season):
Gotta say, I didn’t see this one coming. I guess the nerds are (mostly) all right after all. Alex Trebek’s still kind of a supercilious asshole, though.
Anyway, I’m open to other suggested analyses. Lay ‘em on me.