Hey—we've moved. Visit
The Keyword
for all the latest news and stories from Google
Official Blog
Insights from Googlers into our products, technology, and the Google culture
Find out what’s in a word, or five, with the Google Books Ngram Viewer
December 16, 2010
Scholars interested in topics such as philosophy, religion, politics, art and language have employed qualitative approaches such as literary and critical analysis with great success. As more of the world’s literature becomes available online, it’s increasingly possible to apply quantitative methods to complement that research. So today Will Brockman and I are happy to announce a new visualization tool called the
Google Books Ngram Viewer
, available on Google Labs. We’re also making the datasets backing the Ngram Viewer, produced by Matthew Gray and intern Yuan K. Shen, freely downloadable so that scholars will be able to create replicable experiments in the style of traditional scientific discovery.
Comparing instances of [flute], [guitar], [drum] and [trumpet] (
blue, red, yellow and green respectively)
in English literature from 1750 to 2008
Since 2004, Google has digitized more than 15 million books worldwide. The datasets we’re making available today to further humanities research are based on a subset of that corpus, weighing in at 500 billion words from 5.2 million books in Chinese, English, French, German, Russian, and Spanish. The datasets contain phrases of up to five words with counts of how often they occurred in each year.
These datasets were the basis of a research project led by
Harvard University's
Jean-Baptiste Michel and Erez Lieberman Aiden published today in
Science
and coauthored by several Googlers. Their work provides several examples of how quantitative methods can provide insights into topics as diverse as the spread of innovations, the effects of youth and profession on fame, and trends in censorship.
The Ngram Viewer lets you graph and compare phrases from these datasets over time, showing how their usage has waxed and waned over the years. One of the advantages of having data online is that it lowers the barrier to serendipity: you can stumble across something in these 500 billion words and be the first person ever to make that discovery. Below I’ve listed a few interesting queries to pique your interest:
World War I, Great War
child care, nursery school, kindergarten
fax, phone, email
look before you leap, he who hesitates is lost
virus, bacteria
tofu, hot dog
burnt, burned
flute, guitar, trumpet, drum
Paris, London, New York, Boston, Rome
laptop, mainframe, microcomputer, minicomputer
fry, bake, grill, roast
George Washington, Thomas Jefferson, Abraham Lincoln
supercalifragilisticexpialidocious
We know nothing can replace the balance of art and science that is the qualitative cornerstone of research in the humanities. But we hope the Google Books Ngram Viewer will spark some new hypotheses ripe for in-depth investigation, and invite casual exploration at the same time. We’ve started working with some researchers already via our
Digital Humanities Research Awards
, and look forward to additional collaboration with like-minded researchers in the future.
Posted by Jon Orwant, Engineering Manager, Google Books
Labels
accessibility
41
acquisition
26
ads
131
Africa
19
Android
58
apps
419
April 1
4
Asia
39
books + book search
48
commerce
12
computing history
7
crisis response
33
culture
12
developers
120
diversity
35
doodles
68
education and research
144
entrepreneurs at Google
14
Europe
46
faster web
16
free expression
61
google.org
73
googleplus
50
googlers and culture
202
green
102
Latin America
18
maps and earth
194
mobile
124
online safety
19
open source
19
photos
39
policy and issues
139
politics
71
privacy
66
recruiting and hiring
32
scholarships
31
search
505
search quality
24
search trends
118
security
36
small business
31
user experience and usability
41
youtube and video
140
Archive
2016
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2007
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2006
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2005
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2004
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Google
on
Follow @google
Follow
Give us feedback in our
Product Forums
.