let a sea of unstructured data hide good sake
makes a good Japanese sake? Sake is sensitive to temperature. Flavor changes as
it is served cold, room temperature, heated warm or hot. Particularly, heating sake
to a preferred temperature in a precise amount of time is no easy task to
master. Oh, and knowing what type of sake to pair with tempura versus sushi
makes the rice brew taste even better.
it’s also behind Tetsuya’s search for Tokyo’s best sake bistro.
benefit of text mining technology is to make you aware of what you have not yet
noticed, and that’s how I found this bistro which I became a regular customer”
said Tetsuya… with a sad look in his face.
late 2014, Tetsuya’s favorite bistro near Tokyo Station quietly
closed. The bistro, Yanagi, was run by a couple with a small counter and three
tables. Just the right size for the husband-and-wife team of Otosan (which
translates as “father/darling”) and Okasan (or “mother/honey”) to manage. And
they served the best Kanzake (warm sake) with a taste of Okasan’s home cooking.
bistro gradually made fans like Tetsuya through word-of-mouth as a precious
place that you want to introduce to close friends. So, Tetsuya was shocked to
receive the call from Otosan with the news that Yanagi was closing. He wasn’t
the only regular customer sad about the closure. During his last visit, Otosan
softly muttered that if they could have this many customers on a regular basis,
perhaps, they would not have had to make such a difficult decision.
though, came down to numbers. Customer review numbers. Yanagi had good reviews
for its quality sake and homemade cooking. Just not enough of them to rank on
popular restaurant review sites. It was a hidden treasure that unfortunately
words in his ear, Tetsuya decided to unveil the way how he found Yanagi to shed
light on quality-conscious bistros like Yanagi.
social sentiment of a good sake bistro
Tetsuya analyzed tweets with bistro names to determine if there is information that
indicates a good bistro. He looked not only at tweets about bistros, but what
kind of people tweet about bistros from millions of tweets and tens of
thousands of candidates to narrow down the definition of a good bistro.
gathered tweets which contained either “nihonshu (sake)” or “beer,” while
eliminating tweets with industry terms and expressions, such as “goraiten
(formal way to say “look forward to your visit”).” He included tweets contain
the word “beer” because he wanted to gather broader information in regards to
sake –and oftentimes people start toasting with beer, before drinking sake; so
there might be a greater possibility of finding information on sake by
including these tweets.
referenced reviews and public information such as location, size, ambiance and
menu against the definitions that emerged from his Twitter analysis. To do
this, he mined the text of 4 million tweets to identify hidden bistros that may
be good, and then cross-referenced reviews and public information to further
narrow down which bistros might be good. For example, a tweet of “going to
[bistro name] for some sake” might not seem significant, but it’s a good lead
to connect with other findings.
Tetsuya matched bistros with low-rankings on review sites, but favorable
tweets. In about 30 minutes, his system could deliver a potential bistro near a
specific location, like Tokyo Station.
determine if his analytics technique worked well or not, he decided to go with an
old fashion yet reliable way to confirm. As he identifies a good bistro, he tried
it out after work with colleagues who also love sake. They quickly became
members of the Japanese Sake club hosted by Tetsuya.
on his advanced analytics – and field trips to 15 excellent bistros from Tokyo to
Kyoto – on how to discover a good sake bistros (plus field work by the Japanese
Sake club), Tetsuya wrote Mining a large
amount of tweets for discovering bistro serving good sake: an attempt for using
micro blogs as knowledge, an academic
paper (Japanese) that he and his club members presented at the 21st annual meeting
of the Association for National Language Processing of Japan, last March.
Tetsuya continues working on solving ambiguity problems, not losing sight of
the fact that noisy data may hold hidden insights.
example, bistros often have the family name of the owners. This makes
identification challenging when using natural language processing because it’s
difficult to identify if the name indicates the bistro, some family or
wants to further analyze the people tweeting about sake and sake bistros, using
IBM Watson Personality Insights.
also wants to integrate image analysis to better-identify bistros and
locations. And he is always trying to add some fun into his research, with his
Japanese Sake Club actively supporting him, particularly on field trips.
work: Tetsuya and some of the Japanese Sake club members giving a toast at a
newly discovered bistro after work (From left: Tetsuya,
Labels: analytics, social media, tokyo