Stitcher for Podcasts

Get the App Open App
Bummer! You're not a
Stitcher Premium subscriber yet.
Learn More
Start Free Trial
$4.99/Month after free trial
HELP

Episode Info

Episode Info: wo records from a domain that you've never even seen before. Say you've never done entity resolution on restaurants from Singapore. The first two records you feed it, it's really, really already smart. And then as you feed it more data, it gets smarter and smarter. ... So, there are two things that we've intertwined. One is common sense. One type of common sense is the names—Dick, Dickie, Richie, Rick, Ricardo are all part of the same name family. Why should it have to study millions and millions of records to learn that again? ... Next to common sense, there's real-time learning. In real-time learning, we do a few things. You might have somebody named Bob, but who now goes by a nickname or an alias of Andy. Eventually, you might come to learn that. So, now you know you have to learn over time that Bob also has this nickname, and Bob lived at three addresses, and this is his credit card number, and now he's got four phone numbers. So you want to learn those over time. ... These systems we're creating, our entity resolution systems—which really resolve entities and graph them (call it index of identities and how they're related)—never has to be reloaded. It literally cleans itself up in the past. You can do maintenance on it while you're querying it, while you're loading new transactional data, while you're loading historical data. There's nothing else like it that can work at this scale. It's really hard to do. Related resources: Jeff Jonas on “Context Computing” David Ferrucci on why “Language understanding remains one of AI’s grand challenges” David Blei on “Topic models: Past, present, and future” “Lessons learned building natural language processing systems in health care” “Building a contacts graph from activity data” “Customer record deduplication using Spark and Reifier” ...
Read more »

Discover more stories like this.

Like Stitcher On Facebook

EMBED

Episode Options

Listen Whenever

Similar Episodes

Related Episodes