Bringing data science to the speakers of every language
暂无分享,去创建一个
Speakers of more than 5,000 languages have access to internet and communication technologies. The majority of phones, tablets and computers now ship with language-enabled capabilities like speech-recognition and intelligent auto-correction, and people increasingly interact with data-intensive cloud-based language technologies like search-engines and spam-filters. For both personal and large-scale technologies, the service quality drops or disappears entirely outside of a handful of languages. Speakers of low-resource languages correlate with lower access to healthcare, education and higher vulnerability to disasters. Serving the broadest possible range of languages is crucial to ensuring equitable participation in the global information economy. I will present examples of how natural language processing and distributed human computing are improving the lives of speakers of all the world's languages, in areas including education, disaster-response, health and access to employment. When applying natural language processing to the full diversity of the world's communications, we need to go beyond simple keyword analysis and implement complex technologies that require human-in-the-loop processing to ensure usable accuracy. In recent work where more than a million human judgments were collected on unstructured text and imagery data around natural disasters, I will present observations that debunk recent over-optimistic claims about the utility of social media following disasters. On the positive side, I will share results that show how for-profit technologies are improving people's lives by providing sustainable economic growth opportunities when they support more languages, aligning business objectives with global diversity.