Just as electricity transformed the way industries functioned in the past century, artificial intelligence — the science of programming cognitive abilities into machines — has the power to substantially change society in the next 100 years. AI is being harnessed to enable such things as home robots, robo-taxis and mental health chatbots to make you feel better.
A startup is developing robots with AI that brings them closer to human level intelligence. Already, AI has been embedding itself in daily life — such as powering the brains of digital assistants Siri and Alexa. It lets consumers shop and search online more accurately and efficiently, among other tasks that people take for granted.
“AI is the new electricity,” said Andrew Ng, co-founder of Coursera and an adjunct Stanford professor who founded the Google Brain Deep Learning Project, in a keynote speech at the AI Frontiers conference that was held this past weekend in Silicon Valley. “About 100 years ago, electricity transformed every major industry. AI has advanced to the point where it has the power to transform” every major sector in coming years. And even though there’s a perception that AI was a fairly new development, it has actually been around for decades, he said. But it is taking off now because of the ability to scale data and computation.
Ng said most of the value created through AI today has been through supervised learning, in which an input of X leads to Y. But there have been two major waves of progress: One wave leverages deep learning to enable such things as predicting whether a consumer will click on an online ad after the algorithm gets some information about him. The second wave came when the output no longer has to be a number or integer but things like speech recognition, a sentence structure in another language or audio. For example, in self-driving cars, the input of an image can lead to an output of the positions of other cars on the road.
Indeed, deep learning — where a computer learns from datasets to perform functions, instead of just executing specific tasks it was programmed to do — was instrumental in achieving human parity in speech recognition, said Xuedong Huang, who led the team at Microsoft on the historic achievement in 2016 when their system booked a 5.9% error rate, the same as a human transcriptionist. “Thanks to deep learning, we were able to reach human parity after 20 years,” he said at the conference. The team has since lowered the error rate even more, to 5.1%.
“We have cheap motors, sensors, batteries, plastics and processors … why don’t we have Rosie?”–Dileep George
The Rise of Digital Assistants
Starting in 2010, the quality of speech recognition began to improve for the industry, eventually leading to the creation of Siri and Alexa. “Now, you almost take it for granted,” Ng said. That’s not all; speech is expected to replace touch-typing for input, said Ruhi Sarikaya, director of Amazon Alexa. The key to greater accuracy is to understand the context. For example, if a person asks Alexa what he should do for dinner, the digital assistant has to assess his intent. Is he asking Alexa to make a restaurant reservation, order food or find a recipe? If he asks Alexa to find ‘Hunger Games,’ does he want the music, video or audiobook?
And what’s next for the digital assistant is an even more advanced undertaking — to understand “meaning beyond words,” said Dilek Hakkani-Tur, research scientist at Google. For example, if the user uses the words “later today,” it could mean 7 p.m. to 9 p.m. for dinner or 3 p.m. to 5 p.m. for meetings. This next level up also calls for more complex and lively conversations, multi-domain tasks and interactions beyond domain boundaries, she said. Moreover, Hakkani-Tur said, digital assistants should be able to do things such as easily read and summarize emails.
After speech, ‘computer vision’ — or the ability of computers to recognize images and categorize them — was the next to leap, speakers said. With many people uploading images and video, it became cumbersome to add metadata to all content as a way to categorize them. Facebook built an AI to understand and categorize videos at scale called Lumos, said Manohar Paluri, a research lead at the company. Facebook uses Lumos to do data collection of, for example, fireworks images and videos. The platform can also use people’s poses to identify a video, such as categorizing people lounging around on couches as hanging out.
“Her job is to bring a spot of life to your home. She provides entertainment — she can play music, podcasts, audiobooks.”–Kaijen Hsiao
What’s critical is to ascertain the primary semantic content of the uploaded video, added Rahul Sukthankar, head of video understanding at Google. And to help the computer correctly identify what’s in the video — for example, whether professionals or amateurs are dancing — his team mines YouTube for similar content that AI can learn from, such as having a certain frame rate for non-professional content. Sukthankar adds that a promising direction for future research is to do computer training using videos. So if a robot is shown a video of a person pouring cereal into a bowl at multiple angles, it should learn by watching.
At Alibaba, AI is used to boost sales. For example, shoppers of its Taobao e-commerce site can upload a picture of a product they would like to buy, like a trendy handbag sported by a stranger on the street, and the website will come up with handbags for sale that come closest to the photo. Alibaba also uses augmented reality/virtual reality to make people see and shop from stores like Costco. On its Youku video site, which is similar to YouTube, Alibaba is working on a way to insert virtual 3D objects into people’s uploaded videos, as a way to increase revenue. That’s because many video sites struggle with profitability. “YouTube still loses money,” said Xiaofeng Ren, a chief scientist at Alibaba.
Rosie and the Home Robot
But with all the advances in AI, it’s still no match for the human brain. Vicarious is a startup that aims to close the gap by developing human level intelligence in robots. Co-founder Dileep George said that the components are there for smarter robots. “We have cheap motors, sensors, batteries, plastics and processors … why don’t we have Rosie?” He was referring to the multipurpose robot maid in the 1960s space-age cartoon The Jetsons. George said the current level of AI is like what he calls the “old brain,” similar to the cognitive ability of rats. The “new brain” is more developed such as what’s seen in primates and whales.
George said the “old brain” AI gets confused when small inputs are changed. For example, a robot that can play a video game goes awry when the colors are made just 2% brighter. “AI today is not ready,” he said. Vicarious uses deep learning to get the robot closer to human cognitive ability. In the same test, a robot with Vicarious’s AI kept playing the game even though the brightness had changed. Another thing that confuses “old brain” AI is putting two objects together. People can see two things superimposed on each other, such as a coffee mug partly obscuring a vase in a photo, but robots mistake it for one unidentified object. Vicarious, which counts Facebook CEO Mark Zuckerberg as an investor, aims to solve such problems.
The intelligence inside Kuri, a robot companion and videographer meant for the home, is different. Kaijen Hsiao, chief technology officer of creator Mayfield Robotics, said there is a camera behind the robot’s left eye that gathers video in HD. Kuri has depth sensors to map the home and uses images to improve navigation. She also has pet and person detection features so she can smile or react when they are around. Kuri has place recognition as well, so she will remember she has been to a place before even if the lighting has changed, such as the kitchen during the day or night. Moment selection is another feature of the robot, which lets her recognize similar videos she records — such as dad playing with the baby in the living room — and eliminates redundant ones.
“Her job is to bring a spot of life to your home. She provides entertainment — she can play music, podcasts, audiobooks. You can check your home from anywhere,” Hsiao said. Kuri is the family’s videographer, going around the house recording so no one is left out. The robot will curate the videos and show the best ones. For this, Kuri uses vision and deep learning algorithms. “Her point is her personality … [as] an adorable companion,” Hsiao said. Kuri will hit the market in December at $799.
“About 100 years ago, electricity transformed every major industry. AI has advanced to the point where it has the power to transform” every major sector in coming years.–Andrew Ng
Business Response to AI
The U.S. and China lead the world in investments in AI, according to James Manyika, chairman and director of the McKinsey Global Institute. Last year, AI investment in North America ranged from $15 billion to $23 billion, Asia (mainly China) was $8 billion to $12 billion, and Europe lagged at $3 billion to $4 billion. Tech giants are the primary investors in AI, pouring in between $20 billion and $30 billion, with another $6 billion to $9 billion from others, such as venture capitalists and private equity firms.
Where did they put their money? Machine learning took 56% of the investments with computer vision second at 28%. Natural language garnered 7%, autonomous vehicles was at 6% and virtual assistants made up the rest. But despite the level of investment, actual business adoption of AI remains limited, even among firms that know its capabilities, Manyika said. Around 40% of firms are thinking about it, 40% experiment with it and only 20% actually adopt AI in a few areas.
The reason for such reticence is that 41% of companies surveyed are not convinced they can see a return on their investment, 30% said the business case isn’t quite there and the rest said they don’t have the skills to handle AI. However, McKinsey believes that AI can more than double the impact of other analytics and has the potential to materially raise corporate performance.
There are companies that get it. Among sectors leading in AI are telecom and tech companies, financial institutions and automakers. Manyika said these early adopters tend to be larger and digitally mature companies that incorporate AI into core activities, focus on growth and innovation over cost savings and enjoy the support of C-suite level executives. The slowest adopters are companies in health care, travel, professional services, education and construction. However, as AI becomes widespread, it’s a matter of time before firms get on board, experts said.