You’ve been training AI for free


– This is gonna sound like
a scam but I found a way to make extremely small
amounts of money online. So the one that caught
my eye here is Pinterest. Which, we all know and love Pinterest. “Determine the topical relatedness “Between pieces of text.” 40 cents. Wait there’s one more, so this one’s “Find info from an email.” For three cents. They’re just gonna email me, and then I like find info from it? Which seems cool. It turns out Amazon runs
an online marketplace that farms out basic tasks computer programs have a hard time with. It’s called Mechanical Turk, named after a robot from the 1700’s. But, people just call it MTurk. So what I wanna do now,
is Leonard Monteiro will pay me three cents
to write the prices shown in an image. I’m not totally sure why this
is like, worth doing money, but okay, let’s do it. It looks like a parking meter? I hope that this is for
a good cause somehow, that this app is helping
people rather than just sending bills to people. So I feel like on the
other end of this task there’s some sort of
automated parking meter app. Which is weird, but it
turns out stuff like this happens all the time. Nearly every successful AI product has human beings behind it. You just don’t see them until
you look at the big picture. (upbeat music) So what’s going on here? Is there actually an
app that claims to read parking meter fines, but it’s
actually humans doing it? Or, am I helping train
their AI and these meters are just hypothetical examples? To figure that out, I
asked a resident AI expert James Vincent. – So, I did a little Googling
here and yeah, sorry Russell this is not a parking meter at all. This is actually a little
gadget you put in supermarkets and you scan barcodes to check the prices. Now, the bigger question
is why does someone want you to write down all these prices? And I have two answers for you. In the first case, creating training data. Say you want to make a
machine vision system that automatically does what you’re doing. How does it actually know
where to look in the picture? How does it know what the
barcode scanner looks like? To teach it that information, you need to feed it labeled data. You need to get a human
to do that labeling, in this case, Russell. He labels the data, it
goes into the system, and the system learns what
these things look like. – (laughing) Oh no, there’s so much more! – That is how you train an AI system, but sometimes these systems
they don’t work, right? So, you use case number two, what’s that? Well, that might be where the AI system actually can’t do what it says it can do. It might be that there’s too
much glare in the picture and it can’t read the numbers
on the screen very well. In those cases, you need to throw the data to someone who has the intelligence to work out what’s going on. That’s not a machine, that’s
a human, like Russell. And they will label the
data for the machine and return it back to the end user. Sometimes companies
are upfront about this, and sometimes they lie about it too. Sometimes they will say, “Yes! “We’ve got a wizzy AI system that’s “Doing all this automatically.” And actually, they don’t. Turns out that that AI is
a lot of low-paid workers on a system like Mechanical
Turk, like Russell, providing this data in the background. – If you’ve ever filled out
a CAPTCHA you’ve probably done some of that work yourself. In theory those tests are meant
to verify that you’re human, but Google has started
using them to collect data for other products too. Typing out this blurry word could help the character recognition
algorithm in Google Books. These skewed numbers are
probably helping confirm an address in Google street view. The most recent CAPTCHA’s
ask you to identify all the squares of a picture
that have a car in it, at the same time that
Google’s Waymo branch is trying to train self-driving cars. Even a simple task like setting a timer with Google Assistant can
require an army of contractors manually annotating the data as a recent Guardian investigation showed. Sometimes users do the
labeling themselves. Facebook has some of the
best facial recognition data in the world, because they
already have dozens of pictures of your face. You added them yourself. Multiply that across billions
of users and it’s all the data you need to build a
facial recognition system, which can then start
automatically tagging your friends in the next set of pictures you upload. Suddenly, Facebook has
one of the most advanced facial recognition systems in the world, and they didn’t have to pay a dime for it. When researchers at Google
were trying to build a depth sensing camera,
they went even further. What they really needed
were a bunch of videos where mobile cameras explored static space from different angles. But where would they find that? (atmospheric music) Google downloaded 2000
mannequin challenge videos, fed them into an algorithm,
and a new kind of depth sensing software was born. Think about it, every
minute, 500 new hours of content are added to YouTube. If you’re training an AI that’s
a lot of video to draw on. And there are no copyright restrictions on what you can use for training data. The same goes for websites,
images, Wikipedia pages, it’s all just there for the taking. This has been a huge driving
force for the AI boom. These systems need lots
of examples to recognize even the most basic patterns. That used to mean months of data entry, but now you can scrape everything you need from the internet in a matter of hours. And the people who made the
mannequin challenge videos, they didn’t think they were
encoding depth information. If the researchers hadn’t talked about their training system,
it would feel like they’d done it all on their own. – The remarkable thing
about AI systems is that even though they are built on a foundation of human intelligence, they
regularly transcend that, and do something that
surprises us or goes beyond what we thought was possible. One fantastic example of
this is the AlphaGo program, which was designed by
DeepMind, which was Google’s AI lab here in London. And in 2016 and 2017 it played and beat the human champions of
the ancient board game Go. There’s one particularly famous moment is now known simply as move 37. It was a move that was so unusual, so counter to human expectations, that the matches commentators
thought it was a mistake. But it wasn’t. It was a beautiful play,
that completely undermined Lee’s match, and led to
AlphaGo winning the game. And it was something that
humans couldn’t teach. It was something that the
machine had learnt by itself. Yes, it started from a
foundation of human intelligence, but it went beyond that. This, I think is where
people get so excited by AI, we’re a long long way away
from building computers that are as flexibly
intelligent and sophisticated as humans, but we can still
build algorithms and systems that exceed human intelligence, even in very specific domains. – But that’s AI at its best. The flip side is when an
app needs a description of what’s in a photo, and the
photo recognizing algorithm just doesn’t work. So you get a human being to fill it in, usually through a post on Mechanical Turk. That’s a very old trick,
going all the way back to the machine that
gave the site its name. The original mechanical Turk was this guy, a master chess-playing robot. Hundreds of years before
there was anything we would think of as a computer. The Turk could beat most chess players, playing so well that people thought it was a technological marvel. But, really it was just a trick. There was a human being inside, hiding under the table and
directing the moves from below. It was a human being,
dressed up as a machine. A trick no-one had thought of until then. And as Amazon can tell
you, the trick still works. Thanks for watching, I
hope you liked the video. If you wanna know more about AI we did a whole video about
what these changes look like at a social scale, whether
AI’s destroying jobs, or gonna make everything free. So you can check that out here, or like and subscribe.

Add a Comment

Your email address will not be published. Required fields are marked *