Three cheers for Amazon's human eavesdroppers
(Bloomberg Opinion) --Alexa, are you really a human?
The revelation that a large team of Amazon employees listens to conversations recorded by the company’s digital assistant has exposed the contrast between the hype of artificial intelligence and the reality of the armies of underpaid humans that make the technology work in real life. It is these battalions that are leading Silicon Valley’s massive privacy invasion.
AI is supposed to good at pattern recognition and natural language processing. However, it’s all but impossible to train a neural network to recognize speech or faces with certainty. Algorithms that have to interact seamlessly with humans need to be constantly retrained to allow for changes in slang, population movements that bring new accents, cultural phenomena and fashion trends.
That doesn’t happen by magic; the algorithms won’t find out about the latest pop sensation or TV series all by themselves. During last year’s soccer World Cup in Russia, the authorities used a sophisticated facial recognition system to exclude known hooligans from stadiums. It worked – until the final game, when members of punk band Pussy Riot rushed onto the field, dressed in police uniforms. They weren’t in the database.
For artificial intelligence to work, it needs constant human input. But companies selling AI-based products don’t want to tell customers about the role played by what Wired’s Lily Hay Newman has called their “covert human workforces” for two reasons.
One is that using thousands of people to annotate data collected from customers doesn’t sound as magical as “deep learning,” “neural networks,” and “human-level image and speech recognition.” The other is that people are prepared to entrust their secrets to a disembodied algorithm, in the same the way as King Midas’s barber whispered to the reeds about the king’s donkey ears. But if those secrets risked being heard by humans, especially those with access to information that might identify the customer, it would be a different matter.
In the Midas myth, the barber’s whispers were picked up and amplified by the echo – coincidentally, the name of the Amazon device used to summon Alexa. Employees who annotate its audio recordings and help train the virtual assistant to recognize that Taylor Swift doesn’t mean a rush order for a suit don’t see customers’ full names and addresses, but, apparently, do get access to account numbers and device serial numbers.
That isn’t a big distinction – especially when it comes to private conversations involving financial transactions or sensitive family matters. These, of course, get picked up by Alexa when the digital assistant is accidentally triggered.
There’s not much difference between this level of access and that enjoyed by employees at the Kiev office of Ring, the security camera firm owned by Amazon. The Intercept reported earlier this year that, unbeknownst to clients, employees tasked with annotating videos were watching camera feeds from both inside and outside people’s homes.
Tellingly, the wording of Amazon’s response to the Intercept’s story was identical to the one it provided to Bloomberg. The company said it has “zero tolerance for abuse of our systems.” This kind of boilerplate response does little to inspire trust.
Amazon isn’t, of course, the only company that does this kind of thing. In 2017, Expensify, which helps companies manage employees’ expense accounts, hired workers on Mechanical Turk, the Amazon-owned labor exchange, to analyze receipts. Last year, the Guardian wrote of the rise of what it called pseudo-AI, and identified a number of cases where tech companies hired low-paid humans to generate training data for artificial intelligence.
The line between training and imitation is thin: to create the necessary dataset, people are sometimes needed to replicate the work expected of the algorithm, and this can go on for a long time. I doubt Facebook and Google will ever get rid of the tens of thousands of contractors who scan posts for offensive, sensitive or criminal content because their algorithms will never be good enough to prevent scandalous failures without human help.
In principle, there’s nothing wrong with this human participation in AI-based endeavors. In fact, it’s how things should work if we are to avoid a cataclysm in the labor market. As Daron Acemoglu from the Massachusetts Institute of Technology and Pascual Restrepo of Boston University wrote in a recent paper, “the effects of automation are counterbalanced by the creation of new tasks in which labor has a comparative advantage.” Those “covert human workforces” are doing tasks that would not have emerged without AI.
The problem lies elsewhere. Companies working on AI projects should be honest about human participation. Facebook and Google are already: their moderators and quality raters do similar work to that performed by Amazon’s Alexa-training team.
Of course, openness will demystify AI, and perhaps curb sales of intrusive products such as the Echo. But many people these days will sacrifice privacy for convenience, so there will still be money to be made from these devices.
Regulators have a useful role to play here. They should make sure companies really anonymize their AI-training datasets and make it impossible to link sensitive data to actual people, rather than contingent on a company's goodwill or enforcement practices.
It’s also up to the authorities to examine the pay and conditions of workers in these new, data-oriented occupations. These people are often treated by the tech industry as non-essential, unqualified and easily replaceable – but are doing a job with an emotional and psychological toll that isn’t well understood.
The reason these workers talk to reporters despite their non-disclosure agreements is that they are underappreciated, operating in a gray backwater of the much-glorified tech industry. Their important role shouldn’t be a dirty secret.