Updated Amazon and third-party services have been using smart speaker interaction data for ad targeting, in violation of privacy commitments, according to researchers at four US universities.
Academics at the University of Washington, University of California-Davis, University of California-Irvine, and Northeastern University claim “Amazon processes voice data to infer user interests and uses it to serve targeted ads on-platform (Echo devices) as well as off-platform (web).”
The researchers – Umar Iqbal, Pouneh Nikkhah Bahrami, Rahmadi Trimananda, Hao Cui, Alexander Gamero-Garrido, Daniel Dubois, David Choffnes, Athina Markopoulou, Franziska Roesner, Zubair Shafiq – describe their findings in a paper titled, “Your Echos are heard: Tracking, profiling, and ad-targeting in the Amazon smart speaker ecosystem.”
The ten academics say that smart speaker interaction – that’s you talking to the device – generates ad auction bids from advertisers that are as much as 30x higher than the bids would be without Amazon Echo speaker data. What’s more, they say, the way Amazon and Skills developers – makers of software integrated with Amazon’s Alexa voice assistant service – operate is often inconsistent with privacy policies.
That may not be a surprise to you; either way, the paper provides a technical dive into, and a thorough analysis of, how Amazon, Echo devices, Alexa, and adverts are all tied together.
To understand how Amazon and Skills developers handle audio data, the boffins created an auditing framework to evaluate how voice data gets collected, used, and shared. They did so because Amazon Echo smart speakers do not provide an interface to assess how data gets used and there’s no ready-made mechanism for understanding what happens to smart speaker data sent over the internet.
So the researchers created multiple fake personas with different smart-speaker usage profiles, and then simulated interactions to test statistical differences in amounts bid for audio and web advertisements. In this way, they claim they were able to infer the effect of smart speaker interactions involving the constructed personas.
Technically, the auditing framework involved setting up a custom Raspberry Pi router to record the network endpoints contacted by Amazon Echo and emulating an Amazon Echo by setting up Alexa Voice Service SDK, in order to capture unencrypted network traffic.
This hardware and software setup was used to issue voice commands to an Amazon Echo in order to watch how the data was used for audio ads (via Amazon Music, Spotify, and Pandora), on the web for display ads (personas using a browser logged into Amazon account and Alexa web companion app), and on non-Echo devices.
The fake personas were configured to install and interact with Skills associated with their respective interests. These included: Connected Car, Dating, Fashion & Style, Pets & Animals, Religion & Spirituality, Smart Home, Wine & Beverages, Health & Fitness, and Navigation & Trip Planners.
In addition, over 70 percent of Skills fail to mention Alexa or Amazon, and a mere 2.2 percent make their data collection practices clear in their privacy policies, the researchers contend.
“Amazon’s inference of advertising interests from users’ voice is a clear violation of their policies and public statements,” the researchers state in their paper. “Amazon does not provide transparency in usage of data and thus cannot be reliably trusted to protect user privacy.”
The paper also observes that Amazon was granted a patent in 2018 titled “Voice-based determination of physical and emotional characteristics of users” that describes how the “current physical and/or emotional condition of the user may facilitate the ability to provide highly targeted audio content, such as audio advertisements promotions, to the user.”
Asked to comment, an Amazon spokesperson challenged some of the papers conclusions without citing specific inaccuracies. “Many of the conclusions in this research are based on inaccurate inferences or speculation by the authors, and do not accurately reflect how Alexa works,” the spokesperson told The Register in an emailed statement.
Umar Iqbal, postdoctoral researcher at the University of Washington and lead author of the paper, responded, “This is a broad statement without any concrete details.”
Amazon also insisted it does not sell data, which the paper does not allege. “We are not in the business of selling data and we do not share Alexa requests with advertising networks,” the US giant said.
If you ask Alexa to order paper towels or to play a song on Amazon Music, the record of that purchase or song play may inform relevant ads shown on Amazon or other sites where Amazon places ads
Iqbal countered, with references to the study’s paper: “We find evidence of Alexa Skills directly communicating with advertising/tracking services (section 4.2). We also note that Amazon’s advertising partners sync their cookies with Amazon and bid higher than non-partner advertisers (section 5.5). A logical explanation for this behavior is that Amazon/Skills share/sell user interest data with their advertising partners. We do not claim that Amazon directly shares voice input/transcripts with advertising networks.”
That is to say, records of Alexa conversations are not handed over verbatim, though the nature of what is discussed is seemingly used to guide the kinds of adverts you’ll be targeted with.
Amazon’s statement continues, “Similar to what you’d experience if you made a purchase on Amazon.com or requested a song through Amazon Music, if you ask Alexa to order paper towels or to play a song on Amazon Music, the record of that purchase or song play may inform relevant ads shown on Amazon or other sites where Amazon places ads. Customers can opt out of interest-based ads from Amazon at anytime on our website.”
Iqbal noted: “This statement actually supports our findings.” ®
Updated to add
On Thursday, researcher Umar Iqbal contacted The Register to say that some clarifications had been made to the paper to avoid potential misinterpretation.
“Most notably, we did not state that Amazon shared raw voice recordings/transcripts with advertisers, but we did find evidence that Amazon processes voice recordings from skill interactions to infer user interests and uses those interests to target ads,” Iqbal said.