this post was submitted on 23 Dec 2023
144 points (90.0% liked)

Privacy

31385 readers
745 users here now

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

Related communities

Chat rooms

much thanks to @gary_host_laptop for the logo design :)

founded 4 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Fungah 9 points 9 months ago (2 children)

My own theory is that they tokenize key words and phrases with an AI so that they're not sending the actual audio data. Then it's stored in a form some AI can parse but isn't technically user data so they can skirt legislation around that.

A tokenized collection of key phrases omitting delimiters in text format is going be much, much less than audio, or a transcript.

[–] ben_dover 12 points 9 months ago

as someone who has played around with offline speech recognition before - there is a reason why ai assistants only use it for the wake word, and the rest is processed in the cloud: it sucks. it's quite unreliable, you'd have to pronounce things exactly as expected. so you need to "train" it for different accents and ways to pronounce something if you want to capture it properly, so the info they could siphon this way is imho limited to a couple thousand words. which is considerable already, and would allow for proper profiling, but couldn't capture your interest in something more specific like a mazda 323f.

but offline speech recognition also requires a fair amount of compute power. at least on our phones, it would inevitably drain the battery

[–] [email protected] 2 points 9 months ago (3 children)

That certainly would make the data smuggling easier. What about battery though? I assume that requires inference and at least rudimentary processing.

How would a background process do this in real time on a mobile device without leaving traceable evidence like cpu time?

[–] steveman_ha 6 points 9 months ago (2 children)

What if its not streaming? What if its just cached for future access, e.g. next time the user opens the app (and network traffic spikes anyways) maybe?

[–] [email protected] 3 points 9 months ago

Or plugs in their phone at night, bypassing energy use concerns?

[–] [email protected] 3 points 9 months ago

That’s possible too, and in general I’d think a foreground application currently in use alleviates most of the technical restrictions mentioned (read: why we never install FB).

But again we must assume some uncommon device privileges and we still haven’t solved the problem of background energy usage required to record and/or process a real time feed.

[–] BigPotato 3 points 9 months ago (1 children)

Cox also sells home automation bundles which advertise "smart" features like voice recognition which are always plugged into the wall.

[–] [email protected] 2 points 9 months ago (1 children)

Can it be implemented on pc? They often turned on and people speak around them too. Cpu activity much harder to trace when there are a lot of different processes. Someone can blame their phone, while it listening pc near by.

[–] [email protected] 4 points 9 months ago

Yeah outside mobile devices I imagine there’s a lot more leeway technically speaking. I’d be far more inclined to suspect a smart TV or a home assistant appliance like Amazon Echo, for example. And certainly there are plenty of PCs out there that are 100% compromised.

But it’s the phone that people often think of as eavesdropping on their conversations. The idea is stickier perhaps because it’s a more personal violation. And I wouldn’t put it past data brokers by any means. They would if they could. I’ve just yet to hear a feasible explanation of how they can without being caught. Hence my doubt.