this post was submitted on 21 Oct 2023
188 points (99.5% liked)
Technology
59696 readers
5025 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Wonder where this data can be found?
I'd be interested if my wife's info was in it .
Anyone have a magnet link?
https://breachforums.is/Thread-23andMe-Great-Britain-Originated-4M-Genetic-Dataset
Interesting. I expect that no significant papers would actually cite this information directly or explicitly, but I could certainly see the possibility that some researchers will be able to get grants by suggesting that they "know" that they'll get good results if they study such-and-such details of a population thanks to this. One possible silver lining of a breach like this.
No, you will not be able to get a grant by stating you’re going to do a population study using stolen data.
When you do research involving people or data about them, you have to go through your institution’s Human Subjects Board (or equivalent). They’re looking for things like informed consent and that the study population in particular (as opposed to humanity in general) will see some benefit and will not be harmed. Your proposal then goes through similar reviews by the grants committee, and will probably have been looked at by your department.
Even to access survey data that’s already been collected and has been used in hundreds of studies already, you have to jump through those very necessary and important hoops.
I can’t even see commercial researchers being allowed to use stolen data if they want their work published and accepted by the scientific community. There’s not even a grey area there - it’s just straight up unethical.
There was a study done by Facebook about a decade ago where they pushed negative articles to some users and positive ones to others and then looked at the emotional content of later posts by those people, finding a small but statistically significant correlation. They were excoriated in the literature for not securing consent and for running an unethical study that, for instance, could have led to episodes of depression or self-harm in parts of vulnerable populations. I’m not sure if the authors received a penalty for their work, but it violated scientific ethics pretty severely.
You have to go through training, sometimes multiple times per year, if you’re permitted to work with human subjects data, whether you’re conducting the study or using existing work. I could see accessing a cache of stolen data to be a career ending offense.
Which is the exact opposite of what I was proposing. I'm saying that you can get a grant to collect your own data more easily if you know ahead of time that by collecting the data you're going to find something interesting.
This is analogous to the legal concept of "parallel construction", in which police can make use of evidence that would not be admissible in court to direct their investigation towards finding other evidence that proves the same thing the original inadmissible evidence would.
No you literally cannot. I’ve done this for a living. This is beyond the pale in scientific ethics and would be absolutely fatal for a career.
This is not the FBI or the NYPD. There is no court. There is a panel of your peers who have been through exactly all of those questions, and who consider the entirely morally offensive.
And the think is that it’s not even needed. If you’re in a position to work with this kind of data, there are legitimate sources of the data that will be made available to you which are documentable.
And you literally can’t sneak stuff in with parallel construction because you have to meticulously cite everything that you’re basing your research on. I don’t know how to be more plain than saying I would see a student expelled for this faster than I would for plagiarism. And now that I’m working more in the commercial side, working with stolen data would get you fired. There is a zero tolerance policy.
We have access to this level of data and more. If we need it, we will write a check for it and jump through the hoops to get it, and it will have gone through review for ethical research by people whose entire careers are grounded in studying scientific ethics so that we don’t repeat the mistakes of the past.
I’m sorry if I’m being a bit enthusiastic about defending this point, but it’s something that the western scientific community has quite honestly fucked up for centuries and it involves something that makes almost all of us extremely concerned about companies like 23 and Me even existing. It’s a thing that we’re still figuring out, and that’s even under the legal and licensed access to that data. This is like talking to Richard Stallman about Palantir.
Thanks!