I know this was hashed out on the other threads a bit, but can someone please explain to me why folks are so up in arms about this, compared to, say, studies that scrape user data without consent (something the IRB allows all the time by saying that no human subjects are involved)? Is it simply because there is no visibility into this practice (i.e., no email sent?) Scraping user data from public profiles, aggregating it into a model, and publishing a paper or whatever -- that seems demonstrably more invasive to individuals, storing and keeping their user data, than an email quoting a statute.
I agree that the deception was unnecessary, but that's it. It doesn't feel any wronger than that.
Especially because these researchers really were acting in "meta" good faith trying to probe the privacy ecosystem, I fear there may be a chilling effect. Consumers deserve privacy rights and privacy knowledge in the asymmetric surveillance economy we find ourselves in, IMO.
Ethical guidelines on research exist to prevent an adverse impact on participants. This study had adverse impacts: fear, stress, time & money in consulting lawyers. It was therefore defacto an unethical research study. Speculation as to why the protocol slipped through the IRB cracks are that the language used in the study proposal (at least the part made public) dehumanized the protocols by referring to "websites" rather than humans that would be responding to the inquiries.
The IRB ruled this was not a human subject piece of research, but that is contradicted by the deception protocol. Deception was justified as necessary because people's behavior might change if they knew it was a research request. That acknowledgement made it implicit that human behavior and potential changes to it due to the experiment was a core factor in the study-- ergo, it had human subjects. Behavioral research on human subjects is required to go through a much more rigorous IRB oversight process precisely to anticipate and mitigate potential adverse reactions.
Some people are focussing on the deception, but that is, under some circumstances, allowed by research ethics. The more serious problem was adverse impact which, again, is the primary motivator for why we now have laws and regulation-mandated IRB processes to make sure it doesn't become an issue.
I wonder if IRB ruled in this way because of the assumption of algorithmic response for requests like DMCA take down notices. I can imagine that even for GDPR/CCPA requests, there is still no human involved for website like Google, facebook, youtube and other major sites that is primarily operated through automation. If there is no humans involved then there is no humans to have an adverse impact on.
But as you said, researchers however must have suspected that responses would be made by humans or else the email would have included the fact that it was a study.
> Ethical guidelines on research exist to prevent an adverse impact on participants.
Not relevant in this case, because (1) it's clearly not human research[1] and (2) all kinds of other clearly-non-human-experiments still have adverse effects on humans tangentially involved, so that's not unique to human research, either. When scientists were testing the first particle accelerators, they caused some people a lot of stress who were worried that they would destroy the world - does that mean that those tests were human experiments? (clearly not)
I may be wrong, but I'm going to guess that you don't work with IRB's all that often. I do. I asked a colleague their thoughts on this and they were unequivocal in believing this should have been flagged as involving human subjects if the approving IRB had all of the details of the protocol. Their guess is that the proposal's presentation-- perhaps very innocently-- did not fully convey the details that would have resulted in oversight of the research as a human subject project. These are experts on the nuances of these laws. Of course, experts may disagree, so that reference is not definitive. But it is suggestive that dismissals from those here on HN that are arm-chairing this with no-- or minimal-- experience with this sort of research are not basing their opinions on a full understanding of the IRB & research ethics ecosystem. One or two IRB applications for a research project will not convey the understanding needed to evaluate research protocols under the laws & regulations involved.
As for definitions of human subject, it seems like you are overlooking part of the regs you want to use to support your argument. Per the link in the comment you cite, it's a human subject research if the research obtains data "through intervention or interaction with the individual, and uses, studies, or analyzes the information" That was clearly part of the research in this case: It involved interaction with individuals. I'm not sure how you can overlook this part of the definition when making your determination.
You should also be aware that regulatory laws do not stand alone: The government provides explanatory statements of interpretation and policy guideline. The grand-daddy of these in this case and a fundamental guiding document to anyone in an IRB is the Belmont Report in 1979 that guided the development of modern regulated IRB's. It is pretty clear: participants should "undertake activities freely and with awareness of possible adverse consequence" It's understandable that plenty of folks here on HN are not familiar with the body of clarifications and case-law that that guide interpretation of the law's statutes, but researchers & especially IRB members are supposed to know this stuff inside and out.
Next: Deception to avoid advanced consent is allowed in only very limited circumstances, and is a significant red flag that an IRB needs to bring more scrutiny to the research protocols. I don't know how you (or the IRB if it was doing its job properly) can say that human subjects were not involved if the research protocol relied on the deception of human subjects to see their behaviors.
All of which is somewhat besides the point: This entire research ethics process, independent of specific statutes, exists to prevent adverse impacts on human subjects. This research had adverse impacts, and so defacto it involved humans in research that should have gone through the full IRB human subject oversight process.
The post here on hacker news mentioned the down sides for one receiver. That person was stressed out thinking that they were about to be sued. They considered retaining council, which could have cost them a few thousand dollars, in order to get ahead of the threat. It didn’t come to that, so it’s a “what if”, but I could see myself trying to retain council too. Hopefully, a lawyer would have talked me down and advised me to wait it out. On the flip side, they may have offered to respond on my behalf (which would cost money).
I would not respond to such an email myself, ignoring it until I was able to defer to an attorney.
I publish a simple personal blog and I worry about the _worldwide_ legal implications of doing so. As one example, I have some old information about making model rocket fuel at home. At the time I had carefully reviewed U.S. law and knew how much I could legally make and have in my possession. Then I got questions from people in other countries and I got spooked. What if I break a law somewhere else?
I assume that I’m breaking other countries’ laws all the time, say be criticizing the actions of their governments. I don’t worry about that. I’m much more worried about, say, CCPA compliance while living and working in California. (Not that I’m especially worried it. My personal projects don’t meet any of the criteria which would make it apply to me.)
The problem for people outside USA is that this country repeatedly demonstrated ability to enforce law for example in Europe.
I would not be worried about say Sri Lanka privacy/blasphemy law but USA court can take down my email, website, important accounts, less important accounts starting from HN, gmail and github accounts.
Yeah, me too. I don’t collect stats on visitors anymore (using Google Analytics for example) because I now understand the privacy implications of doing so. I do use a simple impression counter but I capture no information (not IP, not browser, nothing). I definitely think about the CCPA and ADA laws, but I’m relatively sure they don’t apply to me. Still, I certainly think about them.
I personally use a self-hosted analytics app so I can still get some useful feedback without sharing my visitors’ data. I get pretty graphs, and my visitors get to keep their privacy.
If the EU wants an extra-territorial legal framework, why can’t a US website owner do the same in their TOS? That’s a perfectly legal thing to do in the US.
Who knows? I can imagine that an innocent picture of uncovered legs may be illegal in some religious states, but do you have to worry about it? Is that even a thing?
(I’m aware of the chances that you may visit that country some day and find out that you’re a wanted criminal, but not sure if that applies to non-felonies world-legal-wise)
Scraping dating is not imposing work, worry and cost on additional people.
The victims of scraping are not going to do any additional work unless the scraped data is used irresponsibly, but that is separate from the act of scraping.
This email required people to do work and caused worry due to the legal threat that the email tried to lead people to believe was applicable to them. They may have had cost if they called a lawyer and it definitely took their time.
Scraping -> no work forced upon victims. That email -> work forced on unwilling victims.
Is there something I’m missing? People including that poster aren’t reaching this same conclusion but it seems very apparent so am I missing something?
Well, the argument of the GP is that "extra work" is not the only form of harm that is possible. When comparing the harm of extra work and stress due to this email to the harm of have your privacy violated by large, publicly-scraped datasets that include your personal information. For example, once your twitter post is collected in a "posts of Twitter users about X political event" dataset, it's now impossible for you to ever delete that post, which could be harmful for you in the future. it's unclear whether one type of harm is categorically worse then the other.
Part of it was that the did no (or poor) screening. They got their list of target sites from a research list of the popular websites. I got a letter, and my little not-for-profit, not advertised, purely for fun website was around number 350,000 on that list. First, I sincerely doubt my site is even that popular. Second, if I got the mail, so did lots of people in a similar situation.
They weren’t spamming Fortune 500 companies. They were spamming a huge number of single-person sites that aren’t subject to the CCPA at all and who certainly don’t have legal departments to ask about it.
What is the difference between 100,000 individuals emailing 3-5 websites on that list, with their real identities, asking for things to be deleted (such that all 350k are covered)? Where is the meaningful difference between this situation and the one here, ignoring the deception for a moment (unless that is the only issue)?
Could this be a moment of cultural learning for everyone? That's kind of how I am looking at it, frankly, but I am open to being wrong. That is, perhaps small entities will learn, in one or two instances, to just ignore this kind of thing?
You seem extremely unconvinced that any harm was done to the people who were sent scrambling by this alarm. It's as though no matter how convincing the email was, no matter how much of the recipient's time was wasted, no matter how many thousands of dollars they spent on lawyers, you ascribe all blame to the recipient for not having realized they were being deceived — and ascribe no blame whatsoever to the email's author for being deceitful.
This whole discussion was had in the old thread, and there was one person who used the same rhetorical device of belaboring the same question over and over again. It was tiresome.
I should have been more clear, so let me correct that. I am convinced. I agree that harm was done, and suffer from generalized anxiety disorder myself, so I empathize with the panic attacks that people received.
It is because I believe that harm was done, but also because I am a privacy nut myself, that I am trying to, for my own sake, characterize how I should approach sending emails like this in the future. The study may not go on, but individuals still will send these emails as long as CCPA/GDPR exist. (Just to add some color: It's my anxiety which is causing my to want to delete everything from the internet. If there's minimal info about me online, I can rest easy. It's why this is a throwaway that I will abandon shortly.)
Reading everyone's thoughts is what changed my mind. I now understand to have underestimated the emotional and legal effects CCPA/GDPR requests could have on small website operators, and will be more judicious in the future (like this study should have been) in pre-filtering and my wording. Reactions like kstrauser's (elsewhere in thread) were initially surprising to me (perhaps because of the faceless nature of the internet), so I hope you take my about face as genuine.
Where do you think this balance lies? I still believe consumers, in general, should have right to ask those with their data about their processes; to give it to them; and, to upon request, delete it. And further, in general, I think these interactions are the kinds of things that researchers might legitimately want to study. I found your other comments to be thoughtful, so I am curious what you think explicitly.
Based on reading https://news.ycombinator.com/item?id=29611139 the other day, my impression is for a small website operator the email template used some potentially threatening language in the line "I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code."
There is some discussion that for large websites or gov entities this kind of language may be necessary to communicate your sincerity with the request, but lone operators doing their best probably dont have any sort of legal to ensure they follow the letter of the law. From my perspective maybe its best to approach a small website with a more casual tone that you just want your data gone and "make it serious" if the request is ignored or the response is noncompliant.
What I hope to see is a popularization of business models where no personal data is kept, because that is less expensive in terms of compliance costs, more beneficial to the consumer, and hopefully more attractive to the consumer as well. We can see the dawn of a new age in other comments in this thread where people talk about not collecting any data on their blog visitors!
Right now it is difficult to build businesses under such models because most institutions, frameworks, and tools shunt you towards hoarding all data. Over time, I hope that better tools will emerge so that building better businesses becomes easier.
There are people elsethread bemoaning not only the unfortunate artificial costs created by this email experiment, but the compliance costs of privacy-protecting legislation in general. But businesses should be paying those compliance costs, because it's an iron law at this point that business-collected personal data will leak yet individuals bear the costs when the data leaks.
To my mind, this experiment went awry in the same way that privacy-abusing businesses go awry: the organization reaped a benefit while the externalized costs were borne by outside individuals.
However, I'm inclined to forgive the researchers, as I think they will learn from this and find ways to collect data which cause less alarm and imposition. Similarly, I would hope that individuals pursuing their rights under privacy legislation would start off gently but firmly, giving small entities time to adapt. But simultaneously, I have an appreciation for those with bulldog tenacity who go after recalcitrant businesses (e.g. the heroes who have gone after Equifax in small claims court).
> how I should approach sending emails like this in the future
Don't.
It's that simple.
I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code.
Thing is, I would cheerfully process a deletion request, even though I don’t have to because I don’t meet the criteria to be subject to the CCPA. For me, part of the deception was quoting a law and incorrectly saying it obligated me to reply to their information request by a certain deadline. The law says no such thing, and getting a letter from someone who quotes specific legal codes almost never ends with “…and then they went out for dinner, newly found lifelong friends.”
It may have felt like a deception, but there's plenty of bad legal takes on the Internet. For this to be a deception, the sender would have to know for certain that the statute doesn't apply in this case.
Could be they did, but then I missed that. Just as likely, they genuinely thought this correct.
The deception was hiding that this was a study, not a genuine request. Lying about or misrepresenting the goals of a study is deception research. There's strict guidelines for that... in the "soft" sciences (APA guidelines). CS is a bit behind and seems intent on reinventing the wheel :s.
First, this is an altogether improbably scenario (the odds of winning the lottery are good compared to this scenario ever happening).
Site traffic follows a power law. A site at 200k down the list is almost never going to get such attention. It is not someone's full time job. A uniform density of information requests is incredibly unlikely and places a very unfair burden on the smaller sites.
Second, the difference is pretty obvious: 100,000 individuals seeking a legal right implies a potential benefit to a large number of people. 1-5 people abusing the system implies a bad faith actor whose benefit is pretty minimal.
Is about the impact on the humans involved. Imagine the study where are you put police lights on your car and drove behind people on the highway to see how they would respond.
The issue here was not primarily about deception. It seems mainly to be that (a) at least one recipient interpreted their mail as a legal threat, and (b) it was a mass-mailing. Spend a minute thinking through the implications if that were true, and you get a firestorm.
I suspect visibility plays a role in the comparison you're making; out of sight, out of mind and all that. But much more importantly, someone sending you what you think is a legal threat is a lot more salient.
Interesting. Ok, so let's say the deception wasn't the problem, suppose for the moment. Would the study have been more palatable if the researchers had more properly vetted the email list to ensure, say, >95% or perhaps even 100% were corporations that did fall under the law?
The requirements to be subject to the CCPA are any of: have a gross annual revenue of over $25MM; buy, receive, or sell the personal information of 50,000 or more California residents; derive 50% of more of your annual revenue from selling California residents’ personal information. Yes, I believe that if they emailed only sites for which that was true, I would have no issues with the study.
The requirements to comply with the GDPR are much, much stricter and have a much more outsized effect on small, non-commercial site operators. There are no exceptions to the GDPR for non-profits or non-corporate entities. (except a limited carveout for "household processing" that AIUI has been interpreted very narrowly by the courts). I do not think the GDPR is strict enough in this instance, and I think it would have outsized harms on small and non-corporate operators to email them in this way if your only criteria is "could technically be subject to the GDPR in some possible world".
I operate a website that likely meets one of the requirements to be subject to CCPA that received the emails from the research study. We have basically no revenue or staff. I didn't appreciate being lied to (about who was sending the message), being threatened (with legal enforcement), wasting my time (the study was scrapped), and being used for research without consent (the fact that this happens all the time doesn't excuse it). If they wanted to know our CCPA/GDPR policies, they could have simply asked. I also received emails from the study at two other domains I own and one that mentioned a domain I don't even own, but which probably don't matter for CCPA - all of which made me think that this was a scam and legal trap to take seriously.
Deception is a necessary part but not the key. The key is potential for distressing a real human being. The problem is that we live in a legal Society where everyone is at risk of life-altering legal consequences.
Oh, our society, especially America's, is overly litigious. I agree.
But, pushing back a bit (in good faith), do you think asking an entity for your data, or asking them to delete it, should really be considered unusual and panic provoking? I said in another comment the same thing, but do you think this could be a moment of cultural learning?
I recall seeing Ralph Nader speak at a fundraising event 20 years ago and asking the crowd "how many people have actually tried to sue someone?" and in a room of hundreds only a few hands went up.
And a year ago when I took my landlord to small claims it was insane how complex the process was and how many paperwork pitfalls are in the way to disqualify you. I remember sitting on the half-day zoom call and watching case after case get thrown out because plaintiffs "forgot to file proof of service" or whatever. I'm generally good with paperwork and still nearly missed out.
There may be some people in America who are overly litigious but for the general population the legal system is wholly inaccessible.
It doesn’t matter. This isn’t a case where an individual would be suing. This is the government regulation coming down on someone after being flagged by “a victim”.
In a perfect world, I do not think it should be stressful, but we don't live in that world. I think a stress response is reasonable, given the risk of legal consequences.
Perhaps it is a learning moment, but I think the lesson should be to consider the impact of these kinds of studies.
I'm sure it is a learning experience for bloggers as well, and some of them will learn that hosting a Blog is not worth the legal risk and take it down
The fact that everyone violates the law in some form, and anyone with sufficient will and resources could ruin a life with legal proceedings is why we have the concept of standing in American law. It acts as a filter so that only someone with skin in the game can bring suit. It is one protection against abuse, and why laws like that give anyone standing Texas abortion ban and forthcoming California gun legislation are problematic.
You are translating "legal threat" into "asking for data". And your 'learning' comment makes me think this is a cause for you. That's fine, and I even applaud what I take to be the motivation behind it.
But,
- That does not make one in to the other. Misinterpretation or no, the researcher (who was being deceptive, remember) is responsible for how the message was written. I don't know about you, but I don't usually end my polite requests with references to counterparty legal responsibility. When someone starts trying to sound law-talky, it is in no way paranoid or unreasonable to become concerned about what they might be up to.
The problem here is not that USians enjoy suing each other, or that people and businesses underutilize data protection laws. The problem is that an academic study was performed in a way that caused panic in this, our imperfect world (and object of study).
- I also find the idea that an academic study should (also? or primarily?) be an instrument of "cultural learning" deeply troublesome. I'd hope that IRBs would smack that sort of thing down.
Demanding a subject to actively participate in your study upon pain of vague and mostly incorrect legal threat is ethically wrong. Passive participation (like scraping) without consent is morally wrong, but since it doesn't cause undue distress to the subjects, it is not as big of a story.
The IRB in this case didn't consider this ethically suspect because "websites aren't people". And yet the study disproportionately targeted small websites where there is, in many cases, only one person involved.
Because the end of the email (wrongly in most cases) demanded a response by law and implied they were open to legal action, which caused a bunch of people to hire lawyers to check into their liability.
>The controller shall provide information on action taken on a request under Articles 15 to 22 to the data subject without undue delay and in any event within one month of receipt of the request[1]
The legal obligation may not have applied in this case, but it absolutely exists. If someone submits a request to you for their data, you are legally obligated to respond.
> I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code.
First, the CCPA doesn't apply to my site. It's non-commercial, has many fewer users than required to invoke the CCPA, and zero revenue. No provisions of the CCPA require me to do anything.
Second, the questions were about how I'd handle a CCPA request, and weren't actually a request at all:
> 1. Would you process a CCPA data access request from me even though I am not a resident of California?
> 2. Do you process CCPA data access requests via email, a website, or telephone? If via a website, what is the URL I should go to?
> 3. What personal information do I have to submit for you to verify and process a CCPA data access request?
> 4. What information do you provide in response to a CCPA data access request?
The CCPA doesn't obligate anyone to explain their internal processes. It obligates covered entities to respond to the requests themselves, but not to random drive-by questions.
So basically, that sentence was completely wrong. The CCPA doesn't apply to me, and even if it did, the law doesn't say what the researchers claim it did.
Why isn't this the story? It doesn't even have to be about ethics which nobody can seem to agree on. Sounds like the researchers were simply wrong.
So then the problem actually is that they misinterpreted the law. If someone misinterpreting the law can cause such stress and waste such time, shouldn't society safeguard against this?
More like researchers needs to take classes on law jurisdictions. They seemingly to believes that both laws have jurisdictions over everyone in the world, including countries and states that don't have such laws which causes people to be confused with it since it have legal statement.
The researchers created this issue because they don't understand (or tried to understand) the laws nor they do not screen their statements. The liability is not on the law, the liability falls on the researcher especially with "human subject" comment. Therefore, the researchers are likely to be in violation with their university IRB. The legal statement is forcing people (that are not applicable to them) to respond which in turn violated the ethics of IRB because they did not consent to this research. By 'forcing' them to respond to the research that they don't have people consent to do so will run afoul with IRB.
You shouldn't lie to people to trick them into collecting data for you without at least considering the impact on those people.
That's nothing like web scraping. (Though IMHO web scrapers should also use an honest User Agent so if website owners have a problem or question or want to block it, they can)
It is wronger than the deception because the PI "Jonathan Mayer" is not just a run of the mill academic focused on "publishing a paper or whatever." This is an activist with an ax that won't grind itself. Reviewing his work mentioned in Wikipedia I'm impressed and appreciate the contributions Mayer has made. Mayer can't be not aware of the problems with the approach.
I personally know Jonathan and hugely respect his work.
I could believe that because he is an actual lawyer it was harder to imagine the panic that recipients who have no understanding of the law would experience. But I think that more likely is that the response was a bit of a fluke. Way stranger stuff has been done by security and privacy researchers with the go-ahead from their IRB. This feels to me like this is a methodology that isn't universally agreed on but is not especially uncommon that tripped a response from the internet. The conclusion is more that people should not necessarily take the existence similar research as indication that the broader community is okay with these methodologies.
I suspect Meyer's work is in part preparatory to lawfare in order to force websites to pay for lawyerly services. The letter is akin to a fire insurance company knocking on doors while carrying a torch.
"Of all tyrannies a tyranny sincerely exercised for the good of its victims may be the most oppressive."
He's got a PhD and a JD from Stanford and has chosen a faculty position and has done a nontrivial amount of unpaid work for various privacy rights organizations. He obviously isn't motivated by money.
Frankly you are great at knocking a strawman down. Jonathan Mayer likely has some motivation for those efforts. I made no claim to the motivation being remunerative or not.
Do you have an alternative hypothesis of a motivation other than preparation for a "public-interest" lawfare campaign?
Actual legitimate research to understand existing privacy legislation, which can be used by policymakers to iterate and ensure that legislation is effective without being wasteful.
He previously worked in a senator’s office so I do suspect he knows how the sausage is made. And yeah, the staff writing bills do look at this sort of material. It is just one part of a bigger picture but it isn’t just throwing research into a void.
The wording on the main driver of the experiment, their especially bad emails, leads website operators to think there is a problem where there is none. This, on top of the research being entirely devoid of consent between the human parties involved, makes it a _very_ bad study, one that could well cause both the university and the research team to lose money if some of the 'subject' parties actually had to go get a lawyer to have a look at their shoddy emails.
In better studies what is supposed to happen is, you propose taking part in the experiment, you get a signed agreement of some sort, and only then actually start experimenting. What happened here is more like some kind of youtube prank than a useful information gathering procedure.
Scraping public data doesn't result in compelling another person to work under a false premise. Sure, you could argue that scraping introduces load that may draw an operator's attention... but the comparison is a pretty big stretch.
How these things pass board review I don't know... it seems pretty obvious to me that creating work for somebody who didn't volunteer to it is, at best, antisocial behavior.
In US legal code there is actually a definition of a human subject in https://www.hhs.gov/ohrp/regulations-and-policy/regulations/... (EDIT: to clarify this is a guideline for federal researchers and to my knowledge is not legally binding on private institutions, but seems to be used as a basis for private IRB policies):
"""
(e)(1) Human subject means a living individual about whom an investigator (whether professional or student) conducting research:
(i) Obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens; or
(ii) Obtains, uses, studies, analyzes, or generates identifiable private information or identifiable biospecimens.
(2) Intervention includes both physical procedures by which information or biospecimens are gathered (e.g., venipuncture) and manipulations of the subject or the subject’s environment that are performed for research purposes.
(3) Interaction includes communication or interpersonal contact between investigator and subject.
(4) Private information includes information about behavior that occurs in a context in which an individual can reasonably expect that no observation or recording is taking place, and information that has been provided for specific purposes by an individual and that the individual can reasonably expect will not be made public (e.g., a medical record).
"""
The argument is that scraping of public data, already recorded by data systems for general (e.g. not specifically medical) purposes, is neither intervention, interaction, nor private information.
On the other hand, IMO the researchers here clearly interacted with their subjects. While the email was sent to a privacy@ address, not only are emails different from HTTP GET in how likely they are to be read by humans, but this went a step further and implied legal action would be forthcoming unless a human replied to the message. That's interaction. That makes the recipient a human subject.
(IANAL and the above is not legal advice.)
EDIT 2: I've had the pleasure to meet one of the researchers here. They are a staunch defender of online privacy, and I believe the team sincerely wanted to measure how effectively businesses are adapting to the changing winds beyond their legal obligations. But I also think the team, and the Princeton and Radcliffe IRBs, should have done more to consider the impact on the people who operate these businesses themselves. I'm sad and disappointed that the systems in place didn't catch this.
Your question is essentially whataboutism. Both things can be wrong. We can care about this instance without diluting the conversation talking about something else that is also bad.
It's not intended to be whataboutism (sorry about that, I edited this in to clarify) -- I agree that the deception was wrong. But there seems to be something about this particular event that is riling people up, and that's what I am getting at. I am not trying to whatabout, to be super clear.
To clarify. I don't think people would be riled up about individuals sending out these emails. Individuals are required to be legal, not 'ethical'.
The people who are riled up believe that University studies should be performed ethically. They know that IRB's exist to prevent researchers from doing unethical, but legal, things. In this case, they feel the harm caused should have been prevented.
Scraping data silently doesn't cause stress/harm to the participants directly, as they are unaware of any potential threat.
It's not "human experimentation should be banned" its "human experimentation should be heavily scrutinized to prevent harm to participants as much as possible. And definitely never cause harm to unwilling / unwitting participants".
What bothers/riles me is that there doesn't seem to be a consistent ethical framework applying to these complex situations. Of course things should be ethical but ethics aren't defined as “whatever people on HN and Twitter feel like isn't slimy”.
I believe the real issue isn't the research ethics per se, but rather pent up frustration on the larger topic. I posted this in one of the original threads:
> We have also received consistent feedback encouraging us to promptly discard responses to study email. We agree, and we will delete all response data on December 31, 2021.
I wrote one of the blogs posts that got linked here on HN, and I have some strong feelings about that. None of them are joy, though. I think it’s good and appropriate that the study is deleting all the data; since it was collected by misleading methods, I don’t think it was valid. I’m not happy that a study covering an important subject, and led by researchers who had good motivations, went so far off the rails in the first place that it had to be axed.
Ignoring the ethical concerns, all the data they collected was completely worthless, because many of their subjects were contacting eachother and responding to it with the knowledge that it was a mass email sent with a variety of presumably fraudulent names.
It definitely was valid. Probably the most valid data you're going to get if you want to test for this thing.
Your emotions are getting in the way of your logic.
I personally don't see what the problem is. People should be allowed to send whatever emails they like, it's up to you to reply to them. If they had sent emails out asking how everyone's day was going, would you get upset? What about how many employees they had?
Researchers are held to a higher ethical standard than random people because experimenting on people is morally dubious unless you follow strict guidelines.
I feel like there is a lot of hyperbole wrt the harm that was done by this study but on the other hand I think it's clear researchers shouldn't have free reign to manipulate people as they see fit just because it's through email.
I'm not talking about moral standard when I talk about the first sentence. I'm talking about data. And this data is flawless.
They shouldn't be allowed to manipulate people, but that doesn't discredit the results.
The same way it would be very bad for us to put normal, average people in control of jetliner cockpits as a scientific test to see if they can fly them properly with guidance. But if that managed to happen, the data would still be valuable.
I wouldn't call sending one-off emails to people to be manipulation.
Given how widespread word of this thing got, how could researchers possibly distinguish responses to their email that were from people who were not aware it was research, versus responses from people who had become aware what was happening?
The same goes for people who didn't respond. Did they not respond because they heard about this being research, or did they not respond for other reasons?
This data is the opposite of flawless, it is poisoned, and any attempt to draw conclusions from the responses they got would be junk.
>Given how widespread word of this thing got, how could researchers possibly distinguish responses to their email that were from people who were not aware it was research, versus responses from people who had become aware what was happening?
By limiting results by time to anything received before everyone found out? It's pretty easy to set a time period for that. It has the benefit that people who find out will not want to participate or will complain.
>This data is the opposite of flawless, it is poisoned, and any attempt to draw conclusions from the responses they got would be junk.
Nope. Even just limiting it to 48hours would provide great data.
I suspect it was far more than 1000 emails, because my small company got one. It scared us and wasted significant time (eg as we carefully read the relevant CA law). At the time we concluded that it was highly likely to be a phishing scam and archived the message with no response. In addition we decided not to respond because the person asking the question was not a paying customer. I definitely did feel threatened by the way the email was worded.
That with my (conservative, I believe) estimation would make 6-9,000,000 US$.
Let's make it 10,000,000 US$, that were "wasted".
Sending 200-300,000 of such mails makes no sense whatever, AFAICT a study (besides the ones with 12, 18 or 33 participants), if the sample is random enough, with 1,000-10,000 should give accurate enough results.
In the good ol'times (snail mail) sending 200-300,000 letters would have costed probably 200-300,000 US$, I doubt that the Uni (or its IRB/whatever commission) would have approved this kind of expense.
Apparently this was the last line, as reported by the honeypot blogger: "I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code."
To Whom It May Concern:
My name is … , and I am a resident of Paris, France. I have a few questions about your process for responding to General Data Protection Regulation (GDPR) data access requests:
Do you process GDPR data access requests via email, a website, or telephone? If via a website, what is the URL I should go to?
What personal information do I have to submit for you to verify and process a GDPR data access request?
What information do you provide in response to a GDPR data access request?
To be clear, I am not submitting a data access request at this time. My questions are about your process for when I do submit a request.
Thank you in advance for your answers to these questions. If there is a better contact for processing GDPR requests regarding zylstra.org, I kindly ask that you forward my request to them.
I look forward to your reply without undue delay and at most within one month of this email, as required by Article 12 of GDPR.
Sincerely,
That's it? I just don't see how this is so burdensome even if you don't have a data deletion process in place (i.e. probably aren't complying with CCPA/GDPR). It's basically just saying "how can I ask for my data to be deleted and prove which user I am". I'm prepared to answer these questions for my side projects so it seems like a business should be able to answer them.
> I look forward to your reply without undue delay and at most within one month of this email, as required by Article 12 of GDPR.
This is the threatening part, but it's also bogus. The wording of the GDPR does not require a business to answer such an email, unless the sender actually wants to submit a data access request. But previously, the sender denied the intent to do so:
> To be clear, I am not submitting a data access request at this time.
Thus, the email is perceived as spam at best and a threat at worst.
$40/hour? That seems super low, unless all the emails were processed by mid-level admins. If a web admin, engineer, etc processed it, you probably need to double that. If it went to counsel, the value could be tripled or more.
I'm shocked that your hypothesis assigns 0% to "Admin spent 30 seconds pasting a form letter, or a link to a page on the site, that describes their handling of user info and the process for deleting or requesting it."
Even if that material was pre-prepared, there's vanishingly few (probably zero) organizations for whom 30 seconds of one IT admin's time, acting alone, would be spent on this.
"Oh shit, we need to have at least a phone call with counsel on this before we reply at all!"
The mail sent to a "random/generic" address (let's say info@nicesite.com, provided that the site is large enough to have a permanent site admin) would be read by a low level support person, that would forward it to a manager, which would forward it to a higher level manager that would forward it, after having discussed it, to the site admin.
The 30 seconds is totally unreal, let's make it 5 minutes, but these five minutes are spent after another 20 minutes of internal moving/talks before it gets to the site admin.
So my half hour at 40 US$ may become 20 minutes at US$ 40 and 5 minutes at US$ 120 40/3+120/20=19,33, not far from the 20 dollars attributed to 50% of cases.
That's one way to look at it. Another is that people spent some time to understand a law which may or may not affect them, but if it does, they should probably already have known about it. "Should" in the sense that it would be good for them it they did, not in the sense that I think they were negligent, as honestly I think there's a bunch of laws that affect people like this that that most of us are unaware of.
I do not consider acceptable to be threatened about California law that does not apply to me.
I do not appreciate learning about any law by being threatened with it in fake spam email.
And sending threatening email to humans and having chutzpah to comment "our study does not constitute human subjects research" is just insulting.
I received numerous spam from universities about "research" but never one that was blatantly lying, threatening me with inapplicable law and with legal documentation claiming that I am not a human.
I send a complaint to them, and will consider further complaining.
Does anybody have any idea why it "does not constitute human subjects research"?
Is threatening people online not counted because it is online? Or have they lied to review board?
Even if the California law doesn't apply, if you operate a website with EU citizens as users, you're subject to the GDPR (and unless your website is extremely small or you explicitly block them, you've probably got some users from the EU). The GDPR has similar provisions to the CCPA, and some people do exercise their GDPR rights by sending emails like the ones the researchers sent.
Which isn't to say that what the researchers did was acceptable -- just that it can still be a valuable educational experience for anyone unprepared to handle such a request.
> some people do exercise their GDPR rights by sending emails like the ones the researchers sent.
Legitimate mails are OK. Mass send spam with illegitimate threats is still not.
I am in large part irritated because it gives arguments to people who would want to get rid of such laws, makes harder to handle legitimate requests and spreads false info about such laws.
> it can still be a valuable educational experience for anyone unprepared to handle such a request.
And being robbed or having your country invaded also can be valuable lesson, which is not making it in any way acceptable or welcome.
If they aren't an EU website, GDPR effectively doesn't apply. EU can word the law however they want but at least in the US without a treaty to enforce such a law, it lacks the force of law here. Europeans have an extremely hard time understanding this and I'm not quite sure why. I see this assertion again and again across the web.
I've seen that too. I'm in the US, and not subject to the GDPR. I like the GDPR and totally approve of its goals. As a Californian, I'm glad we have the CCPA which is similar to it. I say this, then, as someone who supports the GDPR and appreciates it: I'm still not subject to it because I'm not inside its jurisdiction.
Similarly, I'm certain I've broken laws in other jurisdictions, such as by criticizing fragile-egoed governments who make that illegal. Doesn't matter, they don't apply to me either.
This is a bit pedantic, but I'll make my point anyway: whether a law can apply to you is orthogonal to whether it can be enforced on you. The GDPR is very clear about its application, and it is explicitly extraterritorial [1]. Of course, it does have secondary provisions about company size and non-commercial activity (mainly recitals [13] and [18]) which limits its applicability, but from a legal definition point of view, "I don't live in the EU so the GDPR does not apply to me" is too simplistic.
Nobody in America is going to know about or expect to be bound to the laws of 100 different jurisdictions because in theory someone could visit from that country.
Kind of like visitors from Spain don't bring with them Spanish laws when they visit Nevada.
> I do not consider acceptable to be threatened about California law that does not apply to me.
I think that's a bit much. Someone asking how they would submit a request if they needed to, and specifically saying in the message "I am not submitting a request, just wondering how" isn't exactly threatening you. It's sort of like someone going door to door ina neighborhood asking people what they think of the new water conservation law that requires sprinklers to be run after a certain time of day (which my city has, and recently went into effect). If I'm not in compliance, or don't even know if I'm in compliance, could that person have possibly seen my out of compliance and that's why they're asking? Maybe. If I knew about the law and was actually in compliance, I would know it's not a problem. One thing is not in question though, which is that if I'm subject to the law it's my responsibility to know about it and be in compliance, legally. Someone asking me about it is only a problem if I'm failing to do that in some way.
If they ask me about a law for some other county or state? I could look that up and determine I'm not subject to it. There's plenty of information on it.
> Is threatening people online not counted because it is online?
Your entire comment and all points therein relies on the assertion that the email is threatening. You haven't shown this. Some people might read that email as threatening, but I'll note, the only people that would do so are those that don't actually know whether they are subject to those laws and have ignored what's been going on and were blindsided by the question.
This whole thing is blown up because people are upset at being called out on their disregard to the current state of the internet and the laws being passed to regulate it. That's not to say the study was carried out without problem (it wasn't), but there actual harm to people of the type described in this thread was of their own negligence. Whether you think these laws are good or not, it is your responsibility to know whether you are affected, or have some assurance from others whether you are or not (even if it's just a hosting platform telling you what it thinks your responsibilities are). You can ignore this responsibility if you like. People do that all the time about laws that affect them. I'm sure everyone does it to some extent. Just don't act like you're a blameless victim when asked about them.
I don't think anyone is claiming that the "I am not submitting a request, just wondering how" is threatening
What they refer to is the final paragraph of the mail
"I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code."
I know people like to take it that way, but it's literally saying (whether true or not) "you are required to do this, so do this." I'm a bit more lenient of things that could be classified as implied threats when it boils down to "follow the law" and the threat is only relevant for those not following the law.
Yes, it is a threat, since it suggests that legal action will follow without compliance. It's not an explicit threat, but it communicates a threatening meaning. It is a coercive statement.
Now threats aren't necessarily a bad thing when justified. A threat is just, "if you do/don't do this I will/won't do that." But this particular threat was bad in several ways. First, it was directed at targets not actually bound by the relevant law. Second, even if it was directed correctly, many would probably view it as a frivolous use of that law.
> Now threats aren't necessarily a bad thing when justified. A threat is just, "if you do/don't do this I will/won't do that."
I agree it's a thread, and what you state here was actually going to be my response to that.
> First, it was directed at targets not actually bound by the relevant law.
Yes, that's the worst thing about this. At the same time, I think those people should be prepared to answer things like this. The world we live in means anyone can send them the same request at any time, for real reasons (even if that person might be incorrect in what they are requesting).
> Second, even if it was directed correctly, many would probably view it as a frivolous use of that law.
From what I read of the statute, it appears to be exactly what that section of the law is for. To my (layman's) eyes, this is part of what the "request to know" verbiage in the law is for.
(1) Right to Know About Personal Information Collected, Disclosed, or Sold.
b. Instructions for submitting a verifiable consumer request to know and links to an online request form or portal for making the request, if offered by the business.
That's not what that is at all. It's more equivalent to going up to someone and asking (privately, I might add), whether they have any domestic violence complaints against them, if there were a law requiring people disclose that on being asked within a certain time frame, and noting they have the legally mandated period of time to reply.
Kinda an asshole thing to do, but any person subject to that law (or being asked, even if that's not a law they are subject to) should know how to deal with a request such as that, and if they don't, spend the time to learn how to deal with a request such as that. That might be "fuck off, that's a law from somewhere else" or it might be "I have no complaints"/"I have one complaint".
There's a difference between whether someone is being an asshole or has a right to ask something, and whether learning how to deal with that thing if you don't already is a waste of time and money.
> From what I read of the statute, it appears to be exactly what that section of the law is for.
Sometimes what is legally permitted and what is socially acceptable are different. Pretending to be a member of a small time social network and sending a formally-worded letter to the operator, on a topic you have no personal privacy interest in, is on the legal but not socially acceptable side of the line. It's a jerk move, as you yourself mentioned in a later comment.
In aggregate it's a jerk move. For any single individual it's the purpose of that statute, from what I can see. Asking, as an individual, for how to make requests like that isn't what I would consider a jerk move or frivolous use of the law.
It's for that reason I think people should be prepared to answer these questions if presented, and being presented with them and having to account is for them not a waste of time.
I think people are too caught up in that the people performing the study were being jerks in how they went about it when the actual email is perfectly formed as what any random person on the internet could legitimately send (at least with respect to what damage this caused).
> any random person on the internet could legitimately send
Any random person on the internet could harmlessly send a more gently worded email and then only escalate to legalese if they get an unsatisfactory response.
I'm honestly not sure what point you think I'm trying to make. Because that's not really relevant to what I was trying to express, and I'm kind of tired of trying to clarify my point only to feel like people are ignoring what I say. Either I'm not expressing it well, or people are failing to bother considering it. I'll let you keep whatever interpretation of my point you have, as it's no longer worth trying to correct.
The point you seem to be conveying is that there is nothing wrong with the communication that was sent out. The reason your posts come across that way to me is that you keep saying things like, "that's exactly what the law is for" or "asking, as an individual [... would be ok]." And my response to you is that perhaps those other scenarios would be ok, but we are talking about this scenario, where what was done wasn't ok. It doesn't matter that other scenarios would be ok, and by repeatedly asserting that they would you are giving an appearance of endorsement to what was actually done.
Hope this clarifies my view of the conversation to this point. Personally I am not very interested in talking about other hypothetical scenarios where the law might be employed. It's a little too abstract for me right now.
> The point you seem to be conveying is that there is nothing wrong with the communication that was sent out.
The root of this thread, which I responded to, was about time spent from emails and money "burned" dealing with them because the people had to figure out whether it applied to them and/or respond appropriately.
In that context, I don't believe this is time wasted, it's time people spent learning about something they should already have paid attention to. The "wasted" time is from people or departments responding that already knew their liability (or lack thereof) and had to write another email explaining or pointing towards their documentation, or send the form letter. That actually wasted time is likely far less than was posited.
Should these researched have done this? No. Was it a complete waste of everyone's time that was contacted? I also think no, it wasn't. These were real laws and what was requested was legally required of the people that it applied to, and even for the people it didn't apply to, any random person on the internet could have sent a similar request (either correctly or incorrectly asserting their rights), and the recipients would have had to deal with it just the same. That's what I mean by "any random individual". It's not to say what the researched did was okay, but just to note that if someone is considering all the time people spent dealing with the email and figuring out if it applied to them, I do not consider that entirely wasted time. These are real laws, and people that run sites should be aware of them.
I've repeatedly said that what the researchers did is not acceptable, that they acted like assholes, etc. What I've trying to do is separate the initiating action from the outcome, and make a point about the outcome. Not for the purpose of defending the researchers, but because I think it's important that people understand the liability they expose themselves to just by running these sites, as if they do and they find that problematic, maybe we'll get enough visibility to change the laws in beneficial ways. At a minimum they'll know how to protect themselves in the future if they get a real request that needs to be dealt with within a specific time frame because of the law.
In any case, thanks for taking the time to summarize what you thought my point was. Not everyone would be willing to put in the effort in order to attempt an actual understanding with the other party in a discussion. :)
People got these requests to their personal blogs. The complaints aren't that someone at Apple had to reply to a fake request, but that people who are literally just hosting tiny websites for the fun of it are getting these letters.
If a random teenager sets up a Wordpress site because it looks fun, I contend that they shouldn't have to wonder whether it's legal. Down that path lies insanity.
My point is that some of these people are subject to the law, and could get an honest to god actual legal request to do something, not just explain their procedures, just as easily. People should know whether they have responsibilities under the law or not.
With respect, it just doesn't matter whether you think the researchers were doing a service or not. What I mean is, the researchers are (depending on jurisdiction and funding source) bound to abide by certain standards when doing human subjects research, and informed consent for participation is one of those standards. Even if receiving the email was 100% beneficial to everybody, and had no risks at all, the participants would still need to been told about those benefits before participating. They get to make the choice to participate or not. The IRB process exists to make sure those practices are followed in every case, to take the personal opinion of a researcher out of it. These standards were developed in response to researchers who did very harmful things to subjects without their consent, in many cases because they thought it was for the greater good.
> With respect, it just doesn't matter whether you think the researchers were doing a service or not.
I wasn't making a case that the study was fine and had no problems. I was making a comment on, broadly, "money wasted because of this". Whether the study was problematic or not (it seems like it was), everyone scared by this email was only scared because they'd stuck their head in the sand with regard to laws that have been enacted that put certain requirements on some people, and whether they are affected or not.
As I see it, there are a few possible general outcomes of the email:
One, you know what your requirements are, if any, and you respond appropriately.
Two, you don't know what your requirements are, and you look up your requirements, and respond or take further action at that time. For the majority of people, that fall into this case, that's probably "do nothing".
Three, you don't know, go immediately to a lawyer, and burn a lot of time and money with that lawyer, for them to either tell you it doesn't affect you or to ask you WTF you're doing operating something like you are without knowing the simplest of things that could affect you.
In all those cases, you are left off either with the same or more knowledge about your legal responsibilities online. In the cases where you waste resources using a lawyer (in some cases a lawyer would not be a waste, but possibly something you should have done previously), I think that's people overreacting to their own (possibly longstanding) negligence in understanding their own situation.
For what it's worth, whether the study was conducting in a way that was acceptable is irrelevant this specific question. Any individual could email asking a similar question entirely legitimately.
I mean, that probably makes you an asshole if you do it, like the people that ran this study, but honestly, everyone should know their immigration status, right? If some random person emails you asking your immigration status, I think most people should know how to deal with that.
I don't think it would be acceptable to impersonate any sort of official in that exchange, but that wouldn't be analogous to this situation either.
> There's no impersonation involved; the analogous email would say that the sender would notify authorities about the recipient based on the answers.
No, the analogous email would say it would notify the authorities if they didn't answer in the legally required timeframe (which doesn't exist). Honestly, it's a fairly tortured example that doesn't fit well.
First, the person requesting in reality is the person making sure their own rights are being honored (whether erroneously or not) based on real laws, while your example is some random person asking others about information that is not really their business.
Second, which is based upon people presenting somethign publicly. It's more analogous to going up to someone that has a shop on a public street and requesting info on their current health inspector rating, which is required by law to be shown (for restaurants). A public website is public. You get something be being public, but that might also expose you to liability.
> It's unclear how this would be more or less random than the e-mail to websites.
Hopefully it's not unclear anymore.
> And you'd be just as quick to defend such an asshole, right?
I'd be just as quick to say that yes, that person is an asshole, but I don't think you can necessarily attribute all the lost time and money to looking into their request as wasted, unless it's the short amount of time it takes to tell them to go to hell.
If someone is unhappy because they wasted hours or money on an attorney because some random person asked them their immigration status, well that's probably something they should have worked out already, if it was that important, so the time isn't "wasted".
In other words, it's entirely possible for an asshole to accidentally cause you to do something beneficial for yourself that you should have done long ago. That doesn't make them less of an asshole, but I also wouldn't consider it their fault them for the time you spent finally getting your shit together.
Notice how I'm not really defending someone being an asshole, just making a note about outcomes? Perhaps you should look at that and my past statement before continuing down a path of accusing me of "defending" someone.
To be clear, I'm not saying the study was conducted ethically, which I think is a complex question (but also one I think influenced quite a bit by the wording of accusations, as "human subject research" has some historical connotations even if an accurate description), but that attributing all lost time/money to a cost the study imposed on others might be taking too much of a leap.
From work experience, only a small amount of website contact information work to actually contact the person in charge of the website.
My very rough estimated would put it more like:
40% of email addresses is no longer valid or has an mail box that does not get read.
30% reaches the web design shop which built the website many years ago under a different brand. They blindly forward it to their customer if they still have that information. The contact information is many years old and likely a dead end.
20% has auto-reply and do not get read.
1-2% has algorithmic reply that links to a FAQ.
5% actually reach a human being. Those 5% however are still a good enough reason to not do this!
I don't think they even realize that is what happened. You can see the flawed thinking throughout the entire description of the experiment. They anthropomorphized websites, imagining them to have the abilities that the humans behind them have.
That's probably the big thing that people should learn from this, I think this is a pretty common misconception.
Some ethical questions are very subtle, it doesn't strike me as malicious. I feel like their logic is valid for larger websites where you have teams of people handling these questions, and even possibly for smaller website that are incorporated.
There is a lot of precedent in using the mystery shopper technique to assess companies that aren't considered Human Subjects Research even if the interface to that company is a single human being.
I would argue that even for larger websites this would not be a valid logic either. Such websites might have a team responsible for such issues, but this is not the service these websites provide nor the primary way of operation for these websites but rather a way to mitigates risks. Such risks burden the websites and are raising costs for the websites to operate. It's an abuse of service somewhat similar to shoplifting.
Your analogy with mystery shopper doesn't hold the water as well since the researchers in such cases are just executing the primary function of the businesses in question. So while for the bigger sites you might be correct that the research is not a human subject research but it's still an abuse of service which in unethical without consent as well.
They want to study how websites handle GDPR and CCPA, and that's probably what they submitted.
The IRB reasoned that "websites are not people", which is true, but failed to reason that "websites are operated by people", and therefore certain measures should be taken.
I think the IRB should investigate how they reached that conclusion and probably issue their own apology. Hard to say without seeing the actual application and not being familiar with Princeton IRB rules.
What is it about an apology that makes you seek them?
I prefer action, charitable interpretation, and progress. I found this update rather encouraging:
> Third, I will use the lessons learned from this experience to write and post a formal research ethics case study, explaining in detail what we did, why we did it, what we learned, and how researchers should approach similar studies in the future. I will teach that case study in coursework, and I will encourage academic colleagues to do the same. While I cannot turn back the clock on this study, I can help ensure that the next generation of technology policy researchers learns from it.
Instead of wasting time by making another person or entity go through the humiliation gauntlet, let them improve their surroundings.
Potentially. What I explicitly dislike is the mea culpa portion (and subsequent apology grading, where people try to derive some intent) Rather, I like "responses" with a plan. Is that the same as an apology to you, even without explicitly saying "I'm sorry"?
I am an academic and I am against this type of study. My main objection is that it wastes the valuable time of the website operator for little benefit. It is immoral to waste people's time.
Many IRBs are unaware that these kinds of "public surveys" unduly burden respondents and cause them unnecessary stress.
The little benefit will be a series of graphs indicating how site operators respond, which could be interesting, but does not justify the burden.
My org received one of these emails. I was the engineer pinged on the support ticket.
This request is neither threatening nor burdensome. This is a pretty standard run-of-the-mill GDPR request. We get them all the time.
It took less than 60 seconds of my time to provide our support team with the information they needed to respond to the request. In fact, we already have a canned response to these requests - the person on the support team is a new hire and was unaware.
If your org has users/customers in the EU, you need to have a GDPR playbook. Your support team needs to be briefed on these requests and how they should respond.
I have a difficult time believing that any "controller" complaining about this is properly prepared to respond to GDPR access requests.... Which is kind of the whole point of the study, no?
Your message implies an organization that has at least two engineers and at least two support people. Many of the site owners who were seriously bothered by it seem to have been one person operations running non-commercial personal sites and it never occurred to them that they needed to look into what if any obligations they might have under laws like GDPR and CCPA.
Maybe the default index.html that gets created when you first set up a site should include a notice that if your site is going to be public facing you might be subject to laws like GDPR and CCPA and link to resources you can use to figure out if you are in fact subject to them.
Same for whatever blogging software is common on these sites. I'd guess that they usually include a sample entry so you can verify that your installation is working? If so, include privacy law information in the sample entry.
For professional organizations this is a non issue. For a small operator it can be both their first request, and their first request from someone who is just shooting off random requests to parties that they know have zero data on them, which is an abuse of the process. To add vaguely worded legal threats to that is way beyond where it should have gone. Anyway, the researcher seems to have realized this by now.
The study methodology apparently involved a sample of high-traffic websites from https://tranco-list.eu. I have a hard time believing that the operators did not have to deal with such requests before. I always add the 30 day statements in my GDPR requests, mostly to make sure the support people set a calendar reminder to reply before the date. The next step if no reply is received is to complain to the data privacy watchdog in the country of the website operator or in your country if the website is operated outside EU (though I always begin with an email follow-up). Nobody would go to court after 30 days of a GDPR request without going through the govt data protection agency first. And to be clear, only requests are entitled to a 30-day reply and the email said that no formal request is being filed at the time [1].
But yes, that was clearly human research and the IRB should have grilled the PI about that.
You better believe it, because I got one of these emails about my personal site and I had never had to deal with it before.
I also received an email for a domain that I had absolutely nothing to do with. It seems their system identified my email's domain (i.e. not my email!) because it was in the last link on that domain's homepage – something that a human would have spotted easily.
As I've said elsewhere, I'm on that list and I'm nowhere near what I'd consider a "high-traffic website". The site in question is a zero-revenue personal project, is is several orders of magnitude too small by any metric to be subject to the CCPA (which is the law the letter I got referred to).
They absolutely did not survey only large websites.
I've read more than one response from people saying they are operating their website all by themselves and they definitely did not seem to be high traffic.
It's your analogy, that the research in question is similar to fire alarm drills, that the researcher is analogous to the people who conduct the drill. I'm merely pointing out, that one cannot randomly conduct fire alarm drills on properties that don't belong to them.
The researchers don't own the websites, and they certainly don't own the internet. What entitles them to conduct the "drill"?
I have a blog hosted on GH Pages generated with Jekyll. I got this email from the researcher:
> To Whom It May Concern:
>
> My name is Tom Harris, and I am a resident of Sacramento, California. I have a few questions about your process for responding to General Data Protection Regulation (GDPR) data access requests:
>
> Would you process a GDPR data access request from me even though I am not a resident of the European Union?
> Do you process GDPR data access requests via email, a website, or telephone? If via a website, what is the URL I should go to?
> What personal information do I have to submit for you to verify and process a GDPR data access request?
> What information do you provide in response to a GDPR data access request?
> To be clear, I am not submitting a data access request at this time. My questions are about your process for when I do submit a request.
>
> Thank you in advance for your answers to these questions. If there is a better contact for processing GDPR requests regarding yifan.lu, I kindly ask that you forward my request to them.
>
> I look forward to your reply without undue delay and at most within one month of this email, as required by Article 12 of GDPR.
>
> Sincerely,
>
> Tom Harris
I honestly thought it was one of those legal trolls who sent the same email to everyone hoping to find someone to sue but I responded anyways explaining how statically generated sites worked and that I’m willing to provide the information, being that the information is that I have none…
The last paragraph in particular made it sound like a veiled legal threat (or that they’re hinting that they’re willing to go down that road). I felt that I had to respond just to establish some record.
It was specifically crafted to sound like there will be legal consequence - this internet tough-guy goes into the same bucket as deceptive 'microsoft technicians' asking you to buy gift cards - not as scammy or nefarious, but in a similar vein nevertheless.
I would be very much interested in seeing the IRB submission/application that was submitted for this study. I wonder whether or not it was mischaracterized to the IRB, or written in such a way as to diminish the problematic aspects.
We submitted an application detailing our research methods to the Princeton University Institutional Review Board, which determined that our study does not constitute human subjects research. The focus of the study is understanding website policies and practices, and emails associated with the study do not solicit personally identifiable information.
This came up during the Linux security patch debacle as well. IRB guidelines are focused on a narrow set of harms based in historic abuses of medical research, and don't necessarily condemn the types of deception available here. As TFA points out, “secret shopper” methods are common in academic research of business practices.
This is exactly what I don't understand about the whole thing. People are arguing that this was unethical but the researchers literally proposed the study to their review board which said "go ahead it's not a human subjects research study and does not need consent and is not by that virtue unethical". Perhaps the review board was wrong, people make mistakes, whatever. But assuming the review board was correct in its analysis of the situation (and who are we really to challenge that unless there's a glaring negligent tier mistake), I have yet to hear an argument that dissects the ethics of this case and clearly lays out what ethical quandary we have on our hands and where the line was crossed.
It really seems to me that people are conflating "annoying" with "unethical". Sure, spamming people is annoying. But how is it unethical? I had the same questions about the linux kernal security patches issue. Annoying waste of a few maintainers time, arguably yes for some definition of waste. But unethical? How so and can someone link me to literature detailing the ethical framework that disallows otherwise legal activity in good faith pursuit of knowledge because someone got annoyed in the process? I think that would be an interesting read.
The emails purported to come from individuals, but were (1) written in an aggressive, legalistic style, and (2) directed at individuals who were not subject to CCPA and not equipped to deal with regulatory demands of it.
This caused significant anxiety on the part of the individuals who received this email, since it implied they would be subject to legal action if they did not provide a sufficient reply. It caused them to take significant action -- e.g., to research the law (that they aren't subject to), to determine if/how they could comply with a CCPA data access request if they had to, to consider retaining legal representation, etc.
It came off as some kind of scam or mistake, but one that had to be taken seriously.
You could read some of the blog posts from people who received these emails to understand the effect it had on them. It might also help you to read the email they received and imaging receiving the same for a site for a personal blog or one-man shop.
I think you bring up a good point: that some of the blame going to the researcher really should be directed at the review board itself. It’s their responsibility to catch cases like this. The fact that some people who were included in the study without their consent are upset and angry, means they failed this responsibility.
I think what you are missing though, is that just because something passed a review board, that does not make it ethical. Review boards, like everything else, will make mistakes.
The study is a case of deception research. Deception research is a type of research in which the researchers are lying to / hiding information from their subjects - the mails purported to come from individual citizens, and did not mention that this was an academic study.
Other fields (e.g., psychology) have long since recognised inherent problems with the ethical aspects of deception research (in a very tiny nutshell: you harm your subjects' agency). Therefore, guidelines and protocols have been established (e.g., by the APA).
Roughly, those boil down to:
- don't do deception research unless no alternative method exists AND the outcome will have significant value
- inform the participants as soon as possible about the deception.
In this case, both IRB and researchers failed to recognise this as deception research. That is in itself a serious issue.
Especially in the area of CS/programming, it's so easy for experts to fool the IRB because they can hide behind words like "website", "data", "policy", as if they are dealing exclusively with machines.
> who are we really to challenge that
We are not obligated to make the same ethical judgement as the IRB. We are all entitled to challenge that.
Making bogus legal threats is unethical, when being sued can realistically lead to completely altered life. Yes, it's an annoyance, after the subjects determined that the threat is bogus, but it could be legitimately distressing (even costly) when they first received the threat.
The researchers made no efforts to contain the negative impact of their email, either. The email contains no information about it being a bogus threat. The subjects weren't told in advance that they might be lied to. The subjects had given no consent to being scared.
It doesn't seem "good faith" to send mass threatening emails with deliberately misrepresented laws.
Remember that the Milgram experiment also involved no illegal actions, and were done in pursuit of knowledge.
People aren't upset about it being annoying. People are upset that it read as a threat and resulted in people spending money to hire a lawyer because they thought they were about to get dragged into court.
Probably all they had to do was be transparent about the reason for sending out the emails, who they were, and probably throw in a link to that page (i.e. https://privacystudy.cs.princeton.edu). Seems like a silly thing to overlook but it does seem the impact on people is serious and I guess they know better now...
This is why I lost interest in a career in accademia and set myself up in industry. I saw one too many situations like this where people assumed they'd be stopped by the institution if they took things too far and they were not.
This is a non-apology apology. It's "I'm sorry you feel that way."
I don't think I'm reading too much into it either: "I am dismayed that the emails in our study came across as security risks or legal threats."
"explaining in detail what we did, why we did it, what we learned, and how researchers should approach similar studies in the future."
Nothing about how it impacts their unwilling subjects. Nothing about failing to indicate they were doing an academic study. Nothing about the falsity of their legal threats.
You are misrepresenting the statement. The full quote is
> I am dismayed that the emails in our study came across as security risks or legal threats. The intent of our study was to understand privacy practices, not to create a burden on website operators, email system operators, or privacy professionals. I sincerely apologize. I am the senior researcher, and the responsibility is mine.
He stated the negative impact they had on study subjects (including the interpretation as legal threats), accepted responsibility, and apologized without reservations. How can you possibly claim he wrote "nothing about the falsity of their legal threats"?
Researchers don't have to indicate they are doing an academic study. Ethical actions things don't become unethical simply because it's part of research.
> Researchers don't have to indicate they are doing an academic study.
Actually, they do. Otherwise it's called deception research , which falls under very specific guidelines.
In general, study participants should be treated decently. Deception research robs them of fully informed consent to participate in the study and is therefore by nature intently ethically over an edge. In specific cases, the benefit of the study's outcomes may offset the harm to its participants. Even then, that harm must be minimised.
Sorry, you're right, that's not the right way to say it. What I meant was deception is wrong regardless of whether it's part of an experiment. If the emails had been sent out under different false pretenses, like criminals probing for weak points, it would have been just as wrong.
You've clipped the quote to remove the part where they are describing the opinion of the IRB, not making a claim themselves. That paragraph is describing why there was no IRB involvement, not claiming that humans wheren't involved in the research.
Casting their description of the review board's opinion as if they were asserting their own opinion. If you want to attribute an opinion to them using a quote, find them stating their opinion and quote that.
It was their opinion - unless you expect that they told review board that they are going email people and board decided that it is not a human research?
They also put it into FAQ in way that expressed no disagreement - and also, they decided to put it there.
(disclaimer: non-practicing lawyer here, not yours or theirs, off-the-cuff very hot take)
What part of 'A consumer shall have the right to request...' in the CCPA isn't clear? Nothing about "A fake user may request..." Looking forward to another ballot initiative to further clarify the law!
The secret shopper thing is a big red herring - at least the secret shopper actually buys something, and in real non-academic life is usually hired by the company (or marketing company by extension).
Disappointing to see the lack of judgment from a researcher who's otherwise done great work, and the IRB failure to boot. Good to see some acknowledgement but it feels like "let's build a tool to do the work and hope for good data." Not sure where this could've led other than a name-and-shame conference paper.
That's not the researcher's website, it's the department's website with a blurb about this researcher (and others) on it. The researcher almost surely had no influence on the existence of Google Analytics on that page. The entity you need to talk to is the department.
Part of the problem here was the smug tone of his student initially going on Twitter to make the dubious claim that the response to this study had been "overwhelmingly positive", when it was already abundantly clear that it was anything but.
I fail to understand what the problem is. They send a mail asking the procedure of GDPR in a way that implies they consider using their rights. Now there is outrage to the extend that the researchers scrap their study, which is probably everything anybody needs to know about the state of the GDPR in practice.
> in a way that implies they consider using their rights.
... No, in a way that implied that they were about to take legal action, including against people who never had any legal obligation in the first place.
> [the system] sends up to several emails that simulate real user inquiries about GDPR or CCPA processes. This research method is analogous to the audit and “secret shopper” methods that are common in academic research, enabling realistic evaluation of business practices.
That was the whole problem. An open "we're researching responses" would've been fine. A murky "we're someone who looks fake talking about nebulous legal consequences" Isn't going to be welcomed anywhere, is it?
This is a better response than I expected, and I wish them luck and success in communicating the lessons they've learned here more widely.
>That seems to be highly unscientific/prone to skewing the sample.
I'd argue the emails they did send are MORE prone to skewing the sample.
If I had no CCPA plan and got an email from someone introducing themselves as researchers, I'd tell them I haven't gotten around to it, and that I intend to comply but haven't had anything prompt me to put in effort regarding CCPA.
If I got the email they did send, that would be exactly the request that makes me go do the legal research, and my response would as narrow as possible.
Yes, perhaps they should have hired real people to do real requests. That would have been a bulletproof way of studying this ethically so your conclusion is moot.
For some reason I was completely oblivious to the hole "drama" but reading the older threads and example emails - they were simply a questions regarding adherence to the GDPR/CCPA.
On the one hand stating that this is a research would probably be "nice" though in that case a lot of websites would probably comply in the nicest possible way skewing the result.
Why this bothers me? Because w while back I was trying to get my personal data off of yelp I was nicely greeted with the middle finger (they scraped my data from the government website which is the only page that can publish it and noone is legally permitted to re-publish it but you know... us-based yelp doesn't care about law)…
My name is XXXX XXXXXXX, and I am a resident of Roanoke, Virginia. I have a few questions about your process for responding to General Data Protection Regulation (GDPR) data access requests:
1. Would you process a GDPR data access request from me even though I am not a resident of the European Union?
2. Do you process GDPR data access requests via email, a website, or telephone? If via a website, what is the URL I should go to?
3. What personal information do I have to submit for you to verify and process a GDPR data access request?
4. What information do you provide in response to a GDPR data access request?
To be clear, I am not submitting a data access request at this time. My questions are about your process for when I do submit a request.
Thank you in advance for your answers to these questions. If there is a better contact for processing GDPR requests regarding allisonelearn.com, I kindly ask that you forward my request to them.
I look forward to your reply without undue delay and at most within one month of this email, as required by Article 12 of GDPR.
Hot take: the researcher did nothing wrong. Some random person could make the same legitimate requests. If you dislike what this person did then really you just dislike that portion of the law.
You cannot simultaneously believe anyone can request their data via these laws and then get mad that people do it, research or not.
You can believe that users should be able to request this information legitimately, but arbitrary third parties should not.
The idea is that the burden and stress of response is outweighed by benefit to the legitimate user. In this case there is no legitimate user.
This is similar to the concept of standing in the courts. Someone who is harmed can bring a suit for compensation or redress, but an uninvolved third party cannot.
The answers to the questions don’t depend on whether or not the user is legitimate or not. Not to mention cases where the user isn’t even sure of their account information or lost email, etc.
I agree the answer does not depend on the legitimacy, but that doesn't matter. The answer to where were you last Tuesday night does not depend on who asks it, but only some have the right to ask that question and demand an answer.
I guess there in is the disagreement. Is the request more like one case or the other. It seems that most people feel the intent of the law is (or should be) to allow users to request information no, not any unrelated third-party
Indeed, but the experiment wasn’t about requesting information, it was about requesting their policy around handling user data.
Seems reasonable to me - for example you’re a prospective user and want to know how they handle requests, just in case you want to do it in the future after being a user.
Which is itself a request for information. I think a request for policy information is reasonable if they they didn't make up false identities and claim to be users.
It only takes a quick scan of the comments in this thread to see that there were people who received this email while hosting a static github.io website with their personal blog. That's a public website. Do you honestly think that anyone running a personal blog has no business doing so unless they are knowledgeable about the details of European and California website privacy rules? What a brilliant way to stifle public speech.
Your answer will probably be: "personal blog don't fall under these regulations, so it's a non-issue" but that's exactly the point: these researchers scared a bunch of people into spending time to research a law that doesn't even apply to them, yet the chance that some random from Europe would send a GDPR request to their blog is essentially zero, because even privacy crusaders are smarter than these Princeton research to know that it makes no sense to do this.
Even if the general principle were ethical (not that I agree), the Princeton researches should have used a curated list of websites that could reasonably be expected to receive GDPR requests.
I love the casualness about somebody wasting hundreds of dollars consulting a lawyer for something that isn’t relevant to them.
As for the personal blog: it is relevant, because the email was send to owners of personal blogs.
You claimed that the recipients of this email, such as personal blog owners, had no business running a website if they didn’t know the details a law that doesn’t apply to them. That’s stifling plain and simple.
People to whom the law doesn't apply are not necessarily very familiar with the details of this, and thus are going to be cautious if presented with what appears to be a legal threat. For a pro, this is easy to reply to, for random hobbyists it's not.
“The law says you have a month to reply.”? A little aggressive, but OK.
“According to such-and-such code, section 45, part b, subsection 3, you have 87 hours from the time I sent this — that is, from 12:43:56 PM Eastern time on this date — to give your on-the-record response.”? They’ve got a lawyer, and this is going to be a pain in the ass.
These particular emails were somewhere between the second and third options.
Asserting your rights (which you even say is "a little aggressive, but OK") as per specific regulation is far from being a legal threat where I live. That's just asserting your rights. A legal threat would be far worse.
> They’ve got a lawyer
...but this would suggest to me that this is cultural, since this thought would never occur to me.
How is this not a threat in the EU or anywhere? Yes it's made worse by the litigious nature of the US, but that's beside the point IMO. The sender is clearly implicating that there will be consequences for not responding. Even if this is the law and the sender is within their rights, it's still a threat.
The entire thing is even worse because most of these websites were not under any obligation to reply but didn't know as much as they weren't experts in the law
In your view, what purpose does informing someone of a law related to their compliance serve?
Saying on the basis of which regulation you're asking for something just isn't considered a threat where I live, period. People who want to make threats actually make threats.
> In your view, what purpose does informing someone of a law related to their compliance serve?
Well, obviously, in this case, it was about the time period expected. If you have reasonable assumption that your request is not common (for example businesses may plausibly receive far fewer GDPR requests then they receive product warranty requests), then communicating the expectation seems like a prudent thing to do since the other party is less likely to be familiar with it.
I'll have to take you at your word as I don't have experience where you live. Here, friendly requests tend to be much less formal. As a further example, suppose my dog was in my back yard barking, and this annoys my neighbor. They approach me about it:
> "Hey, neighbor, your dog is bothering us. Could you take it inside?"
Typical response: "Oh, sorry! Sure. Come here, pooch!"
> "Hello neighbor. According to county code section 23, 'Nuisances', paragraph 3, 'Pets', your dog can't bark for more than one minute without violating the ordinance and being subject to a fine of not more than $85."
Typical response: "Get off my property, and if your kid ever throws a baseball at my house again, I'm going to launch it through your front window."
Normal-person requests are usually formulated like "hi, can you do this thing for me?" even if the person being asked is obligated to do it. Citing law is considered an aggressive escalation.
A communication between two entities who are not friends is not "friendly". This is clearly a formal request of a type that is even regulated by a law. You're almost certainly not asking your neighbor about something like this. You're almost certainly asking someone you've never met in your life. Not sure what about it needs to be "friendly" any more that asking a government bureau using some formalized process (like filling out a form) needs to be "friendly".
I've gotten requests from people asking me to delete their account, sent from the email address they used to register it, along the lines of:
"Hi, I've forgotten my password, but I don't really use my account anyway. Could you delete it for me?"
And of course I comply, because I want to be helpful. They asked nicely; I replied nicely. It's a pleasant and productive interaction from all involved. This is the social norm here.
But the example you outlined is not regulated by any law as a formal procedure. That's an ad-hoc request. Of course it could also be phrased as an GDPR erasure request, but I bet you'd definitely expect that to be more formal and more specific. After all, that would be a (formally) legal request, and not just something you may decide to do or not to do depending on how you slept last night.
My door bell is designed to be pressed. But I do have a problem with someone who run down the street pressing every doorbell, because they want to gauge home owner's response time.
Spamming and wrong intentions can make an otherwise legitimate action unethical.
If anything, it seems like this was an effective means to introduce a lot of people to possible liabilities they have under GDPR/CCPA (or why they are not applicable to them).
Fine, but I had no desire to be introduced to the intricacies of the CCPA that afternoon. I was off minding my own business and didn’t ask for an “Are You Compliant For Dummies” course to be dropped in my lap.
Yes, but whether or not it’s explicitly stated doesn’t really change the law.
Ultimately I don’t really get the big deal. It takes 5 minutes to reply to this, and if you don’t unless you’re some huge organization no one is going to waste resources bringing you to court.
It’s not that they’re implying that it is illegal - it’s that it is.
I'm honestly baffled about the response, especially from the pro-privacy crowd on HN. This is simply the reality of GDPR. If you host and operate a website that serves EU visitors you must comply with GDPR. Of course this is a burden on small operators and it may come off alarming the first time you receive a GDPR request, however, this is GDPR working as intended. It is intended to force operators to explicitly decide which user data they are going to collect (incl. on how to inform users, correct, delete, export, etc. this data).
I do agree that there might be ethical concerns on how this study was conducted, however, the email messages do not suggest pending legal action. They're pretty standard GDPR requests.
The emails were sent to websites that do not process personal information and are thus not subject to GDPR, so the recipients were in some cases confused about what their responsibilities would be. And though the emails did not suggest that legal action was pending, they do suggest a willingness to resort to legal action in a relatively short time frame. This caused anxiety for apparently many small-time, non-profit bloggers.
Is it unethical? I dunno. But it's nuanced, at least.
As soon as the client's IP address touches your server you are processing personal information. E.g. I have seen many webserver which save these in their access logs.
Again, this is the reality of GDPR. It is not okay to operate a website serving EU visitors without considering GDPR implications. This is how GDPR is intended. Don't operate a website serving EU visitors if you don't have a plan on how to respond to these emails. I'm not trying to be harsh or dissuade these small websites from operating. It is just the reality of GDPR.
It's bizarre to me that anyone would respond to what would surmount to a legal request which was delivered via email.
Unless the gov has a supeona that was verifiably delivered to me physically, an email will be either totally ignored, or I'll respond telling them to go pound sand.
Is that how you would respond to any GDPR-related request? GDPR legally requires you to respond to requests within a month. If someone making a request points that out to you, you may feel like you've been threatened. That doesn't change the law.
Yes, that is how I would respond to a request inquiring as to what my GDPR practices are.
The requestor here isn't asking for access their information for GDPR reasons. They are asking what my private business operations are, which are not part of what I'm required to disclose, so far as my understanding goes.
Separate from the above, if I run a US based business, why would I care if the EU wanted to try and sue me for breaking a law that has no jurisdiction over me?
Yes, but the researchers aren't asking about their data. They are asking about internal business policies, which they are not granted under the law.
From the OP's links's FAQ:
"""
Why does this study involve contacting websites?
Very few websites post details of their processes for handling GDPR and CCPA requests. Both the GDPR and the CCPA contemplate users and intermediaries reaching out with questions about data rights processes, and we are using that opportunity to understand current website policies and practices.
"""
From the sites I've seen discussing responsibilities, internal business processes for how GDPR requests are handled are not covered under GDPR.
The GDPR establishes data subject rights, which means that, with respect to their personal data, customers, employees, business partners, clients, contractors, students, suppliers, and so forth have the right to:
Be informed about their data: You must inform individuals about your use of their data.
Have access to their data: You must give individuals access to any of their data that you hold (for example, by using account access or in some manual manner).
Ask for data rectification: Individuals can ask you to correct inaccurate data.
Ask for data to be deleted: Also known as the 'right to erasure', this right allows an individual to request that any of their personal data a company has collected is deleted across all systems that use it or share it.
Request restricted processing: An individual can ask that you suppress or restrict their data. However, it is only applicable under certain circumstances.
Have data portability: An individual can ask for their data to be transferred to another company.
Object: An individual can object to their data being used for various uses including direct marketing.
Ask not to be subject to automated decision-making, including profiling: The GDPR has strict rules about using data to profile people and automate decisions based on that profiling.
I was referring purely to the "Why would anyone respond to an e-mail" part -- because there's no requirement for such a request to not be an e-mail for it to be valid.
I took the fact that the request was outside the scope of GDPR to mean that the researchers were trying to thinly veil their request as an official government request akin to a supeona, which is why I originally stated I wouldn't accept electronic delivery of supeonas.
Note: I also would freely ignore real GDPR requests because I'm US-based, and the EU has no jurisdiction over me.
Claiming that this study was "human research" is straight-up insane. There's as much "human research" going on there as there is when statisticians acquire data by going to a human who possesses it and asking for it (e.g. they want to study baseball, so they go to the Head Excel Overlord of the MLB and ask for their data) - that is, exactly none.
Analyzing data that was given to you by a human does not automatically make your research project "human research", full stop.
The research question here is "How do organizations fulfill their legal requirements due to the GDPR and CCPA?" and has nothing to do with individuals. The researchers didn't want any data about the humans, they didn't ask for it, they weren't studying reaction time, psychological response, brain activity, or anything else about any humans at all.
If the answer to the question "Could you replace a human in experiment with an automated process and still perform a meaningful experiment with conceptually identical results?" is "yes", then there is no human research going on. Test: Stanford Prison Experiment: answer "no" (because the students were actually the ones being studied), turns out to be human research. Test: studying the properties of a crystal in a lab, and one of the machines is operated by a human grad student, answer is "yes", not human research. (note also that this test just proves that human research isn't being done - if you replace the baseball players in the above example with robotic batters, the answer the above question is "no", but clearly human experimentation isn't happening)
Similarly, if each of the individuals who were contacted were replaced with a Perl script scanning for keywords that automatically replied with the organization's GDPR and CCPA policies, the results from the experiment would be. Exactly. The. Same.
In fact, let's go and look up the definition of "human subjects research":
According to 45 CFR 46, a human subject is "a living individual about whom an investigator (whether professional or student) conducting research:
Obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens; or
Obtains, uses, studies, analyzes, or generates identifiable private information or identifiable biospecimens."
Let me repeat again: the fact that humans responded to the emails about GDPR+CCPA compliance of an organization owning the email is completely irrelevant. The experimenters didn't care about whether their request was serviced by a human or a computer, and in fact would have preferred computerized responses (instant response, more standardized formats, no possibility of human error).
Claiming that this is "human experimentation" is disingenuous to the point of malice.
> The research question here is "How do organizations fulfill their legal requirements due to the GDPR and CCPA?" and has nothing to do with individuals.
Where your position falls apart is that they sent emails to private individuals. Although I'm employed, I received the email to my personal address regarding a not-for-profit, zero-revenue hobby website I maintain. Another story here on HN was from them sending the letter to a blogger.
They didn't contact people in the "eventually it will end up on the desk of a human lawyer at a Fortune 500 company" sense, but in the "this came to my personal computer and was addressed to me individually" way.
Frankly, I don't care what their stated purpose was here: the reality is that they directly contacted a whole awful lot of people about their hobbies.
> Where your position falls apart is that they sent emails to private individuals.
That in no way counteracts any part of my argument. Actually, if anything, it reinforces it - the fact that the emails sent to private individuals were unintentional proves that there was no human research being done. It's quite clearly impossible to accidentally do research on humans.
It is possible to unintentionally harm people as a result of doing non-human research - but then again, you can unintentionally harm others through traffic accidents, business deals, entering the wrong information into a database, engaging in gossip, accidentally bungling a normal research experiment (high-energy physics?) and doing just about anything in life. It should be really obvious that the possibility (or reality) of human harm doesn't imply human research.
> Although I'm employed, I received the email to my personal address regarding a not-for-profit, zero-revenue hobby website I maintain. Another story here on HN was from them sending the letter to a blogger.
...and, really, there should have been no harm done. If you run a website, then you have a certain set of rights and responsibilities, and one of those responsibilities is to know exactly what data your site collects, and how to respond to a GDPR or CCPA request. That applies to you, to that other blogger, Xe, and every single person (and organization) that received these emails. And, if you weren't prepared for that, that's your fault, and consider this bungled research project a wake-up call.
It’s more complicated than that. The email I received was wrong about my legal obligations to respond. First, I received the email regarding a tiny personal site I operate for the fun of it, and I don’t meet the CCPA deadlines. Second, nothing in the CCPA said I have to reply to random information-gathering requests anyway. And yet, the email gave me a deadline to respond and cited a specific law, claiming that I owed them a response.
I don't know whether the problem is the law or not. Interesting to see some people have seen GDPR as a security risk, but its a great way for people to see what is being used to help an entity carry out its function.
I sent this to my politician and am still waiting for a response but I'm more than interested in what they use tech wise, and I think it covers everything!?!
GDPR Request:
Everything you have on me, please highlight what you or your 3rd party's consider to be for law enforcement purposes or for scientific purposes and therefore can not deleted, and please detail all & any 3rd party's who may be required to handle my data that enable you to perform your function of MP when dealing with me. Examples will include Anti Spam and Anti Virus software vendor(s), system's backup companies, cloud infrastructure provider's, network infrastructure provider's, computer equipment provider's, external national or regional department's, private assistant's or secretaries, mobile phone company's, data analytics' company's that have (in)directly identified me in order for you to get into office. This list of examples is not exhaustive.
After I have received & reviewed the data, I will inform you of what can be deleted.
In order to communicate via email with British politicians, you have to include your name and address.
I also sent a GDPR request, to GCHQ, MI5, various police constabulary's where I have lived or passed through and the Police national database because a Police constabulary dont have to pass information to the central national database.
Its quite interesting knowing what they know about you and would urge all Europeans to do the same ie GDPR your security services & police forces!
I dont have a mortgage, but in theory you could also use GDPR to do this if you have the deeds to your mortgaged property, you could DSAR your mortgage lender and then ask them to remove all your data from their system as its not scientific or law enforcement purposes and you should also time it so that all the credit data agencies like equifax and experian also remove all trace of your mortgage at the same date and time. In theory you should end up with a mortgage free property but I cant try this as I dont have a mortgage.
Hacking isnt just restricted to Computers, you can exploit the law as well. :-)
I'd argue the opposite - they are likely to be one, especially if they've run a successful campaign.
Candidates and parties will canvass the electorate to identify who is (likely) to vote for who so they can put resources in to the right things and make sure likely supporters turn out to vote etc.
That's not to mention that any (prior) correspondence with the politician (or their office) will almost certainly also contain personal data.
OK, that would make sense. That's more likely a party thing where I live (so you wouldn't contact a random MP but someone in an administrative position in the party), but I imagine it's not necessarily the same thing elsewhere.
They did not screw anyone over, their email specifically says that it is not a data request. I remember one person saying they had a panic attack reading the email. Like come on, let's be real.
I got the email and I nearly had a panic attack. The email wasn’t completely generic: it referred to my specific site. It also read an awful lot as if it was coming from someone who would be looking for the slightest mistake in my response so that they could sue me. Similar things happen[1]. And finally, it lied and said I was compelled to respond, quoting a law that said no such thing (and which wouldn’t apply to my zero-revenue personal project website anyway).
The stress wasn’t from the email. It’s that it gave every indication that I was being contacted by a legal troll, and that I might have to defend my hobby project in a courtroom. I couldn’t afford the costs of doing that, even if I ultimately won, and the idea of “well, there goes the college fund because of a stupid lawsuit on my hobby” was awful.
Thank you for sharing your experience; you're shedding more light on the situation than anyone else is right now. Out of curiosity, would you be willing to copy and paste the email contents here?
I was part of a human subject research study without my consent - https://news.ycombinator.com/item?id=29611139 - Dec 2021 (360 comments)
CCPA Scam – Human subject research study conducted by Princeton University - https://news.ycombinator.com/item?id=29599553 - Dec 2021 (331 comments)
Princeton-Radboud Study on Privacy Law Implementation - https://news.ycombinator.com/item?id=29599154 - Dec 2021 (10 comments)