Princeton researcher apologizes for GDPR/CCPA email study

dang · on Dec 22, 2021

The past major threads on this. Others?

I was part of a human subject research study without my consent - https://news.ycombinator.com/item?id=29611139 - Dec 2021 (360 comments)

CCPA Scam – Human subject research study conducted by Princeton University - https://news.ycombinator.com/item?id=29599553 - Dec 2021 (331 comments)

Princeton-Radboud Study on Privacy Law Implementation - https://news.ycombinator.com/item?id=29599154 - Dec 2021 (10 comments)

tjalfi · on Dec 22, 2021

Here's one more thread.

Ask HN: Is this CCPA-related spam? - https://news.ycombinator.com/item?id=29539266 - Dec 2021 (4 comments)

throwawyaaccoun · on Dec 22, 2021

[Throwaway for privacy.]

I know this was hashed out on the other threads a bit, but can someone please explain to me why folks are so up in arms about this, compared to, say, studies that scrape user data without consent (something the IRB allows all the time by saying that no human subjects are involved)? Is it simply because there is no visibility into this practice (i.e., no email sent?) Scraping user data from public profiles, aggregating it into a model, and publishing a paper or whatever -- that seems demonstrably more invasive to individuals, storing and keeping their user data, than an email quoting a statute.

I agree that the deception was unnecessary, but that's it. It doesn't feel any wronger than that.

Especially because these researchers really were acting in "meta" good faith trying to probe the privacy ecosystem, I fear there may be a chilling effect. Consumers deserve privacy rights and privacy knowledge in the asymmetric surveillance economy we find ourselves in, IMO.

I'm open to being wrong.

ineedasername · on Dec 22, 2021

Ethical guidelines on research exist to prevent an adverse impact on participants. This study had adverse impacts: fear, stress, time & money in consulting lawyers. It was therefore defacto an unethical research study. Speculation as to why the protocol slipped through the IRB cracks are that the language used in the study proposal (at least the part made public) dehumanized the protocols by referring to "websites" rather than humans that would be responding to the inquiries.

The IRB ruled this was not a human subject piece of research, but that is contradicted by the deception protocol. Deception was justified as necessary because people's behavior might change if they knew it was a research request. That acknowledgement made it implicit that human behavior and potential changes to it due to the experiment was a core factor in the study-- ergo, it had human subjects. Behavioral research on human subjects is required to go through a much more rigorous IRB oversight process precisely to anticipate and mitigate potential adverse reactions.

Some people are focussing on the deception, but that is, under some circumstances, allowed by research ethics. The more serious problem was adverse impact which, again, is the primary motivator for why we now have laws and regulation-mandated IRB processes to make sure it doesn't become an issue.

belorn · on Dec 22, 2021

I wonder if IRB ruled in this way because of the assumption of algorithmic response for requests like DMCA take down notices. I can imagine that even for GDPR/CCPA requests, there is still no human involved for website like Google, facebook, youtube and other major sites that is primarily operated through automation. If there is no humans involved then there is no humans to have an adverse impact on.

But as you said, researchers however must have suspected that responses would be made by humans or else the email would have included the fact that it was a study.

throw10920 · on Dec 22, 2021

> Ethical guidelines on research exist to prevent an adverse impact on participants.

Not relevant in this case, because (1) it's clearly not human research[1] and (2) all kinds of other clearly-non-human-experiments still have adverse effects on humans tangentially involved, so that's not unique to human research, either. When scientists were testing the first particle accelerators, they caused some people a lot of stress who were worried that they would destroy the world - does that mean that those tests were human experiments? (clearly not)

[1] https://news.ycombinator.com/item?id=29656792

ineedasername · on Dec 23, 2021

I may be wrong, but I'm going to guess that you don't work with IRB's all that often. I do. I asked a colleague their thoughts on this and they were unequivocal in believing this should have been flagged as involving human subjects if the approving IRB had all of the details of the protocol. Their guess is that the proposal's presentation-- perhaps very innocently-- did not fully convey the details that would have resulted in oversight of the research as a human subject project. These are experts on the nuances of these laws. Of course, experts may disagree, so that reference is not definitive. But it is suggestive that dismissals from those here on HN that are arm-chairing this with no-- or minimal-- experience with this sort of research are not basing their opinions on a full understanding of the IRB & research ethics ecosystem. One or two IRB applications for a research project will not convey the understanding needed to evaluate research protocols under the laws & regulations involved.

As for definitions of human subject, it seems like you are overlooking part of the regs you want to use to support your argument. Per the link in the comment you cite, it's a human subject research if the research obtains data "through intervention or interaction with the individual, and uses, studies, or analyzes the information" That was clearly part of the research in this case: It involved interaction with individuals. I'm not sure how you can overlook this part of the definition when making your determination.

You should also be aware that regulatory laws do not stand alone: The government provides explanatory statements of interpretation and policy guideline. The grand-daddy of these in this case and a fundamental guiding document to anyone in an IRB is the Belmont Report in 1979 that guided the development of modern regulated IRB's. It is pretty clear: participants should "undertake activities freely and with awareness of possible adverse consequence" It's understandable that plenty of folks here on HN are not familiar with the body of clarifications and case-law that that guide interpretation of the law's statutes, but researchers & especially IRB members are supposed to know this stuff inside and out.

Next: Deception to avoid advanced consent is allowed in only very limited circumstances, and is a significant red flag that an IRB needs to bring more scrutiny to the research protocols. I don't know how you (or the IRB if it was doing its job properly) can say that human subjects were not involved if the research protocol relied on the deception of human subjects to see their behaviors.

All of which is somewhat besides the point: This entire research ethics process, independent of specific statutes, exists to prevent adverse impacts on human subjects. This research had adverse impacts, and so defacto it involved humans in research that should have gone through the full IRB human subject oversight process.

codazoda · on Dec 22, 2021

The post here on hacker news mentioned the down sides for one receiver. That person was stressed out thinking that they were about to be sued. They considered retaining council, which could have cost them a few thousand dollars, in order to get ahead of the threat. It didn’t come to that, so it’s a “what if”, but I could see myself trying to retain council too. Hopefully, a lawyer would have talked me down and advised me to wait it out. On the flip side, they may have offered to respond on my behalf (which would cost money).

I would not respond to such an email myself, ignoring it until I was able to defer to an attorney.

I publish a simple personal blog and I worry about the _worldwide_ legal implications of doing so. As one example, I have some old information about making model rocket fuel at home. At the time I had carefully reviewed U.S. law and knew how much I could legally make and have in my possession. Then I got questions from people in other countries and I got spooked. What if I break a law somewhere else?

kstrauser · on Dec 22, 2021

I assume that I’m breaking other countries’ laws all the time, say be criticizing the actions of their governments. I don’t worry about that. I’m much more worried about, say, CCPA compliance while living and working in California. (Not that I’m especially worried it. My personal projects don’t meet any of the criteria which would make it apply to me.)

matkoniecz · on Dec 22, 2021

The problem for people outside USA is that this country repeatedly demonstrated ability to enforce law for example in Europe.

I would not be worried about say Sri Lanka privacy/blasphemy law but USA court can take down my email, website, important accounts, less important accounts starting from HN, gmail and github accounts.

codazoda · on Dec 22, 2021

Yeah, me too. I don’t collect stats on visitors anymore (using Google Analytics for example) because I now understand the privacy implications of doing so. I do use a simple impression counter but I capture no information (not IP, not browser, nothing). I definitely think about the CCPA and ADA laws, but I’m relatively sure they don’t apply to me. Still, I certainly think about them.

kstrauser · on Dec 22, 2021

I personally use a self-hosted analytics app so I can still get some useful feedback without sharing my visitors’ data. I get pretty graphs, and my visitors get to keep their privacy.

dhimes · on Dec 22, 2021

I go as far as saying in the TOS that my sites are for users in the US.

phonebanshee · on Dec 22, 2021

Why? What would make you think this has any impact?

tatersolid · on Dec 24, 2021

If the EU wants an extra-territorial legal framework, why can’t a US website owner do the same in their TOS? That’s a perfectly legal thing to do in the US.

The knife cuts both ways.

dhimes · on Dec 23, 2021

I don't think I understand the question.

wruza · on Dec 22, 2021

What if I break a law somewhere else?

Who knows? I can imagine that an innocent picture of uncovered legs may be illegal in some religious states, but do you have to worry about it? Is that even a thing?

(I’m aware of the chances that you may visit that country some day and find out that you’re a wanted criminal, but not sure if that applies to non-felonies world-legal-wise)

nautilius · on Dec 22, 2021

In that case I'd be mostly worried about breaking law in the U.S. by making rocket knowledge available to foreigners https://en.wikipedia.org/wiki/International_Traffic_in_Arms_...

mint2 · on Dec 22, 2021

Scraping dating is not imposing work, worry and cost on additional people.

The victims of scraping are not going to do any additional work unless the scraped data is used irresponsibly, but that is separate from the act of scraping.

This email required people to do work and caused worry due to the legal threat that the email tried to lead people to believe was applicable to them. They may have had cost if they called a lawyer and it definitely took their time.

Scraping -> no work forced upon victims. That email -> work forced on unwilling victims.

Is there something I’m missing? People including that poster aren’t reaching this same conclusion but it seems very apparent so am I missing something?

nightpool · on Dec 22, 2021

Well, the argument of the GP is that "extra work" is not the only form of harm that is possible. When comparing the harm of extra work and stress due to this email to the harm of have your privacy violated by large, publicly-scraped datasets that include your personal information. For example, once your twitter post is collected in a "posts of Twitter users about X political event" dataset, it's now impossible for you to ever delete that post, which could be harmful for you in the future. it's unclear whether one type of harm is categorically worse then the other.

mint2 · on Dec 22, 2021

public posts on the internet being aggregated is not out of the ordinary, if one group doesn’t do it, another may.

Scraping private posts would be wrong or gaining access to posts under false pretenses. This would be wrong, although different than the email.

The email forced work on people and made legal threats causing work and other effects that would not otherwise happen.

kstrauser · on Dec 22, 2021

Part of it was that the did no (or poor) screening. They got their list of target sites from a research list of the popular websites. I got a letter, and my little not-for-profit, not advertised, purely for fun website was around number 350,000 on that list. First, I sincerely doubt my site is even that popular. Second, if I got the mail, so did lots of people in a similar situation.

They weren’t spamming Fortune 500 companies. They were spamming a huge number of single-person sites that aren’t subject to the CCPA at all and who certainly don’t have legal departments to ask about it.

throwawyaaccoun · on Dec 22, 2021

I mean this all in good faith:

What is the difference between 100,000 individuals emailing 3-5 websites on that list, with their real identities, asking for things to be deleted (such that all 350k are covered)? Where is the meaningful difference between this situation and the one here, ignoring the deception for a moment (unless that is the only issue)?

Could this be a moment of cultural learning for everyone? That's kind of how I am looking at it, frankly, but I am open to being wrong. That is, perhaps small entities will learn, in one or two instances, to just ignore this kind of thing?

rectang · on Dec 22, 2021

You seem extremely unconvinced that any harm was done to the people who were sent scrambling by this alarm. It's as though no matter how convincing the email was, no matter how much of the recipient's time was wasted, no matter how many thousands of dollars they spent on lawyers, you ascribe all blame to the recipient for not having realized they were being deceived — and ascribe no blame whatsoever to the email's author for being deceitful.

This whole discussion was had in the old thread, and there was one person who used the same rhetorical device of belaboring the same question over and over again. It was tiresome.

throwawyaaccoun · on Dec 22, 2021

I should have been more clear, so let me correct that. I am convinced. I agree that harm was done, and suffer from generalized anxiety disorder myself, so I empathize with the panic attacks that people received.

It is because I believe that harm was done, but also because I am a privacy nut myself, that I am trying to, for my own sake, characterize how I should approach sending emails like this in the future. The study may not go on, but individuals still will send these emails as long as CCPA/GDPR exist. (Just to add some color: It's my anxiety which is causing my to want to delete everything from the internet. If there's minimal info about me online, I can rest easy. It's why this is a throwaway that I will abandon shortly.)

Reading everyone's thoughts is what changed my mind. I now understand to have underestimated the emotional and legal effects CCPA/GDPR requests could have on small website operators, and will be more judicious in the future (like this study should have been) in pre-filtering and my wording. Reactions like kstrauser's (elsewhere in thread) were initially surprising to me (perhaps because of the faceless nature of the internet), so I hope you take my about face as genuine.

Where do you think this balance lies? I still believe consumers, in general, should have right to ask those with their data about their processes; to give it to them; and, to upon request, delete it. And further, in general, I think these interactions are the kinds of things that researchers might legitimately want to study. I found your other comments to be thoughtful, so I am curious what you think explicitly.

jwagenet · on Dec 22, 2021

Based on reading https://news.ycombinator.com/item?id=29611139 the other day, my impression is for a small website operator the email template used some potentially threatening language in the line "I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code."

There is some discussion that for large websites or gov entities this kind of language may be necessary to communicate your sincerity with the request, but lone operators doing their best probably dont have any sort of legal to ensure they follow the letter of the law. From my perspective maybe its best to approach a small website with a more casual tone that you just want your data gone and "make it serious" if the request is ignored or the response is noncompliant.

rectang · on Dec 22, 2021

What I hope to see is a popularization of business models where no personal data is kept, because that is less expensive in terms of compliance costs, more beneficial to the consumer, and hopefully more attractive to the consumer as well. We can see the dawn of a new age in other comments in this thread where people talk about not collecting any data on their blog visitors!

Right now it is difficult to build businesses under such models because most institutions, frameworks, and tools shunt you towards hoarding all data. Over time, I hope that better tools will emerge so that building better businesses becomes easier.

There are people elsethread bemoaning not only the unfortunate artificial costs created by this email experiment, but the compliance costs of privacy-protecting legislation in general. But businesses should be paying those compliance costs, because it's an iron law at this point that business-collected personal data will leak yet individuals bear the costs when the data leaks.

To my mind, this experiment went awry in the same way that privacy-abusing businesses go awry: the organization reaped a benefit while the externalized costs were borne by outside individuals.

However, I'm inclined to forgive the researchers, as I think they will learn from this and find ways to collect data which cause less alarm and imposition. Similarly, I would hope that individuals pursuing their rights under privacy legislation would start off gently but firmly, giving small entities time to adapt. But simultaneously, I have an appreciation for those with bulldog tenacity who go after recalcitrant businesses (e.g. the heroes who have gone after Equifax in small claims court).

adolph · on Dec 22, 2021

> how I should approach sending emails like this in the future

Don't.

It's that simple.

I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code.

kstrauser · on Dec 22, 2021

Thing is, I would cheerfully process a deletion request, even though I don’t have to because I don’t meet the criteria to be subject to the CCPA. For me, part of the deception was quoting a law and incorrectly saying it obligated me to reply to their information request by a certain deadline. The law says no such thing, and getting a letter from someone who quotes specific legal codes almost never ends with “…and then they went out for dinner, newly found lifelong friends.”

Beldin · on Dec 22, 2021

It may have felt like a deception, but there's plenty of bad legal takes on the Internet. For this to be a deception, the sender would have to know for certain that the statute doesn't apply in this case.

Could be they did, but then I missed that. Just as likely, they genuinely thought this correct.

The deception was hiding that this was a study, not a genuine request. Lying about or misrepresenting the goals of a study is deception research. There's strict guidelines for that... in the "soft" sciences (APA guidelines). CS is a bit behind and seems intent on reinventing the wheel :s.

kstrauser · on Dec 23, 2021

I'm not a lawyer or trained in legal jargon, so I'm using "deception" in the colloquial sense: I believe they lied to me and the other recipients.

blululu · on Dec 22, 2021

First, this is an altogether improbably scenario (the odds of winning the lottery are good compared to this scenario ever happening). Site traffic follows a power law. A site at 200k down the list is almost never going to get such attention. It is not someone's full time job. A uniform density of information requests is incredibly unlikely and places a very unfair burden on the smaller sites. Second, the difference is pretty obvious: 100,000 individuals seeking a legal right implies a potential benefit to a large number of people. 1-5 people abusing the system implies a bad faith actor whose benefit is pretty minimal.

s1artibartfast · on Dec 22, 2021

Is about the impact on the humans involved. Imagine the study where are you put police lights on your car and drove behind people on the highway to see how they would respond.

Luc · on Dec 22, 2021

I am having a learning experience right here about reading the meandering thoughts of throwaway accounts.

_lqaf · on Dec 22, 2021

The issue here was not primarily about deception. It seems mainly to be that (a) at least one recipient interpreted their mail as a legal threat, and (b) it was a mass-mailing. Spend a minute thinking through the implications if that were true, and you get a firestorm.

I suspect visibility plays a role in the comparison you're making; out of sight, out of mind and all that. But much more importantly, someone sending you what you think is a legal threat is a lot more salient.

throwawyaaccoun · on Dec 22, 2021

Interesting. Ok, so let's say the deception wasn't the problem, suppose for the moment. Would the study have been more palatable if the researchers had more properly vetted the email list to ensure, say, >95% or perhaps even 100% were corporations that did fall under the law?

nightpool · on Dec 22, 2021

The requirements to be subject to the CCPA are any of: have a gross annual revenue of over $25MM; buy, receive, or sell the personal information of 50,000 or more California residents; derive 50% of more of your annual revenue from selling California residents’ personal information. Yes, I believe that if they emailed only sites for which that was true, I would have no issues with the study.

The requirements to comply with the GDPR are much, much stricter and have a much more outsized effect on small, non-commercial site operators. There are no exceptions to the GDPR for non-profits or non-corporate entities. (except a limited carveout for "household processing" that AIUI has been interpreted very narrowly by the courts). I do not think the GDPR is strict enough in this instance, and I think it would have outsized harms on small and non-corporate operators to email them in this way if your only criteria is "could technically be subject to the GDPR in some possible world".

joshdata · on Dec 22, 2021

I operate a website that likely meets one of the requirements to be subject to CCPA that received the emails from the research study. We have basically no revenue or staff. I didn't appreciate being lied to (about who was sending the message), being threatened (with legal enforcement), wasting my time (the study was scrapped), and being used for research without consent (the fact that this happens all the time doesn't excuse it). If they wanted to know our CCPA/GDPR policies, they could have simply asked. I also received emails from the study at two other domains I own and one that mentioned a domain I don't even own, but which probably don't matter for CCPA - all of which made me think that this was a scam and legal trap to take seriously.

s1artibartfast · on Dec 22, 2021

Deception is a necessary part but not the key. The key is potential for distressing a real human being. The problem is that we live in a legal Society where everyone is at risk of life-altering legal consequences.

throwawyaaccoun · on Dec 22, 2021

Oh, our society, especially America's, is overly litigious. I agree.

But, pushing back a bit (in good faith), do you think asking an entity for your data, or asking them to delete it, should really be considered unusual and panic provoking? I said in another comment the same thing, but do you think this could be a moment of cultural learning?

rcpt · on Dec 22, 2021

> America is overly litigious.

I recall seeing Ralph Nader speak at a fundraising event 20 years ago and asking the crowd "how many people have actually tried to sue someone?" and in a room of hundreds only a few hands went up.

And a year ago when I took my landlord to small claims it was insane how complex the process was and how many paperwork pitfalls are in the way to disqualify you. I remember sitting on the half-day zoom call and watching case after case get thrown out because plaintiffs "forgot to file proof of service" or whatever. I'm generally good with paperwork and still nearly missed out.

There may be some people in America who are overly litigious but for the general population the legal system is wholly inaccessible.

kortilla · on Dec 22, 2021

It doesn’t matter. This isn’t a case where an individual would be suing. This is the government regulation coming down on someone after being flagged by “a victim”.

s1artibartfast · on Dec 22, 2021

In a perfect world, I do not think it should be stressful, but we don't live in that world. I think a stress response is reasonable, given the risk of legal consequences.

Perhaps it is a learning moment, but I think the lesson should be to consider the impact of these kinds of studies.

I'm sure it is a learning experience for bloggers as well, and some of them will learn that hosting a Blog is not worth the legal risk and take it down

s1artibartfast · on Dec 22, 2021

The fact that everyone violates the law in some form, and anyone with sufficient will and resources could ruin a life with legal proceedings is why we have the concept of standing in American law. It acts as a filter so that only someone with skin in the game can bring suit. It is one protection against abuse, and why laws like that give anyone standing Texas abortion ban and forthcoming California gun legislation are problematic.

_lqaf · on Dec 23, 2021

You are translating "legal threat" into "asking for data". And your 'learning' comment makes me think this is a cause for you. That's fine, and I even applaud what I take to be the motivation behind it.

But,

- That does not make one in to the other. Misinterpretation or no, the researcher (who was being deceptive, remember) is responsible for how the message was written. I don't know about you, but I don't usually end my polite requests with references to counterparty legal responsibility. When someone starts trying to sound law-talky, it is in no way paranoid or unreasonable to become concerned about what they might be up to.

The problem here is not that USians enjoy suing each other, or that people and businesses underutilize data protection laws. The problem is that an academic study was performed in a way that caused panic in this, our imperfect world (and object of study).

- I also find the idea that an academic study should (also? or primarily?) be an instrument of "cultural learning" deeply troublesome. I'd hope that IRBs would smack that sort of thing down.

_lqaf · on Dec 22, 2021

Unless you're studying how people react to online legal threats, why would you not try to avoid this problem with your study entirely?

matkoniecz · on Dec 22, 2021

5% of emails going to hobby websites would be unacceptable and unethical.

If it would went solely to major corporations - more OK.

Another part: do not lie that study does not involve human subjects.

yjftsjthsd-h · on Dec 22, 2021

Yes; if they had ensured that 100% of their targets were corporations then I would have very little concern about it.

rubylark · on Dec 22, 2021

Demanding a subject to actively participate in your study upon pain of vague and mostly incorrect legal threat is ethically wrong. Passive participation (like scraping) without consent is morally wrong, but since it doesn't cause undue distress to the subjects, it is not as big of a story.

The IRB in this case didn't consider this ethically suspect because "websites aren't people". And yet the study disproportionately targeted small websites where there is, in many cases, only one person involved.

tylermenezes · on Dec 22, 2021

Because the end of the email (wrongly in most cases) demanded a response by law and implied they were open to legal action, which caused a bunch of people to hire lawyers to check into their liability.

dahfizz · on Dec 22, 2021

Maybe the problem is the laws which create unknown liability for anyone hosting websites.

eli · on Dec 22, 2021

In this case the law wasn't the issue. The email message asserted a legal obligation that does not exist.

dahfizz · on Dec 22, 2021

>The controller shall provide information on action taken on a request under Articles 15 to 22 to the data subject without undue delay and in any event within one month of receipt of the request[1]

The legal obligation may not have applied in this case, but it absolutely exists. If someone submits a request to you for their data, you are legally obligated to respond.

[1] https://gdpr-info.eu/art-12-gdpr/

kstrauser · on Dec 22, 2021

The request I got was about the CCPA. It said:

> I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code.

First, the CCPA doesn't apply to my site. It's non-commercial, has many fewer users than required to invoke the CCPA, and zero revenue. No provisions of the CCPA require me to do anything.

Second, the questions were about how I'd handle a CCPA request, and weren't actually a request at all:

> 1. Would you process a CCPA data access request from me even though I am not a resident of California?

> 2. Do you process CCPA data access requests via email, a website, or telephone? If via a website, what is the URL I should go to?

> 3. What personal information do I have to submit for you to verify and process a CCPA data access request?

> 4. What information do you provide in response to a CCPA data access request?

The CCPA doesn't obligate anyone to explain their internal processes. It obligates covered entities to respond to the requests themselves, but not to random drive-by questions.

So basically, that sentence was completely wrong. The CCPA doesn't apply to me, and even if it did, the law doesn't say what the researchers claim it did.

dcow · on Dec 22, 2021

Why isn't this the story? It doesn't even have to be about ethics which nobody can seem to agree on. Sounds like the researchers were simply wrong.

So then the problem actually is that they misinterpreted the law. If someone misinterpreting the law can cause such stress and waste such time, shouldn't society safeguard against this?

Isthatablackgsd · on Dec 22, 2021

More like researchers needs to take classes on law jurisdictions. They seemingly to believes that both laws have jurisdictions over everyone in the world, including countries and states that don't have such laws which causes people to be confused with it since it have legal statement.

The researchers created this issue because they don't understand (or tried to understand) the laws nor they do not screen their statements. The liability is not on the law, the liability falls on the researcher especially with "human subject" comment. Therefore, the researchers are likely to be in violation with their university IRB. The legal statement is forcing people (that are not applicable to them) to respond which in turn violated the ethics of IRB because they did not consent to this research. By 'forcing' them to respond to the research that they don't have people consent to do so will run afoul with IRB.

eli · on Dec 22, 2021

You shouldn't lie to people to trick them into collecting data for you without at least considering the impact on those people.

That's nothing like web scraping. (Though IMHO web scrapers should also use an honest User Agent so if website owners have a problem or question or want to block it, they can)

adolph · on Dec 22, 2021

"meta" good faith != good faith

> why folks are so up in arms about this

The implicit legal threat is similar to the harm described in the Prenda saga: https://arstechnica.com/tag/prenda-law/

It is wronger than the deception because the PI "Jonathan Mayer" is not just a run of the mill academic focused on "publishing a paper or whatever." This is an activist with an ax that won't grind itself. Reviewing his work mentioned in Wikipedia I'm impressed and appreciate the contributions Mayer has made. Mayer can't be not aware of the problems with the approach.

UncleMeat · on Dec 22, 2021

I personally know Jonathan and hugely respect his work.

I could believe that because he is an actual lawyer it was harder to imagine the panic that recipients who have no understanding of the law would experience. But I think that more likely is that the response was a bit of a fluke. Way stranger stuff has been done by security and privacy researchers with the go-ahead from their IRB. This feels to me like this is a methodology that isn't universally agreed on but is not especially uncommon that tripped a response from the internet. The conclusion is more that people should not necessarily take the existence similar research as indication that the broader community is okay with these methodologies.

adolph · on Dec 22, 2021

I suspect Meyer's work is in part preparatory to lawfare in order to force websites to pay for lawyerly services. The letter is akin to a fire insurance company knocking on doors while carrying a torch.

"Of all tyrannies a tyranny sincerely exercised for the good of its victims may be the most oppressive."

https://quoteinvestigator.com/2019/12/19/intentions/

UncleMeat · on Dec 22, 2021

Frankly, that's stupid.

He's got a PhD and a JD from Stanford and has chosen a faculty position and has done a nontrivial amount of unpaid work for various privacy rights organizations. He obviously isn't motivated by money.

adolph · on Dec 22, 2021

Frankly you are great at knocking a strawman down. Jonathan Mayer likely has some motivation for those efforts. I made no claim to the motivation being remunerative or not.

Do you have an alternative hypothesis of a motivation other than preparation for a "public-interest" lawfare campaign?

UncleMeat · on Dec 22, 2021

Actual legitimate research to understand existing privacy legislation, which can be used by policymakers to iterate and ensure that legislation is effective without being wasteful.

kortilla · on Dec 22, 2021

But that’s not how laws are written. Do you think he’s that naive?

UncleMeat · on Dec 23, 2021

He previously worked in a senator’s office so I do suspect he knows how the sausage is made. And yeah, the staff writing bills do look at this sort of material. It is just one part of a bigger picture but it isn’t just throwing research into a void.

otrahuevada · on Dec 22, 2021

The wording on the main driver of the experiment, their especially bad emails, leads website operators to think there is a problem where there is none. This, on top of the research being entirely devoid of consent between the human parties involved, makes it a _very_ bad study, one that could well cause both the university and the research team to lose money if some of the 'subject' parties actually had to go get a lawyer to have a look at their shoddy emails.

In better studies what is supposed to happen is, you propose taking part in the experiment, you get a signed agreement of some sort, and only then actually start experimenting. What happened here is more like some kind of youtube prank than a useful information gathering procedure.

sennight · on Dec 22, 2021

Scraping public data doesn't result in compelling another person to work under a false premise. Sure, you could argue that scraping introduces load that may draw an operator's attention... but the comparison is a pretty big stretch.

How these things pass board review I don't know... it seems pretty obvious to me that creating work for somebody who didn't volunteer to it is, at best, antisocial behavior.

btown · on Dec 22, 2021

In US legal code there is actually a definition of a human subject in https://www.hhs.gov/ohrp/regulations-and-policy/regulations/... (EDIT: to clarify this is a guideline for federal researchers and to my knowledge is not legally binding on private institutions, but seems to be used as a basis for private IRB policies):

"""

(e)(1) Human subject means a living individual about whom an investigator (whether professional or student) conducting research:

(i) Obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens; or (ii) Obtains, uses, studies, analyzes, or generates identifiable private information or identifiable biospecimens.

(2) Intervention includes both physical procedures by which information or biospecimens are gathered (e.g., venipuncture) and manipulations of the subject or the subject’s environment that are performed for research purposes.

(3) Interaction includes communication or interpersonal contact between investigator and subject.

(4) Private information includes information about behavior that occurs in a context in which an individual can reasonably expect that no observation or recording is taking place, and information that has been provided for specific purposes by an individual and that the individual can reasonably expect will not be made public (e.g., a medical record).

"""

The argument is that scraping of public data, already recorded by data systems for general (e.g. not specifically medical) purposes, is neither intervention, interaction, nor private information.

On the other hand, IMO the researchers here clearly interacted with their subjects. While the email was sent to a privacy@ address, not only are emails different from HTTP GET in how likely they are to be read by humans, but this went a step further and implied legal action would be forthcoming unless a human replied to the message. That's interaction. That makes the recipient a human subject.

(IANAL and the above is not legal advice.)

EDIT 2: I've had the pleasure to meet one of the researchers here. They are a staunch defender of online privacy, and I believe the team sincerely wanted to measure how effectively businesses are adapting to the changing winds beyond their legal obligations. But I also think the team, and the Princeton and Radcliffe IRBs, should have done more to consider the impact on the people who operate these businesses themselves. I'm sad and disappointed that the systems in place didn't catch this.

halpert · on Dec 22, 2021

Your question is essentially whataboutism. Both things can be wrong. We can care about this instance without diluting the conversation talking about something else that is also bad.

throwawyaaccoun · on Dec 22, 2021

It's not intended to be whataboutism (sorry about that, I edited this in to clarify) -- I agree that the deception was wrong. But there seems to be something about this particular event that is riling people up, and that's what I am getting at. I am not trying to whatabout, to be super clear.

runnerup · on Dec 22, 2021

To clarify. I don't think people would be riled up about individuals sending out these emails. Individuals are required to be legal, not 'ethical'.

The people who are riled up believe that University studies should be performed ethically. They know that IRB's exist to prevent researchers from doing unethical, but legal, things. In this case, they feel the harm caused should have been prevented.

Scraping data silently doesn't cause stress/harm to the participants directly, as they are unaware of any potential threat.

It's not "human experimentation should be banned" its "human experimentation should be heavily scrutinized to prevent harm to participants as much as possible. And definitely never cause harm to unwilling / unwitting participants".

dcow · on Dec 22, 2021

What bothers/riles me is that there doesn't seem to be a consistent ethical framework applying to these complex situations. Of course things should be ethical but ethics aren't defined as “whatever people on HN and Twitter feel like isn't slimy”.

mindslight · on Dec 22, 2021

I believe the real issue isn't the research ethics per se, but rather pent up frustration on the larger topic. I posted this in one of the original threads:

https://news.ycombinator.com/item?id=29607123

kortilla · on Dec 22, 2021

Because it comes across as a vague legal threat to a website operator! That’s in no way like scraping databases.

This cost real legal resources (there are Twitter threads of internal legal counsel hiring outside firms to evaluate this).

dataflow · on Dec 23, 2021

I don't see the similarity. Scraping doesn't involve a thinly veiled threat of a potentially costly lawsuit.

kstrauser · on Dec 22, 2021

More important:

> We have also received consistent feedback encouraging us to promptly discard responses to study email. We agree, and we will delete all response data on December 31, 2021.

I wrote one of the blogs posts that got linked here on HN, and I have some strong feelings about that. None of them are joy, though. I think it’s good and appropriate that the study is deleting all the data; since it was collected by misleading methods, I don’t think it was valid. I’m not happy that a study covering an important subject, and led by researchers who had good motivations, went so far off the rails in the first place that it had to be axed.

Edit: I wrote more about my response to this whole situation at https://honeypot.net/post/dealing-with-princetons-flawed-pri... .

gpm · on Dec 22, 2021

Ignoring the ethical concerns, all the data they collected was completely worthless, because many of their subjects were contacting eachother and responding to it with the knowledge that it was a mass email sent with a variety of presumably fraudulent names.

charcircuit · on Dec 23, 2021

Early data may be less biased than later data. The data should just be released anyways since the damage has already been done.

seoulmetro · on Dec 22, 2021

It definitely was valid. Probably the most valid data you're going to get if you want to test for this thing.

Your emotions are getting in the way of your logic.

I personally don't see what the problem is. People should be allowed to send whatever emails they like, it's up to you to reply to them. If they had sent emails out asking how everyone's day was going, would you get upset? What about how many employees they had?

ad404b8a372f2b9 · on Dec 22, 2021

Researchers are held to a higher ethical standard than random people because experimenting on people is morally dubious unless you follow strict guidelines.

I feel like there is a lot of hyperbole wrt the harm that was done by this study but on the other hand I think it's clear researchers shouldn't have free reign to manipulate people as they see fit just because it's through email.

seoulmetro · on Dec 23, 2021

I'm not talking about moral standard when I talk about the first sentence. I'm talking about data. And this data is flawless.

They shouldn't be allowed to manipulate people, but that doesn't discredit the results.

The same way it would be very bad for us to put normal, average people in control of jetliner cockpits as a scientific test to see if they can fly them properly with guidance. But if that managed to happen, the data would still be valuable.

I wouldn't call sending one-off emails to people to be manipulation.

jffry · on Dec 23, 2021

Given how widespread word of this thing got, how could researchers possibly distinguish responses to their email that were from people who were not aware it was research, versus responses from people who had become aware what was happening?

The same goes for people who didn't respond. Did they not respond because they heard about this being research, or did they not respond for other reasons?

This data is the opposite of flawless, it is poisoned, and any attempt to draw conclusions from the responses they got would be junk.

seoulmetro · on Dec 23, 2021

>Given how widespread word of this thing got, how could researchers possibly distinguish responses to their email that were from people who were not aware it was research, versus responses from people who had become aware what was happening?

By limiting results by time to anything received before everyone found out? It's pretty easy to set a time period for that. It has the benefit that people who find out will not want to participate or will complain.

>This data is the opposite of flawless, it is poisoned, and any attempt to draw conclusions from the responses they got would be junk.

Nope. Even just limiting it to 48hours would provide great data.

charcircuit · on Dec 23, 2021

Researchers are random people. There is no need to gate keep. Anyone can publish research papers. You don't have to be a "researcher" to do research.

jaclaz · on Dec 22, 2021

Personally I would like to know the exact numbers of e-mails actually sent.

Hypothesis out of 1,000 mails:

5% were never read (because of spam filters/whatever)

10% were discarded manually or ignored

50% were replied to taking 30 minutes to write an accurate reply

30% were replied after consulting someone else (in the office or friend) let's say 1 hour

5% were replied after consulting a lawyer or consultant, let's make this 4 hours

500x1/2=250 300x1=300 50x4=200

Every 1,000 e-mails roughly 750 hours of people's work has been lost, that is at (say) 40 US$/hour some 30,000 US$ "burned".

williamstein · on Dec 22, 2021

I suspect it was far more than 1000 emails, because my small company got one. It scared us and wasted significant time (eg as we carefully read the relevant CA law). At the time we concluded that it was highly likely to be a phishing scam and archived the message with no response. In addition we decided not to respond because the person asking the question was not a paying customer. I definitely did feel threatened by the way the email was worded.

adrianhon · on Dec 23, 2021

According to this tweet, it was 200-300,000 emails. Absolutely shocking. https://twitter.com/ehasbrouck/status/1473669157681909764?s=...

jaclaz · on Dec 23, 2021

That with my (conservative, I believe) estimation would make 6-9,000,000 US$.

Let's make it 10,000,000 US$, that were "wasted".

Sending 200-300,000 of such mails makes no sense whatever, AFAICT a study (besides the ones with 12, 18 or 33 participants), if the sample is random enough, with 1,000-10,000 should give accurate enough results.

In the good ol'times (snail mail) sending 200-300,000 letters would have costed probably 200-300,000 US$, I doubt that the Uni (or its IRB/whatever commission) would have approved this kind of expense.

pksebben · on Dec 22, 2021

I would be fascinated to read the original copy. there's a lot of information missing in this story.

javajosh · on Dec 23, 2021

Apparently this was the last line, as reported by the honeypot blogger: "I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code."

lethologica · on Dec 23, 2021

The mail reads:

To Whom It May Concern: My name is … , and I am a resident of Paris, France. I have a few questions about your process for responding to General Data Protection Regulation (GDPR) data access requests: Do you process GDPR data access requests via email, a website, or telephone? If via a website, what is the URL I should go to? What personal information do I have to submit for you to verify and process a GDPR data access request? What information do you provide in response to a GDPR data access request? To be clear, I am not submitting a data access request at this time. My questions are about your process for when I do submit a request. Thank you in advance for your answers to these questions. If there is a better contact for processing GDPR requests regarding zylstra.org, I kindly ask that you forward my request to them. I look forward to your reply without undue delay and at most within one month of this email, as required by Article 12 of GDPR. Sincerely,

tdeck · on Dec 23, 2021

That's it? I just don't see how this is so burdensome even if you don't have a data deletion process in place (i.e. probably aren't complying with CCPA/GDPR). It's basically just saying "how can I ask for my data to be deleted and prove which user I am". I'm prepared to answer these questions for my side projects so it seems like a business should be able to answer them.

samus · on Dec 23, 2021

> I look forward to your reply without undue delay and at most within one month of this email, as required by Article 12 of GDPR.

This is the threatening part, but it's also bogus. The wording of the GDPR does not require a business to answer such an email, unless the sender actually wants to submit a data access request. But previously, the sender denied the intent to do so:

> To be clear, I am not submitting a data access request at this time.

Thus, the email is perceived as spam at best and a threat at worst.

alistairSH · on Dec 22, 2021

$40/hour? That seems super low, unless all the emails were processed by mid-level admins. If a web admin, engineer, etc processed it, you probably need to double that. If it went to counsel, the value could be tripled or more.

dr_dshiv · on Dec 23, 2021

God, think of how expensive the internet is, in human time costs.

pessimizer · on Dec 22, 2021

I'm shocked that your hypothesis assigns 0% to "Admin spent 30 seconds pasting a form letter, or a link to a page on the site, that describes their handling of user info and the process for deleting or requesting it."

sokoloff · on Dec 23, 2021

Even if that material was pre-prepared, there's vanishingly few (probably zero) organizations for whom 30 seconds of one IT admin's time, acting alone, would be spent on this.

"Oh shit, we need to have at least a phone call with counsel on this before we reply at all!"

jaclaz · on Dec 23, 2021

I think you can un-shock yourself.

The mail sent to a "random/generic" address (let's say info@nicesite.com, provided that the site is large enough to have a permanent site admin) would be read by a low level support person, that would forward it to a manager, which would forward it to a higher level manager that would forward it, after having discussed it, to the site admin.

The 30 seconds is totally unreal, let's make it 5 minutes, but these five minutes are spent after another 20 minutes of internal moving/talks before it gets to the site admin.

So my half hour at 40 US$ may become 20 minutes at US$ 40 and 5 minutes at US$ 120 40/3+120/20=19,33, not far from the 20 dollars attributed to 50% of cases.

matkoniecz · on Dec 22, 2021

Presumably this focuses on experience of operators of small hobby websites.

Which do not have dedicated admins or form letter prepared by legal department.

hapanin · on Dec 22, 2021

Perhaps the lab (and the IRB) should collectively perform 750h community service.

ad404b8a372f2b9 · on Dec 22, 2021

At the rates phd students get payed they basically already are.

jaclaz · on Dec 23, 2021

... for each 1,000 mails sent, heck, seemingly they sent 200-300,000 of them!

kbenson · on Dec 22, 2021

That's one way to look at it. Another is that people spent some time to understand a law which may or may not affect them, but if it does, they should probably already have known about it. "Should" in the sense that it would be good for them it they did, not in the sense that I think they were negligent, as honestly I think there's a bunch of laws that affect people like this that that most of us are unaware of.

matkoniecz · on Dec 22, 2021

I do not consider acceptable to be threatened about California law that does not apply to me.

I do not appreciate learning about any law by being threatened with it in fake spam email.

And sending threatening email to humans and having chutzpah to comment "our study does not constitute human subjects research" is just insulting.

I received numerous spam from universities about "research" but never one that was blatantly lying, threatening me with inapplicable law and with legal documentation claiming that I am not a human.

I send a complaint to them, and will consider further complaining.

Does anybody have any idea why it "does not constitute human subjects research"?

Is threatening people online not counted because it is online? Or have they lied to review board?

jonas21 · on Dec 22, 2021

Even if the California law doesn't apply, if you operate a website with EU citizens as users, you're subject to the GDPR (and unless your website is extremely small or you explicitly block them, you've probably got some users from the EU). The GDPR has similar provisions to the CCPA, and some people do exercise their GDPR rights by sending emails like the ones the researchers sent.

Which isn't to say that what the researchers did was acceptable -- just that it can still be a valuable educational experience for anyone unprepared to handle such a request.

matkoniecz · on Dec 22, 2021

> some people do exercise their GDPR rights by sending emails like the ones the researchers sent.

Legitimate mails are OK. Mass send spam with illegitimate threats is still not.

I am in large part irritated because it gives arguments to people who would want to get rid of such laws, makes harder to handle legitimate requests and spreads false info about such laws.

> it can still be a valuable educational experience for anyone unprepared to handle such a request.

And being robbed or having your country invaded also can be valuable lesson, which is not making it in any way acceptable or welcome.

joecool1029 · on Dec 22, 2021

If they aren't an EU website, GDPR effectively doesn't apply. EU can word the law however they want but at least in the US without a treaty to enforce such a law, it lacks the force of law here. Europeans have an extremely hard time understanding this and I'm not quite sure why. I see this assertion again and again across the web.

kstrauser · on Dec 22, 2021

I've seen that too. I'm in the US, and not subject to the GDPR. I like the GDPR and totally approve of its goals. As a Californian, I'm glad we have the CCPA which is similar to it. I say this, then, as someone who supports the GDPR and appreciates it: I'm still not subject to it because I'm not inside its jurisdiction.

Similarly, I'm certain I've broken laws in other jurisdictions, such as by criticizing fragile-egoed governments who make that illegal. Doesn't matter, they don't apply to me either.

tremon · on Dec 22, 2021

This is a bit pedantic, but I'll make my point anyway: whether a law can apply to you is orthogonal to whether it can be enforced on you. The GDPR is very clear about its application, and it is explicitly extraterritorial [1]. Of course, it does have secondary provisions about company size and non-commercial activity (mainly recitals [13] and [18]) which limits its applicability, but from a legal definition point of view, "I don't live in the EU so the GDPR does not apply to me" is too simplistic.

[1] https://gdpr-info.eu/art-3-gdpr/

[13] https://gdpr-info.eu/recitals/no-13/

[18] https://gdpr-info.eu/recitals/no-18/

sergiotapia · on Dec 23, 2021

Trinidad and Tobago might as well threaten the world as well with some weird clause. Fact is that EU GDPR has zero application here in the states.

Beldin · on Dec 22, 2021

Slightly more nuanced: you do not foresee (and have no intention of) being anywhere where the laws you broke hold sway.

There are laws that apply to anyone anywhere*; if you never have to worry about the consequences of breaking a law, you could choose to ignore it.

* Belgium has one on warcrimes if memory serves; the GDPR might also apply to anyone handling an EU citizen's data (but IANAL).

michaelmrose · on Dec 22, 2021

Nobody in America is going to know about or expect to be bound to the laws of 100 different jurisdictions because in theory someone could visit from that country.

Kind of like visitors from Spain don't bring with them Spanish laws when they visit Nevada.

kbenson · on Dec 22, 2021

> I do not consider acceptable to be threatened about California law that does not apply to me.

I think that's a bit much. Someone asking how they would submit a request if they needed to, and specifically saying in the message "I am not submitting a request, just wondering how" isn't exactly threatening you. It's sort of like someone going door to door ina neighborhood asking people what they think of the new water conservation law that requires sprinklers to be run after a certain time of day (which my city has, and recently went into effect). If I'm not in compliance, or don't even know if I'm in compliance, could that person have possibly seen my out of compliance and that's why they're asking? Maybe. If I knew about the law and was actually in compliance, I would know it's not a problem. One thing is not in question though, which is that if I'm subject to the law it's my responsibility to know about it and be in compliance, legally. Someone asking me about it is only a problem if I'm failing to do that in some way.

If they ask me about a law for some other county or state? I could look that up and determine I'm not subject to it. There's plenty of information on it.

> Is threatening people online not counted because it is online?

Your entire comment and all points therein relies on the assertion that the email is threatening. You haven't shown this. Some people might read that email as threatening, but I'll note, the only people that would do so are those that don't actually know whether they are subject to those laws and have ignored what's been going on and were blindsided by the question.

This whole thing is blown up because people are upset at being called out on their disregard to the current state of the internet and the laws being passed to regulate it. That's not to say the study was carried out without problem (it wasn't), but there actual harm to people of the type described in this thread was of their own negligence. Whether you think these laws are good or not, it is your responsibility to know whether you are affected, or have some assurance from others whether you are or not (even if it's just a hosting platform telling you what it thinks your responsibilities are). You can ignore this responsibility if you like. People do that all the time about laws that affect them. I'm sure everyone does it to some extent. Just don't act like you're a blameless victim when asked about them.

smoe · on Dec 22, 2021

I don't think anyone is claiming that the "I am not submitting a request, just wondering how" is threatening

What they refer to is the final paragraph of the mail

"I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code."

kbenson · on Dec 22, 2021

Is asking someone to follow the law a threat?

I know people like to take it that way, but it's literally saying (whether true or not) "you are required to do this, so do this." I'm a bit more lenient of things that could be classified as implied threats when it boils down to "follow the law" and the threat is only relevant for those not following the law.

asdfasgasdgasdg · on Dec 22, 2021

Yes, it is a threat, since it suggests that legal action will follow without compliance. It's not an explicit threat, but it communicates a threatening meaning. It is a coercive statement.

Now threats aren't necessarily a bad thing when justified. A threat is just, "if you do/don't do this I will/won't do that." But this particular threat was bad in several ways. First, it was directed at targets not actually bound by the relevant law. Second, even if it was directed correctly, many would probably view it as a frivolous use of that law.

kbenson · on Dec 22, 2021

> Now threats aren't necessarily a bad thing when justified. A threat is just, "if you do/don't do this I will/won't do that."

I agree it's a thread, and what you state here was actually going to be my response to that.

> First, it was directed at targets not actually bound by the relevant law.

Yes, that's the worst thing about this. At the same time, I think those people should be prepared to answer things like this. The world we live in means anyone can send them the same request at any time, for real reasons (even if that person might be incorrect in what they are requesting).

> Second, even if it was directed correctly, many would probably view it as a frivolous use of that law.

From what I read of the statute, it appears to be exactly what that section of the law is for. To my (layman's) eyes, this is part of what the "request to know" verbiage in the law is for.

(1) Right to Know About Personal Information Collected, Disclosed, or Sold.

b. Instructions for submitting a verifiable consumer request to know and links to an online request form or portal for making the request, if offered by the business.

fastaguy88 · on Dec 22, 2021

Regarding the last section -- you might want to think about how you would answer the question: "When did you stop beating your wife."

kbenson · on Dec 23, 2021

That's not what that is at all. It's more equivalent to going up to someone and asking (privately, I might add), whether they have any domestic violence complaints against them, if there were a law requiring people disclose that on being asked within a certain time frame, and noting they have the legally mandated period of time to reply.

Kinda an asshole thing to do, but any person subject to that law (or being asked, even if that's not a law they are subject to) should know how to deal with a request such as that, and if they don't, spend the time to learn how to deal with a request such as that. That might be "fuck off, that's a law from somewhere else" or it might be "I have no complaints"/"I have one complaint".

There's a difference between whether someone is being an asshole or has a right to ask something, and whether learning how to deal with that thing if you don't already is a waste of time and money.

fastaguy88 · on Dec 23, 2021

Except that for many of the requests, the “someone” wasn’t married, or in a relationship. But they still got asked the question.

asdfasgasdgasdg · on Dec 23, 2021

> From what I read of the statute, it appears to be exactly what that section of the law is for.

Sometimes what is legally permitted and what is socially acceptable are different. Pretending to be a member of a small time social network and sending a formally-worded letter to the operator, on a topic you have no personal privacy interest in, is on the legal but not socially acceptable side of the line. It's a jerk move, as you yourself mentioned in a later comment.

kbenson · on Dec 23, 2021

In aggregate it's a jerk move. For any single individual it's the purpose of that statute, from what I can see. Asking, as an individual, for how to make requests like that isn't what I would consider a jerk move or frivolous use of the law.

It's for that reason I think people should be prepared to answer these questions if presented, and being presented with them and having to account is for them not a waste of time.

I think people are too caught up in that the people performing the study were being jerks in how they went about it when the actual email is perfectly formed as what any random person on the internet could legitimately send (at least with respect to what damage this caused).

asdfasgasdgasdg · on Dec 24, 2021

> any random person on the internet could legitimately send

Any random person on the internet could harmlessly send a more gently worded email and then only escalate to legalese if they get an unsatisfactory response.

kbenson · on Dec 24, 2021

I'm honestly not sure what point you think I'm trying to make. Because that's not really relevant to what I was trying to express, and I'm kind of tired of trying to clarify my point only to feel like people are ignoring what I say. Either I'm not expressing it well, or people are failing to bother considering it. I'll let you keep whatever interpretation of my point you have, as it's no longer worth trying to correct.

asdfasgasdgasdg · on Dec 24, 2021

The point you seem to be conveying is that there is nothing wrong with the communication that was sent out. The reason your posts come across that way to me is that you keep saying things like, "that's exactly what the law is for" or "asking, as an individual [... would be ok]." And my response to you is that perhaps those other scenarios would be ok, but we are talking about this scenario, where what was done wasn't ok. It doesn't matter that other scenarios would be ok, and by repeatedly asserting that they would you are giving an appearance of endorsement to what was actually done.

Hope this clarifies my view of the conversation to this point. Personally I am not very interested in talking about other hypothetical scenarios where the law might be employed. It's a little too abstract for me right now.

kbenson · on Dec 25, 2021

> The point you seem to be conveying is that there is nothing wrong with the communication that was sent out.

The root of this thread, which I responded to, was about time spent from emails and money "burned" dealing with them because the people had to figure out whether it applied to them and/or respond appropriately.

In that context, I don't believe this is time wasted, it's time people spent learning about something they should already have paid attention to. The "wasted" time is from people or departments responding that already knew their liability (or lack thereof) and had to write another email explaining or pointing towards their documentation, or send the form letter. That actually wasted time is likely far less than was posited.

Should these researched have done this? No. Was it a complete waste of everyone's time that was contacted? I also think no, it wasn't. These were real laws and what was requested was legally required of the people that it applied to, and even for the people it didn't apply to, any random person on the internet could have sent a similar request (either correctly or incorrectly asserting their rights), and the recipients would have had to deal with it just the same. That's what I mean by "any random individual". It's not to say what the researched did was okay, but just to note that if someone is considering all the time people spent dealing with the email and figuring out if it applied to them, I do not consider that entirely wasted time. These are real laws, and people that run sites should be aware of them.

I've repeatedly said that what the researchers did is not acceptable, that they acted like assholes, etc. What I've trying to do is separate the initiating action from the outcome, and make a point about the outcome. Not for the purpose of defending the researchers, but because I think it's important that people understand the liability they expose themselves to just by running these sites, as if they do and they find that problematic, maybe we'll get enough visibility to change the laws in beneficial ways. At a minimum they'll know how to protect themselves in the future if they get a real request that needs to be dealt with within a specific time frame because of the law.

In any case, thanks for taking the time to summarize what you thought my point was. Not everyone would be willing to put in the effort in order to attempt an actual understanding with the other party in a discussion. :)

kstrauser · on Dec 22, 2021

People got these requests to their personal blogs. The complaints aren't that someone at Apple had to reply to a fake request, but that people who are literally just hosting tiny websites for the fun of it are getting these letters.

If a random teenager sets up a Wordpress site because it looks fun, I contend that they shouldn't have to wonder whether it's legal. Down that path lies insanity.

kbenson · on Dec 22, 2021

My point is that some of these people are subject to the law, and could get an honest to god actual legal request to do something, not just explain their procedures, just as easily. People should know whether they have responsibilities under the law or not.

matkoniecz · on Dec 23, 2021

Legitimate mails are OK. Mass send spam with illegitimate threats is still not.

And many of victims were not subject to this laws.

dcow · on Dec 22, 2021

Why shouldn't random teens care about the law?

somehnguy · on Dec 23, 2021

Know what really gets young people deeply interested in tech & programming? Long boring legal text & worrying about legality roadblocks!

Said nobody ever.

matkoniecz · on Dec 23, 2021

I do not appreciate learning about any law by being threatened with it in fake spam email.

I guess that the same applies to typical teenager.

karaterobot · on Dec 22, 2021

With respect, it just doesn't matter whether you think the researchers were doing a service or not. What I mean is, the researchers are (depending on jurisdiction and funding source) bound to abide by certain standards when doing human subjects research, and informed consent for participation is one of those standards. Even if receiving the email was 100% beneficial to everybody, and had no risks at all, the participants would still need to been told about those benefits before participating. They get to make the choice to participate or not. The IRB process exists to make sure those practices are followed in every case, to take the personal opinion of a researcher out of it. These standards were developed in response to researchers who did very harmful things to subjects without their consent, in many cases because they thought it was for the greater good.

kbenson · on Dec 22, 2021

> With respect, it just doesn't matter whether you think the researchers were doing a service or not.

I wasn't making a case that the study was fine and had no problems. I was making a comment on, broadly, "money wasted because of this". Whether the study was problematic or not (it seems like it was), everyone scared by this email was only scared because they'd stuck their head in the sand with regard to laws that have been enacted that put certain requirements on some people, and whether they are affected or not.

As I see it, there are a few possible general outcomes of the email:

One, you know what your requirements are, if any, and you respond appropriately.

Two, you don't know what your requirements are, and you look up your requirements, and respond or take further action at that time. For the majority of people, that fall into this case, that's probably "do nothing".

Three, you don't know, go immediately to a lawyer, and burn a lot of time and money with that lawyer, for them to either tell you it doesn't affect you or to ask you WTF you're doing operating something like you are without knowing the simplest of things that could affect you.

In all those cases, you are left off either with the same or more knowledge about your legal responsibilities online. In the cases where you waste resources using a lawyer (in some cases a lawyer would not be a waste, but possibly something you should have done previously), I think that's people overreacting to their own (possibly longstanding) negligence in understanding their own situation.

For what it's worth, whether the study was conducting in a way that was acceptable is irrelevant this specific question. Any individual could email asking a similar question entirely legitimately.

anamax · on Dec 22, 2021

Cool, so it's acceptable to send the analogous e-mail regarding immigration status to lots of people.

kbenson · on Dec 22, 2021

I mean, that probably makes you an asshole if you do it, like the people that ran this study, but honestly, everyone should know their immigration status, right? If some random person emails you asking your immigration status, I think most people should know how to deal with that.

I don't think it would be acceptable to impersonate any sort of official in that exchange, but that wouldn't be analogous to this situation either.

anamax · on Dec 22, 2021

There's no impersonation involved; the analogous email would say that the sender would notify authorities about the recipient based on the answers.

It's unclear how this would be more or less random than the e-mail to websites.

And you'd be just as quick to defend such an asshole, right?

kbenson · on Dec 23, 2021

> There's no impersonation involved; the analogous email would say that the sender would notify authorities about the recipient based on the answers.

No, the analogous email would say it would notify the authorities if they didn't answer in the legally required timeframe (which doesn't exist). Honestly, it's a fairly tortured example that doesn't fit well.

First, the person requesting in reality is the person making sure their own rights are being honored (whether erroneously or not) based on real laws, while your example is some random person asking others about information that is not really their business.

Second, which is based upon people presenting somethign publicly. It's more analogous to going up to someone that has a shop on a public street and requesting info on their current health inspector rating, which is required by law to be shown (for restaurants). A public website is public. You get something be being public, but that might also expose you to liability.

> It's unclear how this would be more or less random than the e-mail to websites.

Hopefully it's not unclear anymore.

> And you'd be just as quick to defend such an asshole, right?

I'd be just as quick to say that yes, that person is an asshole, but I don't think you can necessarily attribute all the lost time and money to looking into their request as wasted, unless it's the short amount of time it takes to tell them to go to hell.

If someone is unhappy because they wasted hours or money on an attorney because some random person asked them their immigration status, well that's probably something they should have worked out already, if it was that important, so the time isn't "wasted".

In other words, it's entirely possible for an asshole to accidentally cause you to do something beneficial for yourself that you should have done long ago. That doesn't make them less of an asshole, but I also wouldn't consider it their fault them for the time you spent finally getting your shit together.

Notice how I'm not really defending someone being an asshole, just making a note about outcomes? Perhaps you should look at that and my past statement before continuing down a path of accusing me of "defending" someone.

kwertyoowiyop · on Dec 23, 2021

It would be egregious in either case.

aero-glide2 · on Dec 22, 2021

I agree, I still don't see what was unethical about this.

bennysomething · on Dec 22, 2021

I'm assuming you've never had something that approaches a real life legal threat? It's extremely stressful.

It's one thing wanting people to know about laws, it's another thing to induce emotional distress just because you think some individual should know.

Personally I think it was a horrible thing to do to an innocent person. Totally thoughtless and uncalled for.

nickff · on Dec 22, 2021

How would you feel about getting threatening e-mails out of the blue, then finding out you were being used for the author's personal benefit?

kbenson · on Dec 22, 2021

To be clear, I'm not saying the study was conducted ethically, which I think is a complex question (but also one I think influenced quite a bit by the wording of accusations, as "human subject research" has some historical connotations even if an accurate description), but that attributing all lost time/money to a cost the study imposed on others might be taking too much of a leap.

belorn · on Dec 22, 2021

From work experience, only a small amount of website contact information work to actually contact the person in charge of the website.

My very rough estimated would put it more like:

40% of email addresses is no longer valid or has an mail box that does not get read.

30% reaches the web design shop which built the website many years ago under a different brand. They blindly forward it to their customer if they still have that information. The contact information is many years old and likely a dead end.

20% has auto-reply and do not get read.

1-2% has algorithmic reply that links to a FAQ.

5% actually reach a human being. Those 5% however are still a good enough reason to not do this!

eli · on Dec 22, 2021

I've been very critical of this study from the start, but credit to the research team for acknowledging their errors and apologizing.

I think the goal is a good one and I hope they're able to find a better way to accomplish it.

addingnumbers · on Dec 22, 2021

Their apology really doesn't address one of their most egregious wrongdoings.

I see them boasting that the IRB determined "our study does not constitute human subjects research."

I don't see them acknowledging that they slipped under the IRB's radar by consistently referring to human subjects as "websites."

vhold · on Dec 22, 2021

I don't think they even realize that is what happened. You can see the flawed thinking throughout the entire description of the experiment. They anthropomorphized websites, imagining them to have the abilities that the humans behind them have.

That's probably the big thing that people should learn from this, I think this is a pretty common misconception.

double_nan · on Dec 22, 2021

That is a very generous way of thinking about (no doubt) very smart people.

ad404b8a372f2b9 · on Dec 22, 2021

Some ethical questions are very subtle, it doesn't strike me as malicious. I feel like their logic is valid for larger websites where you have teams of people handling these questions, and even possibly for smaller website that are incorporated.

There is a lot of precedent in using the mystery shopper technique to assess companies that aren't considered Human Subjects Research even if the interface to that company is a single human being.

double_nan · on Dec 23, 2021

I would argue that even for larger websites this would not be a valid logic either. Such websites might have a team responsible for such issues, but this is not the service these websites provide nor the primary way of operation for these websites but rather a way to mitigates risks. Such risks burden the websites and are raising costs for the websites to operate. It's an abuse of service somewhat similar to shoplifting.

Your analogy with mystery shopper doesn't hold the water as well since the researchers in such cases are just executing the primary function of the businesses in question. So while for the bigger sites you might be correct that the research is not a human subject research but it's still an abuse of service which in unethical without consent as well.

volta83 · on Dec 22, 2021

This might be an error on both sides.

They want to study how websites handle GDPR and CCPA, and that's probably what they submitted.

The IRB reasoned that "websites are not people", which is true, but failed to reason that "websites are operated by people", and therefore certain measures should be taken.

The IRB bears some responsibility for this.

eli · on Dec 22, 2021

I think the IRB should investigate how they reached that conclusion and probably issue their own apology. Hard to say without seeing the actual application and not being familiar with Princeton IRB rules.

_ktx2 · on Dec 22, 2021

What is it about an apology that makes you seek them?

I prefer action, charitable interpretation, and progress. I found this update rather encouraging:

> Third, I will use the lessons learned from this experience to write and post a formal research ethics case study, explaining in detail what we did, why we did it, what we learned, and how researchers should approach similar studies in the future. I will teach that case study in coursework, and I will encourage academic colleagues to do the same. While I cannot turn back the clock on this study, I can help ensure that the next generation of technology policy researchers learns from it.

Instead of wasting time by making another person or entity go through the humiliation gauntlet, let them improve their surroundings.

kortilla · on Dec 22, 2021

Well an apology is usually the first step in admitting wrong-doing and changing a formal process like IRB reviews.

eli · on Dec 22, 2021

Yes, a good apology always explains what is being done to prevent the wrong from happening again. I think we're saying the same thing.

_ktx2 · on Dec 22, 2021

Potentially. What I explicitly dislike is the mea culpa portion (and subsequent apology grading, where people try to derive some intent) Rather, I like "responses" with a plan. Is that the same as an apology to you, even without explicitly saying "I'm sorry"?

ummonk · on Dec 22, 2021

They don't deserve any credit until they set up a fund a to pay the legal costs of websites that had to consult counsel in response to these emails.

abhv · on Dec 22, 2021

I am an academic and I am against this type of study. My main objection is that it wastes the valuable time of the website operator for little benefit. It is immoral to waste people's time.

Many IRBs are unaware that these kinds of "public surveys" unduly burden respondents and cause them unnecessary stress.

The little benefit will be a series of graphs indicating how site operators respond, which could be interesting, but does not justify the burden.

godzillafarts · on Dec 22, 2021

My org received one of these emails. I was the engineer pinged on the support ticket.

This request is neither threatening nor burdensome. This is a pretty standard run-of-the-mill GDPR request. We get them all the time.

It took less than 60 seconds of my time to provide our support team with the information they needed to respond to the request. In fact, we already have a canned response to these requests - the person on the support team is a new hire and was unaware.

If your org has users/customers in the EU, you need to have a GDPR playbook. Your support team needs to be briefed on these requests and how they should respond.

I have a difficult time believing that any "controller" complaining about this is properly prepared to respond to GDPR access requests.... Which is kind of the whole point of the study, no?

tzs · on Dec 22, 2021

Your message implies an organization that has at least two engineers and at least two support people. Many of the site owners who were seriously bothered by it seem to have been one person operations running non-commercial personal sites and it never occurred to them that they needed to look into what if any obligations they might have under laws like GDPR and CCPA.

Maybe the default index.html that gets created when you first set up a site should include a notice that if your site is going to be public facing you might be subject to laws like GDPR and CCPA and link to resources you can use to figure out if you are in fact subject to them.

Same for whatever blogging software is common on these sites. I'd guess that they usually include a sample entry so you can verify that your installation is working? If so, include privacy law information in the sample entry.

jacquesm · on Dec 22, 2021

For professional organizations this is a non issue. For a small operator it can be both their first request, and their first request from someone who is just shooting off random requests to parties that they know have zero data on them, which is an abuse of the process. To add vaguely worded legal threats to that is way beyond where it should have gone. Anyway, the researcher seems to have realized this by now.

smarx007 · on Dec 22, 2021

The study methodology apparently involved a sample of high-traffic websites from https://tranco-list.eu. I have a hard time believing that the operators did not have to deal with such requests before. I always add the 30 day statements in my GDPR requests, mostly to make sure the support people set a calendar reminder to reply before the date. The next step if no reply is received is to complain to the data privacy watchdog in the country of the website operator or in your country if the website is operated outside EU (though I always begin with an email follow-up). Nobody would go to court after 30 days of a GDPR request without going through the govt data protection agency first. And to be clear, only requests are entitled to a 30-day reply and the email said that no formal request is being filed at the time [1].

But yes, that was clearly human research and the IRB should have grilled the PI about that.

[1]: https://christine.website/blog/princeton-study-2021-12-17

Edit: as you can see from the replies below, not only the study ethics is questionable but also the technical details about its methodology.

adrianhon · on Dec 22, 2021

You better believe it, because I got one of these emails about my personal site and I had never had to deal with it before.

I also received an email for a domain that I had absolutely nothing to do with. It seems their system identified my email's domain (i.e. not my email!) because it was in the last link on that domain's homepage – something that a human would have spotted easily.

kstrauser · on Dec 22, 2021

As I've said elsewhere, I'm on that list and I'm nowhere near what I'd consider a "high-traffic website". The site in question is a zero-revenue personal project, is is several orders of magnitude too small by any metric to be subject to the CCPA (which is the law the letter I got referred to).

They absolutely did not survey only large websites.

jacquesm · on Dec 22, 2021

I've read more than one response from people saying they are operating their website all by themselves and they definitely did not seem to be high traffic.

kevinpet · on Dec 22, 2021

My org is prepared to failover to our disaster recovery site, but that doesn't mean we want to or that it isn't work.

kortilla · on Dec 22, 2021

The fact that this came to you via a ticketing system might mean you’re a little out of touch with the personal blog operators this freaked out.

seoulmetro · on Dec 22, 2021

I wonder how many people getting upset by this are also advocating for fire alarm drills to be removed because "they waste precious resources"?

Emotion definitely overriding people's logic abilities here.

powersnail · on Dec 23, 2021

If a researcher starts pulling fire alarms of random buildings, they will face felony charges.

seoulmetro · on Dec 23, 2021

That's not how fire alarm drills work though. That's how fire alarms work.

In this situation, the "researcher" is the person conducting the drills. Which are way more frequent than these emails.

powersnail · on Dec 23, 2021

Without the permission of the property owner, conducting a fire alarm drill is no different than pulling the fire alarm.

seoulmetro · on Dec 23, 2021

>That's not how fire alarm drills work though.

You seem to be stuck on this part still.

powersnail · on Dec 23, 2021

Stuck on what?

It's your analogy, that the research in question is similar to fire alarm drills, that the researcher is analogous to the people who conduct the drill. I'm merely pointing out, that one cannot randomly conduct fire alarm drills on properties that don't belong to them.

The researchers don't own the websites, and they certainly don't own the internet. What entitles them to conduct the "drill"?

yifanlu · on Dec 22, 2021

I have a blog hosted on GH Pages generated with Jekyll. I got this email from the researcher:

> To Whom It May Concern:

>

> My name is Tom Harris, and I am a resident of Sacramento, California. I have a few questions about your process for responding to General Data Protection Regulation (GDPR) data access requests:

>

> Would you process a GDPR data access request from me even though I am not a resident of the European Union?

> Do you process GDPR data access requests via email, a website, or telephone? If via a website, what is the URL I should go to?

> What personal information do I have to submit for you to verify and process a GDPR data access request?

> What information do you provide in response to a GDPR data access request?

> To be clear, I am not submitting a data access request at this time. My questions are about your process for when I do submit a request.

>

> Thank you in advance for your answers to these questions. If there is a better contact for processing GDPR requests regarding yifan.lu, I kindly ask that you forward my request to them.

>

> I look forward to your reply without undue delay and at most within one month of this email, as required by Article 12 of GDPR.

>

> Sincerely,

>

> Tom Harris

I honestly thought it was one of those legal trolls who sent the same email to everyone hoping to find someone to sue but I responded anyways explaining how statically generated sites worked and that I’m willing to provide the information, being that the information is that I have none…

The last paragraph in particular made it sound like a veiled legal threat (or that they’re hinting that they’re willing to go down that road). I felt that I had to respond just to establish some record.

dmingod666 · on Dec 22, 2021

It was specifically crafted to sound like there will be legal consequence - this internet tough-guy goes into the same bucket as deceptive 'microsoft technicians' asking you to buy gift cards - not as scammy or nefarious, but in a similar vein nevertheless.

kstrauser · on Dec 22, 2021

That’s similar to what I got, and I had the same thoughts about it. I responded more publicly though: https://blog.freeradical.zone/post/ccpa-scam-2021-12/ .

anon946 · on Dec 22, 2021

I would be very much interested in seeing the IRB submission/application that was submitted for this study. I wonder whether or not it was mischaracterized to the IRB, or written in such a way as to diminish the problematic aspects.

nightpool · on Dec 22, 2021

From TFA:

    We submitted an application detailing our research methods to the Princeton University Institutional Review Board, which determined that our study does not constitute human subjects research. The focus of the study is understanding website policies and practices, and emails associated with the study do not solicit personally identifiable information.

This came up during the Linux security patch debacle as well. IRB guidelines are focused on a narrow set of harms based in historic abuses of medical research, and don't necessarily condemn the types of deception available here. As TFA points out, “secret shopper” methods are common in academic research of business practices.

dcow · on Dec 22, 2021

This is exactly what I don't understand about the whole thing. People are arguing that this was unethical but the researchers literally proposed the study to their review board which said "go ahead it's not a human subjects research study and does not need consent and is not by that virtue unethical". Perhaps the review board was wrong, people make mistakes, whatever. But assuming the review board was correct in its analysis of the situation (and who are we really to challenge that unless there's a glaring negligent tier mistake), I have yet to hear an argument that dissects the ethics of this case and clearly lays out what ethical quandary we have on our hands and where the line was crossed.

It really seems to me that people are conflating "annoying" with "unethical". Sure, spamming people is annoying. But how is it unethical? I had the same questions about the linux kernal security patches issue. Annoying waste of a few maintainers time, arguably yes for some definition of waste. But unethical? How so and can someone link me to literature detailing the ethical framework that disallows otherwise legal activity in good faith pursuit of knowledge because someone got annoyed in the process? I think that would be an interesting read.

jmull · on Dec 22, 2021

The emails purported to come from individuals, but were (1) written in an aggressive, legalistic style, and (2) directed at individuals who were not subject to CCPA and not equipped to deal with regulatory demands of it.

This caused significant anxiety on the part of the individuals who received this email, since it implied they would be subject to legal action if they did not provide a sufficient reply. It caused them to take significant action -- e.g., to research the law (that they aren't subject to), to determine if/how they could comply with a CCPA data access request if they had to, to consider retaining legal representation, etc.

It came off as some kind of scam or mistake, but one that had to be taken seriously.

You could read some of the blog posts from people who received these emails to understand the effect it had on them. It might also help you to read the email they received and imaging receiving the same for a site for a personal blog or one-man shop.

bryan0 · on Dec 22, 2021

I think you bring up a good point: that some of the blame going to the researcher really should be directed at the review board itself. It’s their responsibility to catch cases like this. The fact that some people who were included in the study without their consent are upset and angry, means they failed this responsibility.

I think what you are missing though, is that just because something passed a review board, that does not make it ethical. Review boards, like everything else, will make mistakes.

Beldin · on Dec 23, 2021

The study is a case of deception research. Deception research is a type of research in which the researchers are lying to / hiding information from their subjects - the mails purported to come from individual citizens, and did not mention that this was an academic study.

Other fields (e.g., psychology) have long since recognised inherent problems with the ethical aspects of deception research (in a very tiny nutshell: you harm your subjects' agency). Therefore, guidelines and protocols have been established (e.g., by the APA).

Roughly, those boil down to:

- don't do deception research unless no alternative method exists AND the outcome will have significant value

- inform the participants as soon as possible about the deception.

In this case, both IRB and researchers failed to recognise this as deception research. That is in itself a serious issue.

powersnail · on Dec 22, 2021

Especially in the area of CS/programming, it's so easy for experts to fool the IRB because they can hide behind words like "website", "data", "policy", as if they are dealing exclusively with machines.

> who are we really to challenge that

We are not obligated to make the same ethical judgement as the IRB. We are all entitled to challenge that.

Making bogus legal threats is unethical, when being sued can realistically lead to completely altered life. Yes, it's an annoyance, after the subjects determined that the threat is bogus, but it could be legitimately distressing (even costly) when they first received the threat.

The researchers made no efforts to contain the negative impact of their email, either. The email contains no information about it being a bogus threat. The subjects weren't told in advance that they might be lied to. The subjects had given no consent to being scared.

It doesn't seem "good faith" to send mass threatening emails with deliberately misrepresented laws.

Remember that the Milgram experiment also involved no illegal actions, and were done in pursuit of knowledge.

yjftsjthsd-h · on Dec 22, 2021

People aren't upset about it being annoying. People are upset that it read as a threat and resulted in people spending money to hire a lawyer because they thought they were about to get dragged into court.

matkoniecz · on Dec 22, 2021

> Sure, spamming people is annoying. But how is it unethical?

Spamming is unacceptable, unwanted, illegal and unethical for quite obvious reasons, namely

(1) negative effects do not justify benefits (2) spammer gets benefits at cost of others (3) it is annoying (4) its is not needed.

Rare exceptions may apply, it is not one of them.

MaxBarraclough · on Dec 23, 2021

As it may be of interest to someone, here's the HN discussion of the University of Minnesota Linux kernel bug scandal, from April.

https://news.ycombinator.com/item?id=26887670

harpiaharpyja · on Dec 22, 2021

Probably all they had to do was be transparent about the reason for sending out the emails, who they were, and probably throw in a link to that page (i.e. https://privacystudy.cs.princeton.edu). Seems like a silly thing to overlook but it does seem the impact on people is serious and I guess they know better now...

jonnybgood · on Dec 22, 2021

Would people respond the same way if they knew it wasn’t a real request i.e. take it less seriously?

kstrauser · on Dec 22, 2021

Possibly, but I'd bet the results would be way more accurate. If I got an email from a university I'd heard of, phrased like:

> Hi! We're trying to study CCPA compliance of random sites. Could you help us by answering a few questions?

then I absolutely would have replied, and would have replied honestly. People generally like to be helpful.

wayne-li2 · on Dec 22, 2021

That's the problem though -- it will skew the data towards friendly and helpful people like yourself. But it doesn't capture reality.

To be honest, I don't know how you get this sort of analysis done without poisoning the intent and maintaining integrity.

kstrauser · on Dec 22, 2021

Good point, and I don't know. But in the end, if it's not possible to conduct the study while acting ethically, then the study shouldn't be done.

Cerium · on Dec 22, 2021

I guess it is good for the researcher to apologize, but I would rather be reading a postmortem from the Princeton IRB.

chaircher · on Dec 22, 2021

This is why I lost interest in a career in accademia and set myself up in industry. I saw one too many situations like this where people assumed they'd be stopped by the institution if they took things too far and they were not.

dataflow · on Dec 23, 2021

Did you actually find academics take things "too far" more frequently than corporations do?

kevinpet · on Dec 22, 2021

This is a non-apology apology. It's "I'm sorry you feel that way."

I don't think I'm reading too much into it either: "I am dismayed that the emails in our study came across as security risks or legal threats."

"explaining in detail what we did, why we did it, what we learned, and how researchers should approach similar studies in the future."

Nothing about how it impacts their unwilling subjects. Nothing about failing to indicate they were doing an academic study. Nothing about the falsity of their legal threats.

jessriedel · on Dec 22, 2021

You are misrepresenting the statement. The full quote is

> I am dismayed that the emails in our study came across as security risks or legal threats. The intent of our study was to understand privacy practices, not to create a burden on website operators, email system operators, or privacy professionals. I sincerely apologize. I am the senior researcher, and the responsibility is mine.

He stated the negative impact they had on study subjects (including the interpretation as legal threats), accepted responsibility, and apologized without reservations. How can you possibly claim he wrote "nothing about the falsity of their legal threats"?

Researchers don't have to indicate they are doing an academic study. Ethical actions things don't become unethical simply because it's part of research.

Beldin · on Dec 23, 2021

> Researchers don't have to indicate they are doing an academic study.

Actually, they do. Otherwise it's called deception research , which falls under very specific guidelines.

In general, study participants should be treated decently. Deception research robs them of fully informed consent to participate in the study and is therefore by nature intently ethically over an edge. In specific cases, the benefit of the study's outcomes may offset the harm to its participants. Even then, that harm must be minimised.

jessriedel · on Dec 23, 2021

Sorry, you're right, that's not the right way to say it. What I meant was deception is wrong regardless of whether it's part of an experiment. If the emails had been sent out under different false pretenses, like criminals probing for weak points, it would have been just as wrong.

matkoniecz · on Dec 22, 2021

Also "our study does not constitute human subjects research".

Since when threatening people does not involve humans? Or is it some "technically, this legal term does not apply - according to our lawyer"?

jessriedel · on Dec 23, 2021

You've clipped the quote to remove the part where they are describing the opinion of the IRB, not making a claim themselves. That paragraph is describing why there was no IRB involvement, not claiming that humans wheren't involved in the research.