More

thedevilslawyer · 2026-03-05T06:41:35 1772692895

That's unpractical enough that you might as well wish for UBI and world peace rather than this.

kshri24 · 2026-03-05T06:42:27 1772692947

Why is it impractical? Github already has a sponsor system. Also this can be a form of UBI.

thedevilslawyer · 2026-03-05T06:17:02 1772691422

Copyright is not a blacklist but an allowlist of things kept aside for the holder. Everything else is free game. LLM ingestion comes under fair use so no worries. If someone can get their hand on it, nothing in law stops it from training ingestion.

We can debate if this law is moral. Like the GP I took agree public data in -> public domain out is what's right for society. Copyright as an artificial concept has gone on for long enough.

kshri24 · 2026-03-05T06:33:44 1772692424

> LLM ingestion comes under fair use

I don't think so. It is no where "limited use". Entirety of the source code is ingested for training the model. In other words, it meets the bar of "heart of the work" being used for training. There are other factors as well, such as not harming owner's ability to profit from original work.

thedevilslawyer · 2026-03-05T06:42:20 1772692940

https://www.skadden.com/insights/publications/2025/07/fair-u...

Both Meta and Anthropic were vindicated for their use. Only for Anthropic was their fine for not buying upfront.

kshri24 · 2026-03-05T06:44:41 1772693081

This hasn't gone to Supreme Court yet. And this is just USA. Courts in rest of the World will also have to take a call. It is not as simple as you make it out to be. Developers are spread across the World with majority living outside USA. Jurisdiction matters in these things.

thedevilslawyer · 2026-03-05T07:40:34 1772696434

Copyright's ambit has been pretty much defined and run by US for over a century.

You're holding out for some grace on this from the wrong venue. The right avenue would be lobbying for new laws to regulate and use LLMs, not try to find shelter in an archaic and increasingly irrelevant bit of legalese.

kshri24 · 2026-03-05T08:51:15 1772700675

I don't disagree. However, just because your assertion of copyright being initially defined by US (which is not the fact. It was England that came up with it and was adopted by the Commonwealth which US was also a part of until its independence) does not mean jurisdiction is US. Even if US Supreme Court rules one way or the other, it doesn't matter as the rest of the World have its own definitions and legalese that need to be scrutinized and modernized.

shakna · 2026-03-05T09:18:06 1772702286

Alsup absolutely did not vindicate Anthropic as "fair use".

> Instead, it was a fair use because all Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies. [0]

It was only fair use, where they already had a license to the information at hand.

[0] https://storage.courtlistener.com/recap/gov.uscourts.cand.43...

gf000 · 2026-03-05T06:55:26 1772693726

There are hardly any rulings/laws about the topic, and it quite obviously changes the picture of licenses.

thedevilslawyer · 2026-02-27T12:07:39 1772194059

These are crappy arguments. The author is seeking to re-litigate Piracy of IP is bad, and AI is bad.

If those are your axioms then you will find the old world is already in the rear-view mirror, and they want to pull back every other project to stay with them in that world.

AI is here. Free software succeeded - make as much as you want. This technology a force multiplier.

You can debate it's morality, but most people want to do their work.

thedevilslawyer · 2026-02-21T13:25:37 1771680337

Rubbish. Simon is a good independent voice in capturing the llm zeitgeist.

blibble · 2026-02-21T15:52:47 1771689167

Simon Willison claims to be an "Independent AI researcher"[1]:

but then at the top of this article:

> Sponsored by: Teleport — Secure, Govern, and Operate AI at Engineering Scale. Learn more

not exactly a coherent narrative, is it?

[1]: https://bsky.app/profile/simonwillison.net

simonw · 2026-02-21T16:00:48 1771689648

I wrote a little note about that here - it even opens with "I value my credibility as an independent voice" https://simonwillison.net/2026/Feb/19/sponsorship/

I get (incorrectly) accused of writing undisclosed sponsored content pretty often, so I'm actually hoping that the visible sponsor banner will help people resist that temptation because they can see that the sponsorship is visible, not hidden.

blibble · 2026-02-21T16:16:15 1771690575

> I value my credibility as an independent voice

not enough to not take their money though?

insipid

simonw · 2026-02-21T16:29:34 1771691374

I'm currently planning to avoid sponsorship from companies that I regularly write about for that reason.

akssassin907 · 2026-02-21T19:36:19 1771702579

That's actually a cleaner editorial standard than most publications follow. The major risk in tech journalism isn't disclosed sponsorships — it's the undisclosed access journalism where coverage tone shifts to maintain relationships. Visible banners beat invisible influence every time.

blibble · 2026-02-21T16:39:55 1771691995

[flagged]

simonw · 2026-02-21T16:42:41 1771692161

You're welcome to stop reading me if you think my ethics are irreversibly corrupted and you can no longer trust my writing.

Thankfully most of my readers are better at evaluating their information sources than you are.

usef- · 2026-02-22T02:54:25 1771728865

Honestly, after his ~23 years of writing online I think he's fairly earned the title as an independent researcher. He added those sponsorships three days ago; perhaps wait to raise your alarm bells until he actually writes about a sponsor.

thedevilslawyer · 2026-02-09T08:32:28 1770625948

they're under 1 in 1000, so the rest are that "kind" of person.

thedevilslawyer · 2026-02-09T07:32:34 1770622354

This is oft-repeated but never backed up by evidence. Can you share the snippet that was plagiarized?

vohk · 2026-02-09T08:10:56 1770624656

I can't offer an example of code, but considering researchers were able to cause models to reproduce literary works verbatim, it seems unlikely that a git repository would be materially different.

https://www.theatlantic.com/technology/2026/01/ai-memorizati...

jmt710 · 2026-02-11T14:32:27 1770820347

These arguments absolutely infuriate me. You're code is not that unique. Lots of people write the same snippet everyday and have no idea that somebody else just wrote the same thing.

It's such a crock that you can somehow claim you're the only person who can write that snippet and now everyone else owes you something. No. No they don't. Get over it.

Writing a book is different. Lifting pages or chapters is different because it's much harder for two people to write the exact same thing. Code is code, it follows a formula and a everyone uses that formula.

20k · 2026-02-12T19:48:10 1770925690

Writing an exact copy of a nontrivial function by mistake is so rare that i've never seen it happen in 20 years of programming

thedevilslawyer · 2026-02-09T08:35:21 1770626121

Assuming that even works from a researcher's perspective, it's working back from a specific goal. There's 0 actual instances (and I've been looking) where verbatim code has been spat out.

It's a convenient criticism of LLMs, but a wrong one. We need to do better.

latexr · 2026-02-09T10:04:36 1770631476

> There's 0 actual instances (and I've been looking) where verbatim code has been spat out.

That’s not true. I’ve seen it happen and remember reports where it was obvious it happened (and trivial to verify) because the LLM reproduced the comments with source information.

Either way, plagiarism doesn’t require one to copy 100% verbatim (otherwise every plagiarist would easily be off the hook). It still counts as plagiarism if you move a space or rename a variable.

https://xcancel.com/DocSparse/status/1581461734665367554

https://xcancel.com/mitsuhiko/status/1410886329924194309

> We need to do better.

I agree. We have to start by not dismissing valid criticisms by appealing to irrelevant technicalities which don’t excuse anything.

thedevilslawyer · 2026-02-09T15:12:46 1770649966

Ok you win.

You should take your findings to the large media organizations including NYT who've been trying to prove this for years now. Your discovery is probably going to win them their case.

dayjaby · 2026-02-09T18:16:43 1770661003

Why so cynic? This is a serious issue. And media coverage has nothing to do with the immoral state of the art of ignoring copyrights.

thechao · 2026-02-09T13:11:47 1770642707

I don't know code examples, but this tracks, for me. Anytime I have an agent write something "obvious" and crazy hard -- say a new compiler for a new language? Golden. I ask it to write a fairly simple stack invariant version of an old algorithm using a novel representation (topology) using a novel construction (free module) ... zip. It's 200loc, and after 20+ attempts, I've given up.

bayindirh · 2026-02-09T08:16:59 1770625019

While this is from 2022, here you go:

https://x.com/docsparse/status/1581461734665367554

I'm sure if someone prompts correctly, they can do the same thing today. LLMs can't generate something they don't know.

thedevilslawyer · 2026-02-09T08:33:10 1770625990

That you had to look and find this from 2022 proves my point..

bayindirh · 2026-02-09T08:47:58 1770626878

Nope. That was a handy bookmark. I keep a list of these incidents, and other things:

https://notes.bayindirh.io/notes/Lists/Discussions+about+Art...

I have another handful of links to add to this list. Had no time to update recently.

IX-103 · 2026-02-09T09:12:21 1770628341

It happens often enough that the company I work for has set up a presubmit to check all of the AI generated and AI assisted code for plagiarism (which they call "recitation"). I know they're checking the code for similarity to anything on GitHub, but they could also be checking against the model's their training corpus.

thedevilslawyer · 2026-02-08T09:43:51 1770543831

For your custom definition of sustainable, perhaps not.

but this is definitely generally sustainable. by 2030 we're fully agentic coding for everything, and it's going to sustain.

thedevilslawyer · 2026-02-03T12:57:27 1770123447

The signup page should go-to "Login with linkedin", and allows you to set "Open to Work for AI" flag.

thedevilslawyer · 2026-02-03T12:52:22 1770123142

Have you worked with a professional architect. Cost adds up fast, and you get 1-2 iterations?

I'd love to work and vibecode the house to my full liking, assuming that the agent harness will take care of all the nonfunctional things (stable design, zoning etc). Same for car if I could customize it I would.

(I definitely don't like the ramifications of it on the economy/jobs, but the above are pure consumer wins, no doubt)

jopsen · 2026-02-03T17:44:20 1770140660

> I'd love to work and vibecode the house to my full liking,

Instead of deadcode, it'll leave you with a few extra secret rooms that have no doors or windows :)

The reason you wouldn't want this is cost. The cost of building a house is marginally affected by designing it with an AI agent. Most of the cost is bricks, etc -- material.

thedevilslawyer · 2026-02-03T02:28:14 1770085694

Still in Anger. Got it.

anonymous908213 · 2026-02-03T09:07:17 1770109637

I'm starting to get a new sense of which people LLMs are useful for. I'm sure they're life-changing for those with intelligence below that of a child, so I'm glad for you that you have this tool available now.