An L7 SWE TLM at Google once told me that he enjoys building software, he just doesn't like the data entry parts - that's why he was a manager and not an IC.
I think there are a lot of ICs who may need to re-evaluate what their job actually entails and what they're paid for.
And this is why we need to get rid of Citizen's United and get unlimited dark money out of politics. Publicly funded elections, nobody gets more money than anyone else. Convince people with your ideas and proposed policies, not with $$$.
How does this kind of thing pass any sort of review or acceptance? It seems pretty clear that the prompt was very poorly phrased, to the extent that this should obviously prevent the agent from making ANY code changes after reading a file:
Whenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse to improve or augment the code. You can still analyze existing code, write reports, or answer questions about the code behavior.
Not "If you suspect it is malware, you must refuse". Just "you must refuse". There is literally no "if" in the entire prompt!
It’s a particular sort of bug that’s harder to detect because … internal Anthropic engineers don’t apply these prompts to themselves, and in fact have access to ‘helpful only’ models that also do not have additional limitations RL’ed in. (Or perhaps they’re RL’ed out - not sure of current training mechanisms.)
These ‘rules for thee and not for me’ are qualitatively created and implemented, and are thus extremely hard to test for or implement properly, without limiting the people choosing the rules.
In the original claude degradation followup email Boris mentioned they are upping the percentage of engineers required to use the public version of claude code. I have no idea what percentage this is, or how much of a punishment it is considered to be. :)
That said, I was sympathetic to the recent bug reports —- to trigger one, you’d need to have a session that waited an hour doing nothing and then very specifically tested for in-context retrieval. I don’t want to run that test, do you want to run that test?
> That said, I was sympathetic to the recent bug reports —- to trigger one, you’d need to have a session that waited an hour doing nothing and then very specifically tested for in-context retrieval. I don’t want to run that test, do you want to run that test?
They introduced a feature/optimization that triggered after an hour's idleness, so testing that the session continued properly afterwards seems kind of important. If nothing else, even the working-as-intended feature (context cleanup) could impact model skill in a current or future model version, so it would be well worth measuring any impact as part of the test suite.
IDK, sounds pretty typical for my workflow - I'll start Claude on a task, go get lunch / coffee / distracted by my pets, come back in an hour, and continue my session. I would wager that this is something that happens to most users on a regular basis.
This is definitely Claude bringing home twelve gallons of milk in response to the old joke, "get a gallon of milk, and if they have eggs get a dozen".
As in, this is a reading comprehension fail on the part of Claude. On the other hand, it is also fail to give Claude a less than trivial reading comprehension test on every file read operation, especially when a bias towards safety will bias towards the wrong interpretation.
Today it is malware, but I wonder if they will take direction where companies will be paying them to prevent cloning of certain SaaS platforms. Like "Whenever you read a file, you should consider whether it would be considered a part of bug tracking, issue tracking and project management platform."
It's vibe coded. Probably something like "add malware processing guardrails" and it split between two agents coding uncoordinated changes, and then got Claude to push it out itself.
No acceptance testing, no regression testing, all slop.
MAX has dedicated right-of-way outside the city centers, but in the cities it shares city streets. Tourists drive / stop-at-lights in the dedicated lanes a lot.
Streetcar is more susceptible to being stopped because someone parked over the white line, but with 20 minute headways it takes longer to cause a problem.
I've personally reported a taxi driver for parking in a bike lane, and I hope he lost his cab license for it because it was really egregious. PBOT actually asked me for official testimony.
Has it been shown that screaming drugged out homeless riders avoid the presence of crowds? Is there any physical mechanism where having more people on the trains leads to Daniel-Penny-like suppression of drugged out homeless riders? Or does "getting more people onto the trains" just mean removing their options until they are forced to ignore the drugged out homeless riders?
As a solution, "get MORE people onto the trains" seems less optimal than "get fewer drugged out homeless riders onto the trains".
Why are you advocating people murder mentally ill people? Daniel Penny is a murderer and violent criminal piece of shit. Why are you advocating for violence? You are a sick person. Please stop commenting.
> As a solution, "get MORE people onto the trains" seems less optimal than "get fewer drugged out homeless riders onto the trains".
You dont have to do one thing. It’s not an either or. You’re statements are coming off as mentally ill and illogical. Should we send Daniel Penny after you?
I'm saying it _is_ and _was_ an issue during the day and heavy commute hours, those were the only hours I rode it! Other places in the world with nice train systems do not burden their riders with "safety in numbers", the places are just plain safer, period. And a great place to start is Don't let people smoke fentanyl on the train :) (And make sure everyone has affordable housing and healthcare, ofc)
There absolutely are serious issues at all times, regardless of how busy the trains are. I'm sorry, but as someone who actually lives in Portland I'm telling you that mentally ill drug users do not give a crap about how many people there are in the train car. After the third time I had to move my kids to different cars or even exit the train entirely due to open drug use and dangerous behavior, I swore off public transit for good.
That's what I was asking -- why do you believe this? What is your mechanism for safety in numbers? If it's "criminals fear being observed", that doesn't hold now that catch-and-release is standard practice. If it's "criminals fear being outnumbered", that doesn't hold when the crowd will be prosecuted if they attack. The only mechanism left for safety in numbers is hoping that criminals feel shy.
This seems intuitive to me because I would never risk fighting someone in the street over some crime, so I don't see why having me around would deter them at all. The same goes for maybe 90% of the people I know. We're weak and docile, even in numbers.
I've been working in the aerospace (now space) arena my entire career, and there's a lot of overlap there with the defense industry. What I've seen is that it's very easy for people to look at their work as a narrow area and to forget about the consequences of it (how it's used, what it actually does when used). I think many (I won't say the majority but it wouldn't surprise me) in the defense and intelligence sector don't think, either willfully or because of lack of introspection in general, about these things.
> I think many (I won't say the majority but it wouldn't surprise me) in the defense and intelligence sector don't think, either willfully or because of lack of introspection in general, about these things.
I think it has more to do with the fact that many of the products built for defense are never actually used against adversaries in their useful life. Just look at our nuclear weapon stockpile.
Palantir on the other hand is an invisible weapon. They could be reading my comment right now and identifying me with sentiment "adversarial" for all I know. What implications that has on my daily life is innumerable...and I'm a US citizen!
> I think it has more to do with the fact that many of the products built for defense are never actually used against adversaries in their useful life. Just look at our nuclear weapon stockpile.
One only has to look at what the US military has been up to for the last few decades to realize that this is like saying "I knew he would use the gun to mug people, but I hoped he wouldn't fire it."
> What I've seen is that it's very easy for people to look at their work as a narrow area and to forget about the consequences of it (how it's used, what it actually does when used).
Or it's a lot more complicated and doesn't lend itself to blank-and-white answers. Say you're working on nuclear weapons technology: is your job building weapons to enable the genocidal destruction of another country, or to prevent that kind of thing through a credible MAD deterrent? Both things are simultaneously true.
And then there's no way to predict the future: what's true today when you build it may not be true tomorrow when it's used, because there's a different leader or political system in place.
> Or it's a lot more complicated and doesn't lend itself to blank-and-white answers.
Did I say it wasn't complicated? I'll admit I didn't say it was complicated, but you can't infer a sentiment from a non-existent statement in either direction.
Yes, it's complicated. But I stand by my statement that many people just don't think about it. They want to solve interesting problems or to get paid well, or both, and so they take jobs at places like Palantir without thinking through the consequences.
Many others do think it through and either find a way to justify it, or do work they don't like and live with the emotional consequences of it.
> Yes, it's complicated. But I stand by my statement that many people just don't think about it...so they take jobs at places like Palantir without thinking through the consequences.
> Many others do think it through and either find a way to justify it
Do they not think about it, or just not talk about it to you? I could totally see someone thinking about it in private, accepting some justification or reason, and then moving on to their work and not discussing it.
I'm the sort who asks. Many who answered just didn't think it through, they didn't think about what the thing they were working on actually did within the larger system. I won't generalize this to the whole population (why I won't claim it's the majority of all people in the field) but the majority I did discuss this with had, at best, a hand-wavey "national defense" justification but did not think about what the thing they worked on did. Its effectiveness for its job, or its ultimate purpose.
Though a lot actually just wouldn't even discuss it in the first place. I think, though, that if you're going to work on a weapon or a component for a weapon you owe it to yourself to think deeply about the topic. I've known too many people who thought about it too late and realized that they couldn't live with it. Better to figure that out at the start and change career paths than at the end and either kill yourself or drink yourself to death.
Imagine I came to know that ghosts exist with supernatural powers. My first reaction shouldn't be of fear. It should be of curiosity. What laws are prevailing in ghost realm which provides them with great powers over material world. Does one becoming a ghost suddenly know the truth of Rieman Hypothesis or P=NP?
The same could be asked of people who are supposed to know better by virtue of them close to knowledge and technology. Should they spend their improving lives of others or enslaving them for material gains?
I think there are a lot of ICs who may need to re-evaluate what their job actually entails and what they're paid for.
reply