It's very unfortunate that all of these AI models are so impressive, yet they're...

19h · on May 12, 2023

Claude may reject answers, but other than OpenAI‘s GPT you can put words into the mouth of the assistant and essentially bypass safety checks.

In fact, Anthropic explicitly discusses putting words into the assistants‘ mouth to be able to shape it’s responses and make it better align with the desired output.

coldblues · on May 12, 2023

Eventually you will get your account banned, not to mention that the filtering that they do decreases the quality of the results you will get compared to an uncensored model, even if you can "jailbreak" it.

19h · on May 12, 2023

If we were to get banned for this, we'd have been banned long ago. We literally process "questionable" content "as a service" and this use case was explicitly approved. (We do heuristics and ML-assisted background checks on unstructured OSINT data.)

coldblues · on May 12, 2023

I think being "explicitly approved" matters a lot in this context. Regular customers can't benefit from the same privileges you did.

19h · on May 12, 2023

That's just wrong. We ran metric tons of questionable content before being approved for anything.