Getting ML to reliably do something specific like flag an image as inappropriate is (a) already well known in the field (b) impossible to do reliably and not getting easier, which is exactly why these comment moderation places exist. Otherwise we'd already be doing it. None of the recent advances are in a direction that brings us closer to being able to do this.
I think you're probably in the majority though who misunderstands what gpt et al are doing and think of them as an advance in ML generally as opposed to just a different demo that works most of the time.
Are you sure about that? We haven't seen what GPT-4 multimodal can do in the wild, it can even take into context the full conversation history in addition to just the images. If it can understand visual jokes easily, are you sure it can't detect CSAM?
We can also reliably GENERATE inappropriate content now, by simply adding a 'nsfw' tag to Stable diffusion, it flips a normal image to an inappropriate one. It doesn't sound very difficult to reverse this.
Also, for these services, you don't need it to be perfect. If even the flagging accuracy goes up significantly, that's a lot fewer human workers to review it.
The AI ecosystem as a whole is also booming massively regarding hardware, datasets, talent, software infrastructure. So that makes development in traditional ML faster.
As someone who is very optimistic about GPT, you are utterly misunderstanding what content moderation teams deal with if you think GPT can accurately assess a majority of their cases.
The accuracy will not go up significantly if humans are already struggling, and I'm not talking about the mental aspects of the job, I'm talking about being unable to determine if the case is accurately violating TOS or not.
This is going to get even more difficult, for both humans and AI, when content is being generated at alarmingly fast rates by AI. So even with the use of AI to combat AI, there is still going to be a gigantic tidal wave of questionable shit to go through.
The options aren't 100% human vs. 100% AI, though?
I'm no expert so maybe you can clarify why high accuracy/reliability is that important for an initial analysis by AI? I would expect the vast majority of reports are straightforward matters?
There's a human poster involved, who can trigger a review if they disagree with the decision, and that review can be attended to by a much smaller pool of humans.
I think you're probably in the majority though who misunderstands what gpt et al are doing and think of them as an advance in ML generally as opposed to just a different demo that works most of the time.