Yes, chess has been dealing with AI for decades at this point, and it's amusing/frustrating that so many other communities are deciding to re-discover everything from scratch, rather than just learn from the chess experience.
If CTF is a player-vs-player event, then AI should just be banned outright, otherwise it will devolve into AI-vs-AI, which is just not an interesting competition format, as we learned in chess. Compared to FIDE top events (which bans AI), only a tiny niche audience actually watches the Top Chess Engine Championship (AI-centered). It turns out what we care about is not whether chess can be solved by any means available, but what are the limits of the human mind in learning chess.
Pretty much all chess coaches/educators also warn against relying heavily on AI during learning; engines only give you an illusion of understanding.
It's hard in general, but for instruct/chat models in particular, which already assume a turn-based approach, could they not use a special token that switches control from LLM output to user input? The LLM architecture could be made so it's literally impossible for the model to even produce this token. In the example above, the LLM could then recognize this is not a legitimate user input, as it lacks the token. I'm probably overlooking something obvious.
Yes, and as you'd expect, this is how LLMs work today, in general, for control codes. But different elems use different control codes for different purposes, such as separating system prompt from user prompt.
But even if you tag inputs however your this is good, you can't force an LLM to it treat input type A as input type B, all you can do is try to weight against it! LLMs have no rules, only weights. Pre and post filters cam try to help, but they can't directly control the LLM text generation, they can only analyze and most inputs/output using their own heuristics.
TLDR: The user got his filesystem corrupted on a forced reboot; native btrfs tools made the failure worse; the user asked Claude to autonomously debug and fix the problem; after multiple days of debugging, Claude wrote a set of custom low-level C scripts to recover 99.9% of the data; the user was impressed and asked Claude to submit an issue describing the whole thing.
Not sure how I feel about this. If Take Take Take becomes very large, and most current Lichess users eventually switch to using that platform's mobile apps and website (say, because it has a few nicer features, and it's perceived as open source anyways), couldn't TTT suddenly decide to cut ties and entrap the community into yet another closed garden? By that time, there may not be a meaningful Lichess community left for users to switch back.
The author of Claude Code himself mentioned this in a recent interview. If I recall correctly, he mentioned that the best programmers he knows have an understanding of the "layer below the layer", which I think it's a good way of putting it. You're a better C programmer if you understand assembly, and you're a better "vibe coder" if you can actually understand the LLM generated code.
This part is unclear; what exactly did you change? Are you saying that the LP relaxation has value 271.666, but, when you enforce integrality, Gurobi can actually find and prove optimality of a solution with value 218?
Were you really just solving LPs up to this point in the article? How can these intermediate LPs be so slow to solve (6+ years) and yet Gurobi is able to solve the integer-restricted problem?
I've always been solving the integer problem of course. But throughout the article, I improve the model formulation again and again through insights, which makes the LP relaxation tighter. Initially, it gave 305.0 as upper bound, but after tightening the model (addind constraints that cut off that 305 solution and others) it gives 271.666...
- which leads to insanely faster search. It's like brute-forcing through all passwords of length 20 and a wizard telling you that you're wrong when you reach character 7 instead of 13.
In addition to the value of the best integer solution found so far, Gurobi also provides a bound on the value of the best possible solution, computed using the linear relaxation of the problem, cutting planes and other techniques. So, assuming there are no bugs in the solver, this is truly the optimal solution.
Unless I missed something, though, the highest bound the author reported for the relaxation was 271 2/3 moves, which is obviously significantly higher than 218...
I think that was an intermediate model. The author updated it, then Gurobi solved the new model to optimality (i.e., the bound became equal to the value of the best solution found).
> With this improved model, I tried again and after ~23 000 seconds, Gurobi solved it to optimality!
> Gurobi solved the new model to optimality (i.e., the bound became equal to the value of the best solution found).
Ah, I was not aware that that's what this language indicated. Thanks for helping me understand more!
I've used Gurobi (and other solvers) in the past, but always in situations where we just needed to find a solution that was way better than what we were going to find with, say, a heuristic... I've never needed to find a provably optimal solution...
If CTF is a player-vs-player event, then AI should just be banned outright, otherwise it will devolve into AI-vs-AI, which is just not an interesting competition format, as we learned in chess. Compared to FIDE top events (which bans AI), only a tiny niche audience actually watches the Top Chess Engine Championship (AI-centered). It turns out what we care about is not whether chess can be solved by any means available, but what are the limits of the human mind in learning chess.
Pretty much all chess coaches/educators also warn against relying heavily on AI during learning; engines only give you an illusion of understanding.
reply