I have found that git.exe outperforms any other codebase representation with GPT5.x once you figure out how to not mangle the arguments. Commands like grep and log can replace a lot of other tools if you can use them reliably.
I saw a major uplift in performance after I combined tools like apply_patch with check_compilation & run_unit_tests. I still call the tool "apply_patch", but it now returns additional information about the build & tests if the patch succeeds. The agent went from ~80% success rate to what seems to be deterministic (so far). I don't bother to describe the compilation and unit testing processes in my prompts anymore. All I need to do is return the results of these things after something triggers them to run as a dependency.
I feel like I'm falling out of whatever is popular these days. I've been using prepaid tokens and custom harnesses for a long time now. It just seems to work. I can ignore most of the news. Copilot & friends are currently dead to me for the problems I've expressly targeted. For some codebases it's not even in the same room of performance anymore, despite using the exact same GPT5.4 base model.
I like this - I think you're not too far off of what's popular these days though. I think similar functionality can be achieved by using the "hook" functionality in claude code / codex.
I have 3 accounts in teams. Two are in client tenants and one is my personal. I haven't had any issues because I can't work on more than one customer at a time. I still get notifications from the other clients when I'm not actively staring at things. I could leave teams off all day and still be fine. The business requirements don't change that quickly in my part of the world. If you are doing 1099 for multiple customers, you should not let them dance you around with instant messages in real time. That will wear you out very quickly.
Because it then restarts the whole app, so you have to wait again for ages so it does whatever it is that it does.
And since you can actually have a chat window open from a different tenant, there’s no reason why they should have you do this ridiculous dance.
If teams wasn’t so god-awful slow, and with an interface jumping around all the time, people would probably complain much less of having to select something in a dropdown.
I had the choice of using discord or teams yesterday to review something and we both picked teams.
This is just like the hate for paid databases, operating systems and big clouds. Easy targets that seem politically convenient to attack on statistical grounds ("I think most people here might agree with me"). It's ultimately childish behavior. Adults explore nuance and find compromise between competing ideas. I find myself constantly defending the proverbial empire around here because of the intense tribalism. If we were focused more on the customer and doing a good job, half of this nonsense would disappear overnight.
Microsoft makes some of the best software on earth. Teams is certainly not an example of that (yet), but it's also not the worst thing they've done. Not even close.
> Small size: Runs in a web browser or on a laptop – 1.5B parameters total and 50M active parameters.
This looks to be a big deal for my use case.
I've already got rule-based PII redaction in place, but there are still some leaky or ambiguous edge cases that crop up from time to time. Putting another layer like this in the way (before we send things to the big model) is likely to dramatically improve my ability to sell these tools to management. The block diagram does look much more reassuring once we introduce this step.
I exclusively use prepaid OAI tokens when doing copilot work in visual studio. It's really easy to set up a "custom" model. The consistency is hard to beat and I can use the latest model on day one. I also get to see how the magic happens in my provider logs. Every token accounted for.
Mine is a ~10-person bank consultancy without time or energy to deal with elite neck beard problems. Windows server, mssql and .NET are a great combo.
I wish we could separate the paid/oss aspects from the technical ones because Microsoft absolutely runs circles around every other stack when it comes to serious business software solutions, especially in resource constrained teams. I agree that oss and free software is conceptually ideal, but I also see why you might want to try different models.
Much of the Microsoft hate seems to come back to this notion that paid, COTS software is inherently evil or bad. Also, windows 11 is genuinely bad, but at least it boots up without weird issues that take an entire afternoon to resolve. I've never had a Linux experience that didn't kick me in the balls in some way. Not even the Steam Deck was smooth.
I happily throw my wallet at Microsoft if they solve my problem. Adobe, IBM, Oracle, The Empire, etc. Doesn't matter anymore. If it provides value to me and my clients, I'm going to use it or advocate for it. Spending money on good tools is not a bad thing. This world is about to get way more competitive than many of us would like for it to be. This level of petty tooling tribalism is going to become absolutely lethal.
I have no longer used Windows servers for a very long time, but when I still worked in a company that used Windows servers, the problem was not that we had to pay for it.
The problem was that the cost was not fixed and predictable, because every now and then we wanted to extend our activities, and that was conditioned by buying extra Microsoft licenses, for additional users, additional CPU cores or sockets, additional services, and so on.
This was extremely annoying in comparison with using a FreeBSD or Linux server, where the operating costs were the same regardless of how we decided to use it.
I agree that in a less dynamic environment, where the requirements for the server are stable and unlikely to ever be changed, using a Windows server may be OK.
However in any organization where this is not true, I believe that using any Windows server is a loser strategy, due to the financial friction that it causes against any improvements in the IT environment.
I feel like this is a very common attitude amongst people who actually have delivered software as a day job for a few years. The raging sports-fan-esque Linux vs Windows fanboy battles are mostly fought by unemployed kids who still have time to customize their desktops.
It's a bit of a trope but PC gamers do seem more serious in general. I observe some pretty stark differences between console and PC players in Battlefield 6. It tells you in the scoreboard what platform each player is on, so it's really easy to start seeing the patterns.
The number one thing I've noticed is that the PC players will almost always try to wait for passengers to hop in vehicles they grab. They also tend to be orders of magnitude more capable with said vehicles in terms of coordinating with engineers for repairs and focusing on objectives.
I do think the control scheme is too constraining for Xbox and PlayStation users. There's just too much going on. Different modalities don't map so well. I wouldn't be able to think straight if I had to use my Xbox controller to do cqb shotgun battle and then transition in and out of vehicular combat roles rapidly. It feels to me like trying to run a WoW raid with a 4 hotkeys. The auto-aim handicap sounds like a nice bonus but it's just not worth the frustration everywhere else.
I agree mostly but don't think it's controls as much as age and target audience. If you spend as much time with a controller you can get as good as with a keyboard at least when it comes to game awareness and sense. Consoles have been cheaper initially and more "casual" although this has change somewhat. That's also why most simulation games are almost exclusively played on pc. What you saw in battlefield is even more apparent in sim shooter like Arma which has added cross play not long ago.
reply