Hacker Newsnew | past | comments | ask | show | jobs | submit | more gronky_'s commentslogin

we started using it recently at my work. the code changes walkthrough is nice


I think the same can be said about AI-assisted writing…

I like the ideas presented in the post but it’s too long and highly repetitive.

AI will happily expand a few information dense bullet points into a lengthy essay. But the real work of a strong writer is distilling complex ideas into few words.


I think these tests are useful as regression tests - unit tests can be really helpful when making changes down the line, tipping you off that you missed something. Also much easier to refactor when there’s good test coverage.

From the PR: unit tests: what are they good for?

Answer: Personal opinion - writing unit testing is not fun. It becomes even less appealing as your codebase grows and maintaining tests becomes a time-consuming chore.

However, the benefits of comprehensive unit tests are real:

Reliability: They create a more reliable codebase where developers can make changes confidently

Speed: Teams can move quickly without fear of breaking existing functionality

Safe Refactoring: Code improvements and restructuring become significantly safer when backed by thorough tests

Living Documentation: Tests serve as clear documentation of your code's behavior:

They show exactly what happens for each input They present changes in a human-readable format: "for this input → expect this output" They run quickly and are easy to execute This immediate feedback loop is beneficial during development


> I think these tests are useful as regression tests - unit tests can be really helpful when making changes down the line, tipping you off that you missed something. Also much easier to refactor when there’s good test coverage.

The problem is that this assumes that the tests or the method was written correctly in the first place. If the behavior in the method is wrong and the tests are validating that the behavior is wrong, then you pay an extra tax. First to fix the behavior of the method, then to fix the behavior of the tests.

That's why automatically generating unit tests is in my opinion adding a bomb to your codebase. The only exception is stuff like basic parameter testing but even that can be questionable at times (is null a valid input at any point for example) unless you know the intent of the code, and AI can't really grasp at the intent.


I tried generating the same test with all 5 models in Qodo Gen.

o1 is very slow - like, you can go get a coffee while it generates a single test (if it doesn’t time out in middle).

o1-mini thought worked really well. It generated a good test and wasn’t noticeably slower than the other models.

My feeling is that o1-mini will end up being more useful for coding than o1, except for maybe some specific instances where you need very deep analysis


How well did it work for generating tests? I was looking for an AI test generation tool yesterday and I came across this and it wasn't clear how good it is.

(before I get a bunch of comments about not letting AI write tests, this is for a hobby side project that I have a few hours a week to work on. I'm looking into AI test generation because the alternative is no tests)


From my understanding faster-whisper optimizes the inference without changing the model itself. Here they seem to be changing the model architecture but not applying other optimizations.

50% on its own doesn’t make this the current best choice for production. But I imagine this could become the new base model that all of the inference optimizations are applied to.

Wonder if it’s plug and play or if faster-whisper and others would need to reimplement from scratch?


Is this even faster? https://github.com/Vaibhavs10/insanely-fast-whisper

If so, is the quality still acceptable?



I think you may have misunderstood the figures.

Based on my understanding, only 1:20 passed the automated acceptance criteria (build, run, pass, increase coverage). Of those that made it through to the human review, “over 50% of the diffs submitted were accepted by developers” according to the paper


FTA:

> In highly controlled cases, the ratio of generated tests to those that pass all of the steps is 1:4, and in real-world scenarios, Meta’s authors report a 1:20 ratio.

> Following the automated process, Meta had a human reviewer accept or reject tests. The authors reported an average acceptance ratio of 1:2, with a 73% acceptance rate in their best reported cases.

1:20 of real-world generated tests reach human review, or 5%. Of those, on average 1:2 are approved, or about 50%. 50% of 5% is 2.5%, or 1 in 40. Where do you see the error?

edit: Okay, I think I see it, specifically in considering engineer time vs the ~2.5% overall hit rate vs. the ~50% rate for tests reaching human review and thus requiring effort. Fair callout, thanks!


Yes, that’s definitely the main reason. It’s called “burying the lede”.

Saving $6M is key information that makes this story interesting. It’s buried all the way at the bottom of the first blog and is completely missing from the second blog which focuses specifically on the migration


TaaS : title as a service


People have done this, eg https://www.reddit.com/r/GrowthHacking/comments/k20g42/ai_to...

However that appears to be defunct now


I'm usually guilty of this. The hands-on person involved in a highly technical project gets excited and bogged down in the details of the project that they end up not being the most compelling storyteller about it.


Don't blame yourself. Not everyone is here for the money, many of us are here for the tech.


HN with tags and filters would be great


Who are a few of your best twitter follows for AI?


I think Twitter lists are a good starting point. There are plenty of ML/AI lists around. I start from there and whittle down to good signal/noise ratio, meaning avoid people who post overly frequent fluffy hot-takes. People who retweet good stuff are also worth a follow. Ultimately what I want to get out of twitter is tools, papers, good blog-links.


For specific research-oriented follows I would suggest starting with Ilya Sutskever and see who he is following or retweeting. And for practical stuff start with Simon Willison


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: