Lead engineer says something is not workable? Pm overrides saying that Claude code could do it. Problems found months later at launch and now the engineers are on the hook.
New junior onboardee declares that their new vision is the best and gets management onto it cuz it’s trendy -> broken app.
It’s made collaboration nearly unbearable as you are beholden to the person with the lowest standards.
I hate how correct you are.
Working at a company with only two engineers and few sales and marketing people the amount of "hey i made that feature with claude when can we ship it for the customer? I showed them and they really like it" only to look at the code and find out that it doesn't adhere any of our standards and is not of a good quality either. But if you tell that then it's "yea but everyone is ai shipping now and we cannot be the ones not doing it as we will lose customers..." yea but now we are losing maintainability, understanding of our codebase and make ourself dependant on LLM providers who are getting more expensive every week.
Currently, it is difficult to live update the model’s parameters in response to new information. This difficulty applies at both an infrastructural level and an optimization level.
We simply don’t know how to incorporate new information without losing old capabilities reliably. Pans handle this through extensive evaluation, heuristics, and experience.
What we do know is that models can adapt to their context, and extending the context window is an infrastructure and capex problem first. A billion useful tokens would obviate the need for any out of band memory structures.
I definitely see why effort is being put into this. But it seems inherently limiting. It's like having someone sit down in a library each day with a notebook containing all their prior work, none of which they can actually remember. At the end of the day, they write out their notes, then go home and get their memory wiped for the next day. Making that notebook longer is an obvious way to improve the system, but it seems like it's going to bump into fundamental limits.
the latent space of the LLM when it chooses each token is 10s or even hundreds of GB for each word that it chooses. It's not really useful to look at LLMs from the perspective of its prediction head which is a very small part of the model.
Agreed there is significant information in the latent space, but what is missing is a fully resolved "thought" based on that information plus current context plus validation against an internal working model of the world.
Except that latent space does not change in response to new information, something that thoughts famously do. If you read a book that captures the author's thoughts, disagree, and write an eloquent arguments to the author, you might change the author's mind. But you will not change the "book's thoughts" on the subject.
Latent spaces are maps of thoughts other people have had, not the thoughts themselves.
This gets a bit tricky. Over very long task contexts (1M tokens) or with prompt compression (10s of millions of tokens) the model can alter its priors based on updated evidence. This form of knowledge based learning is not necessarily robust, but demonstrably does occur.
Having just been on the job market, my experience was that career pivots are much harder now. I initially intended to transition to a neighboring field after an education break - but none of the companies in that field would speak to me. At most I had a recruiter call with one who decided to reject. To complete this transition I had been planning on a massive pay cut.
When I focused on areas I had some more credible experience in, I got significantly better engagement and eventually found a very narrow niche where I had substantial success.
I think we're partially adjusting to a world where employers expect a very narrow experience match to their role. Employers are also paying a premium within that narrow match.
I spend 400-500 dollars per day during active development at this point. However with more aggressive task breakdowns I can spend ~5k per day.
These spend rates are in part due to operating on a larger code base. Operating on a larger code base means more time searching and understanding the code, tests, test output. They are also due to going all-in on agentic coding.
It can feel painfully slow to go back to coding by hand when for a dollar you can build the same functionality in a minute. Now do this with multiple sessions and you can see where the cost goes.
The problem with HN is that everyone here thinks like an engineer, not like a business owner.
$10k a month on tokens is just not that much when you're already making $2M per engineer. If their productivity has increased even 10% then the spend was well worth it.
Case in point, Meta made 33% more revenue this earnings report. Now you can nitpick and ask for attribution down to the dollar, but macro trends speak for themselves.
Go look up a multi-year chart of their revenue and find the inflection point where the AI made it go up faster (there isn't). In fact revenue growth used to be higher pre-2023.
They were also a lot smaller pre-2023, 33% growth for a company of their size is simply insane. It is entirely likely that 33% simply wouldn't have happened without AI.
I had Claude dig through the orders to pick out substantive examples where the courts ultimately allowed the order to go through USAID's dismantlement is the top instance which has an affirmative legal ruling. However unilateral grant freezes, tariffs, and other issues are still progressing in the courts.
While I'd certainly like to return to the world where congress handled the democratic duties of law making, budget, and war declarations. I must acknowledge that we no longer appear to have separation of powers.
This sounds like an issue where the hyperscalers are acknowledging that the new Foundation model firms may in fact be worth more than they are. Anthropic looks increasingly likely to exceed AWS revenue next year, and OpenAI will likely do the same with Azure.
3 years ago a Foundation model seemed like a feature of a hyper scaler, now hyper scalers look like part of the supply chain.
I think both got taken by surprise. Last year the talk was that AI was a bubble, demand was soft, pilots projects were failing, etc. Model providers still believed, but thought they had a long ramp up period to build out their own datacenters. Then in late Autumn/Winter, something happened. Model capability reached a threshold and demand exploded, then just kept exploding. Model firms are scrambling to find any compute capacity they can, which means striking any deals problem with hyper scalers. So question is whether model providers can get enough compute without having to effectively sell themselves to hyper scalers.
Meta has no real product moats at this point. Yes, many of us still use instagram and facebook on occassion - but I'm not giving it any data beyond "what short form slop content will I accept".
The days of meta having network effects to defend its position are long gone, and I suspect we'll see the products die when an AI-first UX comes to mobile.
Are they really? perhaps it is just my social circle - but I've observed most of the group traffic having moved to SMS and WhatsApp. While WhatsApp is a meta product, the economics are extremely different.
There are really just a handful of people from my extended social circles still actively posting to FB/Instagram and that traffic is mostly drowned out by slop content.
Out of curiousity why do you not refill tokens in this case? When I'm actively working on a project I'm prone to spending a few hundred dollars per day or a few thousand during the initial buildout of a new module etc.
Lead engineer says something is not workable? Pm overrides saying that Claude code could do it. Problems found months later at launch and now the engineers are on the hook.
New junior onboardee declares that their new vision is the best and gets management onto it cuz it’s trendy -> broken app.
It’s made collaboration nearly unbearable as you are beholden to the person with the lowest standards.
reply