More

Gabriel_h · 2026-02-03T12:41:20 1770122480

Meticulous AI | Engineers (SWE, Platform, Forward Deployed), Founding AE | SF, London | On-site | Full-time

Meticulous is building the world's first fully autonomous and exhaustive testing platform for frontend codebases. Our system replays real user sessions against every pull request to catch regressions before they ship. Customers include Dropbox, Notion, Wiz, Wealthsimple, and LaunchDarkly. We're a small, high-talent team (ex-Google, GitHub, Meta, Palantir) with exceptional product-market fit and rapidly growing revenue.

Engineering

Forward Deployed Engineer (Senior) — SF / London — On-site $200k–$400k base / £150k–£300k base + significant equity Customer-facing + deep systems work; own deployments end-to-end

Platform Engineer (First hire) — London — On-site £150k–£300k base + significant equity Infra, reliability, scaling replay of thousands of sessions per PR

Sales

Founding Account Executive — San Francisco — On-site OTE $280k–$400k + equity, uncapped commission Inbound-heavy, mid-market to enterprise (ACVs $100k–$1m+)

Tech highlights: TypeScript, browser internals, distributed systems, cloud infra, deterministic execution See all jobs and apply: https://jobs.ashbyhq.com/meticulous

Gabriel_h · 2025-04-24T14:20:13 1745504413

Interesting, AI needs much better guardrails and monitoring!

Gabriel_h · on Nov 10, 2022

You may find this interesting too https://meticulous.ai/blog/let-users-write-tests-for-you/ r.e. e2e testing and automatically generating mocks for UI tests.

Maintaining mocks in e2e tests often becomes infeasible with a meaningful amount of e2e tests.

Gabriel_h · on May 9, 2022

Hey HN, I'm Gabriel, founder of Meticulous (YC S21). Our mission is to make the world's code safe, performant and reliable.

We're starting with a tool to catch JavaScript regressions in web applications with zero-effort from developers.

How it works: Insert a single line of JavaScript onto your site, and we record thousands of real user sessions. We then replay these sessions on new code to automatically catch bugs before they hit production. You can watch a 60-second demo at https://www.meticulous.ai.

Catching JavaScript regressions is just the start. We are a London-based YC company. Our team previously worked at Dropbox, Opendoor and Google. We raised $4m, and are backed by some of the best founders and technical leaders in Silicon Valley, including Guillermo Rauch (founder Vercel, author next.js), Jason Warner (CTO GitHub), Scott Belsky (CPO Adobe), Calvin French-Owen (founder Segment), Jared Friedman (YC partner and former CTO of Scribd) and a bunch of other incredible folks.

You can read more at https://www.ycombinator.com/companies/meticulous/jobs/AkHpFa...

If interested please feel free to reach out to me at gabe at meticulous dot ai, with a few lines on what interests you about Meticulous and mention HN in the subject line.

Gabriel_h · on May 2, 2022

It really depends on your use-case here.

If you’re able to spin up an environment via docker-compose and play against that with playwright, then I think that’s good for that use-case.

However, if you’re testing a flow that relies on some initial state, it can sometimes be tricky to seed that state or do so in a way which is representative.

Gabriel_h · on May 2, 2022

It does not include SSE or WebSockets.

With regards to updating existing records, unfortunately we don't currently have good tooling & support for this, so you may need to record new sets of sessions as your application changes. I would suggest starting off with testing a few core flows.

Klaster_1 · on May 3, 2022

Thanks for clarifying. The value proposition of automatic request/response recording brings something genuinely new to the area, but without a WS support, not all projects could benefit completely.

This reminds me of the test suite for one of my projects where E2E tests only covered relatively simple scenarios because of subpar WS mocking support, leaving more important, complex interactions to manual (can't run in CI) or fully integrated (expensive to author and run often) testing. The situation changed only after we wrote a custom WS mocking layer over the HTTP mocking the framework provides, yielding a dramatic increase in coverage. Out of dozens of developers I interviewed, only a few solved this issue to some degree. Clearly, mainstream testing frameworks provide insufficient support for the use case.

gary_chambers · on May 3, 2022

I agree that WS replay could be very powerful, but I'm not sure it's straightforward. Once you get out of the realm of request/response and you are dealing with subscriptions, connection multiplexing, or frankly any other sort of pushed data that is triggered by whatever is on the other end, knowing how and when to play that back against new sessions is very hard. It seems very application specific on face value.

But I agree that it would enable replay for a lot of very interesting, complex projects. I've worked on some FX trading UIs that have been challenging to test without standing up a lot of backend services.

Gabriel_h · on May 2, 2022

One key difference is that with playwright you have to replay against some environment.

Meticulous captures network traffic at record-time and stubs in responses at replay-time, which removes the need for a backend environment to replay against. We also apply a little fuzziness in our replay, like how we pick and choose selectors (e.g. imagine css selectors with hashes in them, the 'same selector' will look very different between two builds).

We have a long way to go in making this robust though.

Is there anything you wish was easier when writing tests with Playwright?

BasilPH · on May 2, 2022

> Meticulous captures network traffic at record-time and stubs in responses at replay-time, which removes the need for a backend environment to replay against.

That's neat.

We've only just started using Playwright, but we're surprised at how easy it is to get going; but we're also all developers. We primarily use it to test large feature flows that are hard to mock in unit tests. For example, one test logs you in, uploads a file, waits for the result, clicks on the download button and makes sure the downloaded file is what we expected. We mainly want to ensure that we don't accidentally delete or hide the login widget or the download button while working on something tangentially related.

In the example outlined above, we don't mind spinning up the backend locally, as this allows the test also to make sure the response is correct.

However, I see how being able to only test front-end code quickly and easily without the need for a backend can be helpful in many applications. Congrats on the launch, and good luck!

Gabriel_h · on May 2, 2022

We are taking every scrap of luck we can get our hands on - thank you!

> Testim/QA Capture

Self-improving tests is a really interesting area. Timing is definitely an intricate issue. You probably have to layer on top of each other a bunch of different and novel techniques to get something with good signal-to-noise. We're still working on developing those out :)

Oooh, thank you for the rec. I'll make sure to ping Oren after the launch. The space is enormous and my understanding is that the rate of growth for testing tooling will exceed the rate of growth for software, which leaves QA and testing companies in a good position.

Gabriel_h · on May 2, 2022

Ouch, that does sound like a painful cycle. It happens, and no one tests as much as they'd like to.

> Portability of the data.

I hear your concern here. The replay data and session data is also saved to disk, so you can save this somewhere. Of course that still leaves the risk of the record & replay tech. I think open sourcing this would solve the portability issue here, and it's something we're actively talking about but haven't reached a conclusion on yet. Anecdotes and examples like this though are incredibly helpful in helping us make that decision.

Thank you for the feedback!

redredrobot · on May 3, 2022

I worked at a couple of dev tool startups that required you to invest some effort to adopt and would require effort to move away from (e.g. if the company went under) - open sourcing ended up being basically a necessity to sell the product (enterprise sales during seed and A). YMMV, but I would definitely encourage open sourcing enough to make potential adopters feel comfortable that they won't have to do a big, urgent migration and/or lose a bunch of engineering investment if you go under or decide to pivot.

Gabriel_h · on May 2, 2022

Thank you for the wishes here! That is very kind of you.

> Open sourcing

Openreplay is awesome, but we ended up building heavily on top of rrweb (https://github.com/rrweb-io/rrweb). Did you know they have their own documentary on the project? I only noticed that today.

gary_chambers · on May 2, 2022

Both rrweb and OpenReplay are very solid projects. I've spent the last year or so building a session recording tool, and I've scrutinised their code quite heavily. Conceptually, session recording is quite simple, but there are so many edge cases, security models (CSP, feature policies) and performances issues to overcome.

Performance is probably the area I've spent the most time thinking about: if you want to measure performance regressions in a page, instrumenting it with a session recorder is definitely a way to skew the results (for example, checking scroll position of elements during snapshotting will trigger a reflow).