> When you receive a review with a hundred comments, it’s very hard to engage with that review on anything other than a trivial level.
Wow, that seems crazy. I can only hope I never have to work with somebody who thinks it is productive to leave that many comments on a change -- I genuinely cannot imagine any change that could ever require that.
I've had some PRs that would have required a ton of comments. There are usually two ways I handle this:
If the PR went in a completely different direction and missed the goal by a lot, I take ownership of it (with a brief explanation) and re-implement it. I then use the new PR for a pairing session, where I walk through both PRs with the original author for learning purposes.
If it’s mostly smaller issues, I schedule a half-hour pairing session with the author and review everything together, after preparing a list of issues.
Doing it any other way puts too much burden on the author to guess what the reviewer wants, and it slows down velocity significantly.
In those situations, the more productive option to course-correct is to talk to the change author in a meeting/chat instead of just volleying off a tsunami of comments about various minutiae in the change IMO.
A design philosophy called "Progressive Disclosure" tries to tackle this problem, where a tool is supposed to present a minimal set of functionality initially to allow a user to be productive without being an expert and progressively "reveal" more complex features as the user becomes more familiar with the tool and attempts to do more complex things.
I've heard the programming language Swift followed this philosophy during development, though I've never written any Swift code to know how well it worked out.
For most self help / productivity books, I agree pretty strongly with this take. A lot of those types of books (Atomic Habits, Getting Things Done, Measure What Matters as some examples) often feel like they could have been reduced to a blog post, or series of blog posts at most. They were good books, but I did feel like there there was some padding to slog through in each chapter.
Usually, the point is communicated fairly quickly a few paragraphs into the chapter and then belaboured for pages with one or two anecdotes that don’t lend much more weight to the strength/validity of the point for me.
> While not traditional TDD or likely not a new concept, I’ve done something I’ve dubbed TLD (Test Led Development). Rather than writing a whole smattering of tests and then coding until they all pass, TLD focuses on ping-ponging between test and code.
TDD commonly gets mischaracterized as a two-step process of writing a suite of tests upfront, then writing some implementation to make them all pass. It's actually much closer to what is described as TLD in the post. You write a failing test, do the bare minimum to make it pass, refactor if appropriate and start the cycle again. It's a development process that will (in theory) produce a high quality implementation and suite of tests at the end.
I was going to say the same thing. His characterization of TDD as "writing a whole smattering of tests and then coding until they all pass" suggests he's gotten some bad information about TDD. Sadly common, though.
His TLD is pretty much TDD except he doesn't mention refactoring after passing the tests. Even leaving a failing test at the end of the day as a kind of "todo" for the next day, I'm pretty sure Beck mentioned using that idea in his TDD book.
EDIT: Found it finally. From Beck's Test-Driven Development by Example (page unknown, ebook copy, chapter 27):
> How do you leave a programming session when you're programming alone? Leave the last test broken.
> Richard Gabriel taught me the trick of finishing a writing session in midsentence. When you sit back down, you look at the half-sentence and you have to figure out what you were thinking when you wrote it. Once you have the thought thread back, you finish the sentence and continue. Without the urge to finish the sentence, you can spend many minutes first sniffing around for what to work on next, then trying to remember your mental state, then finally getting back to typing.
> I tried the analogous technique for my solo projects, and I really like the effect. Finish a solo session by writing a test case and running it to be sure it doesn't pass. When you come back to the code, you then have an obvious place to start. You have an obvious, concrete bookmark to help you remember what you were thinking; and making that test work should be quick work, so you'll quickly get your feet back on that victory road.
> I thought it would bother me to have a test broken overnight. It doesn't, I think because I know that the program isn't finished. A broken test doesn't make the program any less finished, it just makes the status of the program manifest. The ability to pick up a thread of development quickly after weeks of hiatus is worth that little twinge of walking away from a red bar.
Beck had this process (test, code, refactor) in place in his original eXtreme Programming book, which helped spawn the whole “agile” movement. It’s kind of cool to see how the process was refined over time.
It’s also sad to see how twisted people’s idea of both agile and tdd have become, usually because they never read the source material.
Having been involved in that world from pretty early on, I must admit I'm pretty appalled at the odd interpretations I get both on HN an in real life.
I'm not sure where to put the blame, but "Agile" does get a particularly bad rap these days. Mostly I'd say a corporate watering down of Agile concepts, money hungry consultants who didn't know much, and I h ave a personal dislike of how most of SCRUM tends to be implemented.
TDD in particular has fallen off the map in a way that I find very surprising. Generational amnesia I suppose.
I mean we should get used to this kind of stuff and just accept that Agile means "corporate agile" and just call classical agile that. Just like how OOP today means "like c++" and not whatever smalltalk is.
It's very common. When I was working at a USAF base they decided to adopt Lean. What they actually implemented was almost the exact opposite of Lean in every way, almost comically so.
A key element of Lean is empowering the workers to improve the processes. Let them come up with ideas that improve things and run experiments (guided by management perhaps, but not directed by). But the way USAF did it, the managers would watch a process being done, identify "wasted" movement, and then rewrite the process/procedures to eliminate that wasted movement. It was clearly just Scientific Management but being called Lean because they "leaned out" the processes. Naturally, the actual workers did not like coming in every other week and having to learn their job all over again. After a while they still held "Lean Events" but by then it was for show rather than to actually effect change in how things were done.
Once the scrum "masters" start becoming unspoken management, then you know the process has got off the rails and is no longer agile.
Sometimes it makes sense, corps don't actually have any need to churn out new features, at that point you just have kanban, or even waterfall. Be nice to stop pretending, though.
Agile is a lot like Communism: there's a lot of people who swear it works great but everywhere it seems to have been tried it hasn't gone well... And the apologists just insist "those places didn't understand it and do it right".
I worked at a big company which did a full agile transformation, including full-time, agile coaches. I imagine everyone hired afterwords hated agile for the very real swirl the process caused, but those of us who’d been there before know it was FAR better than that chaos.
I also worked at a funded start up that implemented agile from the beginning, worked pretty great.
I don't understand this recent pushback against agile development.
Do you really want to go back to the waterfall and V model style of development? Those are basically guaranteed failures in a fast moving industry like software development. If anything your comparison should be applied to the waterfall/V model because it is essentially central planning.
If you are doing things like building MVPs, iterated/incremental development with frequent changes and deployments to production then you are doing agile development.
Maybe it is not hyper formalized like Scrum but it is agile nevertheless.
There's a bias there, I think there are a lot of small shops where agile works great and you mostly hear about big shops that use it where it will never work great because you probably don't care that much about SmallTechCo.
Versus communism where people (e.g. Benjamin Tucker) were predicting its primary failure mode decades before it ever was instantiated at any scale.
Small shops have a benefit of freedom of choice that large corps often don't give to workers. Sometimes it's accidental, sometimes deliberate, but large corporations have a tendency to want a single defined process. Variation (or as they'd call it "deviation") from that process is bad (it's not, but from the corporate management perspective it is). It has to be justified, you have to prove the thing you want to do will work before you can even try it to demonstrate that it does or does not work.
This hinders freedom to experiment with new techniques and methods within development teams, but it doesn't stop it. A "trick" is to provide all the artifacts that they want as if you followed their process to the letter, but still do things the way you want so long as it gets the job done. The problem with that is that you have no evidence you did things differently than the defined process and so they'll continue to believe the defined process is perfectly fine, if not excellent. Then some exec will decide to write a book about it and become a consultant selling the (broken) defined process (I assume this is how SAFe came to be, an ironically very rigid "agile" process).
I don't know why you'd care about a brand new test being broken overnight. It's not like you're submitting the change yet anyway. It's just leaving a breadcrumb.
I think it is a good practice to push only working code, I think it stems from this. And it has a good point - you have finished something. Sometimes if you leave something “hanging” you keep thinking about it for a while, especially when falling asleep.
its a good idea to push code before signing off, what if you get sick and someone needs to pick it up, or your computer dies. A small broken test can be fixed likely faster than reinventing the whole thing again.
Yes, you are right about the technical aspect, but my comment was about psychology. Writing test and going to sleep is abit like asking a question to your friend and waiting for answer to the morning - it might get annoying.
Have you tried it? It's actually quite refreshing since you don't need to remember what you were doing in the back of your mind. The test is there for whenever you get back to it.
However, you leave out the following section titled "Clean Check-in" in which Beck argues the exact opposite when working on a team. Clearly it's the days before git and trusting code to sit on your PC overnight without checking it in (I, too, would not trust Windows98 with code that long).
But this is the general problem with the book and TDD. It's outdated and overrated. The book is not a well-written book, even for year 2000 standards, and I'm quite surprised people are still referencing a 20 year old book that is almost entirely composed of trivial examples using Java classes and objects. It is my least favorite tech book on my shelf.
> leaving a failing test at the end of the day
As far as this point goes, you pretty much have to. The TDD methodology, per the book, is to get to green as fast as possible. If you are testing for a function to have inputs of 5 and 2 and expect an output of 10 then the book literally tells you to do:
function() {
return 5 * 2;
}
What's going to happen is you write that code, get distracted or need to quit for the day, and you come back and your test is green. You forget that you wrote some total shit like the above and move on to the next Jira ticket.
As a methodology or system it's just bad. Imagine instead of writing the implementation to pass the tests that you instead are writing tests for some AI to "implement". You set the inputs and the expected outputs and the AI goes to work and does the implementation. In AI this would be called overfitting. Yes, the tests pass. But only for the cases you wrote. There is no guarantee your code works for the general case. Now replace AI with you and the same thing will happen. If all you care about is green tests you're liable to stop thinking of what the implementation should be doing.
Sure, you won't do that. But I promise you people, in general, do. One code base I worked on had minimum coverage required. More than half of the tests were total garbage. They either tested nothing useful, or were false positives that could never fail (because no one knows to check for failure, ever!). This is the opposite scenario, but the same outcome. It's the difference between the letter of the law and the spirit of the law. Give the people a rule to follow and they will mindlessly follow it.
Agile and TDD are both too nuanced and leave too much up for interpretation that it's no surprise we continually see debate on "true TDD" or "true Agile".
There are useful ideas in TDD. But this would be better packaged and sold as "here are some ways to build software under X scenario." These are techniques applicable to a time and place and not a paradigm.
Yes, I was actually considering writing another comment about the next section and how that's (somewhat) changed thanks to git and better distributed version control systems these days.
Regarding that code snippet:
If you forgot that you hadn't finished it, and then check it in, then that's on you. Good news, hopefully you and your office aren't morons and you aren't relying just on the tests from the TDD bits, because TDD itself doesn't directly address creating integration and end to end tests. So your integration tests will catch that. And if not, it'll make it to production and your customers will, rightly, call you a moron. And you'll be embarrassed, write a test to catch the error, fix it, and hopefully not fuck up like that again.
TDD doesn't aim or claim to cover all the testing needed to verify and validate a system. It's one part of the whole (if you use it at all). In fact, it doesn't address validation at all so that's something you have to cover another way entirely.
> If all you care about is green tests you're liable to stop thinking of what the implementation should be doing.
If all you care about are green tests, I'd say you're aren't just liable to stop thinking but that you have stopped thinking. You have become, in my more polite way of saying it these days, a fool. My advice: Don't be a fool, you have a brain, use it.
There is an awful lot of discussion out there about TDD that does not mention the Red-Green-Refactor concept at all. I think that really changes the impression of TDD.
It certainly did for me. A former teammate did a lunchtime session on TDD, which introduced me to the concept. I’m self-taught, and my foray into testing was very much a matter of trying to bolt tests on after the fact. So this concept of red-green-refactor was wild to me. A little intimidating at first, and I didn’t adopt it right away.
But when I did, it not only made testing better, it made my code better too. Not only because it’s more testable, but because it makes me think about the interface first, and the implementation truly as a black box as much as possible.
First time I've ever heard of that tool, it looks incredible. I wish Xcode had some of that functionality. It's close in some ways, all the building blocks are there.
What do you mean? The Red-Green step is "test fails, test passes". You don't see the benefit of the test passing? Or you don't see the benefit of having known that the code previously didn't pass?
If you skip straight to green you don't know, for certain, that the test actually tested what you expected. This isn't even a TDD thing. When you're working on an existing, deployed, system and a user finds a problem, you generate a new test (well, sensible orgs and people will). That test will fail, because you haven't addressed the issue yet. That is, it's "red". Then you make it pass by fixing the system, it becomes "green". That's it. If you fix the system and then write the test, do you know that the test actually recreated the original failure? Or is it merely exercising the new or altered code?
Absolutely, and most prominently when writing new code. The red-green transition is absolutely essential for new code.
You should not write any production code except to make a red test green.
Think of the tests as the specification of your system.
If all tests are green, your system meets the specification. Thus there is no need to write production code.
So in order to make the system do something new, you need to first change the specification. So you add a test. When you add this test, it will almost certainly fail. After all, you haven't written the code to implement the feature.
Then you make the test green, and now the system once again matches the (now updated) specification. Commit/Refactor/Commit.
Having the test that is red also validates your tests. If your tests are always green, how do you know that you're actually testing something?
In fact, it sometimes happens that you write a test that you think should be red, because you haven't implemented the feature yet, but then it starts of as green. Meaning you inadvertently already built the feature. This can be very confusing... :-)
There is a saying that coding is debugging a blank piece of paper. ("Piece of paper" tells you how old this saying is.)
"New code" just means "I want a program to do a thing, and it doesn't do it yet. That's a bug." The difference between "bug" and "new feature" is more a matter of perspective than actual development effort.
If you write the code first and then the test, how do you know your test works? You've only ever run it against working code.
Like if there was a virulent disease for which there was a 100% cure, but you can't get the cure unless you test positive for the disease. I give you a test and say "Yep, test says you are healthy". Ok. What if the test always says people are healthy? "Have you ever tested an unhealthy person and the test detected that they were unhealthy?" "Oh, no, we've done this test a thousand times and it always says people are healthy!"
You write the test first, because your code does not yet have the feature that you are testing. Your current code is a perfect test for the test. Anyone who has done TDD for even a short amount of time has written a test that should have failed but instead it passed. Sometimes the was just a simple error in the test. You fix the test so it can detect what you are looking for (i.e. the test now fails). Other times a fundamental misconception was discovered that blows everyone's mind.
I'm replying to this one, but really, it applies to each of the replies to tester756. I want to say you all are insane, but I'm wondering if it's just that I do a different type of programming, or that the language I'm using doesn't lend itself well to TDD. Example, I have a function, in C [1], that runs another program and returns true or false. The prototype is:
tag is informational; argv is an array where argv[0] is the program name, the rest are arguments. Okay, how would you write a test first for this? You literally can't, because the function has to exist to even link it to a test program. Please tell me, because otherwise, this has to be the most insane way to write software I've come across.
[1] LEGACY CODE!
[2] I go more into testing this function here: <https://boston.conman.org/2022/12/21.1>. The comments I've received about that have been interesting, including "that's not a unit test." WTF?
What's the problem here? You would write the test (if you're doing TDD, which you wouldn't always do anyways because you recognize it's a tool and not all tools are appropriate for all situations). The test would fail because it doesn't compile, so you make it compile. Not compiling counts as a failed test, even fools get that.
Unless of course you're only working no trivial programs (based on your write up, not the case) or an absolute genius you must have at some point or another encountered a failed compilation and used that as feedback to change the code. This is no different.
Then I'm not even a fool, because I didn't get that "not compiling counts as a failed test." It feels like the goal posts get shifted all the time.
Yes, I've gotten failed compilations, and every time it's because of a typo (wrong or missing character) that is trivial to fix, no test needed (unless you count the compiler as a "testing facility"). That is different from compiles that had warnings, which are a different thing in my mind (I still fix the majority of them [1]).
But I'm still interested in how you would write test cases for that function.
[1] Except for warnings like "ISO C forbids assignment between function pointer and `void *'", which is true for C, but POSIX allows that, and I'm on a POSIX system.
I just want to point about that sentence of "goal posts get shifted". In his book Test Driven Development by Example (2003), Kent Beck, on the preface, page X says: "Red-Write a little test that doesn't work, and perhaps doesn't even compile at first"
There is no goal post moving.
More likely, as we transmit information, we don't do it correctly, and knowledge/data gets lost. I found quite enlightening to always go back to the source.
I think you've mistaken me for a unit testing (your restricted definition) and TDD zealot. I wouldn't aim for 100% code coverage using the restricted definition of unit testing you've provided, or any other testing mechanism. The lines you're missing coverage for, in particular, are ones where syscalls fail. Those are hard to force in general, and I'm not going to be bothered to mock the whole OS and standard library. I do see a way to cause `open` to fail, though, you can change the permissions to /dev/null, but that doesn't get you your desired version of a unit test that doesn't touch the file system.
At some point, I, and probably most people, operate under the assumption that we don't need to test (ourselves) that syscalls will do what they say they will do. Until they actually fail to act correctly and then I'd investigate it, and write tests targeting it to try and reliably replicate the failure for others to address since I'm not a Linux kernel developer.
My post is a reaction to my previous job, where management cared more for tests than the actual product, and I was tasked with proving a negative. I was asking for a definition of "unit testing" and I always get different responses, so I've taken to asking using my own projects.
It may seem cynical, but I assumed that anyone into "testing" (TDD, unit testing, what have you) wouldn't bother with testing that function, or with limited testing of that function (as I wrote). You aren't the first to answer with "no way am I testing that function to that level," but no where have I gotten an answer to "well then, what level?"
This may seem like a tangent to TDD, but in every case, I try to see how I could apply, in this case, TDD, to code I write, and it never seems like it's a good match. What I'm doing isn't a unit test (so what's a unit? Isn't a single function in a single file enough of a unit?). I'm not doing TDD because I have to write code first (but then, the testing code fails to compile, so there's not artifact to test).
People are dogmatic about this stuff, but there's no discussion about the unspoken assumptions surrounding this stuff. Basically, the whole agile, extreme, test driving design seems to have fallen out of the enterprise area, where new development is rare and updating code bases that are 10, 20, 40 years old are the norm and management are treating engineers like assembly line workers, each one easily replaceable because of ISO 9000 documentation. And "agile consultants" are making bank telling management what they want to hear, engineers be damned because they don't pay the bills (they're a cost center anyway).
I had written more, but my cat decided pounding on the keyboard was a good idea and got lucky with F5.
Anyways, you never asked me "well then, what level"? and I thought I did answer it but here's an answer anyways (to your unasked question): I'd test it to the point that made sense. I wouldn't follow some poorly considered hard and fast rule (morons do that, we're not morons, we are humans with brains and a capacity to exercise judgement in complex situations). A hard and fast 70% code coverage rule is stupid, as is 100%, even a strict 1% rule is stupid (though for other reasons, like that it's trivially achieved with useless tests for almost every program). If I'm writing code and 90% of it is handling error codes from syscalls, then you'll likely end up getting 10% code coverage from tests (of various sorts, not just unit) out of me. I'm not going to mock all those syscalls to force my code to execute those paths, and I'm not going to work out some random incantation that somehow causes fork to fail for this one program or process but also doesn't hose my entire test rig. Especially not when the error handling is "exit with this error code or that error code". If it were more complex (cleanly disconnects from the database, closes out some files) then I'd find a way to exercise it, but not by mocking the whole OS. That's just a waste of time.
To reiterate my take: We have brains, we have the opportunity to use them. Use the appropriate techniques to the situation, and don't waste time doing things like mocking every Linux syscall just because your manager is a moron. Educate them, explain why it would be a waste of time and money and demonstrate other ways to get the desired results they want (in a situation like your example, inspecting the code since it's so short should be fine).
> The test would fail because it doesn't compile, so you make it compile.
But in this situation the test will fail regardless of what you wrote in the test code. So the supposed usefulness of the test failure showing that you are actually testing what you mean to be testing is inexistent and the exercise of making it fail before making it pass is pointless.
Then don't do it that way? I don't get why people are hung up on this. As I've said in other comments, you have a brain. Use it. Exercise judgement. Think.
If this is actually the one thing that trips you up on TDD, then don't do this one thing and try the rest. This is the easiest part of TDD to skip past without losing anything.
I like that write up on [2]. I have not really been exposed to C in a very long time, and that has been quite informative.
I also like that Set of Unit Testing Rules. That is basically correct, external systems are a no-no on unit testing.
Usually, you deal with mocks through indirections, default arguments, and other stuff like that so you can exclusively test the logic of the function, which is more difficult in C, from what I've seen on your write up, than in other languages. But if you care about not having that on your code for performance reasons, then more likely than not, you will not be able to unit test. And that is fine. You have an integration test (because you are using outside systems). You can still do integration test first, as long as they help you on capturing the logic and flow. The issue is that they tend to be far more involved, and far more brittle (as they depend on those outside systems).
It does make sense with new code, yes. Whether for all code or not, or whether unit tests or integration or end to end tests though is up to your judgement. No one but you, who knows your system and your capabilities and knowledge, can decide for you. Only fools expect a technique to replace thinking. I assume you're not a fool.
For new code, the reason it makes sense is that your system is bugged. It does not do what it's intended to do yet because you haven't written the code to do it (or altered the existing code to add the new capability). So the test detects the difference between the current state of the system and the desired state, and then you implement the code and now the test detects that you have achieved your desired state (at least as far as the test can detect, you could still have other issues).
It's a development process that will (in theory) produce a high quality implementation and suite of tests at the end.
In fact the theory is that this approach will produce a high quality design as an emergent property. This is an extraordinary theory that requires an extraordinary proof - one I haven't seen so far.
It's fairly straightforward: TDD is red/green/refactor, which means you're working in very small steps (about a minute or two each) and thinking about design during two out of three of those steps.
During "red", you're thinking about the design of your public interface.
During "green," you're focusing on implementation.
During "refactor," you're thinking about how to improve the quality of your implementation and how to improve the overall design, and making those changes.
If you believe that spending a lot of time thinking about and improving your design will produce a high-quality design, then TDD will produce a high-quality design. QED.
If you don't accept the axiom, then it's a longer discussion, but that's the proof, and my experience is that it does in fact work.
(If you're looking for a rigorous study and proof, you won't find it, because there are no rigorous studies that formally prove what creates high-quality design. Partially because there is no formal definition of "high-quality design" in the first place.)
TDD is obviously a hill climbing strategy as far as the end result is concerned and the outcome has the associated baggage.
"Refactor" is named this way to emphasize that changes you are making are closely related to the tests you already have and the tests you are about to introduce.
That does not leave enough space to justify a QED. If you choose to design beyond that, the process stops being TDD - at least as described by Kent Beck
TDD has several big issues that lead to bad design:
1. It assumes your spec is good and rigid. In reality, most specs are shitty and fluid. And your first understanding of spec is wrong.
2. It assumes your first implementation of the spec is good enough to justify automated testing.
3. It leads to high test coverage which inhibits refactoring (despite zealots telling you otherwise).
4. Almost always it leads to obsession with testing, which leads to a ton of unnecessary complexity (e.g. dependency injection for the sake of testing, weird practices like "don't mock what you don't own", etc)
I always thought it's great that it forces you to think about public interface, but I came to believe that thinking cannot be forced with a ritual.
> 3. It leads to high test coverage which inhibits refactoring (despite zealots telling you otherwise).
The major flaw with TDD is if you get your test wrong, you get the wrong code. The intent is to inhibit refactoring, because the assumption is the tests are correct, so any refactoring must be done within the constrains imposed by those tests.
OFC the tests and supporting design are usually just as flawed as the code. This is why I say the first step of TDD is wrong, write your feature first, not your test. If you don't have a feature, and can't logically reconcile it with your other features, then its not worth even writing the test in the first place.
Tests are just a supporting tool once you (believe) you have the feature written, which functions on one hand to protect the other features you have written (at least as well as they are tested), and to validate that you aren't wildly breaking the system expectations. A large number of tests is a measurement that something is wrong, but it doesn't tell you if the feature itself is wrong or your design is wrong, just that one of the two is true.
That's how it helps you refactor, a "good" design will add new features and few tests will break, as more tests are added and total test failures approach zero over time, you gain some confidence that the system is good. You never gain certainty, just the knowledge that your constrained refactor probably didn't break anything.
Right, and in the absence of a well-known stable spec, then it seems to me that TDD (and related) is just a variant on the very old and very discredited "Waterfall Model" [0]
Hardly, if it was a variant of Waterfall you wouldn't have testing mixed in with the development of new code. Waterfall has explicit barriers between the development and test phases which is why those systems usually turn out to be clusterfucks (unless they're small or you have, usually by luck, a correct specification). In real-world Waterfall projects (which I have suffered through, worst was a multi-billion dollar disaster for tax payers, and a multi-billion dollar success for the contractors that we took it over from), testing happens after development which means you have no useful feedback while developing that you are building the wrong thing or building it incorrectly.
I had in mind that Waterfall separates architecture design from detailed design from implementation, and in the above case, the spec being precise and finished certainly means that the top level design (or architecture or whatever) is 100% finished before the TDD development begins.
But TDD does not separate architecture design from detailed design from development from testing like Waterfall. It integrates them, or at least enables integrating them. It is literally not Waterfall, which is an idealized (and thus its primary flaw) process predicated on that strong separation that TDD deliberately breaks. TDD is predicated, instead, on the idea that we don't know everything up front. Otherwise we wouldn't need or want to write tests in the middle of development.
As to the downvote, I guess you thought it was me, it was not. But I don't plan to upvote you either.
For the moment I will assume that I have a misconception that led me astray, and that I should correct in the future once I double check the thinking and the definitions etc., and I'll upvote you now for leading me in a better direction.
> For the moment I will assume that I have a misconception that led me astray
If I understood your posts right, it's the same misconception a few others have had in here:
In TDD, it's not write all tests, then red->green->refactor. It's write one test, red->green->refactor, then write one more test, red->green-refactor, repeat until done.
If it was a variant of the waterfall model then it would just put the testing phase up front and once you developed all your unit tests you start making them pass.
Also, as with almost everything else in life, it's important to not be so dogmatic about the "rules". Everything is negotiable - I'd never recommend literally only ever follow the strict red-green-refactor workflow.
I recommend people learn the “proper” technique, even if they don’t apply it in every context. As they say, _you need to be in the mould to break the mould_.
TDD properly done is a 4 step iterative process. 1) Write test. 2) Run test and see that it fails. If it doesn't fail figure out why and fix it so it does. Either the test is wrong, or what you're testing is trivial. 3) Write code so test passes. 4) Refactor code so it's good. Next go back to step 1.
Step 4 is the trickiest and most important part. Refactoring transformations must be such that they do not invalidate the results of step 3. But if you don't execute step 4 properly you'll end up with crap code.
1. 90% of the time what happens during step 4 is realization that your tests are crap. So you have to throw them away and start from scratch. Tests are not helping with refactoring, they are inhibiting it.
2. Step 4 is indeed the most important part, and yet TDD priests and scriptures don't cover it at all. TDD is actually distracting you from what's important, because it focuses on steps 1,2,3. Eventually you'll become disciplined enough to not get distracted, and you'll think that TDD works. But the reality is that you never needed TDD in the first place.
I’ve used TDD as a kind of proxy for predicate transformer semantics. Test first is a way of sneaking specification first into a team’s workflow. But let’s be clear, overspecification is bad specification! And that’s why as you say a lot of times the tests get in the way: they are overly specific and thus inhibit desirable refactoring. In my experience it’s better to err on the side of underspecifying. Most everyone agrees with me on that, since the total lack of specification you generally see is a species of extreme underspecification. Don’t think I’m being snarky either, even that level of underspecification can be appropriate in the exploratory phase, although I think it’s generally worth the effort to start from a specification that’s at least slightly more restrictive than the always true predicate.
Yeah, that's how I think about testing in general. Tests should be a reflection of your specs. But if your specs are bad (most of them are), you should wait for them to improve. Not carve them in stone.
Sadly, this was my impression of TDD for years. I always thought that writing every single test case up front would never make any sense. It was only when I read the original definition of TDD from Kent Beck that it clicked. I use it quite often now and it makes development a lot easier.
It always amazes me when someone has the audacity and sheer lack of curiosity to decide that something they've read about means something else, and then
"improves it" and "I've done something I've dubbed" to what the real thing is in the first place.
TDD has a wikipedia page [1] FFS, which is the #1 hit entering TDD into google, and which very clearly lays out that TDD is a test/code cycle (red, green, refactor anyone?). What the author claims is TDD is called TFD (Test First Development).
I'm not sure if I go even further or not, but I'm perfectly happy to write code first and then write tests. And I will sometimes not write tests until I start debugging code and start thinking to myself "is it a buggy bit of this function I haven't tested yet?" and so I'll write the test to prove it isn't that bit.
With code I'm getting paid for I'll write more tests up front and won't skip that step, but for code I'm playing with then as long as I'm just enjoying myself writing code I'll write zero tests for awhile and just code. Then as I hit the debugging/refactoring step I'll do the backfilling as a form of debugging. Often I'll write the code that fixes the bug, then write the test, then quickly and temporarily revert just the code to ensure that the tests fail and then proceed. That gets you the same safety check as doing it the test-driven way to validate your test actually tested the right thing.
I really tend to hate "thou shalt start by writing tests" as some kind of immutable golden rule. It does always feel good to me when I just naturally wind up doing it, but forcing myself to do it every single time just isn't any fun at all.
At the same time tests are absolutely essential when it comes to refactoring and debugging. When you get too far out ahead of yourself with code then you start needing to shore up the foundations and use tests to eliminate bugs in the code that you've already written. Some code though is obviously correct enough that in personal hobby projects I won't ever code tests for them (unless I do hit the point where I start to doubt their correctness due to some funny bug at which point the situation has changed so I add some tests to prove it one way or the other).
The whole point of this though is that the tests are always serving me and they aren't in the drivers seat quite the way that all the TDD 101 blog posts like to ram down your throat and which I suspect turns people off from that approach so much.
The end result is also that you'll tend to wind up with the tests that are actually useful, covering the code that is particularly hairy or essential and the edge conditions that you really need to make sure to get correct, and you wind up having a test suite which is composed mostly of useful tests instead of all the largely useless ones that infect codebases.
I'll also happily omit tests on lower level functions that are well tested at the level above them, because I don't need to test the same thing at 18 different levels (again, for professional use I'm more likely to include tests at every level if they're fairly mechanical to produce). I also have a flexible definition of what the system under test is, which often encompasses more than just the immediate object that I'm testing and I don't bother wasting mental effort thinking about how to mock the whole world.
I don't know what kind of TXX that is. I still wind up with tests, they're legitimately essential to have, I just don't get there via some prescriptive route. I wind up with good code coverage, but I don't necessarily wind up with it looking as comprehensive as rotely banging out lots of unit test. I typically wind up with tests that I know are useful because they were produced by hitting actual bugs or where I had real questions about the behavior of the code and needed to assert some invariants and prove the code worked.
An important element of TDD is that it wasn't supposed to be prescriptive. It's a tool/technique to use. Sensible people know when to use it and when not to. Even Beck (who many here and elsewhere assume is dogmatic about TDD) has said on many occasions he doesn't use it 100% of the time and wouldn't use it 100% of the time.
That's not dogma, that's the technique. If you aren't writing the test first then you aren't doing TDD, and that's fine. There's nothing wrong with not doing TDD at all if that's what you choose.
The dogma most people see or claim to see is that TDD is meant to be used everywhere (or nearly). Which some fools, yes, believe. But they're just that, fools. People who use their brains (aka, non-fools) know them to be fools and do what works for them and the circumstances because they spend some time thinking about things instead of parroting a dogma (or an anti-dogma).
> The dogma most people see or claim to see is that TDD is meant to be used everywhere
And yet TDD preachers are never drawing the boundaries for TDD applicability. They are always extremely vague: "sometimes I see that TDD doesn't work for the problem that I'm solving and I don't use it". Well, how do you see it? What types of problems is it bad for?
If TDD works, they take credit. If it doesn't work, "it's just a tool" or "you used it wrong".
It is literally impossible to prove that TDD doesn't work. Which makes it a religion.
I mean, how can he? You're the one that knows your system so you have to be the judge on whether TDD is appropriate or not for it, or for parts of it. He says as much in his talks on TDD and in his book. You have a brain, use it, don't expect him to think for you on systems he has no knowledge of.
Pretty much the same for me: I'm only able to write any sort of decent tests if there's some initial version of the code already there. Typically that means a prototype-ish only-works-on-the-happy-path version I wrote in an exploratory way, then that first test can be for the happy path. After that it's pretty easy to copy the test, tweak parameters, and figure out where the first version breaks TDD-style.
> I really tend to hate "thou shalt start by writing tests" as some kind of immutable golden rule
The only study I've seen on this shows only tests were correlated with more correct programs, but doing tests first or last showed no significant difference.
What I find so interesting is that we all effectively do TDD, except without commitment to it - Imagine writing some code and not at least testing it out with several test cases.
The absolute bare minimum should be for you to write those informal tests down and commit them to the repo.
That's some golden knowledge I've gotten over time
It just requires the overhead of setting up the test runner
But the orthodoxy seems to be write the tests first, see them fail, then write code, not write a little bit of code, write some test cases, repeat. How "test first, then code" works in a compiled language like C++ or Rust be beyond me (unless I'm taking this to a literal conclusion).
> How "test first, then code" works in a compiled language like C++ or Rust be beyond me
It's not really any different than in dynamic/interpreted/weakly-typed languages. "Writing the test" for a function sometimes just includes writing a method/function with the appropriate type signature that does nothing (maybe returns a dummy value).
Forcing you to view the code you're implementing from the viewpoint of someone calling that code from the very beginning is one of the advantages/goals of TDD. If you find that it's difficult to set up the objects/data you need to write a test, eg, your code has a bunch of implicit dependencies on other components being in a particular state or it takes a ton of arguments that all have to be constructed, that's usually a strong indicator that you should rethink the design. You're getting early feedback on your API design before you waste time implementing it.
Combine this with usage testing. At least for libraries, which I seem to end up writing a lot of, writing usage tests that don’t necessarily test units but functionality. Doing this early on helps flesh out the friction in usage and will help with testing as it often can have broad coverage. It also, now sits as a potential example for usage in documentation.
Yep, agreed. I tend to just code a single giant integration test where I map out the happy path to start with, and then when it works code tests for known edge cases at specific parts.
I don’t either. But somehow TDD is considered the sanctioned way to work with unit tests, and that it is almost pointless to write tests unless you’re doing TDD. I am relieved to read that I am not the only one.
Wow, that seems crazy. I can only hope I never have to work with somebody who thinks it is productive to leave that many comments on a change -- I genuinely cannot imagine any change that could ever require that.
Great article, fully agree with all the points.