Unit testing is over hyped

- Over-hyped is my polite way of saying it but let’s say it’s largely useless in the context it is used in 2026. This is purely my observation of 10 years into writing code and I am hopeful that this idea gets refined over the years and I am open for debates and discussions. For me the only missing piece was always testing when I started out. There is a plethora of information regarding testing but there are no fixed dos and don’ts. It’s mainly because testing is very taste/culture and code-individualistic. There is no right and wrong. That’s the exact stand I am trying to take with unit test cases as well. Yeah, there are places where it has helped me but is it worth the effort and time? Now writing unit tests is a “fixed must-do” step in any vibe-coded application so I keep hearing about unit tests a lot recently after the outburst of AI coding. But unit test cases have never been too useful for me.
- Let’s first forget about LLMs and agentic coding and see why I don’t enjoy unit testing.
- The idea of testing individual functions is problematic for me. I see code slowly starting to shape up in such a way that it adds more unwanted modularity just because it can be unit tested. Most of the time writing 1 function instead of 10 different small functions is cleaner and cognitively easier to read. But a lot of times it goes counter-intuitive to unit testing and makes code design complex. A lot of code cannot be unit tested as such so we are forced to change its structure.
package main
import "fmt"
func DoWork() {
go func() {
result := 42 * 2
fmt.Println(result)
}()
}
func TestDoWork(t *testing.T) {
DoWork()
// Now what?
// The goroutine runs independently
// You have no handle on result
// You can't assert anything
// The function already returned
}
To get any handle on the result you have to change the API. Sometimes you’re changing the API purely for testability. That’s the modularity creep problem: the test is quietly shaping the design.
func DoWork() chan int {
ch := make(chan int, 1)
go func() {
ch <- 42 * 2
}()
return ch
}
- A main property of unit testing is purity. atleast thats what everyone of us chase. A general guideline is unit tests need to be pure. Pure means: tests one thing in isolation, should be deterministic (you should get the same output if you run 100 times), has no side effects ie it doesn’t write to a database, send emails, mutate global state, or leave anything behind. After the test runs, the world looks exactly the same as before and it should be self-contained, meaning it doesn’t depend on the order other tests run, doesn’t need setup from another test, and doesn’t affect other tests. If our unit tests are pure then our tests are fast, reliable (a failing pure test means your code is broken, not that the DB was down or the API rate-limited you), accurate and parallelism-safe because order doesn’t matter (this is another huge topic for another day).
- But are we making sure our unit tests are pure? It’s not very intuitive. They don’t reap their benefits if they are not written correctly. Orgs try this and the best solution which “software engineering experts” found was writing mocks. Mocks are nothing but small snippets of code which, as the name suggests, mock the functionality of external dependencies so that your unit tests are self-contained and pure. More often than not these mocks are more complex than the original code written. The amount of thought that needs to go into creating these mocks doesn’t make sense to me because the majority of the time the mocks aren’t very accurate and takes lot of time and thinking. This is a really funny tweet and I completely agree with it. You get really good at writing interfaces and that’s it. One good use case of writing unit tests is to understand how a piece of code works but these scary mocks don’t help you a lot here.

- The majority of the time these individual units don’t make sense, and don’t perform in the expected way when they are standalone.
- Reduce this friction because nobody likes doing it. With rapid changes you have to keep fixing the broken tests and over time you get so annoyed you start to rig the system and make sure you hit the coverage threshold. One change in code and you have to go fix 100 different things. Yes, your agent does that now but in the end you are going to verify that, right? Which is still an uphill task. PIcking up a testing pattern like adapter is important where your test calls the adapter and the adapter calls the actual function. any change in the function, you don’t have to fix all the tests because the adapter signature is the same, it’s just a 1-time fix in the adapter code.
func Add(a, b int) int {
return a + b
}
func TestAdd(t *testing.T) {
result := Add(2, 3)
if result != 5 {
t.Errorf("expected 5, got %d", result)
}
}
You refactor Add to take a struct with same behaviour, different signature:
type AddInput struct {
A, B int
}
func Add(input AddInput) int {
return input.A + input.B
}
Now TestAdd breaks even though the logic didn’t change. With high coverage you get hundreds of these. One pattern that helps is the adapter patternyour test calls a stable adapter, and only the adapter needs updating when the internal signature changes:
// stable signature the tests always call
func AddAdapter(inputs []int) int {
return Add(AddInput{A: inputs[0], B: inputs[1]})
}
func TestAdd(t *testing.T) {
result := AddAdapter([]int{2, 3})
if result != 5 {
t.Errorf("expected 5, got %d", result)
}
}
Refactor Add however you want only AddAdapter needs updating, not every test. But now you’re maintaining extra scaffolding purely to keep tests from breaking.
- Code coverage is such a bad measure. Tbh even I have gotten so frustrated and rigged it once because they wanted 100% code coverage. Yes, if written properly it might help you catch bugs, but when you go to such high coverage your tests will be even more tightly coupled to the style in which the code is written, so not just function parameters but changes in code structure inside a function despite exhibiting the same behaviour would break your tests or most of the time you are writing unwanted test code for sake of coverage.
- In the end we care about functionality and behaviour. Especially when a web app is built, you don’t actually need to write unit tests (maybe a few critical logic pieces, parsers, regular expressions, etc.) but your integration test suite (Selenium, Playwright) is good enough because your unit tests won’t cover user behaviour anyway. If you treat unit testing as a goal in and of itself, you will quickly find that, despite putting a lot of effort, most tests will not be able to provide you with the confidence you need, simply because they’re testing the wrong thing.
- Then why do people prefer unit testing? I used to write unit tests for non-interdependent, complex logic to test scenarios exhaustively. Sometimes people prefer TDD (test-driven development) and unit tests are the easiest way of expression. Also, writing integration tests needs a wide understanding of the project. You need to know the use case, data flow, how functionality is interconnected which goes beyond myopic unit testing that just focuses on 1 function.
- I also feel that there are various other testing methodologies but they are underexplored because writing a unit test requires less thinking (unless it involves complex mocking) compared to roping in 10 different things as part of integration testing. Testing is a must, but there are a lot of other techniques out there which are worth exploring.
- I think we are all on the same page with this: nobody likes writing unit tests. Historically, largely it is forced, but there are good reasons why it is forced.
- Let’s forget about technicalities for a minute. Pre-AC (agentic coding), anytime someone new joins what is the first task in a codebase? I wrote unit tests, I used to assign unit tests as well to any junior joining the team. Writing unit tests is the best way to get acclimatized to the codebase. So I would randomly give a coverage count (85%, 90%, etc.) so that they take that time to go through the code at least to hit the coverage target. But post-AC I think it is easier to navigate a newer codebase. At least what I observe is 20-30% of my Windsurf spend these days goes into navigating the codebase. You can argue that getting used to the codebase will reduce the spend, but a bane or boon with AC is that new code keeps piling up and changes are happening rapidly so I feel writing unit test cases just to understand the codebase is slowly becoming counter-intuitive, and LLMs can help you navigate the same codebase better.
- The reason why unit tests blew out of proportion is because vibe coders ask an agent to generate some code and write some unit tests to test the code written. There is no more use other than this. People don’t care how the tests are written because they are not gonna run nightly builds that run the tests every day. I am 100% okay with this. Just don’t commit the vibe coded test code and carry the baggage. You are anyway unsure what all dependencies it is using,how its written and in a rapid vibe-coded application these unit test cases would be broken anyway because there is always a “why waste tokens fixing them?” Unit tests are now a way to make sure the expected feature is built, but they don’t test the functionality as a whole anyway. Then why are you writing unit tests as part of agent-written code? Surprisingly, vibe coders don’t have any answers.
- My goal of this blog is to say don’t cling to just unit testing. I am leaving this blog open ended because I am also exploring and getting better at understanding whats best for me.Testing is essential but you have to explore what other techniques are out there that give you enough confidence to sleep well without worries.
