His second example to "modularize" a branch condition is not functionally equivalent in _most_ in-use programming languages:
valid_user = loggedIn() && hasRole(ROLE_ADMIN)
valid_data = data != null && validate(data)
if (valid_user && valid_data) …
Is not equivalent to:
if (loggedIn() && hasRole(ROLE_ADMIN) &&
data != null && validate(data)) …
His version will always execute `validate(…)` if `data` is not null regardless of whether the user is logged in or has the appropriate role. Not knowing the cost of `validate(…)`, this could be an expensive operation that could be avoided with short-circuiting. It also seems somewhat silly (and I know it's just a contrived example), that a validation function would not also perform the `null` check and leave that up to the caller.
If validate() doesn't have a side effect then the short-circuiting doesn't matter. The micro-optimization of skipping the validation for performance reasons is premature optimization. If the performance optimization is necessary it should be stated more explicitly in the code then just being hidden being a && short circuit
I've always thought this kind of short-circuiting as an implementation detail of the runtime that should not be relied on, a hack from old school C that refuses to go away. I cringe when I see code that relies on it. It is not semantically obvious when a developer intends to use the short-circuiting trick vs when it's just inadvertently there. Of course part of the reason it lives on is because of our awful popular languages that require null checks everywhere but don't provide any syntactic help for it.
Allowing the extra constraint "doesn't have a side effect" is more dangerous than clarity in this case because it increases what a dev needs to know to modify the code, and allows for a bug to easily be introduced if the validate code is modified to have side-effects.
And though a test might be added to check for this, "Was this function run" is easier to determine than "Does this function have side-effects".
That's why all your functions should not have a side effect if at all possible. And if they do it should be stated in the function's name. Maybe something like validateAndLogIt().
And then when stuff is rewritten s.t. once you get there there's always a user logged in and non-null data? Presumably you do away with the temporary variables and just write:
has_role(ROLE_ADMIN) && validate(data)
If validate(data) needed to be called regardless of has_role, you now have a problem.
For whatever reason, I'd prefer comments to this version. Note: I actually agree with the OP about pulling the logic into named conditionals whenever possible, but in the case you do want the short-circuiting behavior I would not bother with the variables at that point.
if (loggedIn() && hasRole(ROLE_ADMIN)) {
// User has permission to do this
if (data != null && validate(data)) {
// Submitted data is valid
...
}
}
That's shorter, introduces terms in reading order (readers do not have to wonder what if (loggedIn() && hasRole(ROLE_ADMIN)) means before encountering userHasPermission. Yes, you can write the comment before the if statement, but then, it tends to become longer: "check whether the user has permission to do this") and less likely to become inconsistent when code evolves.
I see such comments as symptoms of time pressure or a somewhat sloppy, but caring, programmer.
That's an interesting trick, but it does create a whole bunch of throwaway booleans. Still, I can see that if the situation gets really complex breaking it down like this can be a benefit (but not so much in the current example where it actually increases cognitive load because you will have to read more lines to figure out what is really going on rather than what the boolean name indicates, for instance, userHasPermission does not fully cover the load and if validate would return false on being sent 'null' for that data element then you could get rid of the flag altogether).
I think part of the reason for doing it the way they do in the article is to reduce nesting. I saw the problem with missing the short circuit as well though which is why I like the suggestion of using
if (isValidUser() && isValidData(data))
And checking for data being null in the isValidData method. There is a little overhead in the call and return if data is null, but the clarity provided seems like a win in these kinds of cases.
And why not just make it clear that valid_data depends on valid_user?
valid_user = loggedIn() && hasRole(ROLE_ADMIN)
valid_data = valid_user && data != null && validate(data)
if (valid_user && valid_data) …
I think this makes it clear that valid_data has a dependency on valid_user that was being hidden in the previous version, which we're now making clear. It will look a bit weird, but I think that's a positive because it draws attention to the fact that the short-circuiting is required and that double-look coupled with a short comment will make everything much more readable. The previous version does not convey as much meaning, in my opinion.
if (loggedIn() && hasRole(ROLE_ADMIN) &&
data != null && validate(data)) …
If you want to be really concise you could also remove the precondition that loggedIn returns true before calling hasRole and have it simply return false if loggedIn isn't true.
But it seemed to me that the important bit was two dependent conditionals, not this exact example.