Better than Deming.
You can not measure without influencing. This starting point presumes that the act of measurement is an intrusive act, even at a quantum level. Deming celebrates this; Heisenberg understands it to be a bit of a messy problem.
I agree with Heisenberg.
In our business, measurement and metrics are periodically hot buzzwords. You can not measure what you can not manage. Oh, wait. Strike that. Reverse it.
A common metric used in the software development business (especially OPD) is KLOCs. Units of one thousand lines of code. Usually measured as KLOCs per person day, or some such temporal normalization.
If I tell an engineering team that they are going to be measured on KLOCs per day, and that their perceived success is going to be based on their output measured against this deliverable, and if the team is worth its salt, they will start to raise their KLOC per day output. There’s a few ways they could do this:
- They could work longer hours, or remove distractions and work more diligently.
- They could write shite code. Lots and lots of shite code.
There are probably other ways they could increase their output, but let’s presume the ways boil down to the two bullets above.
With a KLOC metric like this, harmlessly implemented and purely good intentioned, you’ve got a 50% chance of driving your project off a cliff, and seriously degrading the integrity, manageability, and comprehensibility of your code, because you’ve built an inventive structure that discourages abstraction and rewards copy-paste linearity. You’ll get a system that is impossible to maintain, very quickly, if human nature takes over and engineers start padding their KLOC count.
I don’t have a ready proposal for a good alternative to KLOCs, but certainly, there must be something that increases behavior you want, without inducing behavior you abhor.
What would you like your engineers to do? Write shite code? Certainly not? Write elegant code? Maybe. Let’s go with that.
What if you measured something that got at how “good” the code was, not how much of it there was. Pick, for instance, code review comments? You could put a process in place to have all code inspected (a good and proven idea, for a number of reasons) and then count the number and severity of the code comments. Obviously, comments indicate a need for revision, so big numbers here are bad. Start counting these events, and tell your engineers that their count should be low, and that it should get lower through time, and suddenly you’ve created an incentive for behavior you believe is intrinsically good… This is harder to do, but it doesn’t give you the same risk of crippling your project because of the metric you chose.
The reason I agree with Heisenberg, instead of Deming, is that I see it as much more common, pandemic almost in our business, that metric programs create the wrong behavior.
If you ask your team to get predictable, and they will do the same thing over and over again, without the variance necessary to adjust for “conditions on the ground”. Quality will suffer. (Ask me how I know?)
Ask your team for schedule adherence, and they’ll throw quality away, and you’ll ship shite code. (Again, ask me how I know?)
Measurement influences, so it’s critical to think about what behavior you want before you start measuring anything.
I’ve been working through a “balanced scorecard” approach for a big team of OPD contractors I manage. One interesting metric we’re working on is, broadly put, attrition.
This one is real interesting. The job market in India, where this team is located, is very hot. So take it as a given that there will be high attrition. We also hire low on the food chain – young engineers with only 2 to 4 years experience. This population job hops, no matter where they are on the planet. So, again, take it as a given that there will be “industry average” attrition.
So, what do you measure here? Do you count attrition, and hold the team accountable to keep attrition levels below the industry average? Sure. That’s where I started. But it doesn’t really get you the behavior you want.
Let’s talk about what a “good” behavior would be here…
Start a few steps back -- why is attrition viewed as bad?
Because you lose the investment you’ve made in ramping someone up.
Would attrition be bad if the ramp time was zero?
Only a little.
So, what’s bad isn’t attrition itself, but the lowered productivity associated with staff churn, right? Right.
If you accept that assertion, then what’s important is not attrition, but resiliency. The ability to absorb staff loss without a decrease in productivity.
If that is true, measuring attrition will result in a sub-optimal behavior. If your management team focuses on “keeping people”, you’ll probably lower your attrition, maybe even below the industry standard. But what of it? When you do lose someone, can the organization absorb the hit? My assertion is that 1 staff turnover with no resiliency planning is probably worse than 10 staff turnovers in a highly resilient organization. I’d also assert that “presaged” attrition is easier to manage than “surprise” attrition. Lastly, I’d offer that attrition of “freshers” is way less damaging than attrition of key senior staff.
So, what you want is for your team to keep key staff around, and to have a way to manage knowledge-capture and process definition. You have to pay attention to new staff induction, so that in the face of inevitable attrition, you still keep your team cranking out piles and steaming piles of the shite code you’ve made inevitable with your idiotic KLOC metric.
So, a good metric program here would be something like:
- Key Staff – Managed Attrition
- Key Staff – Surprise Attrition
- Fresher – Managed Attrition
- Fresher – Surprise Attrition
- Staff Recruiting Latency (how long to fill an open req)
- Induction Efficiency (how long to “perceived” contribution for a new staffer)
Also, it’s obvious but attrition is when people quit, not when they’re fired. Firing people is good for teams, when it’s done fairly. Counting that as attrition means you create an inventive program that encourages managers to keep bad hires on the team forever.
Maybe you create a modification of the above instrument panel, and add something about quits or fires in the first 90 days, which could get at whether the management team is hiring the right people in the first place.
Anyway, my point, expressed succinctly, is that you can’t measure people without influencing their behavior (presuming they know you’re measuring them and that they give a shit). If you keep this in mind, you can steer them in a direction you presume to be subjectively “good”. If you forget this, you run a big risk that your measurement will induce behavior antithetical to your intent… Because, as Deming said, if you measure something, it will generally “improve”. (For very loose definitions of “improve”)