Friday, April 30, 2021

Measuring Productivity
by Ron Lichty

 


Never, ever, ever, let story points or velocity be used as a performance metric.

In fact, don’t share points and velocity outside the team.

They’re not only not useful outside the team, they can be counterproductive.

Because points (and velocity) are team-unique, they are useless for comparing teams. One team’s points and velocity have no validity to any other team. None. Zero. Points are a measure of pace, not a measure of performance. They’re team-useful, not of use to anyone outside the team. Not for any purpose I can think of.

Nor are they useful for externally measuring a team’s productivity, as team velocity will vary naturally based on factors outside the team’s control.

Story points derived from rapid, effective relative sizing, combined with velocity, can be very useful to teams themselves, and to delivering predictability to teams’ stakeholders. Points and velocity enable teams to be predictable: they offer the ability to walk down the team’s project backlog and draw a watermark - a predictor of where the team will likely be - three to four months from now.

Then we can do some agile product planning: if we draw a line there, a watermark, we can ask ourselves and our stakeholders, do we have the right stuff above the line?

Predictability is not a principle in agile development. Just a result. We get better predictability from agile development - from relative sizing plus velocity - than anything else I've ever seen used in software. Of course, if we’re truly agile, we’re likely to insert stories and change story order before we get to the watermark. Each of those is a conversation we can have about priorities and the effect on the watermark of swapping stories in and out.

Story points are also really useful to product managers and product owners (and managers!) to understand the ease or difficulty of features and stories. They give product owners particular insight into backlog ordering, enabling them to enable the team, at every sprint planning, to always be delivering the most customer delight the team can provide.

But back to my warning: Points and velocity are not a performance measure! Attempting to use them to measure performance is not only useless, but terribly, terribly counterproductive. Knowing points and velocity are being watched will cause smart people to game them. (Note: all software teams are full of smart people!) Gamed points are useless not only as a metric but worse, gaming points makes them useless for helping teams be predictable!

I quickly copied down a client's rule of thumb a few years ago: What gets measured gets manipulated. It’s the most succinct summary of the biggest problem with metrics - and certainly of using points and velocity as a metric - that I've ever heard or read. (I'm a collector of useful Rules of Thumb. There are 300 of them in our book, Managing the Unmanageable, and we’ve continued to collect them online.)

Based on my client’s rule of thumb, let me re-state the previous observation: Use points as a metric, and the number of points delivered in every sprint will go up. Productivity won't, but points will. Any team of smart people who are aware that management thinks points matter will game them. Measure points, and the inevitable gaming will make points useless as a metric. With the added injury - a major one! - that gamed points are then made useless for the team to leverage internally to be predictable. Gamed, they’ve become meaningless.

As Sally Elatta, cofounder of AgilityHealth, says, "If you ever use metrics to punish or reward, you’ll never see the truth again."

Ethan Bernstein’s Harvard Business School study, The Transparency Paradox field experiment, showed dramatic hits to quality and performance just from workers being aware of being watched. Workers who were encouraged to experiment to improve their process but whose process was constantly monitored for productivity not only saw teams game the system but showed significant performance and quality degradation over giving the teams autonomy to just show results. Todd Lankford kindly translated Bernstein’s factory-floor study to software development in his post, How Transparency Can Kill Productivity in Agile.

Add to all that the cost of morale due to flawed management conclusions based on points measures. "Your team isn't delivering as many points as that other team." Or "Work harder so your velocity goes up!"

Using points and velocity to measure productivity is as counter-productive as measuring lines of code or butt-hours-in-seats.

Productive teams are happy teams. Measuring team happiness - and team health - is a much better metric to gauge productivity than points and velocity.

And ultimately, what we need to care about is customer happiness: not inputs and outputs but outcomes. Are we delighting customers? Are we delivering the most value with the highest quality? Are we delivering the right things, and delivering them right?

As Sanja Bajovic pointed out when I first fostered this message as an online discussion, “One of the issues may be that measuring story points is so easy. All tools support it. Measuring customers’ happiness is more complex.”

Sanja’s point is one of the core ones that Jerry Muller cites in his book, The Tyranny of Metrics. To paraphrase Muller (only slightly), the appeal of metrics is based in good part on the notion that development teams will be unresponsive if they are opaque, and more effective if they are subject to external monitoring. That’s not a useful notion. Muller quotes Andrew Natsios in defining an increasingly problematic, increasingly common human condition that Natsios labeled “Obsessive Measurement Disorder: an intellectual dysfunction rooted in the notion that counting everything… will produce better policy choices and improved management.” Muller devoted his book to debunking that belief.

In the same online discussion about points and velocity, Jeremy Pulcifer added color to my own arguments when he observed that “points are useful in helping order the backlog, the value-proposition. Leaking that metric is a very bad practice.”

Points and velocity, unwatched outside the team, give the team and its product owner the ability to, when management asks what you're planning, walk them over to your card wall and say, Things will likely change - that's the point - but with our velocity today as a measure, we're likely to be here 3 months from now. Do you agree, knowing what we know today, that this is the right order and that we have the right stuff above the line?... That's a useful conversation.

I should, perhaps, note that I find predictability does require stable teams and truly relative sizing to be able to leverage velocity to set predictable watermarks. Given stable teams and truly relative sizing as pre-requisites, I repeatedly see teams deliver to the watermark, plus or minus 20% (with the caveat, of course, that if/when the backlog changes, they’ve adjusted the watermark to match). In software development, that’s a remarkable level of predictability.

Product owners have responsibility to keep stakeholders clued in to what to expect. Walking them up to the card wall and walking them through the ordered backlog of upcoming and future stories can be useful. Sharing velocity charts with stakeholders, on the other hand, is pretty unuseful: velocity is not meaningful outside the team; what stakeholders really want to know is what value they can expect and when.

So what metrics are worth focusing on?

I do find some usefulness to measuring, end of each sprint, the number of stories a team finishes vs. the number they committed to, or the number of story points a team finishes vs. the number of story points they committed to. With the caveat that what’s being measured is not productivity but the team’s ability to plan. Teams good at planning regularly finish somewhere between 85 percent and 110 percent of their points - regularly complete their plan 80-plus percent of the time.

Everybody, just everybody, knows whether teams are, end-of-sprint, delivering what they said they would at the sprint’s beginning. When teams regularly deliver the stories they promised, when they honestly say at the beginning, this is the high-value customer stuff we believe we can deliver, and then 80+ percent of the time demo that stuff visibly end-of-sprint, everyone relaxes and lets them keep delivering value without (or with less) interference.

Teams find it terribly counterproductive when outside voices pressure teams with messages like, "c'mon, you can do more.”

As an engineering manager, I watch commitment-vs-completion primarily both to make sure team members are not under some false idea that they should pressure themselves to increase velocity - and to make sure someone else isn't doing that sort of pressuring to them. In the absence of either of those two, it's to coach them to be more effective at planning.

Effective sprint planning is core to building trust with stakeholders. Only if the team demonstrates predictability in its sprint planning and delivery can the team be convincing to stakeholders with regard to months-out watermarks drawn in the backlog.

Again, outside forces can undercut teams. We’ve probably all experienced otherwise well-meaning managers and project managers who push their teams to plan for more points than they’ve been delivering. When you see this happening, you may want to suggest what I do: Adding paper to a printer doesn’t make it print faster.

Velocity is a measure of pace. If you think your team is capable of a higher pace, then invite the team to retrospect on what might make them more effective and happier; remove the impediment that’s standing in their way; bring in a trainer or coach to tune weak practices to be more effective; or facilitate your team’s engagement as a team (Google’s Aristotle Study calls out “psychological safety” as the differentiator, and how to watch for it; Em Campbell-Pretty calls out culture-first agile in her book Tribal Unity).

Here are other metrics I consider:

1) Outcomes. I want to see visible progress - product increments - being demo'd end-of-every-sprint - progress delivering some increment(s) of the product functionality that customers value most.

2) Happiness. I seek to find measures for both customer happiness and team happiness.

3) Tripartite metrics. I’m attracted to measures advocated by one of scrum's creators, Jeff Sutherland, who suggests measuring cycle time, escaped defects, and team happiness. Important: In my opinion, neither of the first two is useful without the other two.

4) Team engagement. Progress in team ownership and team engagement (and progress in identifying effective practices and adopting and learning them) is critical. Fundamentally, software development is a team sport; we said this in our book Managing the Unmanageable eight years ago, but it continues to hit home for me that software development is a team sport. While measures of team happiness may be representative of ownership and engagement, in my opinion evaluating progress toward team ownership and team engagement relies mostly on judgment from experienced leaders: managers, scrum masters and coaches. It's the bane of our analytical engineering brains that we must rely on experienced judgment over metrics we can analytically measure in evaluating software development. But I’ve seen nothing better.

5) Psychological safety. Google’s study told us we can observe it: when everyone at the table feels like they have the opportunity to speak up, we see “equality in distribution of conversational turn-taking”: no one dominates, no one is silent.

6) Finally, as I said above, I have suggested to any number of product people and teams (with caveats, mind you) that they consider measuring number of stories delivered.

As I (and so many others) have noted, when human beings know they are being measured for performance, they’ll game whatever metrics you’re measuring (even if it’s subconsciously - we innately know what's good for us!). So before we use a metric, we need to really think deeply about how it might be manipulated.

Regarding number of stories delivered… that means we must ask how people might game measuring stories-delivered. One obvious way would be to split a story into multiple, smaller stories: same work, more stories. But good news! Smaller stories are better! While splitting stories can be hard, there's pretty universal agreement that smaller stories (or if we're not doing agile, more granular requirements or smaller tasks) are better for a variety of reasons, from clarity to develop-ability to debug-ability to faster validation that we're on the right track.

Gaming story throughput by making stories smaller not only benefits a product team’s members but also benefits the software development itself. It's one of the very few metrics for which human-manipulation has such a positive side effect. (I've heard stories told of teams that proclaimed they could not split stories further, only to very creatively find useful new ways to split stories after management started measuring numbers of stories.)

But I would add a caution: this positive side effect is not the only side effect. Another way to put more stories into production is to spend less time on their quality and on testing them.

Regardless of your metrics, be very careful. Side effects will likely be rampant.

There has been some really good writing on metrics - articles and posts that explore both possibilities and concerns. Here are a few I review from time to time:
• from Pivotal Labs: don't measure velocity but volatility, cycle time, rejection rate
• genius overview of metrics by Ron Jeffries
• genius overview of metrics by Sean McHugh
• wonderful study of team productivity through the lens of DevOps metrics: Accelerate, by Nicole Forsgren, Jez Humble and Gene Kim - a must-read, in my opinion

2 Comments:

At 7:03 AM, May 06, 2021, Anonymous Anonymous said...

Wonderful blog on productivity measurement. it is very helpful to continuously monitor it because it helps to make a decision to eliminate less productive work and continue to work on productive steps. You'll need some metrics to keep in the record to measure it, https://www.connectall.com/blog/the-ultimate-guide-to-lean-metrics-in-value-stream-management-2020-version/ this guide will help you in your process.

 
At 9:57 AM, December 11, 2023, Blogger Flowace said...

This article emphasizes that story points and velocity are valuable tools for team predictability and backlog planning but should not be used as external performance metrics with Productivity Monitoring Software

 

Post a Comment

<< Home