So what? Even though engineers realize that, as long as you're talking abstract ...

voxl · on Dec 14, 2017

You're missing my point.

My point is that no story points need to exist. If you want to average the amount of work done over a period of time then just count bugs and stories completed. The Central Limit theorem applies equally well to stories over the long term, so just collect data. Automatic.

We add story points, presumably because we don't want to wait to gather enough data on stories, but the neutral position is to not use them because they incur a cost (meetings, training to get consistency right, etc). So why have them? I have not seen a convincing argument.

lostcolony · on Dec 14, 2017

Sure, we could treat all stories as being of a single size and rely on the central limit theorem, but that takes far, far longer to converge. On the order of years, I suspect, not weeks or months, which is what you get out of pointing. "We should have this done sometime between now and 2022" is not a useful metric for a PO.

We add story points because given consistent incentives, estimates tend to be consistent. Maybe consistently under or consistently over, and obviously there's a bit of wiggle room from estimate to estimate, but they tend to converge quite quickly, and to within a week or two's uncertainty of what we'll have done at any given point within the next six months, and within 6 weeks within the next year. Quite a far cry from not using them.

Meetings? Some, sure; you'd have them anyway just defining what it is you're doing, the additional burden to assign points takes up maybe half an hour per person per week or two. Training to get consistency right? There's no training involved. The hardest thing is to get people accustomed to picking a number, relative to the others. But that's not that hard either; we basically just took the first sprint's stories, organized them into a line (much like his rectangles) going from least to most complex, then discussed where to draw three vertical lines, separating them into four distinct sizes, 3s, 5s, 8s, and 13s. We then made sure the largest didn't feel too large compared with the other 13s (or else it might actually be a 21, just compared to what we had agreed a 13 was, and so we had to split it up), and from there we always had 'reference stories' to decide whether it felt more like a 5 or an 8, say. And while we sometimes differed, we could always hash out why we differed and come to an agreement on exactly how large the story was. Again, per the OP, so long as you are consistent with how you address those discrepancies, your overall estimates will be consistent, and give you predictability.

One half hour per week or two is hardly a huge cost to pay when it gives the business the ability to accurately predict when we'll have something delivered.

I don't care if the argument is convincing. I've seen it work. You're free to do whatever you want; I know what I've seen work, and what I've seen not work. I've yet to find something that works so well.

voxl · on Dec 14, 2017

Your claims are outrageous with no supporting evidence.

If you just want to throw out anecdotes I'll give you my own. I collected data on all the pointed stories at one previous job for three years and found a negative correlation between story point and time from a story being started to being completed.

Sure, you might claim that it all averages out in the end, except that our velocities were wildly in flux for that entire span as well. But we were just "doing it wrong" right?

lostcolony · on Dec 15, 2017

Outrageous? Seriously? Hyperbole much?

I doubt that you did, if I understand you. Because it sounds like you're saying that overall 3 pointed stories took the most amount of time, and 13 (or whatever your max) took the least.

But let's say it did. I'd look to see why your estimates fluctuated all over the place. Or whether stories were being closed when they were actually finished (i.e., being accurately reported). Did you have deadlines? Did you have delivery pressures? Because that right there is a good reason; as you near a deadline you start padding estimates more. Did you keep having things come up that broke the sprint? Etc. All manner of things can cause estimates to be wrong. But not negatively correlated, -especially- with velocities constantly in flux (and I mean seriously in flux; you take an average because it can and will vary, especially if there's unexpected stuff, like someone getting sick, that you didn't account for when planning); that to me, yes, definitely sounds like you were doing something very, very wrong.

apenwarr · on Dec 14, 2017

If you look at the first Central Limit Theorem slide in the original article, it compares story precision vs precision of unestimated bugs. The short answer is that if your stories are big, not estimating them causes more error (weeks or months) than most people are willing to accept. Not so with small tasks (bugs) which average out, as you suggest.

However, it’s hard to make long-range estimates using only many tiny stories/bugs, because you don’t want to break the job down with such granularity for months or years into the future - plans will change by then, and all that design work will have been wasted. That’s what makes big stories useful; you can estimate months of work in a few minutes. But because they’re so big, you can’t treat them all as equal sized.

voxl · on Dec 14, 2017

Except you can't, as I've said, engineers are not good at estimating. They get significantly worse when you start estimating months out instead of days out.

The problem is not that they're engineers, no one is good at estimating work they've never done before. This is a well known problem in pretty much every single software shop I've ever been in. Teams never deliver what was planned on time, only functional teams cut features for releases. This is not a "win" for estimation.

naasking · on Dec 15, 2017

> Except you can't, as I've said, engineers are not good at estimating. They get significantly worse when you start estimating months out instead of days out.

Because you keep focusing on time estimates instead of point estimates. People have intrinsic biases related to time and their productivity. Like how most people implicitly assume they're above average in intelligence, looks, etc.