I should start off by saying that I’m a big fan of Scrum “but”. Scrum imposes strict limits on a number of areas and absolutely requires a number of things being done. And by not doing them you are setting yourself up for failure. However, there are problems that Scrum does not address and which are not talked about that much. Things like organizational challenges and internal company politics and just culture in general. After all, Scrum is a process and so it can’t take those items into account. it says, “here is the prescribed way of doing work – do it this way and everything will work, don’t do it this way and you will not succeed.” But Scrum also makes a number of assumptions which are not good. In this post I will walk you through some of the items we faced and how we overcame them.
In a strict world of Scrum you have a Product Backlog and when you begin a sprint you move the items that you are committing to into the Sprint Backlog and then you pull your work from there – almost like a Kanban system but not quite, because the Work In Progress (WIP) limits are at a higher level and in general it tracks work at a higher level than Kanban does as well as several other differences.
The major problem we continuously run into is the complexity of the network and the hardware involved in the work that we do. To add to the complexity is the fact that different teams (which we have no influence over) take longer to do work than we do and they are governed by a strict set of Service Level Agreements (SLA) which you will often find in large companies where some of the core work, such as server maintenance or data center maintenance, is subcontracted.
Scrum, combined with the team size gave us the perfect ability to hide our problems which was not a good thing. What would happen is that teams would run into a problem in, say, the development environment and set things in motion to solve the problem (opened a ticket with the server team) and would then move on to the QA environment. The problem would occur in the QA environment and then move to the Pre-Prod environment – all the while this problem still occurred. But because the team kept pulling items from their queue to work on, they were “busy” all of the time and reporting good progress – but not actually completing anything.
I should note that some of the user stories were straight technical stories – software being installed for instance to get various environments up and running.
The problem here was two-fold. The first is that the way the user stories were constructed, installing software and moving code from Dev to QA to Pre-Prod were all part of a single user story – it was just different parts of the requirement and Scrum doesn’t track down at the task level. The second part is that the “in progress” bucket can be huge – it is up to the practitioner to break the story down into a manageable level. So lo and behold we were building up a huge amount of technical debt that turned out to be fatal for this iteration. What happened versus what should have happened?
First a team of 15+ people is too big for a 15 minute standup each morning. 15 people x approximately 3 minutes each = 45 minutes. That’s a bit outside of 15 minutes and our team is actually larger than 15 people. So what did we do?
We eliminated the 15 minute standup meetings in favor of a once weekly team meeting.
There were other forces involved in this decision – at any large company there are going to be a lot of meetings and the team was suffering from “meeting fatigue”. As the Scrum Master I didn’t want to add to it and I couldn’t figure out how to keep the meetings to 15 minutes regardless. This, of course, led to the second problem.
In a larger team meeting, people are concerned that they will be looked on negatively if they keep reporting problems. So they don’t – or they under-report the magnitude of the problem.
Our iterations are one month long. This took me two months of looking back at the data to realize. Once I figured it out, I committed mistake #3.
Private conversations regarding the problems people are experiencing do not allow the team to focus their energies on the problem at hand – it becomes an individual firefighting experience.
We have a lot of very talented people on our team. By having one on one meetings with individuals who were having problems I was getting right back into the problem that mistake #1 was designed to solve – the number of meetings skyrocketed as I had to deal with a whole bunch of problems one on one. In addition, I wasn’t able to bring the knowledge of the entire team to bear on a problem – I had to schedule more meetings to do investigation instead of just having everyone pitch in at once.
To be honest, some problems – specifically those that rely on external forces – can only be solved by escalation to management. And this lead to Mistake #4.
I didn’t recognize the negative impact of the groups culture – which is, they were not rewarded for bringing problems to light. The team members felt management didn’t care – that they just wanted the problem solved but they didn’t want to know about it.
I am not management – I never will be. But because of my role team members felt that reporting problems to me was the same as reporting them to management. Which they were highly reluctant to do. This lead to Mistake #5.
The team members had enough work in their queue that they could simply move on to something else. By not updating their status correctly I had no visibility of this.
What they essentially did was to try to handle the problem on their own, reported that everything was okay and on track when in fact that was far from the truth. They kept hoping that things would come back on track which they never did because no effective intervention ever took place. This effectively let us hide waste. There was one more problem that came out of all of this.
We didn’t recognize quickly enough that the handoffs between team members were difficult and not working well at all.
In Scrum, one person is generally responsible for a user story from beginning to end – there is a concept of ownership. We tried to do that – as the Scrum Master I had the ownership. Why not a Product Owner you might ask. Good question – the team was reporting to five product owners at the same time with approximately 8 different statements of work per month. We are a core solutions team so we provide many different solutions for different customers on the same infrastructure. The reason for individual team members not taking ownership is because there were many coordination’s that had to occur to get a single piece of work done because of the specialized skillsets required to get the work done (we recognize the need to grow the skillsets but that can’t be done overnight). Essentially there was no coordination or collaboration in the handoffs of work or even the scheduling for when work would be done.
Those are the big six mistakes that I feel, in retrospect, that I made as a scrum master with this particular team. So, what did we do about it and how did it work? I can’t tell you how it worked because this is a work in progress and this is the first month we are doing this but I’ll post back with a follow up.
Countermeasures we put in place
Solution #1 – Implement a Scrum of Scrums
The immediate step was to appoint two additional Scrum Masters and “break” the team up into three autonomous teams (that is, each team is self contained – the mix of resources is such that each team has someone with the specialized skills needed to work a particular technology or type of solution).
This was done so that the Scrum Masters could more effectiv
ely deal with problems since they were only overseeing a small number of team members. And oversee is too strong a word, the Scrum Masters did perform some work as well. This freed up my time to do some more work since I was now the Scrum Master for only 6 people (including myself). This immediately led to another quick solution…
Solution #2 – Reinstated the daily stand-ups
The smaller teams allowed each Scrum Master the ability to hold daily standups that actually last only 15 minutes.
The standups are four days a week plus one team meeting that lasts one hour. The Scrum Masters also have a once a week coordination standup meeting. We ask the same three questions but reserve the right to extend the meeting to coordinate resources to meet special demands.
Solution #3 – A “true” Kanban approach
Individuals are assigned only one requirement at a time – the other requirements are held in an iteration backlog and are assigned to the generic resource of “Product Manager”. We do not discuss these items with the team members at all during our iteration planning meeting.
This had a whole bunch of ramifications. The first is that we would not commit to a full iterations worth of work – we created a monthly “roadmap” which said, “these are our goals for the month which we will try to reach”. The upside to this approach is that our iteration planning meetings went from 7 hours to 2 hours. Nice. The downside to this approach is that our team is currently really unable to make a commitment and stick to it. We had not been doing it for the last year so this is really no change from the status quo but it leads to another benefit:
When a problem is encountered it is surfaced and dealt with in a 24 hour period. What do I mean by this? Well, since team members are only allowed to work on one item at a time (a WIP limit of 1 as it were) when something goes wrong and they cannot fix it themselves we (the Scrum Masters) either find out about it in the next standup meeting or the team member contacts the Scrum Master as soon as the problem is recognized and it can be elevated to management ASAP. Work halts until the problem is solved. We then capture the resolution to these issues and turn around and drive them back into the process.
Note that this is a work in progress and we’re still working out how to best drive the problem resolution back into the process so we avoid the problem in the future.
In addition, this alleviates some of the management role that I was playing because if team members do not escalate problems, they have to explain to their management why they were working on one tiny item with an estimate of 8 hours for a week. It also allows Scrum Masters, through empirical observation to detect problems earlier.
Did I mention that our entire team is virtual and spread out all over the US?
Solution #4 – Reward problems discovered
The last item is to celebrate when a problem is found – the earlier the better.
We are using TFS for our work and created a special area for internally created issues. We encourage team members to report issues with the process as they find them. We review these issues each iteration and take one to improve upon as part of our work structure. It remains to be seen how effective this will be but it’s a start.
I debated whether or not to share this experience and in the end decided that there must be other teams running into these problems. I’ve been doing this for 5 years now (stricter focus on process and methodology that is) and I knew all about these problems and yet I let the politics of the organization suck me right back into things that I have coached other teams for years on how to avoid. Talk about feeling dumb.
And I had tried other things along the way but kept running into the same problems because I could not trust the team to do things in the way they needed to be done. While I still have that problem, I was also part of the problem because I was not explicit enough in explaining to the team why we needed things done.
Hopefully, this post will help other people who might be in the same situation.