Its time to Fail Fast

Following up on our first podcast episode:

I wanted to elaborate the subject of application testing a little further. And in particular I wanted to explain the process of “Fail Fast”-testing a bit deeper.

So, to summarize the podcast I argued that Windows 10 and the migration projects that will take you there could be handled in a much more efficient way in regards to time and money. Previously (in migrations between Windows XP and Windows 7 for example) we have always had the mindset that none of them will work after the migration. With this mindset, we have to test every single application which takes a lot of time. Therefore, I always try to push the concept of “Fail Fast” in the projects I’m involved in.

Fail fast does have the opposite mindset of the traditional way of application testing. Instead of looking at all applications and assume that nothing will work, you should instead assume that everything will work! And therefore they doesn’t require testing. This is, in my opinion, also the way to handle applications in a world with Windows Servicing. So, to give you more insight I’ve inserted a Visio drawing of the process (in this case a roll-out of something, it could be Windows, Office or something else) but its easy enough to use adapt it to any case you may have.


User Group (n)

A User Group should consist of a number of users that share the same tools and responsibilities in the organization. This is something that needs to be inventoried and created prior to the start of the roll out. In each user group a smaller group should be pointed out. This group should cover as many of the tools and responsibilities as possible of the user group in total. If possible this group should have some degree of technical knowledge but it’s not mandatory.

As an example organization A consist of 1000 users. These 1000 users is divided into five user groups:
– User Group 1: Accounting (200 users)
– User Group 2: Management (100 users)
– User Group 3: Sales (400 users)
– User Group 4: Production (250 users)
– User Group 5: Design (50 users)

Each group is then evaluated to gather the least amount of users that covers the highest amount of tools and responsibilities. Usually this is around 10 % but could be both more of less.

The roll out to each user group always starts with this smaller group and when they have evaluated the solution the roll out continues to the rest of the users in the user group. Note: This does not mean that every tool and responsibility has been evaluated and tested. Even after the smaller group has given their OK it’s still vital to have the Task Force ready in the event of technical or process-related problems.

Task Force

The Task Force is a small group of, preferably, dedicated recourses that should try to solve any problems that arises during the roll out to the user groups. This (or these) groups should consist of the following skill sets:
– Problem management (My employer uses the Kepner-Tregoe methodology but any problem management/Troubleshooting methodology could be used).
– Technical knowledge (of the implemented solution)
– User group knowledge (of the user group in particular and of the organization in general)

Each skill set could consist of one or several persons, and one person could combine more than one skill set. It is recommended to have at least two persons, but Add to dictionary more, in each Task Force.

This group should ensure that each problem that arises are taken care of, solved and communicated and documented. It’s important that this group is dedicated to the roll out – or at least (not recommended) has this as their top priority. The group should also have the mandate to involve other competences with short notice if required. The PM of the roll out should not be a part of the Task Force, but should handle the official communication to the end-users and is responsible for the acknowledgement of the solution. It’s the PM that activates and de-activates the Task Force.

The Task Force should, as far as possible, only work with one problem at the time, but usually there are more than one problem assigned to each Task Force. Two Task Forces should not share resources.
3.4 Information and documentation
As stated in the process map, there are several parts of the process where information should be either transferred from the end-users to the project or from the project to the end-users. Each part will be explained later on in this document.

Prior to the roll out its important to establish where and how end-users should communicate with the project and vice-versa. Usually the end-users to project communication is handled using a case-management solution, but platforms like Microsoft Yammer could also be used.

For the project to end-user communication some kind of FAQ should be established. It should be easily accessible and searchable. If end-users could comment of the information in the FAQ this is a benefit. Initially the communication could be handled using e-mail or other kind of direct messaging to the affected end-users.

It’s vital for a smooth process that all solutions created by Task Forces, the project or end-users is documented to share the knowledge gathered from each-user group.

Process Map Explanation

Number Text Explanation
1 Roll-Out The start of the roll-out project.
2 Information to user group N Inform the first user group that their roll-out will start in X-time, depending on the size, complexity and possible communication to the group. It’s advisable to receive an acknowledgement from a manager of the user group if such exist.
3 Distribution to user group N Distribute the new solution to the smaller part of the user group.
4 (Decision) All OK? Problems are reported during a decided time frame (different depending on the user groups usage of the solution and the amount of tools and users).
5.1 No If problems are reported
6.1 Report to PM The end-users use the communicated channel to contact the project. It’s vital that the end-users understand that its their responsibility to report problems. Also inform other communication channels (service desk etc) that end-users may contact them and how they should handle that kind of communication.
7.1 Inform user group N and paus roll-out When a problem is reported and verified (aka, not exists in the “FAQ”) the project should inform the other users in the current roll-out group and pause the roll-out. The roll-out could continue to users not affected by this particular problem, but may give the Task Force (Task Forces) more to handle.
8.1 Activate task force The PM activates a Task Force and ensures that all members of it is dedicated to solve the problem.
9.1 (Sub Process) Task Force (See 3.3 in this document)
10.1 (Decision) Solution The Task Force evaluates the problem and solves it. If required they involves other parts of the project or external competences as needed. The problem (and solution) is thereafter categorize into one of the three categorize depending on the solution.
11.1.1 User Error If the user have reported a problem that is due to non-sufficient user training. The solution should be incorporated in the training as soon as possible, if necessary for the entire user group and/or users in other user groups.
11.1.2 Work Around If the user have reported a problem where its impossible or requires significant work to solve in a technical manner. The solution could require additional steps and/or require a change in the process for the end-user.
11.1.3 Technical Solution If the user reports a problem that is solved in a technical manner that the user may or may not notice. The solution does not require any change in the work process, but may require changes in the end-user training.
12.1 Report to PM When the solution has been tested and implemented the Task Force should report to the PM.
13.1 Contact end-user and verify functionality The PM contacts the affected end-users(s) and verifies that the functionality is as expected. If not, the PM returns the problem to the Task Force. If everything works as expected, the PM de-activates the Task Force for the particular issue.
14.1 Document solution and inform user group N The PM documents the solution in the appropriate ways (FAQ, other documentation, etc) and informs the user group that the issue has been solved.
4 (Decision) All OK?
5.2 Yes If no problems are reported.
6.2 Confirm to PM The PM should confirm functionality with the manager of the user group before moving to step 7.2
7.2 Proceed roll-out to user group N The roll out proceeds to the rest of the user group. During the entire roll out the process can be activated again if new problems arises. As for 8.2
8.2 (Decision) New problem
9.2.1 No
9.2.2 Yes
10.2 Finalize roll-out to user group N and restart process with group N+1 Finalize the group and restart the process with the next user group. It’s possible to have more than one process running simultaneously, but it requires more resources.
2 Information to User group N+1


Q: Does this process still require inventory of tools and solutions?
A: Yes, the inventory of tools and solutions is still required and essential for a successful roll out.
Q: Does this process eliminates the need for additional testing?
A: Yes, its advisable to do additional test for tools and solutions that are common for the entire organization or basic functionality for the day-to-day work. These tools and/or applications should be tested in a traditional manner to avoid unnecessary disturbance in the basic functionality.
Q: Does this process requires fewer resources/time/money than traditional testing.
A: This should usually reduce the amount of time, resources and money spent on testing. It does however put higher demands on the implementation project. Another valuable benefit is that its possible to proceed with the roll out of the new tool/solution earlier than with the traditional testing process.
Q: Wont this process affect the end-users in a negative way and potentially prevent the user from doing their day-to-day work?
A: This could affect a small group of end-users in a negative way, but usually the end-users will get assistance and a solution quicker due to the task force. It’s important to inform the user of their responsibilities but also the assistance available. But, it will in most cases affect lesser users in this way than traditional testing. It will also give the users new solutions and added productivity earlier and save the organization money.

To summarize

I want to make clear that this approach isn’t always advisable and that it may create challenges. But I’m convinced that this is the approach for the future and that most organizations needs to adopt this way of thinking in the long run. It’s a bit of Devops to it, and I strongly belive that we in the IT-pro business could learn a lot from software developers in this matter.

What do you think? Do you have any thoughts on this or have you tried it in your organization? Please let me know!


Tagged with: , , , , , , , , , , ,
Posted in Application Management, Microsoft, Okategoriserade, Podcast, Project Management, SCCM, The Basics, Windows 10
2 comments on “Its time to Fail Fast
  1. […] this process with your existing infrastructure. One good post to begin with is this one on “Fail Fast“, which is focused on implementations projects, but it will be the base line for the […]

  2. […] all) – assume that they are compatible. You can read more about this in my previous post on the Fail Fast approach. But this way of managing your migration projects needs to become a part of you everyday work as […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: