Saturday, December 11, 2010

Using Standardized Work Types To Enable a Measurable, Continually Improving System of Work



Readers of this blog can attest that I am a big fan of many of the innovations and new thinking that’s come out of the Lean for Systems and Software Community. One of the more interesting innovations that I have been injecting into a variety of my client commitments has been the notion of tagging all work according to a finite set of possible work types. Once this work type categorization has been done then work can be tracked, measure, and managed according to the unique features of that type. This approach provides a healthy balance between treating all software delivery work as being completely unique, requiring completely different process and skills, and treating all software delivery work as being the same. Frequently my clients get tripped up around when to standardize, and when not to, how much common process is a good thing, and how much gets in the way. I found that one of the most effective ways of dealing with this issue is to create one or more work types that match the context of a particular software delivery environment, variation of approach can then be used to support the unique context of these different categories of work carried
Defining work according to A finite set of standardized work types allows management to measure, manage & improve highly variable work
Capturing meaningful metrics is challenging, work across context aren’t always comparable
The first big benefit of tagging work according to a finite set of work categories is that it supports the scalability of a measurable system of work. Frequently organizations fall into the trap of trying to capture and compare metrics across a variety of "work" contexts. These work contexts include
  • unique solution platform
    solution types (business intelligence, SOA, Web applications, ERP, etc.)
  • Risk/Cost of Delay (long-term investment, emergencies, incremental value, etc.)
  • size/complexity
  • team capability/geography/culture
  • value (new business feature versus change request/defect)
  • etc.
the problem with capturing metrics is that work across contexts are not really comparable, comparing the estimation accuracy, throughput, or cycle time of business intelligence work to Web application work doesn’t really give me any meaningful data. The challenge is that software delivery is inherently highly variable, a multitude of factors make work different, and obscure meaningful analysis. The obvious approach, is to measure each work item according to each unique combination of "work categorization" actors. Unfortunately while this approach will result in accurate measurements, it really doesn’t scale. This approach ends up creating a complex multidimensional dashboard, being very hard to read and very hard to maintain.
metrics_work_variability
Create standard work types based on the most common combination of work attributes
another approach, one that I am recommending and implementing with a number of my clients is to spend some time analyzing both current and future demand with an eye towards categorizing this demand into a reasonable and manageable list of work categorization types. Not every type of work needs to make it into a work type, my recommendation is to start with the most prevalent work, categorized work into one or more of these types, and then start "tagging" any new work according to one of these types. Work performance measurements can then be associated with each of these types.
Each work type is an expression of a particular combination of size, risk, solution/platform, client, and any other attribute that would cause this particular work to require a unique process flow as well as cause this work can be difficult to measure in comparison to other work types. What we Are trying to do is apply just enough structure to our demand to make measurements meaningful, while at the same time making sure that we can not go overboard and end up with something that is not maintainable in terms of robust analysis and metrics gathering.
metrics_work_Types_examples
Once work types have been defined, track the progression of work as it progresses through the software delivery lifecycle and track aggregate performance against these types
Now that we have our work types, we can track the start date and end dates of each work "ticket" (or package or feature or however you define individual units of work) never changes to a new process state (e.g. requirements >build). The date whenever a request for a particular unit of work is made, and the date that it arrived to the customer should also be tracked. I also recommend tracking whenever the work makes its way into input queue of a downstream process. Finally I also recommend tracking whenever an individual unit of work is blocked because of organizational impediments.
metrics_value_stream
An effective approach to tracking work daily is through the use of a physical kanban board as well as an electronic tool
Of course Kanban is not required for a measurable system of work, but it is an excellent way to enable everyday team members to both visualize their work and track work as it crosses different process "states". While the purpose of this post is not to explain kanban, in a nutshell kanban enables a pull system, where work is pulled into a particular process state only if there is capacity to handle it. Kanban allows knowledge workers to balance capacity with demand, make process policies explicit, and promote incremental improvement through the collection of state change metrics not to mention that it makes bottlenecks explicit.
metrics_kanbanTracking the state of each work unit and aggregating along each work type provides a rich and informative suite of metrics that can inform on both project and enterprise delivery health across a variety of software delivery contexts
The key metrics here that can inform on the evaluation of a healthy system of work is lead time, cycle time, and capacity load (more commonly known as Work in Progress or WIP). Leadtime lets us know how long a customer is waiting for a request to be completed, driving his number down leads to customer satisfaction and better business agility. Cycle time is a good indicator of how efficient delivery is, finally capacity load lets us know if our staff/workers are able to effectively start a piece of work and complete it before starting any new work. This last metric is an indicator of organizational maturity, collaboration, and effectiveness. Teams who are able to work with less inventory are fundamentally more effective than teams that are always working with larger inventories. Capacity is also a leading indicator of cycle time.
Other important metrics include value load, the ratio of work entering the system that creates new value (e.g. new features) versus work fixes existing value (e.g. defects or change requests), other metrics are briefly described below.
metric_overview

Tracking aggregate performance against differentiated work types is both effective and manageable
Once we have measured the performance of individual work units and aggregating them according to particular work types we can do some demand/supply analysis to baseline aggregate performance of particular work packages assigned to these work types. As an example, we could determine that a enhancement request for a new feature on an existing application takes 30 days or less the majority of the time. We can use these numbers to set up what is known as a Service Delivery Promise, where we promise our customer that we can complete a particular work unit for a particular work type within a particular period of time as long as total work is under a agreed upon capacity load. an example of this service delivery promise could be to complete an enhancement within 30 days 90% of the time, as long as there are no more than 30 enhancement requests passing throug system. this service delivery promise looks a lot like a service level agreement for an infrastructure style service. we are applying the same concept to delivery work. what makes this approach feasible is that all work units for particular work package type no longer contain a high degree of variability in terms of delivery effort. this means that new work package types will be identified fairly frequently during the initial stages of following this approach. new work package types will also be appropriate to support the business venturing into entirely new product lines , which may require new solutions and new approaches.
another approach is to set service delivery targets right away, before the system of work using a new approach has had a chance to run for a while. in this case historical analysis can help in figuring out what these targets are. often these historical do not exist , in this instance I recommend working with actual workers to come up with some baseline estimates for each of these types. Once performance targets have been identified, we now have a baseline for what normal system delivery targets could be as well as a baseline for further improvement. Creating good system delivery performance targets is always a bit of a chicken and egg proposition. My experience is that initial service-level targets as well as work type categorizations need to be really fleshed out through real work. While analysis is helpful, appropriate work package sizes, categorization types, and matching delivery performance targets will drastically change through the practical application of delivering real software.
matrix_Factory
tools like Statistical Process Control Charts can be used to help identify common and special cause variations and create opportunities for process improvement
now that we are tracking things like cycle time and lead time for individual units of work across particular work package types we can use Statistical Process Control (SPC) Charts to identify issues in the proces. SPC charts easily identify outliers in a measurement (cycle time, lead time, etc.) , we can then perform root cause analysis on these outliers to determine if the problem was “common cause” or “special cause”. Common cause are variations that result from a problem in our system of work. inappropriate process, bad software, too much governance, not enough governance, etc. are all examples of common cause variation. Special cause are not inherent in the process, should occur rarely, and seem unpredictable. someone becoming sick, getting hit by a truck, or an earthquake are all examples of special cause variation. examining these outliers for either type of variation allows us to make changes necessary to ensure that the particular issue that cause the variation in performance does not occur again. in this way we can further improve our service delivery promises as our average completion time continually improves.metrics_spc
Using Cumulative Flow, the Application Delivery group can determine how well they are limiting WIP and improving lead time
Cumulative Flow diagrams help to illustrate the amount of work-in-progress at each stage in the system . If the work is lowing smoothly, the bands should be smooth and their heights should be stable Lead time can be derived by scanning the diagram horizontally and it is evident that as WIP increases, lead time increases. This tool is useful for determining how well the Application Delivery group at maintaining their WIP limits
metrics_cumulative_flow

using this approach to create a measurable system of work is incredibly powerful, comments from my clients have been that they have been on a project before such a rich set of measurements allowing them to know exactly how the work was progressing. below is a sample dashboard that I have used in the past.
metrics_project_dashboard
.......