Granularity – The spectrum from Big Rocks to Dust

What is a Minimal Viable Product?
April 10, 2017
Before you Blame the Vehicle, Make Sure it is not the Driver
April 30, 2017

Granularity – The spectrum from Big Rocks to Dust

One of the hallmarks of modern software development is the breaking down of large efforts into smaller increments, with more frequent delivery. To a large degree the days of 3 to 5 year release cycles are long gone, and deliver intervals are now measured in weeks or shorter time increments.

Teams that have adopted iterative approaches (e.g. Agile, Scrum) are used to focusing on things at the Sprint (or Iteration) level. Time boxing efforts to a 2 or 3 week window inherently puts an upper limit on the granularity of items that can be reliably achieved within the window.

Things start to get more diverse as one looks at the items within the sprint, and the impact on various elements. Some teams focus on only one item (User Story/PBI) per sprint; this has the limitation of a binary outcome, with the team succeeds completely, or has an abject failure. When the number of items is broken down into 10 distinct elements, then there will be better ability to measure partial success; and one can focus on delivering the most important items with maximum reliability.

For the purposes of this post, we will examine some workflow based on a few presumptions:

  • Story/PBI is broken down into specific Tasks
  • Tasks are accomplished by one or more linked commits to a source repository
  • Source code can be tracked back to the task which drove its implementation

The least granular approach possible is that the coding effort for the Story/PBI is defined as a single task, and all of the necessary code is done in a single commit. This pattern is common in many environments, but it has many limitations.

If one moves to highly granular commits [still being done under one task for right now], the activities such as code reviews are much easier (the effort of a code review is exponential with respect to the number of lines of code). Performing a compare between to logically sequential commits provides a clear “how to” accomplish some small goal which facilitates understanding and improves re-use of the knowledge.

Let’s examine a hypothetical sequence at the very beginning of a new project. Although this will be using C# and Visual Studio specifics, the concept is universal.

#1) Create an Empty Solution
#2) Add a Project to Solution using a Project Template
#3) Add a Class (just the raw class) to the Project
#4) Add a Method to the Class along with the First Test(s)

If this level of granularity is used in conjunction with automated build/test [ideally prior to the changes being permanently recorded in the repository], then the result is a step by step recipe for the work involved. If there is any misstep along the way, it will be caught immediately, mitigating the risk that additional work will be done based on a flawed foundation.

When multiple people are working independently, there will always be the need to merge the efforts. The shorter the isolation time, the fewer conflicts that will arise. These conflicts can be lexical/semantic, which will hopefully be caught by the merge tool; or they can be logical where one must rely on testing to catch the problem.

A classic example (at least for C# developers) is multiple developers making changes to the project file. For people not familiar with C#, the project file is an XML file where the ordering is not deterministic (as is typical of XML). In environments where the source control system allows for exclusive locks, it is a fairly common practice to use a lock in these cases which inhibits workflow in order to reduce merge issues. If one adopts high granularity, then the addition of a class (or other file type) to a project takes only a few seconds; committing the updated project, along with the new file(s) immediately thereafter can virtually eliminate the potential for conflict without the need for locking. This becomes even more important if a team is using a source control system that does not support any type of locking.

A second example is the adoption of TDD [Test Drive Development] and the Red-Green-Refactor paradigm. When there is isolation the Refactor aspect is greatly inhibited because even simple refactors (renames, signature changes) are likely to propagate across the code base. As a result, there is less refactoring done, which actually impedes the entire cycle as writing the minimal amount of bare bones code would tend to leave that [intentionally] poor code in place.

There are many more examples that can be given. If there is interest, then these may appear as subsequent posts here.

For many of the teams I have been coaching and mentoring over the past few years, this has resulted in a mean developer commit (to the shared repository) interval of about 20-30 minutes (with an upper limit of less than 60 minutes). Clearly this means that the tooling to commit changes, build [including test], and get the most current information must be extremely efficient. The most common transformation occurs on the Build machines themselves; morphing from often underpowered boxes – or VM’s, into high performance rigs.

As is usually the case, different environments have different goals, constraints and abilities. The “best practice” is what has been empirically determined to be appropriate for a given team at a given time. Achieving this means trying different approaches and doing an appropriate evaluation of the outcomes.

Comments are closed.