Branching Strategies

Branching as described in the previous section is a means to keep the develop-test-release cycle running smoothly. The strategy depends upon a programming team to finish a defined goal or task. This task is something that can be done with in a small time limit - usually a week or two, but sometimes as long as a month. This task should be something that is atomic and testable too. If you define a too large a unit as a task, then development may take too long, testing will be idle, and the integration of the work into the release will be delayed. This could affect the development of other parts of the project which depends upon this work.

Normally, branching would seem to be tasked based, but naming a branch for a task is rarely done. Instead, branching is handled in several different ways.


Single Branch Development

In smaller shops, the programming team may consist of one or two people who do most of the testing themselves. In such shops, the programmers work together in close synchronization, and it is possible for the programmers to work in such tight coordination with each other, that they all work on the same defined task. After the task is complete, the programmers might do the testing. In such a tightly run shop, it isn't unusual for all programmers to work off of the same branch.

In the following case, all work was done in the release was done on a single development branch. When the task is complete, the programmers, as a group do almost all of the testing, and correct bugs found in the program.

In the following diagram, testing is complete, and the program is merged onto the main branch and labeled for outside testing for customer acceptance.

At this point, an external test team will check for remaining bugs. Meanwhile, branch is renamed (in this case, /dev_jan_31), and a new branch called /dev is created:

While work is going on with the new development branch, a test team is testing the release INT_REL_1.0 for problems. If the work on /main/dev is complete, and no further problems are found, then the new /main/dev is renamed (in this case, /main/dev_feb_15) and merged onto /main. Finally, the new /main/LATEST is labeled as a release:

However, if an error was discovered in /main/LATEST, the error can be corrected on the old /main/dev branch (in this case, /main/dev_jan_31). Once the bug has been corrected and retested, it is merged onto /main again, and possibly merged into the new development work too:

In some sites, if the corrections are minor, the change is made on /main itself. Remember that the programmer thoroughly tested the release before it was merged onto /main in the first place:

As you can see in this strategy, /main is used for releases, and most work takes place on the single side branch. Because there is only a single branch involved, merging is fast and easy with no contentions. However, only a single person can work on this file at a time because there is only a single branch available. Also, all development work on that branch must cease at the same time when that branch is renamed.

In larger shops, this not only might not be possible, but not even desirable. Imagine the effort it would take to get seven or eight programmers to synchronize their development cycles. Imagine how much work must be done to test the changes. And, seven or eight programmers is considered a small ClearCase sight. Imagine 40 or 50 programmers attempting to do the same thing.


Task Branching

Another way to branch is branching by task where each task is given its own branch, and this is one of the most popular ways to use branching. However, very few ClearCase sites actually name their branches after the task. Instead, most name the branches after the user ID of the various programmers. There is a reason for this. Most programmers only work on a single task at a time. It is not that the programmer is only assigned a single task, but that it is easier to work on a single task in order to complete it before continuing. Tasks are also usually assigned to a single programmer, so using a branching scheme where each programmer has their own private branch actually works very well.

The Task branching works well in a large shop programmers will tend to finish their tasks at different times which means that changes will be at a constant steady rate. Testing is now done on a smaller set of changes, and on a more regular basis. As each task passes testing, it can be incorporated into the build at a steady rate.

In the following example, there are two source files (foo.h and bar.c) and two programmers (first and second). Each programmer will use their own branch (called first and second). Each of these programmers are branching off of the last tested internal release called INT_REL_1.3. This is true although there may be a more recent version of the program.

In this diagram, the first programmer starts making a change to both foo.h and bar.c. Notice that the first programmer is branching off of INT_REL_1.3 which is not the latest version in foo.h. This is not that uncommon. It is important for programmers to work off of a stable base even if it is not necessarily the latest version. Of course, working off of a stable version must be balanced against making sure that the programming changes are relevant to the code, so a programmer shouldn't fall too far behind in the project:

In the next diagram, the second programmer makes their change in foo.h. Notice that the second programmer is also branching off of the same version of foo.h that the first programmer used.

Because the changes the second programmer made were simpler, the second programmer actually finishes first. The second programmer does what is called a Unit Test of the changes. This makes sure that the program can compile, and that the changes can compile, and that the changes work without causing obvious errors in the program.

Once the Unit testing is done, the changes can be merged onto /main. In this situation, the second programmer is merging the changes back into /main where other changes have taken place, so the programmer will probably do some more testing to make sure that the changes made on /main/second still work.

The release could be given to the test team for more thorough testing. If an error is detected, the error might be fixed right on /main, or another /second branch is created, and the problem is fixed there.

The first programmer has just finished their changes. Since their have been many changes in foo.h@@/main since the first programmer branched off of /main, the programmer finds that merging from foo.h@@/main/first/LATEST to foo.b@@/main/LATEST causes too many problems. Instead, the programmer merges backwards from foo.h@@/main/LATEST to foo.h@@/main/first/LATEST to incorporate the new changes into the side branch. This is called Backmerging:

After the backmerge, the first has discovered they had to make a few more changes in their program. The backmerge allowed the programmer to test the merging on their branch instead of the problems getting discovered on /main. Now that the changes pass the Unit Test, these changes can again be merged onto /main:

The branch is renamed, and the first programmer can create a new branch called first and continue their work.

In the above example, branch /main is reserved for unit tested work. Attributes can be used to show the status of the testing done on the version on the /main branch. In the single development branch example, almost all the testing was done on /main/dev. This meant that /main was reserved specifically for release. In a situation where many branches are taking place, it is almost impossible for the testers to test on the side branches because there are too many of branches, and merging becomes more of an issue.


MR Branching

The idea of branching on task can be taken to its logical conclusion by branching on each Modification Request (MR) that is made. This is done with sites that use a defects tracking software like ClearDDTs. The idea is that each Bug Number, Change Request, Modification Request, Defect Number, or whatever becomes a separate branch for development.

This has certain advantages. First changes don't have to be incorporated into the release immediately. For example, an MR has been created to handle the communication between the client program and the server program. This MR has been completed in the client program, but not yet in the server program. Since there may be changes in the communication protocol between the client and server program, the MR is simply not incorporated in the next client release until the server changes are closer to being finalized. Also, there is no need to rename branches since branches are named after MR numbers that constantly change.

However, creating so many branches can create a vast amount of overhead. In the example below, the program has several MRs attached to it:

In order to make the diagram easier to understand, the versions under the side branches are not shown.

Imagine trying to incorporate all four of these MRs into the next release. Four different merges will have to be done, and there is a good chance of merge contention. In this case, backmerging onto the various branches becomes a necessity. Also imagine if a programmer makes a change on a branch that fixes two MRs. For example, the change in /main/mr004 fixes both MR004 and MR005. This makes the two MRs dependent upon each other. It is now impossible to make a release with MR004 without including MR005 and visa versa. The site must have some way to track this problem. In many sites, MR005 is simply merged into MR004 and any MR005 branches are changed to MR004 branches.

Over course that strategy wouldn't work in this example, if MR004 fixed a problem that was also in MR002. Whatever happens, there must be a way of tracking which MRs have become dependent upon each other. Many times, sites that use MR branching will also have a test branch:

In the above diagram, all four MRs are merged onto a Unit test branch for Release 1.0 which is branched off of /main/3. Any sort of problems with the merging of these four MRs can now be fixed on /main/rel_1.0_utest before being placed upon /main. This makes sure that software that is placed upon /main actually works.


Other Reasons for Branching

There are many other reasons for development branching besides allowing programmers to continue their work while their software is being tested. Anytime parallel development is needed, branching can help.


Maintaining Multiple Releases

In the following example, there are two versions being maintained on the software, Release 4.x and Release 3.x. In the following example, Release 4.x is being maintained on branch /main while Release 3.x is being maintained off of rel_3.x:

This is the typical use of branching in most SCM packages, and branches in ClearCase can also be used for this purpose. In the following diagram, the labels have been removed in order to more clearly show what is going on. There is a programmer that uses the branch barton to maintain the code. Notice that there are two branches called barton: One is /main/barton that is used to support Release 4.x changes while the other is /main/rel_3.x/barton that is used to support changes to Release 3.x. Under ClearCase, these are both examples of a branch type called barton, but this single branch type does not only have to be branched directly off of /main:

Notice the merge done from /main/rel_3.x/barton/1 to /main/barton/2. Apparently there was a bug in Release 3.x that is also in Release 4.x. Notice that merging doesn't always have to take place from a daughter branch to a parent branch, but can occur between any two branches. In fact, it is also possible to merge several branches at once as shown in the Branch by MR example.


Change in Interface

When a program is in early stages of development, there is sometimes a changes done in one of the functional interfaces. For example, another parameter needs to be added, or a better way of handling the function is found. In most SCM systems, such a change can take massive efforts to coordinate. In ClearCase, the change can be placed upon the branch.

In the following example, function foo() has a change in the way it is called. This change can be placed upon a side branch until everyone in the group is ready for the new change. Meanwhile, development can still take place upon the main release branch while waiting for the whole development team to change over:


Marking Branches

In the above examples, you can see how ClearCase use branches for various jobs. A branch can be used as it is in other SCM packages - to allow for multiple versions of software, a branch can be usedto keep the /main branch used for only completed software, a branch can be used to help coordinate functional interface changes among a whole development team, and branches can be used for private development for the programmers.

Since branches can now be used for various purposes, it would be nice to be able to mark the purpose of each branch. ClearCase 3.0 allows you to attach attributes to branch types, and not just to the branches themselves. This could be used when building the view to make sure that the right type of branches are being used.

How to Mark Your Branch
$ cleartool mkbrtype -nc barton
Make a Branch Type called "barton"
$ cleartool mkattype BRANCH_TYPE
Make an attribute called BRANCH_TYPE
$ cleartool mkattr\
> BRANCH_TYPE \"DEVELOPER\" brtype:barton
The attribute BRANCH_TYPE is attached to a branch type called barton. The attribute has a value of "DEVELOPER"