Friday, July 17, 2009

Using Open-Source Tools to Manage Software Quality

At PyCon 2009, Aaron Maxwell gave a presentation about the use of BuildBot to support an automated software QA infrastructure. Listening to his talk (online) made me think more carefully about the reasons I am not using BuildBot, which I took a look at several years ago.  After working with a custom automated build tool for a few years, I have recently begun using Hudson to automate software quality processes for a variety of open source software packages.  Hudson automates the following QA activities for these packages:
  • portability tests - building packages with different compilers, language versions and compute platforms
  • continuous integration - rapid builds and software tests to provide developers continuous feedback
  • integration tests - builds that test the integration of different software tools
  • archiving QA statistics - test histories, code coverage statistics, build times, etc.
  • managing third-party builds - building third-party libraries that my codes depend on
Although I am reasonably happy with Hudson, I must admit that I did not immediately decide that it was perfect for my needs the first time I looked at it.  However, as the scope of my QA needs has grown, it has become critical to have a flexible, extensible strategy for automating software QA activities. The following high-level issues have proven to be major considerations when assessing the viability of tools like BuildBot and Hudson:
  • GUI/web interface
    GUI and web interfaces are key to ensuring that developers regularly use the QA data that is being generated. Interactive interrogation of QA current data facilitates effective use of this data, and GUI interfaces are very important when developers do not all have access to the same computing platforms. These interfaces can also convey valuable QA in a concise manner, such as graphical representations of QA history (e.g test failure/successes of time).

  • Extensibility
    Any automation framework is going to need to be customized to adapt to local operating constraints. Thus, the extensibility of the automation framework is a key issue for end-users. A particularly effective strategy for supporting this flexibility is with plugin components, which are supported in Hudson.

  • Loosely Coupled QA Tools
    Hudson uses a standards-based approach for integrating QA information. QA activities can be initiated in a very generic manner, using shell commands whose scope is not restricted. If the QA information is provided in a standard format, then Hudson can integrate it into its QA summaries. For example, Hudson recognizes testing results that are documented with the xUnit XML format, and code coverage results that are documented with the Cobertura XML format. This strategy supports a loose coupling between the QA processes and the use of Hudson, which allows for the application of a heterogeneous set of QA tools, including tools for different test harnesses and programming languages!

  • Compute Resource Management
    Coordinating of a large number of QA activities requires scalable strategies for managing computing resources. Frameworks like Hudson provide basic resource management strategies, including dynamic scheduling of continuous integration builds on a build farm. More generally, scalable automated testing tools need to support strategies like fractional factorial test designs, which test many build options (configuration, platform, compiler, etc) with a small number of builds. Also, management of daemon clients also becomes an issue for large build farms (e.g. notification of exceptional events like running out of disk space).

  • Ease of Use
    It is worth restating that ease-of-use is a major factor in practice. Developers will not use QA frameworks unless they add value to the development process. Further, it can be difficult to convince an organization to support the maintenance of automated QA frameworks on a large build farm.
As a final note, the Acro Developer Resources page summarizes the QA tools that the Acro project is using with Hudson to support software development. It is noteworthy that this effort includes QA processes for both C/C++ software and Python software. On another project, we have also used Hudson to summarize tests of Matlab code.

P.S. I want to thank John Siirola for brainstorming about this blog. John has done most of the work setting up the Hudson server that we are using for Acro and related open source software development.


  1. If you're using Java (or scala or groovy), take a look at maven. I combine these technologies:

    - Maven for the build and the whole POM principle. A pom.xml file states all important project meta-data, such as dependencies, build things.
    - Hudson as the build server. It takes 1 second to hudsonize a maven project: just feed it the pom.xml.
    - Several maven plugins for QA, including:
    -- Dependency report: So if I use org.drools.drools-solver:drools-solver-core:5.0.1 I also get org.codehaus.xstream:stream:1.x in my classpath?
    -- findbugs: So I should use StringBuilder instead of StringBuffer in that peice of code?
    -- pmd and/or checkstyle: So that private field in that class is never read?
    -- surefire reports: So those tests failed?

    Maven has become the defacto standard in Java land (hibernate, spring, ...). It's great: finally I can checkout, build and automatically configure in my IDE (Eclipse, IntelliJ, Netbeans) any open source project and start patching it in under 3 minutes.
    For example. Building drools is as simple as:

    $ svn checkout drools
    $ cd drools
    $ mvn -DskipTests clean install

  2. Any pointers to what else you use for hudson + Python builds?

    Like what you use to change version numbers, upload builds etc..