Tuesday, October 28, 2008

Generating tests in Python unittest

There are many applications where you want to apply a code to a variety of data sets, and verify that you get the correct output. In this context, what you want is a test generator, which can dynamically create tests, based on the set of data sets that are available for testing.

Unfortunately, this does not appear to be a feature of unittest. The closest I have seen to this, is the support for test generators in the nose package, which extends unittest to provide test discovery mechanisms. However, that test generation feature is somewhat limited; it only applies to test functions that are Python generators, and not to similar class methods.

The following example shows how to directly insert new test methods into a unittest.TestCase class:


#
# A simple example for generating tests in the Python unittest framework
#

import glob
import unittest

#
# Defining the class that will contain the new tests
#
class TestCases(unittest.TestCase): pass

#
# A generic function that performs a test on a particular file
#
def perform_test(self,file):
if len(file) > 20:
self.fail("Failing in file \""+file+"\" because its name is too long.")

#
# Insert test methods into the TestCases class
#
for file in glob.glob("*")+glob.glob("*/*"):
tmp = file.replace("/","_")
tmp = tmp.replace("\\","_")
tmp = tmp.replace(".","_")
setattr(TestCases, "test_"+tmp, lambda self,x=file: perform_test(self,x))

#
# Apply these unittests
#
if __name__ == "__main__":
unittest.main()

In this example, the files in the current directory and in all subdirectories are used to create new test methods. For each file, a new test method is added to the TestCases class, which does a silly check to see if the file is too long.

Friday, October 3, 2008

Constraint Programming

Nick Berger pointed me to the Global Constraint Catalog, a collection of constraints that are can be used for constraint programming formulations.  This looks like a nice reference!

Wednesday, October 1, 2008

New Journal: Mathematical Programming Computation

I have recently joined the editorial board of the new journal Mathematical Programming Computation, which publishes original research articles that are at the intersection of math programming and computing. This journal reflects the growing role of computation in operations research, where real-world applications often require the application of complex software packages to analyze mathematical models.

This journal will include articles that report on innovative software, comparative tests, modeling environments, libraries of data, and/or applications. A main feature of the journal is the inclusion of accompanying software and data with submitted manuscripts. The journal's review process includes the evaluation and testing of the accompanying software. Where possible, the review will aim for verification of reported computational results.

Topics covered in Mathematical Programming Computation include linear programming, convex optimization, nonlinear optimization, stochastic optimization, robust optimization, integer programming, combinatorial optimization, global optimization, network algorithms, and modeling languages.

Wednesday, September 24, 2008

Online Video Tutorials

By necessity, I have become quite adept at digging through webspace with search engines like google to figure out "how to do X". But occasionally, it is difficult to get a sense of whether something is easy from written instructions. For example, I recently tried to install PyQT, a Python interface to the popular QT application interface library, and here's the error that I got when trying to use nmake to build the SIP library (which PyQT uses):
C:\Python25\sip-4.7.6\sip-4.7.6\siplib>nmake

Microsoft (R) Program Maintenance Utility Version 8.00.50727.762
Copyright (C) Microsoft Corporation. All rights reserved.

cl -c -nologo -Zm200 -O2 -MD -W0 -DUNICODE -DWIN32 -DQT_LARGEFILE_SUPPORT -I. -IC:\Python25\include -Fo @C:\DOCUME~1\wehart\LOCALS~1\Temp\nm271.tmp
NMAKE : fatal error U1077: '"C:\Program Files\Microsoft Visual Studio 8\VC\bin\cl.EXE"' : return code '0xc0000135'
Stop.

I had more than a little difficulty figuring out what the return code 0xc0000135 meant, especially since this is turned out to be an nmake return error and not an error generated by cl. Go figure.

In many cases, it is much easier to look over someone's shoulder and have them show you the ropes. That is why I have been so impressed with the ShowMeDo tutorial site. This site provides video tutorials where someone walks you through common installation, setup and configuration examples.

I think that this site is particularly useful for an "average" computer user, who is not familiar with manual software installation and configuration. I work with a variety of such users in my work, and I have found that these tutorials provide a convenient reference that it quite accessible. If they cannot look over my shoulder, then at least they can view a video that provides a similar experience!

The site also includes video tutorials for how to create video tutorials! I have not tried this myself, but I hope this type of tutorial catches on.

Monday, September 22, 2008

Why Python?

In the past year, I have increasingly been using Python to develop a variety of OR-related scientific software. In particular, the Coopr library has been a major focus of this software development.

Recently, I have written a paper that will appear in the proceedings of the INFORMS Computing Society Conference 2009:
W. Hart, Python Optimization Modeling Objects (Pyomo), Proc. INFORMS Computing Society Conference, 2009, (to appear).
In this paper, I describe Pyomo, an open-source tool for modeling optimization applications in Python. A key goal of Pyomo is to provide an open-source math programming modeling capability. Although open-source optimization solvers are widely available in packages like COIN-OR, surprisingly few open-source tools have been developed to model optimization applications.

Pyomo has been developed in Python because it is a well-used modern programming language that provides a robust foundation for developing and applying scientific software. In this paper, I noted that Python meets some basic criteria that relate to how Pyomo will be used and managed:
  • Open Source License: Python is freely available, and its liberal open source license lets you modify and distribute a Python-based application with few restrictions.
  • Features: Python has a rich set of datatypes, support for object oriented programming, namespaces, exceptions, and dynamic loading.
  • Support and Stability: Python is highly stable, and it is well supported through newsgroups and special interest groups.
  • Documentation: Users can learn about Python from extensive online documentation, and a number of excellent books that are commonly available.
  • Standard Library: Python includes a large number of useful modules.
  • Extendability and Customization: Python has a simple model for loading Python code developed by a user. Additionally, compiled code packages that optimize computational kernels can be easily used. Python includes support for shared libraries and dynamic loading, so new capabilities can be dynamically integrated into Python applications.
  • Portability: Python is available on a wide range of compute platforms, so portability is typically not a limitation for Python-based applications.
Another factor, not to be overlooked, is the increasing acceptance of Python in the scientific community. Large Python projects like SciPy and SAGE strongly leverage a diverse set of Python packages.

Several other popular programming languages were also considered for Pyomo. However, in most cases Python appears to have distinct advantages:
  • .Net: As mentioned earlier, the .Net languages are not portable to Linux platforms, and thus they were not suitable for Pyomo.
  • Ruby: At the moment, Python and Ruby appear to be the two most widely recommended scripting languages that are portable to Linux platforms, and comparisons suggest that their core functionality is similar. Our preference for Python is largely based on the fact that it has a nice syntax that does not require users to type weird symbols(e.g. \$, \%, @). Thus, we expect this will be a more natural language for expressing math programming models.
  • Java: Java has a lot of the same strengths as Python, and it is arguably as good a choice for Pyomo. However, two aspects of Python recommended it for Pyomo instead of Java. First, Python has a powerful interactive interpreter that allows realtime code development and encourages experimentation with Python software. Thus, users can work interactively with Pyomo models to become familiar with these objects and to diagnose bugs. Second, it is widely acknowledged that Python's dynamic typing and compact, concise syntax makes software development quick and easy. Although some very interesting optimization modeling tools have been developed in languages like C++ and Java, there is anecdotal evidence that users will not be as productive in these languages as they will when using tools developed in languages like Python (see Python vs Java).
  • C++: Models formulated with the FlopC++ package are similar to models developed with Pyomo. They are be specified in a declarative style using classes to represent model components (e.g. sets, variables and constraints). However, C++ requires explicit compilation to execute code, and it does not support an interactive interpreter. Thus, we believe that Python will provide a more flexible language for users.

Sunday, September 21, 2008

INFORMS ICS Meeting

If you are interested in the intersection of operations research and computing, then the INFORMS ICS Meeting will be of interest to you! I am organizing a session on open-source software for operations research. Contact me if you are interested in giving a presentation!

Monday, September 8, 2008

Why open-source software?

Much of my work involves the development of open-source software. Recently, I have been challenged to justify this in several different projects.

I recently stumbled across Dave Wheeler's paper, which provides a nice quantitative analysis of the advantages of open-source software.

Saturday, September 6, 2008

Testing ScribeFire

I'm going to try using ScribeFire to generate these posts. This seems highly recommended. Also, I can work with it offline, which is a definite plus for me!

Wednesday, August 27, 2008

Starting to blog...?

OK, this is really my second blogging experience. I'm currently the 'blogger in residence' for the INFORMS Computing Society. Though I'm still trying to figure out what that means, I blogged the last INFORMS Annual Meeting .

So, why this blog? I do a lot of software development at Sandia National Laboratories, mostly focused on scientific computing and optimization. The more I work with open-source projects, the more I realize that I should archive some of my 'lessons learned'. Although I like to publish papers, a blog is a better way to informally share information!