Wednesday, February 1, 2012

Science Fiction Comes to Life

One of my favorite authors is Vernor Vinge, who explores how the evolution of technology impacts future societies.  Vernor is a former professor of computer science, so it is perhaps no surprise that his tech focus appeals to me.  The novel The Peace War imagines a post-apocalyptic society that is recovering from a world war that was prompted, in part, by a new force field generating device (called the Bobbler).

There are all kinds of futuristic technologies described in this book, which is typical for Vinge's literature.  In one scene, the protagonist is about to be ambushed by a group of bad guys. He hands a gun to his teenage companion, who takes out the bad guys without hardly aiming the gun.  The trick is that the gun has computer-guided bullets, which the protagonist directed using his laptop.

This sounded like standard sci-fi tech fantasy until I heard about the self-guided bullet that has been developed by Sandia National Laboratories.  Mashable has a fun video describing the work (see Yahoo News).  Although I work at Sandia, the team working on this didn't invite me to the test range.  Perhaps I'll get to help test out the Bobbler.  ;)

Friday, January 20, 2012

Testing Open Source Software

Software testing is widely recognized as a best practice for software development. Software tests define expected functionality, and they can focus developer efforts by providing an objective assessment of the state of a software project. Additionally, software testing data can provide evidence that a software package can be reliably used. For example, when evaluating whether to try out open source software, I routinely look for software testing data to confirm which platforms the software will run on, the versions of associated software that is used, and test coverage statistics that indicate how much of the the code is tested.

Unfortunately, most open source software projects do not publish software test data.  I suspects that this indicates that a small fraction of OSS projects have robust test suites.  However, this also reflects another aspect of the OSS community:  hosting facilities for open source software do not support web-based testing facilities, like Jenkins, that can be used by developers to remotely launch jobs on test machines with a variety of different configurations. This is not totally unexpected, since testing can be computationally intensive.

Recently, I learned about CloudBees, which provides cloud services for building, running and managing Java applications. Happily CloudBees makes its Dev@cloud service freely available to open source projects! This includes the Jenkins testing service, which provides a limited number of CPU hours each month for testing an OSS project.  For example, the CxxTest project now hosts tests on a CloudBees Jenkins server.  Cool!

Sunday, January 8, 2012

A Different Model for Writing Blog Posts


This is a blog that I have been meaning to write for some time.  I occasionally take a look at the download statistics for this blog, and recently I was prompted to do this by other bloggers who were reporting their end-of-year statistics (e.g. see Laura McLay’s review of the Punk Rock OR blog).

Unlike Laura, I do not have impressive download statistics to report about the many blogs I have written in 2011; frankly, I did not create many posts. However, an interesting pattern has emerged regarding this blog’s readership:  there are a few key blog posts that are frequently downloaded.  For example, my most frequently downloaded blog post is a survey of Python plugin software, which I wrote in 2009.  I suspect that other bloggers have seen the same thing; they have a few posts that are very popular because people do web searches on that topic.  However, it is worth stepping back and thinking about the implications of this when writing a blog.

When I first started blogging, I imagined that readers would view my blog the way that I view Laura’s blog.  They would use a RSS feeder to collect and view blog updates.  These would be read shortly after they were published, and afterwards they might be used as a reference.  This led me to create blogs that referenced each other as part of a larger conversation on a topic.  For example, after blogging about Python Plugin Frameworks, I had several follow-up blogs, including a brief description of PyUtilib Plugins that I had developed.

However, I have realized that my blogs are more likely to be found through internet searches focused on a topic.  Consequently, the Python Plugin Frameworks post gets frequently read while the PyUtilib Plugins post rarely gets read.  Readers are finding my blog posts after searching for “python plugins”. The narrower topic covered by the PyUtilib Plugins post is not frequently referenced on the internet, and consequently it is not strongly associated with the more general topic of Python plugins; for example, I did not see it in the first three pages of a google search for “python plugin”.

This suggests a different model for writing blog posts that has already begun to affect my blogging.  Since blog posts are individual artifacts that may have enduring value to readers, updating a blog post with new content makes more sense than creating a new post that continues the previous discussion.  For example, I’ve updated the Python PluginFrameworks post to include references to PyUtilib’s plugins.  This may confuse readers of RSS feeds, and I do not know that RSS feeds will automatically update their feed to capture updates like this. I would assume not.  However, this is clearly a strategy that will enhance the long-term impact of a blog post on a specific topic.

Saturday, January 7, 2012

The Pyomo Book is Coming Soon

The Python Optimization Modeling Objects (Pyomo) package is an open source tool for modeling optimization applications in Python. Pyomo can be used to define symbolic problems, create concrete problem instances, and solve these instances with standard solvers. Pyomo provides a capability that is commonly associated with algebraic modeling languages such as AMPL, AIMMS, and GAMS, but Pyomo's modeling objects are embedded within a full-featured high-level programming language with a rich set of supporting libraries. Pyomo leverages the capabilities of the Coopr software library, which integrates Python packages for defining optimizers, modeling optimization applications, and managing computational experiments.

Of course, there is very little online documentation describing Pyomo.  However, the first book on Pyomo is set to be published in February!
Pyomo - Optimization Modeling in Python. William E. Hart, Carl Laird, Jean-Paul Watson and David L. Woodruff. Springer, 2012.
Here are some links if you want to learn more:
Enjoy!

A Pythonic C++ Parser

If you google for "python C++ parser", you will find a variety of internet discussions related to parsing C++ in Python.  C++ cannot be parsed by a LALR parser and it is well-known that parsing C++ is a nontrivial task.  Thus, these discussions generally fall into one of several categories:
  1. It is too hard to parse C++ in Python, so use a package like GCC_XML that does this for you.  If you really need to do something in Python, write a wrapper to GCC_XML.
  2. It is too hard to perform a complete parse of C++ in Python, but we can use a LALR parser to collect gross structural information from C++ files.  The CppHeaderParser is an example of this type of package, which uses the ply parser to collect information about classes in header files.
In the recent release of CxxTest, I included a LALR C++ parser that is similar to CppHeaderParser. CxxTest is a unit testing framework for C++ that is similar in spirit to JUnit, CppUnit, and xUnit. CxxTest is easy to use because it does not require precompiling a CxxTest testing library, it employs no advanced features of C++ (e.g. RTTI) and it supports a very flexible form of test discovery.

CxxTest performs test discovery by searching C++ header files for CxxTest test classes. The default process for test discovery is a simple process that analyzes each line in a header file sequentially, looking for a sequence of lines that represent class definitions and test method definitions.

I added a new test discovery mechanism in CxxTest 4.0 that is based on the a parser for the Flexible Object Generator (FOG) language, which is a superset of C++. The grammar for the FOG language was adapted to parse C++ header files to identify class definitions and class inheritance relationships, class and namespace nesting of declarations, and class methods. This allows CxxTest to identify test classes that are defined with complex inheritance relationships.

As I noted earlier, the CxxTest FOG parser is similar to the parser in CppHeaderParser.  Based on my limited knowledge of CppHeaderParser, here are some points of contrast between these two capabilities:

  1. The FOG parser is embedded in CxxTest, while the CppHeaderParser is a stand-alone package.  Although I implemented the FOG parser as a separate component in CxxTest, I did not have specific design requirements that led me to make this a separate package.  (Interested parties should give me a buzz...)
  2. The FOG parser is a specifically focused on the features required by CxxTest, and thus it does not parse out much of the information that CppHeaderParser provides (return values, argument types, etc).
  3. The FOG parser was specifically designed to capture class inheritance relationships.  It is not clear to me that the CppHeaderParser does this.
  4. The FOG parser is based on a superset of C++.  Thus, it can robustly parse C++ method and function definitions.  The examples provided by CppHeaderParser suggest that it can parser function and method declarations, but not headers that include their definitions.  (Of course, the FOG parser ignores these definitions, but that's the point.  The parser can do that.)
  5. The FOG parser has been tested on a large set of C and C++ test files that are used to test the ELSA compiler.  This is a much more extensive test suite than is used to develop CppHeaderParser.
The point of this comparison is that the FOG parser may be of interest for other C++ parsing applications.  It has not been developed for general use, but it could easily be adapted to provide a more general capability.  

Wednesday, October 5, 2011

OptimJ is now free, but Ateji is closed

Atjei, the creater of OptimJ, is now closed. The OptimJ product is a Java extension that supports a simplified syntax for specifying optimization models.  It is sad to see this product be abandoned like this.  I do not know of any other optimization modeling tool that directly supports Java.

Erwin Kalvelagen noted the demise of Ajeti, and commented that "I believe that some of the more complex data issues in practical modeling are often better (i.e. more efficiently) dealt with in a specialized language than in a traditional programming language." Pyomo is similar to OptimJ, in that it supports optimization modeling in Python. However, a premise that has guided Pyomo development is that users will want to perform modeling in a full-featured programming language.

Clearly, traditional programming languages are more verbose and complex than a domain-specific language. However, Python is arguably much simpler to use than Java. In fact, I have gotten feedback from Pyomo users that suggests that they were not aware that they were developing models in Python;  they simply thought that it was another modeling language!

Of course, I am a programmer who likes to do optimization.  My selfish reason for creating Pyomo is that I find domain-specific languages quite constraining.  In fact, I had thought that the use of Python for modeling and optimization would be attractive to optimization researchers.  Python is a great language for prototyping a complex idea, usually without losing too much performance.  I and other Pyomo developers have implemented complex optimizers in Python that directly interface with Pyomo models, and in most cases the runtime is dominated by LP subproblems.  Thus, there has been little motivation to develop custom, highly-optimized codes in languages like C++ based on these solvers.

Friday, September 23, 2011

Using AsciiDoc for Mathematical Publications

Technical writing is an integral part of my research in computer science and operations research. I have a long history using LaTeX, which is very well suited for writing technical articles that contain mathematical equations as well as code snippets. Although LaTeX can readily generate postscript and PDF output files, I have been unimpressed with tools that generate HTML from LaTeX source. Thus, I was intrigued by AsciiDoc, which promises to generate PDF, HTML and eBook formats. AsciiDoc is used to provide online documentation for software projects, and authors can publish book through O'Rielly using this tool. Thus, this is a well-developed document generation tool.

I have successfully prototyped a draft book, Getting Started with Coopr, and you can browse the subversion repository for this document here. Note that the Makefile file specifies build targets for PDF, HTML and eBook files.

The advantage of AsciiDoc is that you can use a simple markup language to generate complex documents in a variety of formats. Since this is a generic document-generation process, it is reasonable to expect that there will be limited control of document formatting. (If you want a lot of control, you should just use LaTeX!) However, there are several major limitations to the document generation and format control that limit what you can do with AsciiDoc:

  • Portable Mathematical Equations:  There is only limited support for generating eBook documents that contain mathematical equations.  I noticed that the ePUB standard was just updated this month to support MathML, so it is not clear that e-readers can handline MathML right now.  Additionally, AsciiDoc does not support the generation of the MathML XML from a high-level description (e.g. LaTeX math equations).  Thus, a user cannot easily prepare a document that generates both PDF (using LaTeX under the hood) and ePUB (using MathML under the hood).  I guess we will have to wait a few more years to see robust publishing of mathematics for eBooks.

  • Formatting Mathematics:  For whatever reason, the default formatting of mathematical environments in HTML is not centered or indented (as it is in LaTeX).  Thus, it is much more difficult to read HTML documents containing mathematics.  I tried resolving this using an AsciiDoc filter, without luck.  I wound up rewriting the LatexMath macros to enforce this different formatting in HTML.  Unfortunately, these revised macros do not precisely match the syntax used by AsciiDoc.  {sigh}

  • Document Authors:  The AsciiDoc markup language does not provide a convenient way to create a document with multiple authors.  Yes, I am not kidding.  There is a docbook configuration file that you can provide, which only works if the document generation process goes through docbook;  in my example, that works for ePUB and PDF files.  Thus, there does not appear to be a single, portable way for specifying multiple authors.
  • Citations: It is noteworthy that none of the examples of online books referenced in the online AsciiDoc documentation contain citations or a bibliography.  The default format for bibliographies in AsciiDoc PDF files is as a numbered chapter or section, which differs from the normal convention in LaTeX (which I much prefer).  Thus, my AsciiDoc book uses the colophon section, which is not numbered.  However, that means that it does not show up in the table of contents.  {sigh}

    Another issue with citations is that the examples provided by AsciiDoc do not correctly generate hyperlinks in the PDF file.  Basically, the bibliography section type provided by AsciiDoc does not work well with the dblatex tool used to generate the PDF.  My solution was to not use the bibliography section type!

    Finally, the examples provided by AsciiDoc include citations in a list environment, which means that the PDF output contains a numbered list followed by a bracket citation reference.  Again, my solution was to avoid using the list;  the bibliography is simply a sequence of paragraphs, each of which is a citation with its associated anchor.
Despite these issues, I am planning to continue developing the Coopr documentation with AsciiDoc.  The lack of support for mathematical equations is a problem for ePUB documents, but most readers will be using this document to refer to the examples in python.  However, I would not consider using AsciiDoc for developing more complex documents, like a book intended for publication. There is too much customization that would be needed to get past the current limitations.


Friday, August 5, 2011

Coopr Download Fun

I stumbled across the following site, which provides download statistics for PyPI Python optimization packages:

   http://taichino.appspot.com/pypi_ranking/keyword/optimization

It was fun to see the different download package statistics. I'm quite curious that coopr.pysos has the highest number of downloads, since IMHO this package doesn't have much interesting functionality... Hmmm...

Another surprise for me is that so many optimization Python packages are _not_ on this list!  I am maintaining the following list of links for Python optimization packages:

   https://software.sandia.gov/trac/coopr/wiki/Documentation/RelatedProjects

Perhaps these packages were not tagged with the 'optimization' label ... or perhaps they were not downloaded enough to make the list?

Saturday, July 23, 2011

Python Optimization Packages

I have been maintaining a list of Python optimization packages for a while now on the Coopr Trac pages:  see https://software.sandia.gov/trac/coopr/wiki/Documentation/RelatedProjects.Today, I noticed that if you google for "python optimization packages", this page does not show up right away.  Perhaps Trac pages are index differently?  I'm not sure.

Anyway, I thought I'd add this reference here to help others find this list that I'm maintaining...

Coopr 3.0.4362 Release

We are pleased to announce the release of Coopr 3.0 (3.0.4362). Coopr is a collection of Python software packages that supports a diverse set of optimization capabilities for formulating and analyzing optimization models.

The following are highlights of this release:

- Solvers
* More sophisticated logic for solver factory to find ASL and OS solvers
* Various solver interface improvements
* New Solver results object for efficient representation of variable values
* New support for asynchronous progressive hedging

- Modeling
* Changes in rule semantics to limit rule return values
* Changes in the expected order of rule arguments
* Constant sums or products can now be used as constraint bounds
* Added full support for the !ConstraintList modeling component.

- Usability enhancements
* More explicit output from runph and runef commands
* Added support in runef to write the extensive form in NL format
* Add controls for garbage collection in PH

- Other
* Efficiency improvements in generation of NL and LP files.
* Significant efficiency improvements in parsing of Pyomo Data Files.
* More robust MS Windows installer (does not use virtual python
* environment)

Note that this is a major release of Coopr that changes the expected formulation of Coopr models. See the Coopr blog for further details about deprecated functionality, which will be disabled in future releases.

See https://software.sandia.gov/trac/coopr/wiki/GettingStarted for instructions for getting started with Coopr. Installers are available for MS Windows and Unix operating systems to simplify the installation of Coopr packages along with the third-party Python packages that they depend on. These installers can also automatically install extension packages from Coin Bazaar.

Enjoy!

- Coopr Developer Team
- coopr-developers@googlecode.com
- https://software.sandia.gov/trac/coopr/wiki/Documentation/Developers


-----------
About Coopr
-----------

Coopr is a collection of Python software packages that supports a diverse set of optimization capabilities for formulating and analyzing optimization models.

A key driver for Coopr development is Pyomo, an open source tool for modeling optimization applications in Python. Pyomo can be used to define symbolic problems, create concrete problem instances, and solve these instances with standard solvers. Thus, Pyomo provides a capability that is commonly associated with algebraic modeling languages like AMPL and GAMS.

Coopr has also proven an effective framework for developing high-level optimization and analysis tools. For example, the PySP package provides generic solvers for stochastic programming. PySP leverages the fact that Pyomo's modeling objects are embedded within a full-featured high-level programming language, which allows for transparent parallelization of subproblems using Python parallel communication libraries.

Coopr development is hosted by Sandia National Laboratories and COIN-OR:

* https://projects.coin-or.org/Coopr
* https://software.sandia.gov/coopr

See http://groups.google.com/group/coopr-forum/ for online discussions of Coopr.