EPANet 3 is due sometime next year and it will probably be the last version supported by the US EPA. Last year, a proposal was made at the 2009 EWRI conference in Kansas City to establish an Open Source Project for EPANet. About two weeks ago, at the WDSA 2010 conference held in the Tucson AZ, Lew Rossman and Kobus van Zyl raised the issue again. During the conference I talked with a few people and got a few opinions, all of them positive I must say. There are a number of issues that needs to be resolved before this kind of project is initiated and in this post I will try to present my views.
I’m all for open source as I have been involved for a few year now with an open source project called WordPress. WordPress is the software that runs this blog and it is used by millions of people all around the world. I have been a part of the Israeli WordPress community by contributing some patches and helping with localization. Earlier this month we held the third WordPress user conference in Israel called WordCamp. All is done by volunteers for the community.
What is Open Source?
Many people think of open source, or free software, as a matter of price, a software that is given for free. Actually it has nothing to do with price but with liberty. To understand the concept, one should think of “free” as in “free speech”, not as in “free beer”. Usually it is said that users should have four essential freedoms (source: The free software definition):
- The freedom to run the program, for any purpose (freedom 0).
- The freedom to study how the program works, and change it to make it do what you wish (freedom 1). Access to the source code is a precondition for this.
- The freedom to redistribute copies so you can help your neighbor (freedom 2).
- The freedom to distribute copies of your modified versions to others (freedom 3). By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.
An open source project is usually managed online over the Internet and is started with a small group of people that expands as time goes by. The community is self-organizing with almost no official organization. The most common case is that the majority of the community are users and not contributers. The group of contributers divide between core developers, beta-testers, documenters and many people who reports bugs and make suggestions. In most cases, the small core team have the final word regarding what will go in the program and what will not. In the case in which part of the community thinks differently than the core team they may create a “fork” and form a new project (remember that its open source and anyone has the right to do so). However, this is a rare case as the threat of “forkability” is one of the things which holds the community together (Producing Open Source Software by Karl Fogel):
The indispensable ingredient that binds developers together on a free software project, and makes them willing to compromise when necessary, is the code’s forkability: the ability of anyone to take a copy of the source code and use it to start a competing project, known as a fork. The paradoxical thing is that the possibility of forks is usually a much greater force in free software projects than actual forks, which are very rare. Because a fork is bad for everyone, the more serious the threat of a fork becomes, the more willing people are to compromise to avoid it.
In many cases there is one person who holds the “keys” for the project. Since the project is managed online it “lives” on a web server. Currently EPANet is hosted on the EPA’s servers and they control it. When the project will move to a new “home” it will be hosted on a different server which is controlled by some else. At any given time there is only one entity that controls a server. But, no worries. Remember forkability?
A few technicalities
A project is hosted on a web site that should include at least two elements: a version control tool to allow management of the source code and a bug tracking tool. The tools that the WordPress project uses are trac and SVN which are also open source projects. These are self-hosted tools but there are many hosted services like SourceForge and Google Code which ease the project’s management burdens. To get an idea of what these systems offer look at WordPress versions page, the software timeline, the bugs tickets and a small change to the code. In addition to these two tools the community may use a mailing list, support forums, chat rooms, IRC and even, god forbid, face-to-face meetings.
Using a self-hosted solution allows more control on everything. However, with authority comes responsibility. For the self-hosted option the project will need someone to manage the hosting server, keep all software updated, make backups and more. This is a responsibility and also comes with a cost. I think that the EPANet project should start with a hosted solution like Google Code and if\when it grows the community may consider moving to the self-hosted option.
Scope of the software
EPANet is not very different from other projects. Lew reveled that in the first 8 month of 2010 EPANet was downloaded about 20,000 times! What we can learn from this number is that the vast majority of EPANet users are people who use EPANet’s graphical user interface (GUI) and not software developers who uses the programmer’s toolkit or directly use the source code. EPANet is somewhat unique in the fact that it is built from two sub-programs: the hydraulic\quality engine and the GUI. While the first component is most important mainly to the research community the latter is used by most of the program users. I think that EPANet, as an open source project, must include both components (engine and GUI).
EPANet was developed by the US EPA. As such, it is a public domain software that may be freely copied and distributed. In other words there is absolutely no ownership. As strange as it may sound, anyone may download the program and sell it as it is. Yes, that’s right, sell it. However, open source is not about money, its about liberty (as explained above).
Each open source project usually have a license and there are plenty of them. There are a few licenses that are popular and widely used or with strong communities: Apache, BSD, GPL, LGPL, MIT and a few more. There is no need to go into each of them and explain the differences as in our case it is enough to consider the GPL and MIT licenses. The MIT license is one of the most simple ones. It actually states that anyone is entitled to do whatever they want with the software with the exception that the authors can’t be held liable in any case. The GPL license is much more complicated but the main difference is that when one redistribute the program, or any derivative of the program, he must do that under the same license. For example, let us consider the case that one would add a new feature to EPANet, like a new valve type, compile the code and sells it to a third party. In both licenses he has the right to do so. However, with GPL, since the new software is a derivative of the original, it is also licensed under GPL. And according to the license the third party must be granted the right to change the code and redistribute it thus he must have access to the source code. I guess that by now you understand the main difference: it would be difficult to sell proprietary software based on a GPLed license EPANet.
If the community would like to allow proprietary software based on EPANet then the MIT license is the way to go. While I was in Tucson I had a chat with a few people on this subject. Some, mainly from the academic research community, said that they would not feel right contributing to the project if commercial companies would be able to make money from their work and not give back to the community. Unfortunately I didn’t have the time to discuss this subject with the corporate people but I will try to get their response. I’m not a lawyer but there are legal and practical ways around this. I think that the project should start with the MIT license and see what understanding may be achieved later on with the big companies.
The case of CWSNet
A few days before the WDSA 2010 conference Dragan Savic announced via the EPANet user’s list the availability of a new open-source, object oriented, library for water distribution system modelling named CWSNet. Zoran Kapelan presented the library at the conference and it seems that it performs very similarly to EPANet but still somewhat slower with large networks. The main advantage of CWSNet is the fact that is object oriented and simplifies the understanding of the code. An example given by the CWSNet authors to show the need for an object oriented library has to do with matrix representation:
Another problem in the design of EPANET is that the items that are used in various parts of the hydraulic computation (such as the matrix representation) do not have a specific interface. For example, the matrix that represents the linear systems of equations is accessed directly by the various methods, i.e. these methods know how the matrix is represented in the memory. Therefore, if a programmer needs to change the internal representation of the matrix, he or she needs to change all the methods that interact with it.
I think that in the long run it would be better if EPANet would be object oriented so the understanding of the code would be much more simple and it would be easy to replace software components. This structure have an additional benefit license wise, as it would be much more easier to include GPLed components in the MIT licensed project. Don’t get me wrong, I think Lew’s code is brilliant. I still use code snippets from the Visual Basic version on EPANet 1.0.
The introduction of CWSNet, a few days before the conference, seemed to spark a debate among the participants. At one moment it looked like its the Europeans against the Americans. I may got it wring but as I understand it, Lew wants to stay with the current structure for EPANet and not move to an object oriented structure. I think that for version 3.0 this is the way to go as CWSNet is still young, hadn’t been tested and used by the community and currently does not include the water quality engine nor the GUI. A lot of work is needed to bring it to where EPANet currently stands. Once version 3.0 is released and the community takes over a process should be made to merge the work done in CWSNet into the project.
How to proceed?
It may take up to one year for EPANet 3.0 to be released and I have no doubt that the open source project should officially start from that point. However, things may be done in the meanwhile. It is up to the US EPA and Lew Rossman to decide what will be included in version 3.0 and hope this decision would be made soon. Another decision is what will be the GUI development environment – Java or C# (my impression was that it’s going to be Java).
Once these decisions are made I hope Lew will be able to release an alpha version as soon as possible. It does not really matter how raw is this version and how many bugs there are. This alpha version is not to be used by the general public but should be used to set up the open source development environment (Google Code would make a good option). Once the environment is set up people from all over the world could start and learn how to work with it. How to make suggestions, report bugs, suggest code patches and more. If Lew could release also a beta version it would be even better. After the final release of version 3.0 the new code would be uploaded and the community will take it from there.
- The EPANet project should start with a hosted solution like Google Code.
- The project must include the core engine and the GUI.
- At first the MIT license should be selected.
- Jim Uber suggested that a new name will be selected for the project. This is an option but the community must make sure that the fiasco of the epanet.com domain name will not return.
- Object oriented structure should be introduced in steps using the work done with CWSNet once the open source project gets going.
- An alpha version of EPANet 3.0 is essential to set up the open source environment.