Re-defining Open Source
7/10/2007 9:29 PM
It seems that I am seeing more and more discussion around what constitutes Open Source software and Open Source projects. Not only do you have the Free Software group who follow the Richard Stallman philosophy and the Open Source group who fall more into the Eric Raymond camp, but lately though you have a number of people for whom purity is defined not just by how closely your code adheres to the Open Source or Free Software definitions but also by how closely your project follows The Bazaar development methodology. For people in this camp Open Source did not truly exist before Linus started Linux.
In his post on Defining Open Source, Jeff Atwood lays out the requirements for a project to participate in his program to support .Net Open Source projects. In defining Open Source Jeff tries to distinguish between projects that merely pay lip service to open source but that don't truly supporting the tenets of Open Source. Jay Wren in his follow up post, Re: Defining Open Source, tries to make the point that to be Open Source means that every part of your development process needs to be totally transparent and open to the public.
In a nutshell, Jeff and Jay's arguments go something like this: If you do not allow unfettered access to the source code and nightly builds then you are somehow not truly an open source project. To be really pure you must accept code from everybody and their brother and give the whole world access to all of your projects communication channels.
I say this is a bunch of rubbish. Neither the Free Software nor Open Source definitions make any distinction with regard to the actual development methodology. They are both clearly focused on the software and the license under which it is distributed. They are basically a backlash against the restrictions which are imposed by many software licenses and are an effort to remove restrictions which would prevent a user from gaining the most benefit from your software. I invite you to look through both the GNU and OSI sites for yourselves. You may run across several statements mentioning the benefits of a massively distributed development model, but nowhere that I can find is that development model tied to the concept of free or open source software. So let's look at what Jeff and Jay are saying here and why I think it is wrong to set such requirements to be considered Open Source.
The FOSS movement is a philosophy that tries to do away with many of the bad licensing practices that have been around since the early days of computers and that were exacerbated with the rise of the personal computer. These licenses were including more and more onerous terms which were slowly limiting consumers abilities to use the software. Often users would not understand the full import of a license term until they were embroiled in a mess like the one in which Jaime Cansdale currently finds himself. To counter these practices Richard Stallman created the GPL and started the Free Software Foundation. His goals were simple - to remove the barriers to use from the software. To give users back the freedoms which he feels were unfairly restricted under copyright laws.
Some people found that the term Free Software and some of the principles that Richard was espousing, were too radical to be accepted by business. Since many projects are typically designed for business use, they felt it was important that business not be turned off by the stigma associated with "free" software. Thus was formed the Open Source movement and the Open Source Initiative. Neither of these movements were trying to define the ultimate development methodology or to prescribe the method by which software was created. I believe that there are good reasons not to follow this path.
The underlying concept of copyrights, which is at the heart of the FOSS movement, has been around since before the founding of the United States. It is codified in our Constitution and in our laws and in international laws and treaties. With the advent of computers and computer software, copyrights were extended to cover software. The concept of the ownership of ideas and words, and the right to profit from that ownership, is a timeless principle that has held for many many years, and is one that is not likely to substantively change anytime soon. The principles upon which the FOSS movement were founded are equally timeless in that they seek to overturn or limit the application of copyrights to software.
Contrast this with development methodologies like those described in Eric Raymond's book "The Cathedral and the Bazaar" which is a collection of essays describing what he views as the ultimate open source development methodology. In his book, Eric lays out the case for the use of a widely distributed and very open development methodology as a way to improve the software development process. He attempts to equate open source software with "the process of systematically harnessing open development and decentralized peer review to lower costs and improve software quality". The problem with this, is that it is not a timeless principle. Development methodologies change and evolve, in many cases rather dramatically, over time. In just the last 10 years you have seen the demise of traditional waterfall methodologies and the rise of more agile approaches. These changes occur as we better understand how to develop software ever more rapidly to better meet the needs of the consumers of that software. While Eric lays out the case for the "Bazaar" being a more efficient development model than the "Cathedral" approach, he admits that it is an evolving concept which has, and will continue to change. Just because those methodologies work well in today's programming environment and with current development tools, does not mean that it will always be the best approach for writing software, or that it is the best approach for every project and every project team.
The second point here, is that Eric describes a methodology that is neutral with regard to exactly what it looks like to harness "open development" and what does it mean to have "decentralized peer review". How much of each of these does a project need to be considered "open source" under Eric's definition? In his essay, "How many eyeballs tame complexity", which is included in his book, one of the key benefits from this model is derived from the ability of testers to see the code and therefore provide much more meaningful bug reports. Now instead of describing erroneous behavior and the steps which caused it, testers can actually find the exact lines of code which cause the failure and provide this information to the developer for incorporation back into the code. The only requirement to achieve this benefit is for the testers to have access to the source code. It does not require daily access or nightly builds to receive the benefit.
Eric goes on to describe the problem with giving unfettered access or having communication channels that are too wide open. The resulting chaos from such an approach actually increases the number of bugs introduced into the project. This is essentially what is described as Brook's law or put more succinctly "too many cooks can spoil the stew". To combat this effect, open source projects have adopted a core plus halo organization where you have a very small group of developers who work on the heart of the system and where most of the critical communication occurs, and then you have various halo groups who work on separable parallel sub-tasks. Again, this approach does not require complete and open paths of communication. In fact it argues against this. Instead what is advocated is that there are open communication paths to the various halo groups who in turn have open communication paths with the core group. This keeps the chaos in check but still allows essential bug-reports and code changes to make their way back to the core, without overburdening the team.
In the end, the very sources used to justify the "wide-open" definition of open source do not advocate nor require the kind of structure or openness that people claim is necessary to meet their new definitions of Open Source. Even with that, I think there are still some additional points which argue against the re-definition of open source.
We currently code in an environment in which companies are increasingly using the courts or the threat of legal action to try and crush competition. Companies are amassing large stockpiles of software patents to use as both offensive and defensive weapons against their competitors. Companies like SCO are launching legal attacks, not against the developers or organizations behind open source software, but rather against the customers who use the software. Even Microsoft has launched a FUD campaign against Open Source to get reciprocal licensing deals from various distributors of open source software. These attacks do not come against the core project or the primary organization behind the project, but rather are focused on the project's customers. The reason for this is simple: money. In general open source projects and their organizations have none. You could try to sue the project out of existence, but because of the distributed nature of Open Source projects, and the freedoms granted under the OS licenses, that would likely be a very costly and losing proposition. However the customers and the distributors of open source software have a ton of money. If you attack, or create uncertainty in the minds of the customers, then you can limit the spread of the Open Source software.
Since I am not a lawyer, I will not try to go into the legal aspects of this beyond saying that these attacks rely on the use of copyright and patent law to claim that the Open Source software is violating the other companies intellectual property. This gets us back to the discussion of the wide-open model of development which some are now advocating as the true definition of Open Source. Any model which allows for a virtually unlimited flow of code into the core project necessarily brings with it the risk that the code may be covered by other copyrights or patents. As the number of contributors to a project increases, or as the size and complexity of contributions increase, the risk to the project and also its customers also increases. Where did the code really come from? Did someone google it? Did someone copy it from CodeProject? Maybe it was grabbed from some GPL'd project on SourceForge or even in the worst case it was copied from some project they have at work. The Bazaar model of development says that it is all good... let it all in. It is more efficient and it will make the software more bug-free.
These risks are not unreal nor are they simple to completely guard against. However, I think that in this day and age anyone who does not practice "safe open source" is asking for the source code equivalent of AIDS. What do I mean by "safe open source"? By that I mean that a project needs to put some boundaries in place that attempt to minimize the likelihood of infected code (that is code that is covered under an incompatible copyright or patent, or code which the contributor does not own) entering the project. These protections should also help minimize the damage that can be done in the event that infected code is incorporated.
Just like with "safe sex", "safe open source" promotes the practice of identifying risky behavior and then seeks to either avoid that behavior or to minimize the risks that come with engaging in the risky behavior. In open source, this means that project should identify what level of contributions they feel are safe to accept from strangers. How many lines of code is it before someone crosses into that gray area? Does that code use some sort of trick or hack that might be considered a trade secret or patented idea? These are questions your project needs to figure out before blindly accepting a patch or bug fix.
At some point the code will reach a size and complexity at which it will likely be covered by copyright or patent law. On the DotNetNuke project, we have discussed this issue with our attorney Mark Radcliffe who advised us that we needed to ensure that we had contributor agreements in place which ensured that we had the rights to use and modify the software as needed and that provided some protection against claims of copyright infringement. Taking affirmative steps to ensure your are making a good faith effort to comply with copyright laws will go a long way towards mitigating any damage that may occur in the case where you find that some code has mistakenly been included in the project. This level of contribution is not something that happens lightly. We like to know who is contributing the software and have worked with them in some capacity for a period of time. This essentially requires a certain level of trust on our part that backs up the legal agreements we put in place. In some cases we can look at other indicators of trust such as the inclusion in some community organizations like AspInsiders, Microsoft MVP program, or even personal referrals from people on the team.
Here is the final problem with the re-definitions of Open Source: they ignore the fact that every project, whether open source or proprietary, must find a development methodology and a management approach that works for both the project and the community. Attempting to dictate some software practice for every project, whether it works for the project or not, is counter to the whole point of the Eric's essays which was to reduce cost and improve reliability. I believe that a project needs to find out what works for them. For some projects it will be OO and Agile methodologies in a Bazaar model. Other projects will work better in a more closed model that uses less iterative methodologies. Even a single project will go through different stages where one development approach works better for the project. To arbitrarily define one development approach as being better or more pure than another is counter-productive. Just like users should be free to use the code how they wish... Developers should be free to develop code using a style that suits them and their project. Projects should not be forced into a development style that doesn't work for them just because that style is de rigueur.
So how does all this apply to DotNetNuke? I think that we clearly fall under the definition of both Free Software and Open Source software. I also believe that we attempt to run the project consistent with open source ideals. We have changed our development style a couple of times over the life of the project to match our experience as to what worked for us and our community and what didn't work for us. Our development model is not written in stone. If we find that a more open approach is warranted and this openness can be balanced against our desire not to overcomplicate our communication channels or introduce additional legal risks, then are certainly willing to make the changes. You have seen some changes in the project recently: You can now view our release tracker and see where each project is in the process of getting out a new release. You can vote and comment on your favorite feature requests. You can even enter bugs and suggested bug fixes in our bug tracker. As always you can still discuss what you like and don't like about DotNetNuke in the DotNetNuke forums. As we formalize our release management processes we will also likely open up patch submission.
The DotNetNuke development practices allow users to see and modify the source code and to provide meaningful bug reports based on the users understanding of the code. Our practices also allow us to leverage a large development community without also overwhelming our communication channels. In the end, I think that the current DotNetNuke practices provide the major benefits of the Open Source philosophy while also recognizing the current legal environment and the practices that actually work for our management team.
3 comment(s) so far...
By Haacked on
7/10/2007 2:51 PM
re: Re-defining Open Source
I totally agree that DNN falls under any reasonable definition of Open Source.
However, I don't think Jeff meant to "redefine" Open Source. Rather, I felt he was setting a higher bar for the purposes of his contribution.
Perhaps he would have been better off saying that he was defining what it means to be an "Active" open source project and a list of criteria where he can in an automated fashion, determine that. Obviously DNN is an active open source project, and just by looking at it, I can tell that. But for the multitude of project submissions he will receive, there's no clear unambiguous criteria for determining that. So he had to choose something.
Besides, do you guys really need his money? ;)
By Joe Brinkman on
7/10/2007 4:17 PM
re: Re-defining Open Source
@Phil - The implication from Jeff's post is that there are some criteria which distinguish a project which is not just paying lip service to the Open Source philosophy. The follow up post by Jay makes it very clear that I was not the only one who believed that was the import of Jeff's post. But regardless of Jeff's intention, it is a sentiment that has been expressed by many people in the OS community. There is a subtle hostility to OS projects which don't fall into a certain mold. If they run on .Net, if they have an active commercial eco-system or even if they don't have a public repository. This hostility manifests itself in efforts to re-define OS. In many cases, there are many individuals in the .Net world who think that OS is about no-cost software or about giving everyone access to change the code as they see fit. I believe that OS projects have a responsibility to educate their communities about what the ideals are behind the Open Source movement and to show how their project fits in with the overall OS philosophies. I also think it is important for projects to communicate why they make some of the decisions they do with regards to how the project is managed.
As for the money - we have lots of plans we have yet to implement this year, and every little bit helps us do more for the community like hosting the OpenForce conferences.
By Haacked on
7/13/2007 10:03 PM
re: Re-defining Open Source
> The implication from Jeff's post is that there are some
> criteria which distinguish a project which is not just paying
> lip service
I think the implication is that there are certain criteria which are *easily verifiable* and *provide evidence* that a project is not paying lip service to Open Source.
I don't think, as Shaun seems to imply in the comments, that this was a direct "singling out" of specific .NET OS projects (by which I assume he means DotNetNuke).
In fact, if you read carefully, it seems this criteria was meant to single out projects like NDoc which might only have one developer and never released the code for NDoc 2. However, by unfortunate quirk of fate, this very same criteria would seem to rule out DNN. It's the law of unintended consequences at work.
As you and I know, DNN is one of the most successful Open Source projects on the .NET platform and meets every definition of what Open Source is. It's obvious.
And DNN was added to the spreadsheet. I can't speak for Jeff, but I assume he'll make an exception to his criteria for DNN because nobody would argue that DNN doesn't qualify as an OSS project.