IFOSSLR Tech Watch Submission

From SPDX Wiki
Revision as of 05:33, 19 November 2010 by K.stewart (Talk | contribs)

Jump to: navigation, search

Software Package Data Exchange(SPDX) Specification


Kate Stewart, Phil Odence, Esteban Rockett


(a) Kate Stewart works as the Ubuntu Release Manager at Canonical, Inc. After reviewing way too many standard projects for license and copyright info in her prior job at Freescale Semiconductor, Inc. she found a group of folk equally fustrated, and set out collaborating with them on defining a specification for sharing package licensing and copyright facts between projects and organizations.

(a) Phil Odence is Vice President of Business Development for Black Duck Software, makers of enterprise application development tools that address management, compliance and security challenges associated with open source. In that role, he is responsible for expanding Black Duck’s reach, image and product breadth by developing partnerships in the multi-source development, legal and open source ecosystem.


(a) Esteban Rockett is a Senior Counsel at Motorola, Inc.   He is responsible for Software, Applications & Digital Content Licensing.

Abstract

The goal of the Software Package Data Exchange (SPDX(TM)) specification is to enable companies and organizations to share license and component information (metadata) for software package and related content with the aim of facilitating license and other policy compliance. The specification is being developed by collaboration between technical, business and legal professionals to create a standard that addressess the needs of these various partcipants in the software supply chain.

Background

Companies at all points in the software supply chain are becoming conscious of the need to treat open source just like any other third party code. They need to know and document the components in the products and software they are consuming and distributing for a variety of reasons, not the least of which is to make sure they understand their legal obligations. Thus the need for a common approach to sharing information about software packages and their related content has never been greater. Breaking down information silos is still a work in progress. The Software Package Data Exchange working group was formed originally as FOSSbazaar sponsored effort, and is now a part of the Linux Foundation's Open Compliance Program[3]. The working group's goal is to figure out effective ways to share copyright and license information about a software packages and common licenses in that package down to the file level.

Why is a standardized specification needed?

Innovation happens very rapidly in the open source ecosystem, often by developers by building on top of the work of other developers. To do this, source code files, that have been created as part of one project under a specific license may be copied and reused in another project, which may be under a different license. This mixing and matching of licenses, creates problems for those companies reusing and redistributing software packages that have this combined software, when it comes time to figure out what they need to do to comply with the licenses that govern it. By creating a standard way of summarizing the licensing and copyright information to the file level, and have a way to double check that the summary they're reading actually matches the code, it makes the task of figuring out what license are in effect much easier. This permits creation of a software "bill of materials" that can be passed with the actual software, throughout the supply chain, saving considerable analysis effort at every step. Simply saying your company is doing the right thing is not enough: savvy users want proof to limit the risk of non-compliance with licenses. Suppliers themselves should welcome a single standard format for disclosing open source rather than having to respond to each manufacturer’s request using a unique format.

What does the SPDX specification consist of?

Although most software licenses convey intent, the SPDX(TM) effort is focused on coming up with a way to summarize the discoverable facts. By describing the solution to the problem as a ‘defined format of file to accompany any software package,’ the effort eases the exchange of license information between companies by looking at three areas: facts that deal with identification, facts that provide overview information, and facts that provide file-specific information about the software package.

Facts that deal with a software package’s identification (metadata) included in the specification are:

  • Which version of the SPDX specification is in use
  • Unique identifier (based on a cryptographic hash algorithm) representing a unique identifier that correlates with this specific software package
  • How the information was generated ( who, when, tools used, etc.)
  • Has an independent audit been done ( signoff/reviewed by ) process.

 

Facts that provide overview information about a software package’s content include:

  • Formal Name
  • Package File Name
  • Download Location
  • Declared License(s)
  • Detected License(s)
  • Copyrights and Dates

 

Facts that are specific to a software package’s file-specific properties:

  • File Name (including subdirectory)
  • File Type (source or binary)
  • Detected license(s) governing file (from file)
  • Copyright owners and dates (if listed)

 

Because of the license orientation of the specification, the Working Group is also committed to providing standardized license references. The spec includes:

  • License names
  • Unique identifiers for common open source licenses
  • Mechanisms for handling non-standard licenses.

 

The SPDX(TM) Working Group does not attempt to apply legal judgment, but rather provides a summary of the facts for other tools and professionals to make their judgements from.

How far along is the development and what are the next steps?

The Version 1.0 beta form of the specification is available for download at [1], but its just a starting point. It's had some road testing, but hasn't been driven by the public, so the group's focus will shift to driving practical applications and incorporating the inevitable feedback before we release the official version 1.0. The group is assembling a list of key projects for which to create SPDX(TM) reports, and to get those done by any method possible. Initially we expect members of the group to roll up their sleves on this live testing, but we're also working hard with other tool vendors, proprietary and open sources (like FOSSology and Ninka) to create other options for generating these reports. We also anticipate the need for tools (e.g. syntax checkers and reading and displaying tools) to enable this development, as well as training materials for educating others on using the standard.

Interested in Learning More and Helping out?

If you want to join our volunteer effort, and help make it better, there is information on how to participate available on our web site [1],[2]. Sub-groups with their own mail lists have recently formed around technical, business, and legal issues, and depending where your interests are, all are open and welcome new members to collaborate on the specific topic areas.

Conclusion

Getting the SPDX(TM) specification adopted across the ecosystem will be a challenge. We need participation and support from key Linux distros and package maintainers, legal experts, tool developers (commercial and open source) and package consuming organizations as well. With major players in all those categories already on board, and with the support from FOSSbazaar and the Linux Foundation[3], the pieces are finally coming together to let us achieve our goal of a useful specification.

References

[1] http://www.spdx.org/
[2] http://www.spdx.org/wiki/spdx/participation-guidelines

[3] http://www.linuxfoundation.org/programs/legal/compliance