Technical Team/Best Practices
This is a place holder for working on the Best Practices document.
- 1 Introduction
- 2 Interpreting the Specification
- 3 Tools
- 4 Contributing to SPDX
- 5 Producing
- 5.1 Package Name (4.1)
- 5.2 Package Supplier (4.4); Package Originator (4.5); Source Information (4.10)
- 5.3 Package Download Location (4.6)
- 5.4 Concluded License (4.11); Declared License (4.13)
- 5.5 Extracted Text (5.2)
- 5.6 File Name (6.1)
- 5.7 Author vs. Creator (3.1) vs. Reviewer (7.1)
- 5.8 Use of Parentheses in License Tag Fields
- 6 Consuming
- 7 Notes from LinuxCon 2013 17 Sept 2013
- 8 Examples
Interpreting the Specification
Clarify and help with what is in the spec. Structure sections around the spec?
Best practices around using the SPDX tools
Contributing to SPDX
how to provide feedback, get involved, etc
Jack: This should be moved to the website.
SPDX Version: 1.2 PURPOSE: The SPDX specification is meant to stand on its own and to make clear how a field is to be populated. Still, there are times when more clarification is required. This tech note provides clarification with regard to certain fields about which questions of arisen. Some of these clarifications may be rolled into future versions of the specification.
Package Name (4.1)
The package name should be exclusive of version number. Field 4.2 Package Version is intended for version number and the package name should not redundantly specify this information.
- Package Name: glibc
- Package Version: 2.11.1
- Package Name: glibc 2.11.1
- Package Version: 2.11.1
Package Supplier (4.4); Package Originator (4.5); Source Information (4.10)
The first two fields are intended to where the package came from and what entity created it. In many cases these will be one in the same, but it is possible that the supplier may have gotten the package from another source.
Wind River supplies the Linux kernel.
- Package Supplier: Wind River
- Package Originator: linux.org
Source Information is a freeform field, which, like many such fields in SPDX is there to allow the document creator to provide information they feel would be useful or important, but which my not fit neatly into the specification.
Package Download Location (4.6)
The intent of this field is to indicate the URL of the location from which the package is actually obtained. Generally this should be the originating site of the package, but in cases where the package was obtained from a mirror site, the URL of the mirror should be used. The format for the URL should follow RFC conventions, specifically RFC3986 or any newer one that may eventually augment or obsolete it.
Concluded License (4.11); Declared License (4.13)
In cases where there is a contradiction between the Declared License and some other license present, the concluded license should represent that contradiction, and best practice would be to explain further in the 4.14 Comments on License field.
LEGAL TEAM SHOULD REVIEW THIS
A GPL 2 package that contains files licensed under Apache 2.0.
- Declared License: GPL-2.0
- Concluded License: GPL-2.0 and Apache-2.0
- Comments on License: Several Apache licensed files (A, B, and C) are included in the packages causing an incompatibility with the licensing of the package.
Extracted Text (5.2)
To clarify, or reinforce, the text included here, should be the exact text in that is included with the package and no more. Some early SPDX tools included full text of the relevant license even though the full text was not supplied in the actual package. The example in the specification is one where full text is included, but if the text is incomplete, so should be the text in the Extracted Text field.
A copying file in the top level of the directory says, “This software is licensed under the Beer License.”
- Extracted Text: This software is licensed under the Beer-ware License
- “THE BEER-WARE LICENSE" (Revision 42):<phk@FreeBSD.ORG> wrote this file. As long as you retain this notice you can do whatever you want with this stuff. If we meet some day, and you think this stuff is worth it, you can buy me a beer in return Poul-Henning Kam
File Name (6.1)
To clarify, the format for the file name should follow URI RFC conventions, specifically RFC3986 or any newer one that may eventually augment or obsolete it. Specifically, it uses the relative path reference format of an URI, and is defined as being relative to the root of the package from which the file came.
In RFC3986, section 4.2, a relative path reference must not start with a slash character ("/"). Relative references also do not need to start with a "./" (dot slash), although there is one format for which the preceding "./" is necessary. In any case, RFC3986 is clear about how to handle dot-segments, and in the case of "./", it is simply removed.
- FileName: ./package/foo.c
- FileName: package/foo.c
- FileName: /package/foo.c
- FileName: //package/foo.c
Note about RFC3986:
- This document obsoletes [RFC2396], which merged "Uniform Resource Locators" [RFC1738] and "Relative Uniform Resource Locators" [RFC1808] in order to define a single, generic syntax for all URIs.
Author vs. Creator (3.1) vs. Reviewer (7.1)
Author: The author is used occasionally in the text of the specification and generally refers to the creator(s) of the package. There is no explicit field for the author information other than copyright information and URL, which may or may not reflect the actual original author. In section 2.2.1, there is also a reference to an "SPDX Author"; this actually refers to the SPDX Creator.
Creator: The SPDX Creator is defined in section 3.1 and is used to identify who (or what, in the case of a tool) created the SPDX file.
Reviewer: The SPDX Reviewer is detined in section 7.1 and is used to identify who reviewed the content of the SPDX.
To put things into context, let's consider the following flow:
- An author creates a package and assigns a license to it, hopefully to each individual file, and also follows all of the best practices and obligations for the chosen or found licenses in the package.
- An SPDX creator analyzes the content of the package, extracts the pertinent information and assembles the SPDX file, whether manually or using a tool.
- An SPDX reviewer inspects the work of the SPDX creator, whether manually or using a tool. Consider the reviewer to be anything from another set of eyes, to a recognized reliable entity with some kind of sign-off authority.
The SPDX creator is a mandatory parameter - every file must have its creator. However, not every SPDX file is reviewed.
Use of Parentheses in License Tag Fields
In the RDF object model, licenses can be defined to be nested disjunctive or conjunctive sets in a very flexible manner. However, when using the Tag value format, it is not clear how, if at all, one could use the parentheses to define more complex licensing scenarios. It is not the intention to restrict either one of the formats, hence we view these additional examples as acceptable:
- LicenseConcluded: LGPL-2.0
- LicenseConcluded: (LGPL-2.0 or LicenseRef-2)
- LicenseConcluded: (LGPL-2.0 and (LicenseRef-2 or LicenseRef-3))
- LicenseConcluded: ((LicenseRef-2 or (LicenseRef-3 and LicenseRef-4)) and LGPL-2.0)
- LicenseConcluded: (LGPL-2.0)
- LicenseConcluded: LGPL-2.0 or LicenseRef-2
- LicenseConcluded: (LGPL-2.0 and LicenseRef-2 or LicenseRef-3)
Conceptually, much like in the RDF format, there are no limits to the depth of nested license sets, although practically, more than 2 levels are improbable.
This applies to all license fields.
Best practices around the process of doing it. Examples of how this is done.
Notes from LinuxCon 2013 17 Sept 2013
What should be in a best practices, how does it relate to the spec?
- particular questions (sort of like a FAQ)
- Could start with things that are not well defined but end up in the specification
- I need a field for X, its not there, what field could I use?
- best practices around the specification and best practices around contributing to SPDX. Maybe two documents?
- Snapshot best practices document at intervals and post on site. Use wiki for active discussions, new proposals, etc.,.
- Should we have a getting started guide?
- best practices for meta tagging like u-boot did. maybe link it in here but should be separate page. Could possibly include other information for developers supporting spdx and producing spdx friendly code. Look at things like U-boot, Mozilla, etc.,.
This section contains examples of working with SPDX documents.
Jack: In doing this I was looking for open source projects to use but Im now thinking we should "create" our own projects. We could right size them for the examples and we could take the code from other OS projects and store the examples and the SPDX docs in our GIT. These docs could then become a form of validation examples so people generating SPDX docs could compare them back?
Jack: Im also wondering of the examples below should be "singular" in nature. That is they illustrate one concept?
Converting 1.2 to 2.0
Jack: Show a SPDX 1.2 and then a version of it using 2.0 and explain how one would convert. We can use the spdx tools as the example. Not sure this belongs here. Could be its own thing? May fit better here if there are things you can do when converting that help.
Simple 2.0 Document
Jack: Just a simple use case. Small number of files, say less then 5. Easy to get your head around. Bill: see https://github.com/spdx/tools/blob/develop/TestFiles/SPDXTagExample-v2.0.spdx
Using the SPDX License List
- a site which provides a pick list of SPDX licenses, and based on choice stores the declared license for an OSS project
- an API for a site that returns the declared license of a project (e.g. github repo) in terms of SPDX license identifier
- a LICENSES.TXT file in a codebase that states the licensing for the code referencing SPDX license identifiers
- a comment section in a source file that states the licensing for the file referencing SPDX license identifiers
- Package Managers / Distros that state licensing in files specific to that package manager
- a package manager (e.g. Rubygems) that encourages use of SPDX identifiers by validating .gemspec files ?
- a distro that encourages use of SPDX identifiers in their spec files (e.g. Debian DEP-5 format, while not an SPDX doc, can/may use SPDX license identifiers)?
Multiple SPDX Documents in one
Jack: Use this for a test case of multiple documents related to one another and multiple documents in one.
External SPDX Documents
Jack: Binary referencing external source spdx documents. We could use SPDX tools here?
Below are some pairs of SPDX doc, referencing External SPDX Doc that illustrate a few scenarios
- Binary Jar file
- java source files it was built from
- original java source files (on apache.org)
- repackaged java source files (from a distro)
- original java source files
- patched (modified) java source files (having added a security fix)
Annotations (replaces Reviewer Comments)
Jack: Simple example with reviewer comments added after the fact.
Jack: Example showing use of all file types, especially when there are "choices".
“SOURCE” | “BINARY” | “ARCHIVE” | “APPLICATION” | “AUDIO” | “IMAGE” | “TEXT” | “VIDEO” | “DOCUMENTATION” | “SPDX” | “OTHER”
2.0 Spec allows for multiple filetypes
- a .jar file containing compiled .class files could be described as BINARY and ARCHIVE
- a .java file could be both SOURCE TEXT
- a binary PDF file could be BINARY DOCUMENTATION