THE SPDX WIKI IS NO LONGER ACTIVE. ALL CONTENT HAS BEEN MOVED TO https://github.com/spdx

Legal Team/only-operator-proposal

From SPDX Wiki
< Legal Team
Revision as of 04:28, 7 September 2017 by Jlovejoy (Talk | contribs)

Jump to: navigation, search

Introduction to Issue

Historical Background

Originally, the SPDX License List listed variations of licenses as separate "line items". Notably, various versions of L/GPL had two listed items: one for a specific version-only, and one indicating a version-or-later. This was indicated in the list via the inclusion of the words "only" and "or later" in the full name; and via the short identifier: GPL-2.0 and GPL-2.0+ respectively. (examples throughout this page use GPL-2.0, but this issue applies to the entire family of GNU licenses, i.e., GPL, LGPL, FDL, APGL).

For example, the pre-version-2.0 SPDX License List looked like this:

  GNU General Public License v2.0 only	  GPL-2.0	
  GNU General Public License v2.0 or later	  GPL-2.0+

When the license expression syntax was introduced in version 2.0 of SPDX, licenses with the "or later" indication were removed as separate listed licenses ("deprecated"), as the "or later" option could now be provided for via using the + operator. This enabled the use of the + operator with other licenses, as applicable.

The ability to create license expressions using the + operator along with the with operator also solved the issue of under-representation of license-exception combinations.

However, this also created a new issue whereby the standard header which is usually where the "or later" or "only" option is indicated in practice could no longer be differentiated or captured directly. This also resulted in the full license name having "only" with no real way to modify that to "or later" in the full license name where/when the +.

The original argument for the + operator was that it could be used with other licenses (not just GNU family). However, to anyone's memory we did, not at that time, conduct a full analysis of the other licenses with "or later" language or how they work in practice.

Other License with "or later" Clauses

Not all licenses that provide for later versions treat their application in the same way or as explicitly as the GNU family of licenses does. A list of the licenses with text related to later versions, relevant text, and an analysis of the licenses that have "or later version" type language are listed on this page: https://wiki.spdx.org/view/Legal_Team/later-version-clauses

Notably, most licenses that reference the possibility of later versions can be read to say you can take and redistribute the work under the license you found it with or any other later version with no explicit option of 'this version only.'

The CDDL family is slightly different in that the CDDL is “or later” by default and “only” with an explicit notice prohibiting later versions. There's currently no only operator for the CDDL, although some code (e.g. parts of FreeBSD's ZFS implementation) do declare that prohibition while other code (e.g. some illumos userspace tools) do not (and are therefore presumably CDDL-1.0+).

Issue

In practice, practitioners are not always explicit about whether a license is a version only or or later option. Also, because the "only" or "or later" distinction often is found outside the actual license text (e.g., in a license notice or header in the source files), machines (and humans) need a way to identify the license file itself without determining (yet) if the package is "only" or "or later".

The SPDX specification provides a way to distinguish between what is detected license files and the concluded file license as well as similar fields at the package level when creating an SPDX document. But before getting to that point, tools must have ways to correctly identify what is precisely found in the files without having to draw a conclusion. Also, as discussed in the 2017-08-08 meeting, some ecosystems restrict themselves to license expressions (e.g. npm allows SPDX 2.0 license expressions), and some tools attempt to detect package licenses by looking only at the license files and not at license-grant blurbs (e.g. Licensee, which is used by the GitHub license API).

Examples / Challenges

The scenarios below map out various examples and how one would identify the License Information in File using SPDX short identifiers and bearing in mind machine reading for this task. Some of these scenarios have been discussed on the various calls on this topic as to how one would identify the license short identifier for each file currently (and under the proposal, see below). In any case, these scenarios need to have clear indication as to appropriate interpretation:

  1. you find: 1 text file with the license text of GPL-2.0; and 4 source files with the standard header for GPL-2.0, which include the language "any later version"
    1. 1 text file with license text of GPL-2.0 = GPL-2.0
    2. 4 source files with standard header for GPL-2.0, which include the language "any later version" = GPL-2.0+
    3. Concluded package license = GPL-2.0+
    4. Note: in the current reality, GPL-2.0 for the license text file would mean "only", yet one cannot determine from that file alone if the copyright holder intends to use the version only or the or later option. Also note, that a machine can positively identify the "or later" option via the use of recommended standard license header.
  2. you find: 1 file with the license text of GPL-2.0; and 4 source files with the standard header for GPL-2.0, which omits the language "any later version"
    1. 1 text file with license text of GPL-2.0 = GPL-2.0
    2. 4 source files with standard header for GPL-2.0, which omits the language "any later version" = GPL-2.0
    3. Concluded package license = GPL-2.0
    4. Note: Same comment as #1 re: license text file. Also note, that a machine can positively identify the "only" option via the use of recommended standard license header which omits the "any later version" text.
  3. you find: no license file; 4 source files with the statement, “this is licensed under GPL"
    1. 4 source files = GPL-1.0+
    2. Concluded package = GPL-1.0+
    3. Note: The determination of GPL-1.0+ would most likely need to be made by a human. But given the language of the GPL stating, "If the Program does not specify a version number of the license, you may choose any version ever published by the Free Software Foundation." and no version indicated in any manner whatsoever, we can feel reasonably confidant that any version can be applied and the short identifier GPL-1.0+ covers any version.
  4. you find: 1 license file with GPL-2.0 license text; 4 source files with no license information whatsoever
    1. 1 text file with license text of GPL-2.0 = GPL-2.0
    2. 4 source files with no license information whatsoever = NONE (as per SPDX specification)
    3. Question: Is including a copy of a particular version of the license "the Program specif[ying] a version number of the license which applies to it"?

This approach will also fail to distinguish between MPL-2.0 and MPL-2.0-no-copyleft-exception, since that distinction is based on the blurbs. And there is currently no way to express “I just found these licenses and am not sure what the package license is” using only license expressions.

Goals

  • Provide way to get to "GPL-2.0-only" (or the like) identifier to enable better clarity for that option/scenario
  • Ensure we accommodate the reality of other licenses with language re: later versions such that all options can be represented appropriately
  • Ensure that there is some amount of consistency with what the short identifiers mean / Don't completely break the meaning for current users

Proposed Solution

Based on the joint legal/tech call on 17 August 2017, we coalesced around a solution similar to "alternative option" from 8 Aug joint call:

For licenses that may include “only” or “or later” (or both) clauses in the license text, having an explicit + and an explicit only, to use in SPDX identifiers reduces the chance that an unfamiliar user misinterprets the identifier and provides a way to be explicit and accurate that was arguably not available before. This leaves teh ability to use the "plain" license identifier (without an operator) to indicate the existence of the license text itself or other scenarios where it is not clear whether the "or later" or "only" options are articulated by the copyright holder.

  • Create an "only" operator defined as to be used to indicate 'only the version of the license is intended for use'.
    • add new "only" operator to Appendix IV: SPDX License Expressions of spec and explanatory language
  • The + operator would remain with the same definition.
  • In the spec, clarify that the license ID used by itself would indicate that the license text itself should be used to determine if later versions of the license could be used.
    • This avoids SPDX having to make an interpretation as to what the licenses mean/intend.
    • Allows for use of GPL-2.0 when you find just the license file with the text of GPL-2.0 and still provides option for Concluded License to be GPL-1.0+ if no other info is found
  • Remove "only" from full name of GPL family.
  • This solution allows the license text to speak for itself and uses the 'only' and '+' operators to be explicit in intent. It also avoids conflicting operators by retaining the "plain" license identifier.

Potential issues:

  • Backwards compatibility for GPL-2.0 meaning 'v2-only' to now meaning 'whatever the license says' creates some inconsistency as the default position of the license text with no other info is "or later" or "any version", not "only". However, it was pointed out on the call that those people conscientious enough to be using GPL-2.0 correctly, will be more likely to be conscientious enough to add the "-only" operator as needed/once it's available. All seemed to agree that use of GPL-2.0 is probably not thought through as to "or later" or "only", and thus in these situations, there is no different meaning.
  • Does not make clear indication of "ambiguous" situation to push people towards using 'only' and '+' operators to be explicit (preferred). However, we don't have a clear indication of ambiguity now either (is ambiguity ever clear?).
  • L/GPL language is clear that it is one or the other, so this could create more confusion and not reflect the intention of the license. However, all agreed that when you have a machine scanning code with no other license info other than a file with the license text of GPL-2.0, this is not entirely clear situation.