THE SPDX WIKI IS NO LONGER ACTIVE. ALL CONTENT HAS BEEN MOVED TO https://github.com/spdx

Technical Team/SPDX Specification Versions

From SPDX Wiki
Jump to: navigation, search

For anyone wanting to add comments/questions/etc. directly in the document, so they get tracked without having to do a lot of version reference, please put your comments on a new line and use the following syntax:

(yyyymmdd initials comments)  

for example:

(20100407 KS does this make sense?)

 

Version: DRAFT 20100407

(20100310 JM Can we add a date or version to the document?  We should probably add a revision table when it goes live but in my view not necessary to have right now) (20100407 KS done)

 

Signed off:

(Approved for use by active participants in this specification effort, as indicated by name and email id)

1. Rationale

1.1. Charter

Create a set of data exchange standards to enable companies and organizations to share license and component information (metadata) for software packages and related content with the aim of facilitating license and other policy compliance.

1.2. Why is a common format for data exchange needed?

Companies and organizations (collectively “Organizations”) are widely using and reusing open source and other software packages. Compliance with the associated licenses requires a set of due diligence activities that each Organization performs independently: a manual and/or automated scan of software and identification of associated licenses followed by manual verification. Software development teams across the globe use the same open source packages, but they have not yet set-up a way to collaborate on license discovery – many groups are performing the same work leading to duplicated effort and redundancy. This working group seeks to create a data exchange format so that information about software packages and related content, may be collected and shared in a common format with the goal of saving time and improving data accuracy.

1.3. What does this specification cover?

1.3.1. Identification Information: Meta data to associate analysis results with a specific package. This includes a unique identifier to permit correlation of a specific instance of this data with a specific package.

1.3.2. Overview Information: Facts that are common properties for the entire package.

1.3.3. File Specific Information: Facts that are specific to each file (copyrights, licenses) that are included in the package.

1.3.4. Common Licenses: standardized way of referring to the common licenses likely to be encountered.

1.3.5. ?

4. What is not covered?

1.4.1. Information that cannot be derived from a visual inspection of the package to be analyzed.

1.4.2. How the data stored in this file format is used. After we agree on what should be specified; discussions on how it can be used, who will generate it, how it will be published, audited, etc., will happen outside the scope of this document.

1.4.3. ?

5. Format Requirements:

1.5.1. Needs to be in a syntax that humans can read and write.

1.5.2. Needs to be a syntax that tools can read and write.

1.5.3. Needs to be suitable to be checked for syntactic correctness independent of how it was generated (human or tool).

1.5.4. ? Character set to be used to support international naming. (follow Debian precedent?)

1.5.5. ? Actual specification of fields – below is illustrative rather than agreed on.

1.5.6. ? Discussion: XML vs. simple text to represent fields. Extent human understandable without tool still needs to be discussed.

2. Identification Information

1. One instance per package

2. Fields:

2.2.1. Version Number for the instance of the SPDX specification.

2.2.1.1. Purpose: version of SPDX specification to use to parse the rest of the file. This will permit future changes to the specification, and retain backwards compatibility.

2.2.1.2. Format: Version: N.N

2.2.1.3. Example: 1.0

2.2.2. Unique Identifier

2.2.2.1. Purpose: Need an independently reproducible mechanism that is agreed will permit unique identification of a specific package with this data. It must be able to determine if any file in the original package has been changed. Options under consideration: SHA256, ?

2.2.2.2. Format: UniqueID: ?

2.2.2.3. Example: ?

2.2.3. Generation Method

2.2.3.1. Purpose: identify how this information was generated. If manual – who, if tool – identifier and version.

2.2.3.2. Format: Manual: ”person name” | Tool: ”tool id - version”

2.2.3.3. Examples: ?

2.2.4. Creation Time Stamp

2.2.4.1. Purpose: Identify when the analysis was done.

2.2.4.2. Format: Created: YYYYMMDD-HH:MM:SS

2.2.4.3. Example: Created: 20100129-18:30:22

2.2.5. Independent Review/Audit

2.2.5.1. Purpose: reviewers of tool result, or other reviewer of original – equivalent to “signed off” or “reviewed by”.

2.2.5.2. Format: Reviewed by: “person name”

2.2.5.3. Example: ?

2.2.6. ??

3. Common Overview Information

1. One instance per package

2. Fields:

3.2.1. Formal Name

3.2.1.1. Purpose: Full name given by originator with version information.  ? Permit international extended characters in character string or restrict ?

3.2.1.2. Format: ?

3.2.1.3. Example: ?

3.2.2. Specific Package Identifier

3.2.2.1. Purpose: Machine name of package.

3.2.2.2. Format: identifier.suffix

3.2.2.3. Examples: foo.tar, foo.rpm, ?

3.2.3. Official Source Location

3.2.3.1. Purpose: identify where the original version of this package resides (at time of analysis).

3.2.3.2. Format: download URL

3.2.3.3. Example: ?

3.2.4. Declared License for Package

3.2.4.1. Purpose: use a standard way of referring to license and its version. See Section 5.0 for standardized license short forms. If more than one in effect, list license package defaults to and indicate alternate license is present.

3.2.4.2. Format: identifier | other

3.2.4.3. Example: ? (something like GPL2.0)

3.2.5. License(s) Present

3.2.5.1. Purpose: list of all licenses found in files in package by scanning

3.2.5.2. Format: identifier

3.2.5.3. Example: Y

3.2.6. –removed-

3.2.7. Formal Copyright Holder of Package

3.2.7.1. Purpose: identify the author and licensor of package. ? Permit international extended characters in character string or restrict ?

3.2.7.2. Format: ?

3.2.7.3. Example: ?

3.2.8. Formal Copyright Date of Package

3.2.8.1. Purpose: Identify the date this package was created. Individual files inside package may have different copyright dates.

3.2.8.2. Format: YYYY

3.2.8.3. Example: 2010

3.2.9. ???

4. File Specific Information

4.1. One instance for every file in package

4.2. Fields:

4.2.1. File Name

4.2.1.1. Purpose: identify path to file that corresponds to this summary information. version of this standard to use to parse the rest of the file.

4.2.1.2. Format: [directory/]filename.suffix

4.2.1.3. Example: bar/foo.c

4.2.2. File Type

4.2.2.1. Purpose: Identify common types of files where there may be different treatment of copyright and license information: source, binary, machine generated, ??

4.2.2.2. Format: ?

4.2.2.3. Example: ?

4.2.3. License(s)

4.2.3.1. Purpose: License governing file if known. This will either be explicit in file, or be expected to default to package license. Use a standard way of referring to license and its version. See Section 5.0 for standardized license references. If more than one in effect, list all licenses.

4.2.3.2. Format: ? [identier,]* [identifier | “string“]

4.2.3.3. Example: GPL2.0,BSD,”xyz license type”

4.2.4. Copyright(s)

4.2.4.1. Purpose: identify the copyright holders and associated dates of their copyright that are in this specific file if known. Note: Copyright holder identifier may have developer names, companies, email addresses, so we’ll probably need a generic string mechanism (including international characters). Since there may be multiple per file, need a way of having separators between them.

4.2.4.2. Format: [ “copyright holder”:”date(s)”]*

4.2.4.3. Example: “Linus Torvalds”:”1996-2010”

4.2.5. ?

5. Standard License Identifiers

5.1. Rationale for licenses to choose to standardize identifiers. Focus on standardizing the most commonly used rather than all. Align with any other standardization efforts underway here that will meet the need.

5.2. Table of standard licenses and their identifiers

<tbody> </tbody>

Identifier

Full name

Official Source Text

GPL2.0

GNU General Public License (GPL) Ver. 2, June 1991

http://www.gnu.org/copyleft/gpl.hmtl

GPL3.0

...

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6. Definitions

1. Package: ...

2. Date range: [YYYY,]*[YYYY-]YYYY syntax for multiple ranges needed.