THE SPDX WIKI IS NO LONGER ACTIVE. ALL CONTENT HAS BEEN MOVED TO https://github.com/spdx

Difference between revisions of "Technical Team/Proposals/Rough proposal for provenance, hierarchy and aggregation, and supply chain friendliness in SPDX 2.0"

From SPDX Wiki
Jump to: navigation, search
Line 1: Line 1:
 
<p>A desire has been expressed to be able to have SPDX be capable of expressing</p><p>&nbsp;</p><ol><li>Hiearchy ( package A contains packages B, C, etc)</li><li>Authentication ( we can know precisely who said what and when about a package)</li><li>How software flows through a supply chain (upstream to packager, through several intermediate vendors to consumer)</li></ol><p>A rough example of this thought is shown in the diagram below, showing how the coreutils package might be represented:</p><p>&nbsp;<img src="http://spdx.org/system/files/spdxdoodle_0.jpg" alt="" width="586" height="372" /></p><p>&nbsp;The simple story behind this diagram is this:</p><ol><li>The upstream maintainer of coreutils provides an SPDX file which</li><ol><li>Provides information for the copyrighted entity that is the package as a whole</li><li>Provides embedded information for the copyrighted entity that is each file in the package (same format, just embedded and clearly down hiearchy)</li><li>Provides a coreutils.spdx.sig file with the signature for the coreutils.spdx file (so we can authenticate it)</li></ol><li>This coreutils.spdx file is in the coreutils.tar.gz for the upstream</li><li>The rpm (or deb) packager creates a coreutils.spdx (distinct from the one for the upstream) in the rpm file which:</li><ol><li>Provides information for the copyrighted entity that is the rpm (or deb) package as a whole</li><li>Provides embedded information for the copyrighted entity that is each file (such as patch files) contained in the rpm (or deb) package</li><li>For the coreutils.tar.gz file (also contained in the rpm or deb package), provides it's SPDX information by *referencing* the coreutils.spdx in the coreutils.tar.gz file.</li><li>Optionally provides and Annotation section to 'annotate' some of the information provided by the coreutils upstream.</li></ol></ol><p>&nbsp;</p><p>Diagram for a Concrete proposal (very very rough) for structure (note, notes that say 'Concrete' or 'Referential' are just indicating an 'or' in the doc structure):</p><p>&nbsp;</p><p><img src="http://spdx.org/system/files/spdxdoodle2_1.jpg" alt="" /></p><p>Description of diagram</p><ul><li><strong>Top:</strong> Simple top level place to start</li><li><strong>SPDXFile</strong>: File containing SPDX data</li><li><strong>SPDXElement:</strong> The containing element for SPDX data for a given copyrightable work contained in the SPDXFile. &nbsp;It's SPDXElements all the way down.</li><li><strong>Specifier:</strong> Not really a node, sort of a grouper of nodes to indicate those fields which specify the 'thing' the SPDX Element is about</li><li><strong>LicenseData</strong>: not really a node, sort of a grouper of nodes to indicate those fields which specify what we know about the 'thing' this SPDX Element is about</li><li><strong>SPDXElements:</strong>&nbsp;zero or more additional 'contained' SPDX Elements referring to contained things (like files, or contained tarballs etc).</li><li><strong>Annotations:</strong> zero or more annotations indicating additional information about the contained SPDXElements (to handle the case where a contained SPDX Element represents a reference to a another SPDX file that is signed and thus we can't change directly) - Note, we need more thought here.</li><li><strong>Creator (Annotation):&nbsp;</strong>Equivalent to SPDX 1.0 Creation Information Creator</li><li><strong>Date (Annotation):&nbsp;</strong>Equivalent to SPDX 1.0 Creation Information Created</li><li><strong>Comment (Annotation):</strong> Equivalent to SPDX 1.0 Creation Information Creator Comment</li><li><strong>AssertNewLicense (Annotation):</strong> Reference to new License Data you wish to assert to override existing SPDX License Data. &nbsp;Generally used in situations when we have existing License Data from a more primary source but we believe we have reason to believe otherwise.</li><li><strong>Name</strong>: Equivalent to SPDX 1.0 Formal Name</li><li><strong>Version:</strong> Equivalent to SPDX 1.0 Package Version Information</li><li><strong>Supplier:</strong> Equivalent to SPDX 1.0 Package Supplier</li><li><strong>Summary:</strong> Equivalent to SPDX 1.0 Package Summary Description</li><li><strong>Description:</strong> Equivalent to SPDX 1.0 Package Detailed Description</li><li><strong>URI (in SPDXElement-&gt;Specifier):</strong> URI of the copyrightable thing being referenced, may point to a file, an archive, a package, etc.</li><li><strong>URICheckSum( in SPDXElement-&gt;Specifier):&nbsp;</strong>Checksum for the thing URI points to</li><li><strong>CopyrightText:&nbsp;</strong>Equivalent to SPDX 1.0 CopyrightText</li><li><strong>LicenseText:&nbsp;</strong>Full text of license if LicenseShortForm isn't available</li><li><strong>LicenseShortForm:&nbsp;</strong>License short form in lew of license text if available</li><li><strong>SPDXFileURI:&nbsp;</strong>If the SPDX Element does not contain it's own concrete license data but references an external SPDX File... the URI of that SPDXFile</li><li><strong>SPDXFileSigURI:&nbsp;</strong>If the SPDX Element references an external SPDXFile, the URI of the sig file for that SPDX file</li><li><strong>ACL:</strong> I hate the name ACL, but basically it's a way of specifying that you are including or excluding some of the copyrightable bits that are covered by the referenced SPDX File.</li><li><strong>Exclude (in ACL):&nbsp;</strong>Used to specify parts of the stuff referenced by the external SPDX file you are not bring in. &nbsp;So if I am using all of a package, but not foo.c or bar.c.</li><li><strong>ExcludeAll (in ACL):</strong> Used to indicate that *none* of the referenced copyrightable items from the SPDX file are used except those explicitely included.</li><li><strong>Include (in ACL):&nbsp;</strong>Used after an excludeall to indicate we are only using the specifically included files... say we are just using foobar.c for example.</li><li><strong>SPDXFileSig</strong>: Separate file containing the signature for the octets of the SPDXFile&nbsp;</li></ul><p>&nbsp;</p><p>This can also be visualized with a UMLish diagram:</p><p>&nbsp;<img src="http://spdx.org/system/files/spdxdoodle_1.jpg" alt="" /></p><p>Mapping SPDX 1.0 Fields to Proposal</p><p>TBD</p><table><tbody>
 
<p>A desire has been expressed to be able to have SPDX be capable of expressing</p><p>&nbsp;</p><ol><li>Hiearchy ( package A contains packages B, C, etc)</li><li>Authentication ( we can know precisely who said what and when about a package)</li><li>How software flows through a supply chain (upstream to packager, through several intermediate vendors to consumer)</li></ol><p>A rough example of this thought is shown in the diagram below, showing how the coreutils package might be represented:</p><p>&nbsp;<img src="http://spdx.org/system/files/spdxdoodle_0.jpg" alt="" width="586" height="372" /></p><p>&nbsp;The simple story behind this diagram is this:</p><ol><li>The upstream maintainer of coreutils provides an SPDX file which</li><ol><li>Provides information for the copyrighted entity that is the package as a whole</li><li>Provides embedded information for the copyrighted entity that is each file in the package (same format, just embedded and clearly down hiearchy)</li><li>Provides a coreutils.spdx.sig file with the signature for the coreutils.spdx file (so we can authenticate it)</li></ol><li>This coreutils.spdx file is in the coreutils.tar.gz for the upstream</li><li>The rpm (or deb) packager creates a coreutils.spdx (distinct from the one for the upstream) in the rpm file which:</li><ol><li>Provides information for the copyrighted entity that is the rpm (or deb) package as a whole</li><li>Provides embedded information for the copyrighted entity that is each file (such as patch files) contained in the rpm (or deb) package</li><li>For the coreutils.tar.gz file (also contained in the rpm or deb package), provides it's SPDX information by *referencing* the coreutils.spdx in the coreutils.tar.gz file.</li><li>Optionally provides and Annotation section to 'annotate' some of the information provided by the coreutils upstream.</li></ol></ol><p>&nbsp;</p><p>Diagram for a Concrete proposal (very very rough) for structure (note, notes that say 'Concrete' or 'Referential' are just indicating an 'or' in the doc structure):</p><p>&nbsp;</p><p><img src="http://spdx.org/system/files/spdxdoodle2_1.jpg" alt="" /></p><p>Description of diagram</p><ul><li><strong>Top:</strong> Simple top level place to start</li><li><strong>SPDXFile</strong>: File containing SPDX data</li><li><strong>SPDXElement:</strong> The containing element for SPDX data for a given copyrightable work contained in the SPDXFile. &nbsp;It's SPDXElements all the way down.</li><li><strong>Specifier:</strong> Not really a node, sort of a grouper of nodes to indicate those fields which specify the 'thing' the SPDX Element is about</li><li><strong>LicenseData</strong>: not really a node, sort of a grouper of nodes to indicate those fields which specify what we know about the 'thing' this SPDX Element is about</li><li><strong>SPDXElements:</strong>&nbsp;zero or more additional 'contained' SPDX Elements referring to contained things (like files, or contained tarballs etc).</li><li><strong>Annotations:</strong> zero or more annotations indicating additional information about the contained SPDXElements (to handle the case where a contained SPDX Element represents a reference to a another SPDX file that is signed and thus we can't change directly) - Note, we need more thought here.</li><li><strong>Creator (Annotation):&nbsp;</strong>Equivalent to SPDX 1.0 Creation Information Creator</li><li><strong>Date (Annotation):&nbsp;</strong>Equivalent to SPDX 1.0 Creation Information Created</li><li><strong>Comment (Annotation):</strong> Equivalent to SPDX 1.0 Creation Information Creator Comment</li><li><strong>AssertNewLicense (Annotation):</strong> Reference to new License Data you wish to assert to override existing SPDX License Data. &nbsp;Generally used in situations when we have existing License Data from a more primary source but we believe we have reason to believe otherwise.</li><li><strong>Name</strong>: Equivalent to SPDX 1.0 Formal Name</li><li><strong>Version:</strong> Equivalent to SPDX 1.0 Package Version Information</li><li><strong>Supplier:</strong> Equivalent to SPDX 1.0 Package Supplier</li><li><strong>Summary:</strong> Equivalent to SPDX 1.0 Package Summary Description</li><li><strong>Description:</strong> Equivalent to SPDX 1.0 Package Detailed Description</li><li><strong>URI (in SPDXElement-&gt;Specifier):</strong> URI of the copyrightable thing being referenced, may point to a file, an archive, a package, etc.</li><li><strong>URICheckSum( in SPDXElement-&gt;Specifier):&nbsp;</strong>Checksum for the thing URI points to</li><li><strong>CopyrightText:&nbsp;</strong>Equivalent to SPDX 1.0 CopyrightText</li><li><strong>LicenseText:&nbsp;</strong>Full text of license if LicenseShortForm isn't available</li><li><strong>LicenseShortForm:&nbsp;</strong>License short form in lew of license text if available</li><li><strong>SPDXFileURI:&nbsp;</strong>If the SPDX Element does not contain it's own concrete license data but references an external SPDX File... the URI of that SPDXFile</li><li><strong>SPDXFileSigURI:&nbsp;</strong>If the SPDX Element references an external SPDXFile, the URI of the sig file for that SPDX file</li><li><strong>ACL:</strong> I hate the name ACL, but basically it's a way of specifying that you are including or excluding some of the copyrightable bits that are covered by the referenced SPDX File.</li><li><strong>Exclude (in ACL):&nbsp;</strong>Used to specify parts of the stuff referenced by the external SPDX file you are not bring in. &nbsp;So if I am using all of a package, but not foo.c or bar.c.</li><li><strong>ExcludeAll (in ACL):</strong> Used to indicate that *none* of the referenced copyrightable items from the SPDX file are used except those explicitely included.</li><li><strong>Include (in ACL):&nbsp;</strong>Used after an excludeall to indicate we are only using the specifically included files... say we are just using foobar.c for example.</li><li><strong>SPDXFileSig</strong>: Separate file containing the signature for the octets of the SPDXFile&nbsp;</li></ul><p>&nbsp;</p><p>This can also be visualized with a UMLish diagram:</p><p>&nbsp;<img src="http://spdx.org/system/files/spdxdoodle_1.jpg" alt="" /></p><p>Mapping SPDX 1.0 Fields to Proposal</p><p>TBD</p><table><tbody>
 +
 
<tr><th colspan="2">SPDX 1.0</th><th colspan="2">SPDX 2.0 Proposal</th><th>Notes</th></tr>
 
<tr><th colspan="2">SPDX 1.0</th><th colspan="2">SPDX 2.0 Proposal</th><th>Notes</th></tr>
 
<tr><th>Section</th><th>Field</th><th>Element</th><th>Attribute</th><th></th></tr>
 
<tr><th>Section</th><th>Field</th><th>Element</th><th>Attribute</th><th></th></tr>
Line 10: Line 11:
 
<tr><td>Package Information</td><td>Formal Name</td><td>Concrete Specifier</td><td>Name</td><td></td></tr>
 
<tr><td>Package Information</td><td>Formal Name</td><td>Concrete Specifier</td><td>Name</td><td></td></tr>
 
<tr><td>Package Information</td><td>Package Version Information</td><td>Concrete Specifier</td><td> Version</td><td></td></tr>
 
<tr><td>Package Information</td><td>Package Version Information</td><td>Concrete Specifier</td><td> Version</td><td></td></tr>
<tr><td>Package Information</td><td>Package File Name</td><td></td><td></td><td>TBE</td></tr>
+
<tr><td>Package Information</td><td>Package File Name</td><td></td><td></td><td>If the SPDX data is outside the package, then this can be specified with a contained SPDX Element with a Referential Specifier, if the SPDX information is inside the package, otherwise this field is undesirable.</td></tr>
 
<tr><td>Package Information</td><td>Package Supplier</td><td>Concrete Specifier</td><td>Supplier</td><td></td></tr>
 
<tr><td>Package Information</td><td>Package Supplier</td><td>Concrete Specifier</td><td>Supplier</td><td></td></tr>
<tr><td>Package Information</td><td>Package Originator</td><td></td><td></td><td>TBE</td></tr>
+
<tr><td>Package Information</td><td>Package Originator</td><td></td><td></td><td>Note: As the SPDX 2.0 proposal correctly handles the notion of 'things' being repackaged along the way via nesting, this field is no longer necessary.  The coreutils.tar.gz upstream is the supplier for coreutils.tar.gz.  Someone like Fedora could be the supplier for coreutils.rpm, which would refer to the SPDX data from coreutils.tar.gz.  Full provenance abrogates the need for this field.</td></tr>
 
<tr><td>Package Information</td><td>Package Download Location</td><td></td><td></td><td>TBD</td></tr>
 
<tr><td>Package Information</td><td>Package Download Location</td><td></td><td></td><td>TBD</td></tr>
<tr><td>Package Information</td><td>Package Verification Code</td><td></td><td></td><td>SPDX Sig File</td></tr>
+
<tr><td>Package Information</td><td>Package Verification Code</td><td></td><td></td><td>SPDX Sig File + Referential Specifier URIChecksum</td></tr>
 
<tr><td>Package Information</td><td>Package Checksum</td><td></td><td></td><td>TBD</td></tr>
 
<tr><td>Package Information</td><td>Package Checksum</td><td></td><td></td><td>TBD</td></tr>
 
<tr><td>Package Information</td><td>Source Information</td><td></td><td></td><td>Handle as an Annotation</td></tr>
 
<tr><td>Package Information</td><td>Source Information</td><td></td><td></td><td>Handle as an Annotation</td></tr>
<tr><td>Package Information</td><td>Concluded License</td><td></td><td></td><td>TBE</td></tr>
+
<tr><td>Package Information</td><td>Concluded License</td><td>License Data</td><td></td><td>Note: Concluding a license different than what is declared upstream of you is handled via Annotations</td></tr>
<tr><td>Package Information</td><td>All Licenses Information From Files</td><td></td><td></td><td>TBE</td></tr>
+
<tr><td>Package Information</td><td>All Licenses Information From Files</td><td></td><td></td><td>Handled by SPDX Element Nesting, not even desirable in a world where an upstream consumer may choice to pick some but not all of the contained parts</td></tr>
<tr><td>Package Information</td><td>Declared License</td><td></td><td></td><td>TBE</td></tr>
+
<tr><td>Package Information</td><td>Declared License</td><td>License Data</td><td></td><td></td></tr>
 
<tr><td>Package Information</td><td>Comments on License</td><td></td><td></td><td>Handle as an Annotation</td></tr>
 
<tr><td>Package Information</td><td>Comments on License</td><td></td><td></td><td>Handle as an Annotation</td></tr>
 
<tr><td>Package Information</td><td>Copyright Text</td><td></td><td></td><td>TBD</td></tr>
 
<tr><td>Package Information</td><td>Copyright Text</td><td></td><td></td><td>TBD</td></tr>
Line 25: Line 26:
 
<tr><td>Package Information</td><td>Package Detailed Description</td><td>Concrete Specifier</td><td>Description</td><td></td></tr>
 
<tr><td>Package Information</td><td>Package Detailed Description</td><td>Concrete Specifier</td><td>Description</td><td></td></tr>
 
<tr><td>Other License Information Detected</td><td>Identifier Assigned</td><td></td><td></td><td>TBE</td></tr>
 
<tr><td>Other License Information Detected</td><td>Identifier Assigned</td><td></td><td></td><td>TBE</td></tr>
<tr><td>Other License Information Detected</td><td>Extracted Text</td><td></td><td></td><td>TBE</td></tr>
+
<tr><td>Other License Information Detected</td><td>Extracted Text</td><td>License Data</td><td></td><td>TBE</td></tr>
 
<tr><td>File Information</td><td>File Name</td><td>Referential Specifier</td><td>URI</td><td></td></tr>
 
<tr><td>File Information</td><td>File Name</td><td>Referential Specifier</td><td>URI</td><td></td></tr>
 
<tr><td>File Information </td><td>File Type</td><td></td><td></td><td>TBD</td></tr>
 
<tr><td>File Information </td><td>File Type</td><td></td><td></td><td>TBD</td></tr>
Line 36: Line 37:
 
<tr><td>File Information</td><td>Artifact of Project Homepage</td><td></td><td></td><td>TBD</td></tr>
 
<tr><td>File Information</td><td>Artifact of Project Homepage</td><td></td><td></td><td>TBD</td></tr>
 
<tr><td>File Information</td><td>Artifact of Project URI</td><td></td><td></td><td>TBD</td></tr>
 
<tr><td>File Information</td><td>Artifact of Project URI</td><td></td><td></td><td>TBD</td></tr>
<tr><td>Review Information</td><td>Reviewer</td><td></td><td></td><td>TBE</td></tr>
+
<tr><td>Review Information</td><td>Reviewer</td><td>Annotation</td><td>Creator</td><td>Handled by SPDX nesting and Annotations.</td></tr>
<tr><td>Review Information</td><td>Review Date</td><td></td><td></td><td>TBE</td></tr>
+
<tr><td>Review Information</td><td>Review Date</td><td>Annotation</td><td>Date</td><td>Handled by SPDX nesting and Annotations.</td></tr>
<tr><td>Review Information</td><td>Comments</td><td></td><td></td><td>TBE</td></tr>
+
<tr><td>Review Information</td><td>Comments</td><td>Annotation</td><td>Comment</td><td>Handled by SPDX nesting and Annotations.</td></tr>
 
</tbody></table><p>&nbsp;</p><p>&nbsp;</p><p>Stealing Java's archive URI syntax:</p><p>In the java world, they commonly use a URI syntax &lt;archivename&gt;!&lt;filename&gt; to indicate a particular file within an archive, for example foo.tar.gz!bar.c. &nbsp;I suggest we use this (as there are no other widespread alternatives).</p><p>Archiving multiple SPDX files:</p><p>It may be desirable to archive together multiple SPDX files so that we have full resolvability for a given package. &nbsp;In that case, those should be rolled into a specific archive format (say foo.spdx.zip) with the various spdx files, their sig files, and an index file. &nbsp;The index file should map URIs (as specified the SPDX files) to filenames in the archive (so we can resolve them).</p><p>&nbsp;</p><p>Example:</p><p>file://coreutils.tar.gz!coreutils.spdx: file://upstream/coreutils.spdx</p><p>file://coreutils.tar.gz!coreutils.spdx.sig file://upstream/coreutils.spdx.sig</p><p>Note: We have to also solve the problem here of how to distinguish when two spdx files contain the same URI (say relative to their archives) but they actually need to be resolved to separate files in the rollup, say if both upstream and the rpm (or deb) packager used file://coreutils.spdx&nbsp;</p><p>&nbsp;</p>
 
</tbody></table><p>&nbsp;</p><p>&nbsp;</p><p>Stealing Java's archive URI syntax:</p><p>In the java world, they commonly use a URI syntax &lt;archivename&gt;!&lt;filename&gt; to indicate a particular file within an archive, for example foo.tar.gz!bar.c. &nbsp;I suggest we use this (as there are no other widespread alternatives).</p><p>Archiving multiple SPDX files:</p><p>It may be desirable to archive together multiple SPDX files so that we have full resolvability for a given package. &nbsp;In that case, those should be rolled into a specific archive format (say foo.spdx.zip) with the various spdx files, their sig files, and an index file. &nbsp;The index file should map URIs (as specified the SPDX files) to filenames in the archive (so we can resolve them).</p><p>&nbsp;</p><p>Example:</p><p>file://coreutils.tar.gz!coreutils.spdx: file://upstream/coreutils.spdx</p><p>file://coreutils.tar.gz!coreutils.spdx.sig file://upstream/coreutils.spdx.sig</p><p>Note: We have to also solve the problem here of how to distinguish when two spdx files contain the same URI (say relative to their archives) but they actually need to be resolved to separate files in the rollup, say if both upstream and the rpm (or deb) packager used file://coreutils.spdx&nbsp;</p><p>&nbsp;</p>

Revision as of 18:47, 13 February 2012

A desire has been expressed to be able to have SPDX be capable of expressing

 

  1. Hiearchy ( package A contains packages B, C, etc)
  2. Authentication ( we can know precisely who said what and when about a package)
  3. How software flows through a supply chain (upstream to packager, through several intermediate vendors to consumer)

A rough example of this thought is shown in the diagram below, showing how the coreutils package might be represented:

 <img src="http://spdx.org/system/files/spdxdoodle_0.jpg" alt="" width="586" height="372" />

 The simple story behind this diagram is this:

  1. The upstream maintainer of coreutils provides an SPDX file which
    1. Provides information for the copyrighted entity that is the package as a whole
    2. Provides embedded information for the copyrighted entity that is each file in the package (same format, just embedded and clearly down hiearchy)
    3. Provides a coreutils.spdx.sig file with the signature for the coreutils.spdx file (so we can authenticate it)
  2. This coreutils.spdx file is in the coreutils.tar.gz for the upstream
  3. The rpm (or deb) packager creates a coreutils.spdx (distinct from the one for the upstream) in the rpm file which:
    1. Provides information for the copyrighted entity that is the rpm (or deb) package as a whole
    2. Provides embedded information for the copyrighted entity that is each file (such as patch files) contained in the rpm (or deb) package
    3. For the coreutils.tar.gz file (also contained in the rpm or deb package), provides it's SPDX information by *referencing* the coreutils.spdx in the coreutils.tar.gz file.
    4. Optionally provides and Annotation section to 'annotate' some of the information provided by the coreutils upstream.

 

Diagram for a Concrete proposal (very very rough) for structure (note, notes that say 'Concrete' or 'Referential' are just indicating an 'or' in the doc structure):

 

<img src="http://spdx.org/system/files/spdxdoodle2_1.jpg" alt="" />

Description of diagram

  • Top: Simple top level place to start
  • SPDXFile: File containing SPDX data
  • SPDXElement: The containing element for SPDX data for a given copyrightable work contained in the SPDXFile.  It's SPDXElements all the way down.
  • Specifier: Not really a node, sort of a grouper of nodes to indicate those fields which specify the 'thing' the SPDX Element is about
  • LicenseData: not really a node, sort of a grouper of nodes to indicate those fields which specify what we know about the 'thing' this SPDX Element is about
  • SPDXElements: zero or more additional 'contained' SPDX Elements referring to contained things (like files, or contained tarballs etc).
  • Annotations: zero or more annotations indicating additional information about the contained SPDXElements (to handle the case where a contained SPDX Element represents a reference to a another SPDX file that is signed and thus we can't change directly) - Note, we need more thought here.
  • Creator (Annotation): Equivalent to SPDX 1.0 Creation Information Creator
  • Date (Annotation): Equivalent to SPDX 1.0 Creation Information Created
  • Comment (Annotation): Equivalent to SPDX 1.0 Creation Information Creator Comment
  • AssertNewLicense (Annotation): Reference to new License Data you wish to assert to override existing SPDX License Data.  Generally used in situations when we have existing License Data from a more primary source but we believe we have reason to believe otherwise.
  • Name: Equivalent to SPDX 1.0 Formal Name
  • Version: Equivalent to SPDX 1.0 Package Version Information
  • Supplier: Equivalent to SPDX 1.0 Package Supplier
  • Summary: Equivalent to SPDX 1.0 Package Summary Description
  • Description: Equivalent to SPDX 1.0 Package Detailed Description
  • URI (in SPDXElement->Specifier): URI of the copyrightable thing being referenced, may point to a file, an archive, a package, etc.
  • URICheckSum( in SPDXElement->Specifier): Checksum for the thing URI points to
  • CopyrightText: Equivalent to SPDX 1.0 CopyrightText
  • LicenseText: Full text of license if LicenseShortForm isn't available
  • LicenseShortForm: License short form in lew of license text if available
  • SPDXFileURI: If the SPDX Element does not contain it's own concrete license data but references an external SPDX File... the URI of that SPDXFile
  • SPDXFileSigURI: If the SPDX Element references an external SPDXFile, the URI of the sig file for that SPDX file
  • ACL: I hate the name ACL, but basically it's a way of specifying that you are including or excluding some of the copyrightable bits that are covered by the referenced SPDX File.
  • Exclude (in ACL): Used to specify parts of the stuff referenced by the external SPDX file you are not bring in.  So if I am using all of a package, but not foo.c or bar.c.
  • ExcludeAll (in ACL): Used to indicate that *none* of the referenced copyrightable items from the SPDX file are used except those explicitely included.
  • Include (in ACL): Used after an excludeall to indicate we are only using the specifically included files... say we are just using foobar.c for example.
  • SPDXFileSig: Separate file containing the signature for the octets of the SPDXFile 

 

This can also be visualized with a UMLish diagram:

 <img src="http://spdx.org/system/files/spdxdoodle_1.jpg" alt="" />

Mapping SPDX 1.0 Fields to Proposal

TBD

<tbody> </tbody>
SPDX 1.0SPDX 2.0 ProposalNotes
SectionFieldElementAttribute
SPDX Document InformationSPDXIncorporated into SPDX Element
Document InformationVersionSPDXVersion
Document InformationData LicenseTBD
Creation InformationCreatorAnnotationCreator
Creation InformationCreatedAnnotationDate
Creation InformationCommentAnnotationComment
Package InformationFormal NameConcrete SpecifierName
Package InformationPackage Version InformationConcrete Specifier Version
Package InformationPackage File NameIf the SPDX data is outside the package, then this can be specified with a contained SPDX Element with a Referential Specifier, if the SPDX information is inside the package, otherwise this field is undesirable.
Package InformationPackage SupplierConcrete SpecifierSupplier
Package InformationPackage OriginatorNote: As the SPDX 2.0 proposal correctly handles the notion of 'things' being repackaged along the way via nesting, this field is no longer necessary. The coreutils.tar.gz upstream is the supplier for coreutils.tar.gz. Someone like Fedora could be the supplier for coreutils.rpm, which would refer to the SPDX data from coreutils.tar.gz. Full provenance abrogates the need for this field.
Package InformationPackage Download LocationTBD
Package InformationPackage Verification CodeSPDX Sig File + Referential Specifier URIChecksum
Package InformationPackage ChecksumTBD
Package InformationSource InformationHandle as an Annotation
Package InformationConcluded LicenseLicense DataNote: Concluding a license different than what is declared upstream of you is handled via Annotations
Package InformationAll Licenses Information From FilesHandled by SPDX Element Nesting, not even desirable in a world where an upstream consumer may choice to pick some but not all of the contained parts
Package InformationDeclared LicenseLicense Data
Package InformationComments on LicenseHandle as an Annotation
Package InformationCopyright TextTBD
Package InformationPackage Summary DescriptionConcrete SpecifierSummary
Package InformationPackage Detailed DescriptionConcrete SpecifierDescription
Other License Information DetectedIdentifier AssignedTBE
Other License Information DetectedExtracted TextLicense DataTBE
File InformationFile NameReferential SpecifierURI
File Information File TypeTBD
File InformationFile ChecksumReferential SpecifierURIChecksum
File InformationConcluded LicenseLicense Data
File InformationLicense Information in FileLicense Data
File InformationComment on LicenseHandled by Annotations
File InformationCopyright TextTBD
File InformationArtifact of Project NameTBD
File InformationArtifact of Project HomepageTBD
File InformationArtifact of Project URITBD
Review InformationReviewerAnnotationCreatorHandled by SPDX nesting and Annotations.
Review InformationReview DateAnnotationDateHandled by SPDX nesting and Annotations.
Review InformationCommentsAnnotationCommentHandled by SPDX nesting and Annotations.

 

 

Stealing Java's archive URI syntax:

In the java world, they commonly use a URI syntax <archivename>!<filename> to indicate a particular file within an archive, for example foo.tar.gz!bar.c.  I suggest we use this (as there are no other widespread alternatives).

Archiving multiple SPDX files:

It may be desirable to archive together multiple SPDX files so that we have full resolvability for a given package.  In that case, those should be rolled into a specific archive format (say foo.spdx.zip) with the various spdx files, their sig files, and an index file.  The index file should map URIs (as specified the SPDX files) to filenames in the archive (so we can resolve them).

 

Example:

file://coreutils.tar.gz!coreutils.spdx: file://upstream/coreutils.spdx

file://coreutils.tar.gz!coreutils.spdx.sig file://upstream/coreutils.spdx.sig

Note: We have to also solve the problem here of how to distinguish when two spdx files contain the same URI (say relative to their archives) but they actually need to be resolved to separate files in the rollup, say if both upstream and the rpm (or deb) packager used file://coreutils.spdx