THE SPDX WIKI IS NO LONGER ACTIVE. ALL CONTENT HAS BEEN MOVED TO https://github.com/spdx

Difference between revisions of "GSOC/GSOC ProjectIdeas"

From SPDX Wiki
Jump to: navigation, search
(Add Golang RDF Saver project)
 
(78 intermediate revisions by 7 users not shown)
Line 1: Line 1:
 
<br />
 
<br />
  
<span style="font-size:150%">'''Welcome to the 2018 SPDX Google Summer of Code Project Page'''</span>
+
<span style="font-size:150%">'''Welcome to the 2022 SPDX Google Summer of Code Project Page'''</span>
  
 
See the [https://rtgdk.github.io/spdx-gsoc-proposal.html proposal template] if you are interested in submitting a Google Summer of Code proposal.
 
See the [https://rtgdk.github.io/spdx-gsoc-proposal.html proposal template] if you are interested in submitting a Google Summer of Code proposal.
Line 20: Line 20:
  
 
* [https://spdx.org/using-spdx License List and Short Identifiers]
 
* [https://spdx.org/using-spdx License List and Short Identifiers]
* [https://spdx.org/using-spdx SPDX Specification for generating SPDX Doucments in either RDF or Tag/Value format]
+
* [https://spdx.org/using-spdx SPDX Specification for generating SPDX Documents in multiple formats]
 
* [https://spdx.org/tools A set of basic tools for working with SPDX Documents]
 
* [https://spdx.org/tools A set of basic tools for working with SPDX Documents]
 
* [https://spdx.org/using-spdx License Identifiers in source]
 
* [https://spdx.org/using-spdx License Identifiers in source]
Line 32: Line 32:
 
= Getting Involved =
 
= Getting Involved =
  
Beyond working wth your mentor(s) we highly encourage students who select one of these projects to get involved with the SPDX community via our technical working group. Interaction with the technical team is primarily done via its mailing list (see resources). There is however a weekly call you could join as well. All of the daily work for the Tech team is done on this wiki.
+
Beyond working with your mentor(s) we highly encourage students who select one of these projects to get involved with the SPDX community via our technical working group. Interaction with the technical team is primarily done via its mailing list and on gitter (see resources). There is however a weekly call you could join as well. .
 
+
  
 
== Resources ==
 
== Resources ==
Line 42: Line 41:
 
* [https://lists.spdx.org/mailman/listinfo/spdx-tech SPDX tech mailing list]
 
* [https://lists.spdx.org/mailman/listinfo/spdx-tech SPDX tech mailing list]
  
 +
= Ideas for 2022 Projects =
  
=SPDX Workgroup Tooling Projects=
+
== SBOM Conformance Checker ==
These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX and increase the accuracy of the SPDX documents.
+
The goal of this project is to create a simple tool that
 
+
checks whether an SBOM (in SPDX format)
==Add a License XML Editor==
+
conforms to NTIA's minimum elements guidance.
The SPDX license list (see https://spdx.org/licenses/) is maintained by the SPDX legal team. The source for the license list is maintained in the SPDX license-list-XML github repository (https://github.com/spdx/license-list-XML). Making changes to the license requires manually editing an XML file which can be challenging for contributors not familiar with XML.
+
=== Description ===
 
+
The SPDX Specification defines a number of fields (elements) that may appear in an SBOM (Software Bill of Materials).
An online program could be created which would take as input a license XML file uploaded from the client machine to the server.  The editor would allow editing of all of the XML fields as well as the optional and alternate text properties.  The tool would allow a person to then make changes to the XML formatted license (changes to both the XML fields as well as text) by utilizing a simple UI instead of needing to understand how XML works.
+
Not all of them are mandatory, however, so SBOMs in SPDX format can vary greatly.
 
+
The following usage scenarios could be supported:
+
 
+
* Import an existing license from the SPDX license-list-XML github repository, edit the license XML and create a pull request with the edit changes
+
* Upload a license XML file, edit it and download the updated license XML file
+
* Copy and past a license XML file into a text box on the web page, edit the page, copy/paste the updated license XML file
+
 
+
This project could be implemented in Python in the existing online tools.
+
====Skills Needed====
+
* Development skills in the Python language
+
* Knowledge of Git
+
* Understanding of XML
+
====Available Mentors====
+
[mailto:gary@sourceauditor.com Gary O'Neall] [mailto:rohit.lodhartg@gmail.com Rohit Lodha]
+
 
+
==Add New License Submital Feature to Online Tools==
+
The SPDX license list (see https://spdx.org/licenses/) is maintained by the SPDX legal team.  The source for the license license list is maintained in the SPDX license-list-XML github repository (https://github.com/spdx/license-list-XML).  Currently, new licenses are submitted through emails to the SPDX legal team and the license XML files are manually generated.
+
 
+
An online program could be created which would take all the relevant data (e.g. license name, URL to original link, license text), notify the legal team of the submission and create a pull request with the proper license XML format.  Additional, optional useful features could include comparing the license text to existing licenses to avoid duplication, allow demarking optional text and tracking the approval status.
+
  
This project could be implemented in Python in the existing online tools.
+
While researching the attributes that have to be present in an SBOM,
====Skills Needed====
+
NTIA came up with a guidance about the minimum elements that must appear therein:
* Development skills in the Python language
+
https://www.ntia.doc.gov/files/ntia/publications/sbom_minimum_elements_report.pdf
* Knowledge of Git
+
* Understanding of XML
+
====Available Mentors====
+
[mailto:gary@sourceauditor.com Gary O'Neall] [mailto:rohit.lodhartg@gmail.com Rohit Lodha]
+
  
==Update Parser Libraries to SPDX 2.1 for GO==
+
It would, therefore, be useful to have a tool that can determine whether an SBOM stored in SPDX format
Update one of the SPDX GO libraries to the SPDX 2.1 specification.  The SPDX 2.1 specification is a major upgrade from SPDX 1.2 supporting relationships between SPDX documents and SPDX elements.
+
fulfills all such minimum obligations.
====Skills Needed====
+
* Development skills in the GO language
+
* Experience with parser development
+
* Understanding of RDF and XML
+
====Background Information====
+
SPDX currently provides libraries supporting the reading and writing of SPDX document.  Currently, only Java libraries support the new SPDX 2.1 specification.  The Python libraries and the GO libraries support version 1.2 of the spec.  The libraries must support both RDF/XML import/export as well as tag/value import/export.  The [[https://github.com/spdx/tools-go SPDX git repository]] SPDX Tools project contains the source code for the libraries.
+
  
====Available Mentors====
+
The tool should make use of the already existing libraries for reading SPDX documents
[mailto:gary@sourceauditor.com Gary O'Neall]
+
=== Technologies ===
 +
Python
 +
=== Duration ===
 +
This will be a short (175 hours) project.
 +
It might be extended to a long (350 hours) project if integration
 +
with the existing SPDX handling tools (e.g., the Validation tool)
 +
is also implemented.
 +
=== Mentors ===
 +
Dick Brooks, Kate Stewart
  
 +
== Private license management system ==
 +
A web-based system for managing license texts; similar to the SPDX License List but oriented towards other private collections of licenses.
 +
=== Description ===
 +
The goal of the project would be to create a simple web application
 +
for people to upload license texts
 +
and automatically create a license repository.
 +
The initial rough "functional specifications" describe it as
 +
mainly an input form, where the information is entered.
 +
There will be some automatic processing (e.g., canonicalization, duplicate avoidance, etc.),
 +
a review/approval (and naming) step,
 +
and then publishing in a specified format.
  
==Update Python SPDX library to SPDX 2.1==
+
It should be noted that the specification is not yet finalized
Update one of the SPDX Python libraries to the SPDX 2.1 specification.  The SPDX 2.1 specification is a major upgrade from SPDX 1.2 supporting relationships between SPDX documents and SPDX elements.
+
regarding naming namespaces, way to publish licenses, etc.
====Skills Needed====
+
If the SPDX project has already advanced in these definitions,
* Development skills in the Python language
+
this project will obviously implement the decisions taken.
 +
=== Technologies ===
 +
Python (any framework) for the back-end; JavaScript (any framework) for the minimal front-end.
 +
=== Duration ===
 +
This can be either a short (175 hours) project, implementing only the basic functionality;
 +
or a long (350 hours) one, implementing more functionality and automation.
 +
=== Mentors ===
 +
Alexios Zavras; more TBD
  
====Background Information====
+
== SBOM combiner ==
SPDX currently provides libraries supporting the reading and writing of SPDX document. Currently, only Java libraries support the new SPDX 2.1 specification.  The Python library support version 1.2 of the spec.  The library must support primarily the tag/value import/export and also the RDF/XML import/export.  The [[https://github.com/spdx/tools-python SPDX git repository]] SPDX Tools project contains the source code for this library.
+
The project will result in a simple command-line tool that will be able to “combine” information from a number of SBOMs into a comprehensive SBOM that includes all the information of the provided ones.
 +
An actual use case would be the generation of an SBOM for an actual software delivery that is comprised by a number of components, each one of which has its own correct SBOM.
 +
=== Description ===
 +
The primary purpose of this tool would be
 +
to stitch together smaller component-level SPDX documents
 +
and amalgamate them into one top-level SPDX document
 +
representing a "sum of parts" piece of software.
 +
As an initial pass for implementation, the component-level SBOMs would have to be provided by the caller
 +
until the tool was advanced enough to fetch SPDX Documents referenced by ExternalDocumenRef reliably.  
 +
=== Technologies ===
 +
Python (preferably); or Go.
 +
=== Duration ===
 +
This will be a short (175 hours) project.
 +
=== Mentors ===
 +
Rose Judge; others TBD
  
====Available Mentors====
+
== Update of Java SPDX libraries to handle latest spec ==
[mailto:pombredanne@nexb.com Philippe Ombredanne] [mailto:rohit.lodhartg@gmail.com Rohit Lodha]
+
=== Description ===
 +
The SPDX Project maintains a library, written in Java, for working with SPDX data.
 +
The development of the library does not always follow the development of the specification immediately.
 +
Since the specification has evolved
 +
and a newer version is expected to be published
 +
right before the timeframe of the project,
 +
it would be useful to have the standard Java libraries
 +
capable of handling the latest spec.
  
==Add support for SPDX license expression to Python library==
+
The project will involve obviously understanding deeply
Update the [[https://github.com/spdx/tools-python]|SPDX Python library]] to fully support license expression.
+
the existing libraries
====Skills Needed====
+
and extending them to handle the latest additions
* Development skills in the Python language
+
of the specification (to the point of the published version).
 +
=== Technologies ===
 +
Java; see https://github.com/spdx/Spdx-Java-Library
 +
=== Duration ===
 +
This will be a short (175 hours) project.
 +
=== Mentors ===
 +
TBD
  
====Background Information====
+
== Update of Go SPDX libraries to handle latest spec ==
See https://github.com/spdx/tools-python/issues/10 and https://github.com/nexB/license-expression/
+
=== Description ===
====Available Mentors====
+
The SPDX Project maintains a library, written in Go, for working with SPDX data.
[mailto:pombredanne@nexb.com Philippe Ombredanne] [mailto:rohit.lodhartg@gmail.com Rohit Lodha]
+
The development of the library does not always follow the development of the specification immediately.
 +
Since the specification has evolved
 +
and a newer version is expected to be published
 +
right before the timeframe of the project,
 +
it would be useful to have the standard Go libraries
 +
capable of handling the latest spec.
  
== Port SPDX license expression library to Ruby, JavaScript and Java==
+
The project will involve obviously understanding deeply
The [[https://github.com/nexB/license-expression/]|licens_expressionlibrary]] provides comprehensive support license expression using a boolean engine for Python.
+
the existing libraries
The goal of this project is to port and/or package this library for JavaScript, Ruby and Java, considering either code conversion tools, alternative Python implementations (e.g. Jython) or calling Python from another language to bring the same features to these other languages.
+
and extending them to handle the latest additions
====Skills Needed====
+
of the specification (to the point of the published version).
* Development skills in Python, Java, Ruby, JavaScript.
+
=== Technologies ===
 +
Go; see https://github.com/spdx/tools-golang
 +
=== Duration ===
 +
This will be a short (175 hours) project.
 +
=== Mentors ===
 +
TBD
  
====Background Information====
+
== SPDX Golang RDF Saver ==
See https://github.com/spdx/tools-python/issues/10 and https://github.com/nexB/license-expression/
+
=== Description ===
====Available Mentors====
+
SPDX already has a Golang library to save RDF triples into a file/string
[mailto:pombredanne@nexb.com Philippe Ombredanne]
+
using the gordf project: https://github.com/spdx/gordf
  
 +
The aim of this GSoC project would be to write an adapter in the
 +
SPDX Golang Tools (the tools-golang repository at https://github.com/spdx/tools-golang) that
 +
would take an SPDX Document struct (see https://github.com/spdx/tools-golang/blob/main/spdx/document.go) as
 +
an input, and serialize it and its child elements into RDF triples to be consumed by the
 +
aforementioned gordf rdf-writer.
 +
=== Technologies ===
 +
Golang; RDF
 +
=== Duration ===
 +
This will be a short (175 hours) project. If the project requires less than 175 hours, remaining time can be spent on
 +
additional improvements to the Golang tools.
 +
=== Mentors ===
 +
Rishabh Bhatnagar; Steve Winslow as secondary / backup
  
==Build Tool SPDX File Generators==
 
Support a continuous integration (CI) generation of SPDX files by creating a plugins or extensions to build tools.  These plugins or extensions  will generate valid SPDX documents based on the build file metadata and source files.
 
 
 
====Skills Needed====
 
* Experience developing parser/scanners
 
* Experience with the specific build tools
 
====Background Information====
 
Many build environments include license information in their metadata but do not produce sufficient information for good license compliance.  By adding SPDX generation to these build environments, high quality licensing information can be captured in a way which is easily used by downstream users of the code.  Following is a partial list of popular build environments/package managers which do not have an SPDX generation capability:
 
* MSBuild
 
* PIP
 
* NPM (Note: NPM does include SPDX compliance license information and tools)
 
* DEB
 
The Yocto build environment currently has some SPDX file generation capabilities, but there is a need for some additional work to integrate some of the existing tools into a more complete integrated toolset.  The [https://github.com/goneall/spdx-maven-plugin SPDX Maven Plugin] is an example of an existing build tool SPDX generator.
 
====Available Mentors====
 
[mailto:gary@sourceauditor.com Gary O'Neall]
 
[mailto:pombredanne@nexb.com Philippe Ombredanne]
 
  
=SPDX Specification Projects=
+
== Update of Python SPDX libraries to handle latest spec ==
The following projects contribute directly to the creation or validation of the SPDX 2.1 specification.
+
=== Description ===
 +
The SPDX Project maintains a library, written in Python, for working with SPDX data.
 +
The development of the library does not always follow the development of the specification immediately.
 +
Since the specification has evolved
 +
and a newer version is expected to be published
 +
right before the timeframe of the project,
 +
it would be useful to have the standard Python libraries
 +
capable of handling the latest spec.
  
== SPDX Specification in MarkDown ==
+
The project will involve obviously understanding deeply
Migrate the specification from Google docs to GitHub+MarkDown based toolchain capable of generating HTML, PDF and EPUB
+
the existing libraries
====Skills Needed====
+
and extending them to handle the latest additions
* Understanding of documentation tooling
+
of the specification (to the point of the published version).
* Web-development skills to style HTML version
+
=== Technologies ===
====Background Information====
+
Python; see https://github.com/spdx/tools-python
The [https://spdx.org/specifications 2.1 SPDX specification] PDF and HTML version have several issues.
+
=== Duration ===
1. Navigation through both document is difficult as a index is missing
+
This will be a short (175 hours) project.
2. Switching to GitHub+MarkDown will remove friction for contributors to comment/amend the specification. Common workflow within the OSS community
+
=== Mentors ===
====Available Mentors====
+
TBD
[mailto:kstewart@linuxfoundation.org Kate Stewart]
+
[mailto:thomas.steenbergen@here.com Thomas Steenbergen]
+
  
  
== SPDX Specification Wiki Examples of Package Managers ==
+
== More to come... ==
SPDX specification describes on a high level how to describe package, files and snippets but lack examples how to capture the use of package managers
+
Mentors:  please fill out the following template for any projects you wish to propose.  
====Skills Needed====
+
* Understanding of package managers
+
====Background Information====
+
To encourage adoption of SPDX it should be clear how to encode the use of common programming language package managers within SPDX. The aim of this project is to create example per build tool/package manager so that not only as example to the community but also form the input for SPDX tech team discussions and future tooling development
+
  
Initial package managers:
+
=== Project Name ===
* Bower
+
add overview of project here
* CocoaPods
+
====Skills Needed====
* Gradle
+
what skills should the student have to do the coding exercises
* gem
+
====Duration===
* gitmodules
+
whether this is a short or a long project
* Maven
+
====Background Information====
* npm
+
context for the project and references to be studied
* PyPi
+
====Available Mentors====
* sbt
+
list individuals who are willing to mentor and provide information about the project proposal.
* NuGet
+
  
====Available Mentors====
+
= Historical info =
[mailto:thomas.steenbergen@here.com Thomas Steenbergen]
+
[mailto:stewart@linux.com Kate Stewart]
+
  
== SPDX Specification Views for legal counsels and developers ==
+
[[GSOC/PastProjectIdeas]]
The proposal is to see if it possible to deduct large SPDX documents into a small subset SPDX document providing a specific reduced "views" on larger data.
+
====Skills Needed====
+
* Understanding of compliance needs of legal counsels and developers so we can remove friction to adopt SPDX
+
====Background Information====
+
SPDX documents commonly contain 100s, if not 1000s of entries making it hard for a human to make manual corrections or draw conclusions. No scanner can provide 100% complete data human corrections are usual needed. The aim from this proposal is twofold:
+
1. Enable developers with a "code view" of tool-generated SPDX document close to the code they work on to enable them to make corrections to the SPDX data. For instance amend SPDX package tag values or model package dependencies not detected by used scanner.
+
2. Provide legal counsels with a "package and limited file view" to enable legal conclusions
+
====Available Mentors====
+
[mailto:thomas.steenbergen@here.com Thomas Steenbergen]
+
[mailto:ybronshteyn@blackducksoftware.com Yev Bronshteyn]
+

Latest revision as of 17:36, 31 March 2022


Welcome to the 2022 SPDX Google Summer of Code Project Page

See the proposal template if you are interested in submitting a Google Summer of Code proposal.

Should you have questions please do not hesitate to contact one of the mentors directly.



What is SPDX ?

First and foremost we are a community dedicated to solving the issues and problems around Open Source licensing and compliance. The SPDX work group (part of the Linux Foundation) consists of individuals, community members, and representatives from companies, foundations and organizations who use or are considering using the SPDX standard. The work group operates much like a meritocratic, consensus-based community project; that is, anyone with an interest in the project can join the community, contribute to the specification, and participate in the decision-making process. We come from many different backgrounds including open source developers, lawyers, consultants and business professionals, many of who have been involved with license compliance and identification for years.

As part of this effort we have developed a set of collateral that can be used:

Why choose an SPDX Project?

Contributing to one of the SPDX projects below will provide a valuable contribution to developers and/or users of open source software. We believe you will find the projects both technically challenging and rewarding. In essence we believe you will be able to look back one day and I say I was part of that effort.


Getting Involved

Beyond working with your mentor(s) we highly encourage students who select one of these projects to get involved with the SPDX community via our technical working group. Interaction with the technical team is primarily done via its mailing list and on gitter (see resources). There is however a weekly call you could join as well. .

Resources

Ideas for 2022 Projects

SBOM Conformance Checker

The goal of this project is to create a simple tool that checks whether an SBOM (in SPDX format) conforms to NTIA's minimum elements guidance.

Description

The SPDX Specification defines a number of fields (elements) that may appear in an SBOM (Software Bill of Materials). Not all of them are mandatory, however, so SBOMs in SPDX format can vary greatly.

While researching the attributes that have to be present in an SBOM, NTIA came up with a guidance about the minimum elements that must appear therein: https://www.ntia.doc.gov/files/ntia/publications/sbom_minimum_elements_report.pdf

It would, therefore, be useful to have a tool that can determine whether an SBOM stored in SPDX format fulfills all such minimum obligations.

The tool should make use of the already existing libraries for reading SPDX documents

Technologies

Python

Duration

This will be a short (175 hours) project. It might be extended to a long (350 hours) project if integration with the existing SPDX handling tools (e.g., the Validation tool) is also implemented.

Mentors

Dick Brooks, Kate Stewart

Private license management system

A web-based system for managing license texts; similar to the SPDX License List but oriented towards other private collections of licenses.

Description

The goal of the project would be to create a simple web application for people to upload license texts and automatically create a license repository. The initial rough "functional specifications" describe it as mainly an input form, where the information is entered. There will be some automatic processing (e.g., canonicalization, duplicate avoidance, etc.), a review/approval (and naming) step, and then publishing in a specified format.

It should be noted that the specification is not yet finalized regarding naming namespaces, way to publish licenses, etc. If the SPDX project has already advanced in these definitions, this project will obviously implement the decisions taken.

Technologies

Python (any framework) for the back-end; JavaScript (any framework) for the minimal front-end.

Duration

This can be either a short (175 hours) project, implementing only the basic functionality; or a long (350 hours) one, implementing more functionality and automation.

Mentors

Alexios Zavras; more TBD

SBOM combiner

The project will result in a simple command-line tool that will be able to “combine” information from a number of SBOMs into a comprehensive SBOM that includes all the information of the provided ones. An actual use case would be the generation of an SBOM for an actual software delivery that is comprised by a number of components, each one of which has its own correct SBOM.

Description

The primary purpose of this tool would be to stitch together smaller component-level SPDX documents and amalgamate them into one top-level SPDX document representing a "sum of parts" piece of software. As an initial pass for implementation, the component-level SBOMs would have to be provided by the caller until the tool was advanced enough to fetch SPDX Documents referenced by ExternalDocumenRef reliably.

Technologies

Python (preferably); or Go.

Duration

This will be a short (175 hours) project.

Mentors

Rose Judge; others TBD

Update of Java SPDX libraries to handle latest spec

Description

The SPDX Project maintains a library, written in Java, for working with SPDX data. The development of the library does not always follow the development of the specification immediately. Since the specification has evolved and a newer version is expected to be published right before the timeframe of the project, it would be useful to have the standard Java libraries capable of handling the latest spec.

The project will involve obviously understanding deeply the existing libraries and extending them to handle the latest additions of the specification (to the point of the published version).

Technologies

Java; see https://github.com/spdx/Spdx-Java-Library

Duration

This will be a short (175 hours) project.

Mentors

TBD

Update of Go SPDX libraries to handle latest spec

Description

The SPDX Project maintains a library, written in Go, for working with SPDX data. The development of the library does not always follow the development of the specification immediately. Since the specification has evolved and a newer version is expected to be published right before the timeframe of the project, it would be useful to have the standard Go libraries capable of handling the latest spec.

The project will involve obviously understanding deeply the existing libraries and extending them to handle the latest additions of the specification (to the point of the published version).

Technologies

Go; see https://github.com/spdx/tools-golang

Duration

This will be a short (175 hours) project.

Mentors

TBD

SPDX Golang RDF Saver

Description

SPDX already has a Golang library to save RDF triples into a file/string using the gordf project: https://github.com/spdx/gordf

The aim of this GSoC project would be to write an adapter in the SPDX Golang Tools (the tools-golang repository at https://github.com/spdx/tools-golang) that would take an SPDX Document struct (see https://github.com/spdx/tools-golang/blob/main/spdx/document.go) as an input, and serialize it and its child elements into RDF triples to be consumed by the aforementioned gordf rdf-writer.

Technologies

Golang; RDF

Duration

This will be a short (175 hours) project. If the project requires less than 175 hours, remaining time can be spent on additional improvements to the Golang tools.

Mentors

Rishabh Bhatnagar; Steve Winslow as secondary / backup


Update of Python SPDX libraries to handle latest spec

Description

The SPDX Project maintains a library, written in Python, for working with SPDX data. The development of the library does not always follow the development of the specification immediately. Since the specification has evolved and a newer version is expected to be published right before the timeframe of the project, it would be useful to have the standard Python libraries capable of handling the latest spec.

The project will involve obviously understanding deeply the existing libraries and extending them to handle the latest additions of the specification (to the point of the published version).

Technologies

Python; see https://github.com/spdx/tools-python

Duration

This will be a short (175 hours) project.

Mentors

TBD


More to come...

Mentors: please fill out the following template for any projects you wish to propose.

=== Project Name ===
add overview of project here
====Skills Needed====
what skills should the student have to do the coding exercises
====Duration===
whether this is a short or a long project
====Background Information====
context for the project and references to be studied
====Available Mentors====
list individuals who are willing to mentor and provide information about the project proposal.

Historical info

GSOC/PastProjectIdeas