THE SPDX WIKI IS NO LONGER ACTIVE. ALL CONTENT HAS BEEN MOVED TO https://github.com/spdx

Difference between revisions of "GSOC/GSOC ProjectIdeas"

From SPDX Wiki
Jump to: navigation, search
(SPDX Specification in PDF and HTML)
(Add Golang RDF Saver project)
 
(52 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
<br />
 
<br />
  
<span style="font-size:150%">'''Welcome to the 2019 SPDX Google Summer of Code Project Page'''</span>
+
<span style="font-size:150%">'''Welcome to the 2022 SPDX Google Summer of Code Project Page'''</span>
  
 
See the [https://rtgdk.github.io/spdx-gsoc-proposal.html proposal template] if you are interested in submitting a Google Summer of Code proposal.
 
See the [https://rtgdk.github.io/spdx-gsoc-proposal.html proposal template] if you are interested in submitting a Google Summer of Code proposal.
Line 20: Line 20:
  
 
* [https://spdx.org/using-spdx License List and Short Identifiers]
 
* [https://spdx.org/using-spdx License List and Short Identifiers]
* [https://spdx.org/using-spdx SPDX Specification for generating SPDX Doucments in either RDF or Tag/Value format]
+
* [https://spdx.org/using-spdx SPDX Specification for generating SPDX Documents in multiple formats]
 
* [https://spdx.org/tools A set of basic tools for working with SPDX Documents]
 
* [https://spdx.org/tools A set of basic tools for working with SPDX Documents]
 
* [https://spdx.org/using-spdx License Identifiers in source]
 
* [https://spdx.org/using-spdx License Identifiers in source]
Line 32: Line 32:
 
= Getting Involved =
 
= Getting Involved =
  
Beyond working wth your mentor(s) we highly encourage students who select one of these projects to get involved with the SPDX community via our technical working group. Interaction with the technical team is primarily done via its mailing list (see resources). There is however a weekly call you could join as well. All of the daily work for the Tech team is done on this wiki.
+
Beyond working with your mentor(s) we highly encourage students who select one of these projects to get involved with the SPDX community via our technical working group. Interaction with the technical team is primarily done via its mailing list and on gitter (see resources). There is however a weekly call you could join as well. .
 
+
  
 
== Resources ==
 
== Resources ==
Line 42: Line 41:
 
* [https://lists.spdx.org/mailman/listinfo/spdx-tech SPDX tech mailing list]
 
* [https://lists.spdx.org/mailman/listinfo/spdx-tech SPDX tech mailing list]
  
 +
= Ideas for 2022 Projects =
  
=SPDX Workgroup Tooling Projects=
+
== SBOM Conformance Checker ==
 
+
The goal of this project is to create a simple tool that
These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX and increase the accuracy of the SPDX documents.
+
checks whether an SBOM (in SPDX format)
==Registry for License List Namespaces==
+
conforms to NTIA's minimum elements guidance.
Build automation for a GitHub repository to support registration of license namespaces to support the following workflow:
+
=== Description ===
* User submits a request for a new license namespace
+
The SPDX Specification defines a number of fields (elements) that may appear in an SBOM (Software Bill of Materials).
** The namespace can be a dns-style request or a free-format namespace (e.g. .org.spdx.ad-hoc-licenses or this-is-my-licenses)
+
Not all of them are mandatory, however, so SBOMs in SPDX format can vary greatly.
* User includes optional information for a URL with additional information for the namespace
+
* User includes optional information on the submitter name and email
+
* User agrees that the information will be publicly shared per the terms of the Linux Foundation privacy policy
+
* A check is made that the namespace is not already in use
+
* A pull request is created for the new namespace
+
* A committer to the namespace repository accepts the pull request
+
* When accepted, the namespace is published to a known website
+
* REST based API's are available to query the namespace repository
+
 
+
The automation can be built as a function of the SPDX online tools or as an independent tool or as extensions to GitHub.
+
 
+
====Skills Needed====
+
* Development skills in the Python or JavaScript language
+
* Understanding of GitHub API's
+
* Ability to work with the user community in refining requirements
+
* UI development
+
* REST API development
+
 
+
====Background Information====
+
SPDX provides a license list for commonly used open source license - the [https://spdx.org/licenses SPDX License List].  SPDX also supports defining licenses within the SPDX document using a LicenseRef syntax defined in [https://spdx.org/spdx-specification-21-web-version#h.1v1yuxt section 6 of the SPDX specification].  In the next release of SPDX, we plan to introduce a mechanism for other organizations or individuals to maintain lists of licenses outside of the SPDX license list, but allow those licenses to be valid without requiring the text to be in the SPDX document itself.  This enhancement has been documented in the [https://github.com/spdx/spdx-spec/issues SPDX specification issues list].  This project automates the registration and management of the namespaces.
+
 
+
====Available Mentors====
+
[mailto:gary@sourceauditor.com Gary O'Neall]
+
 
+
==Develop a Distributed License Repository Application==
+
Develop an application which accepts license text and adds it to a repository of licenses stored in a GitHub repository.  The interface the application would be a REST API.  There would be optional storage of a license in a GitHub repository.  The application would support the following use cases:
+
* See if a license text has already been registered or if license text is already on the [https://spdx.org/licenses SPDX License List].  If so, the ID would be returned.
+
* Add externally referenced SPDX documents containing license definitions.  When added, each license would be checked to see if it already exists.  A list of license ID aliases would be maintained for all licenses which point to the same license text.
+
* Add a license text to a known git repository.  Input would be license text, license name, proposed license ID, optional license namespace, optional comment, optional creator.  This would be stored as an SPDX document which defines the license.  There would be one document per license.  When added, the license would be checked to see if it already exists.  A list of license ID aliases would be maintained for all licenses which point to the same license text.
+
* Promote a license to the license list - this would call the REST API's for the online tool to add a license to the SPDX license list.
+
* Remove a license reference.  This would also update the license aliases.
+
* [Optional] Provide metrics on use for licenses to help the SPDX legal team propose licenses which should be on the SPDX license list.
+
 
+
====Skills Needed====
+
* Development skills in the Python, Java or JavaScript language
+
* Understanding of Github API's
+
* Ability to work with the user community in refining requirements
+
* REST API development
+
 
+
====Available Mentors====
+
[mailto:gary@sourceauditor.com Gary O'Neall]
+
 
+
====Background Information====
+
SPDX provides a license list for commonly used open source license - the [https://spdx.org/licenses SPDX License List].  SPDX also supports defining licenses within the SPDX document using a LicenseRef syntax defined in [https://spdx.org/spdx-specification-21-web-version#h.1v1yuxt section 6 of the SPDX specification].  In the next release of SPDX, we plan to introduce a mechanism for other organizations or individuals to maintain lists of licenses outside of the SPDX license list, but allow those licenses to be valid without requiring the text to be in the SPDX document itself.  This proposal is to provide a method of comparing, tracking and storing licenses which are not currently proposed to be on the SPDX license list.  It would likely be used by a front end UI and by other external tools.
+
 
+
==Enhanced Workflow for Online License Request==
+
Update the SPDX Online Tools license submit feature to support the following workflow:
+
* License submit can be initiated directly from the UI or through an external application (e.g. the [https://github.com/spdx/spdx-license-diff SPDX License Diff browser plugin]) using a documented API
+
* License text is compared to the currently approved license list
+
** If matched, the SPDX ID is returned and the user is informed that the license already exists
+
* License is compared to all submitted yet not approved licenses
+
** If matched, the user is informed the license is already submitted and is provided a link to the [https://github.com/spdx/license-list-XML/issues License List XML issue]
+
* License is compared to all submitted and rejected licenses
+
** If a match is found, the user is provided a link to the [https://github.com/spdx/license-list-XML/issues License List XML issue]
+
* License is compared to the existing license list using an algorithm which finds close matches (SPDX License Diff is an example)
+
** If an existing license is close, a diff view will show the word differences
+
** The user is presented with a choice of adding an issue for the nearly matching license stating that the license should match
+
*** If the user chooses to add the issue, the license text will be added to the issue requesting a change to the license XML to allow the match
+
*** We could also implement suggested XML markup (e.g. alt or optional text) to make the licenses match - NOTE: This may be a technically challenging feature to implement
+
* If the user wants to submit a new license request, the information is captured and processed by the SPDX legal team through [https://github.com/spdx/license-list-XML GitHub]
+
 
+
====Skills Needed====
+
* Development skills in the Python language
+
* Experience with parser development
+
* Experience proposing spec changes
+
* Understanding of Github API's
+
* Experience in XML parsing
+
 
+
====Background Information====
+
The SPDX legal team uses an online request process for new license requests. This feature was implemented by a GSoC student in 2018.  Extending the functionality to check for duplicate requests and checking for near matches would greatly improve the efficiency of the license request and approval process.
+
 
+
This project would require significant interaction with the users of the tool (the SPDX legal team) and would have some interesting technical challenges in storing and matching text.  The optional feature of suggesting XML markup for near matches could involve sophisticated matching techniques to find the appropriate text to include as optional or alternate.
+
 
+
The current SPDX license list request process is documented in the [https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md License List XML contributing page].
+
 
+
 
+
====Available Mentors====
+
[mailto:gary@sourceauditor.com Gary O'Neall]
+
 
+
==Update Parser Libraries for Golang ==
+
A new Golang library has recently been added to the SPDX tools, at [https://github.com/spdx/tools-golang]. This tool updated the SPDX Golang libraries to the SPDX 2.1 specification. This tool has several opportunities for improvement and adding features.
+
====Skills Needed====
+
* Development skills in the Golang language
+
* Experience with parser development
+
* Understanding of RDF and XML
+
====Background Information====
+
SPDX currently provides libraries supporting the reading and writing of SPDX documents. A recent new tool has been added for parsing, generating and working with SPDX documents in Golang [https://github.com/spdx/tools-golang]. Opportunities for improving and adding features to this tool include the following:
+
* adding support for the official RDF format
+
* experimenting with support for other formats, such as JSON, YAML and XML
+
* enabling support for parsing and generation of documents under pre-2.1 versions of the SPDX spec
+
 
+
====Available Mentors====
+
[mailto:swinslow@linuxfoundation.org Steve Winslow]
+
[mailto:gary@sourceauditor.com Gary O'Neall]
+
  
==Additional Format Support for the Python Libraries==
+
While researching the attributes that have to be present in an SBOM,
Add the ability to read and write XML, JSON, and YAML formats of the SPDX documents.
+
NTIA came up with a guidance about the minimum elements that must appear therein:
 +
https://www.ntia.doc.gov/files/ntia/publications/sbom_minimum_elements_report.pdf
  
====Skills Needed====
+
It would, therefore, be useful to have a tool that can determine whether an SBOM stored in SPDX format
* Development skills in the Python language
+
fulfills all such minimum obligations.
* Experience with parser development
+
* Understanding of XML, JSON and YAML
+
  
====Background Information====
+
The tool should make use of the already existing libraries for reading SPDX documents
SPDX 2.1 specification supports reading and writing RDF/XML and a tag/value format for SPDX documents. Version 2.2 of the specification will add support for XML, JSON and YAML.  The Python libraries currently support reading and writing the RDF/XML and tag/value. This project would extend the parsing and file generation capabilities of the python libraries to include XML, JSON and YAML format. 
+
=== Technologies ===
 +
Python
 +
=== Duration ===
 +
This will be a short (175 hours) project.
 +
It might be extended to a long (350 hours) project if integration
 +
with the existing SPDX handling tools (e.g., the Validation tool)
 +
is also implemented.
 +
=== Mentors ===
 +
Dick Brooks, Kate Stewart
  
The current python libraries are in the [[https://github.com/spdx/tools-python SPDX python tools git repository]]
+
== Private license management system ==
 +
A web-based system for managing license texts; similar to the SPDX License List but oriented towards other private collections of licenses.
 +
=== Description ===
 +
The goal of the project would be to create a simple web application
 +
for people to upload license texts
 +
and automatically create a license repository.
 +
The initial rough "functional specifications" describe it as
 +
mainly an input form, where the information is entered.
 +
There will be some automatic processing (e.g., canonicalization, duplicate avoidance, etc.),
 +
a review/approval (and naming) step,
 +
and then publishing in a specified format.
  
====Available Mentors====
+
It should be noted that the specification is not yet finalized
[mailto:tetechris20@gmail.com Krys Nuvadga], [mailto:gary@sourceauditor.com Gary O'Neall]
+
regarding naming namespaces, way to publish licenses, etc.
 +
If the SPDX project has already advanced in these definitions,
 +
this project will obviously implement the decisions taken.
 +
=== Technologies ===
 +
Python (any framework) for the back-end; JavaScript (any framework) for the minimal front-end.
 +
=== Duration ===
 +
This can be either a short (175 hours) project, implementing only the basic functionality;
 +
or a long (350 hours) one, implementing more functionality and automation.
 +
=== Mentors ===
 +
Alexios Zavras; more TBD
  
== Port SPDX license expression library to Ruby, JavaScript and Java==
+
== SBOM combiner ==
The [[https://github.com/nexB/license-expression/]|licens_expressionlibrary]] provides comprehensive support license expression using a boolean engine for Python.
+
The project will result in a simple command-line tool that will be able to “combine” information from a number of SBOMs into a comprehensive SBOM that includes all the information of the provided ones.
The goal of this project is to port and/or package this library for JavaScript, Ruby and Java, considering either code conversion tools, alternative Python implementations (e.g. Jython) or calling Python from another language to bring the same features to these other languages.
+
An actual use case would be the generation of an SBOM for an actual software delivery that is comprised by a number of components, each one of which has its own correct SBOM.
====Skills Needed====
+
=== Description ===
* Development skills in Python, Java, Ruby, JavaScript.
+
The primary purpose of this tool would be
 +
to stitch together smaller component-level SPDX documents
 +
and amalgamate them into one top-level SPDX document
 +
representing a "sum of parts" piece of software.
 +
As an initial pass for implementation, the component-level SBOMs would have to be provided by the caller
 +
until the tool was advanced enough to fetch SPDX Documents referenced by ExternalDocumenRef reliably.  
 +
=== Technologies ===
 +
Python (preferably); or Go.
 +
=== Duration ===
 +
This will be a short (175 hours) project.
 +
=== Mentors ===
 +
Rose Judge; others TBD
  
====Background Information====
+
== Update of Java SPDX libraries to handle latest spec ==
See https://github.com/spdx/tools-python/issues/10 and https://github.com/nexB/license-expression/
+
=== Description ===
====Available Mentors====
+
The SPDX Project maintains a library, written in Java, for working with SPDX data.
[mailto:pombredanne@nexb.com Philippe Ombredanne]
+
The development of the library does not always follow the development of the specification immediately.
 +
Since the specification has evolved
 +
and a newer version is expected to be published
 +
right before the timeframe of the project,
 +
it would be useful to have the standard Java libraries
 +
capable of handling the latest spec.
  
=SPDX Specification Projects=
+
The project will involve obviously understanding deeply
The following projects contribute directly to the creation or validation of the SPDX 2.1 specification.
+
the existing libraries
 +
and extending them to handle the latest additions
 +
of the specification (to the point of the published version).
 +
=== Technologies ===
 +
Java; see https://github.com/spdx/Spdx-Java-Library
 +
=== Duration ===
 +
This will be a short (175 hours) project.
 +
=== Mentors ===
 +
TBD
  
== SPDX Specification in PDF and HTML ==
+
== Update of Go SPDX libraries to handle latest spec ==
 +
=== Description ===
 +
The SPDX Project maintains a library, written in Go, for working with SPDX data.
 +
The development of the library does not always follow the development of the specification immediately.
 +
Since the specification has evolved
 +
and a newer version is expected to be published
 +
right before the timeframe of the project,
 +
it would be useful to have the standard Go libraries
 +
capable of handling the latest spec.
  
We need to generate both HTML and PDF versions from Markdown (MD). The default is English but going forward we envision other language translations as well so they would need to be accounted for in the overall structure and approach.
+
The project will involve obviously understanding deeply
 +
the existing libraries
 +
and extending them to handle the latest additions
 +
of the specification (to the point of the published version).
 +
=== Technologies ===
 +
Go; see https://github.com/spdx/tools-golang
 +
=== Duration ===
 +
This will be a short (175 hours) project.
 +
=== Mentors ===
 +
TBD
  
What we need going forward is:
+
== SPDX Golang RDF Saver ==
 +
=== Description ===
 +
SPDX already has a Golang library to save RDF triples into a file/string
 +
using the gordf project: https://github.com/spdx/gordf
  
* The ability to generate both an HTML and PDF document for every released version of the specification. This would be a final copy that should should not be re-rendered.  
+
The aim of this GSoC project would be to write an adapter in the
* The ability to display "real time" draft versions of the specification in HTML. This means changes to the specification via commits to GIT for that version should be shown, More than one draft version should be able to be displayed in this fashion. That means we could be rendering draft version 3.1 and draft version 4.0. WE do not need this in PDF.
+
SPDX Golang Tools (the tools-golang repository at https://github.com/spdx/tools-golang) that
* For the PDF version we will need a cover page, page numbers, TOC, header and footer. Internal document references should work. The ideal solution will be to come up with something that works for both the HTML and PDF versions but some automated post processing is for the PDF is okay.
+
would take an SPDX Document struct (see https://github.com/spdx/tools-golang/blob/main/spdx/document.go) as
* The default is English but going forward we envision other language translations as well for the HTML and PDF so they would need to be accounted for in the overall structure and approach.
+
an input, and serialize it and its child elements into RDF triples to be consumed by the
 +
aforementioned gordf rdf-writer.
 +
=== Technologies ===
 +
Golang; RDF
 +
=== Duration ===
 +
This will be a short (175 hours) project. If the project requires less than 175 hours, remaining time can be spent on
 +
additional improvements to the Golang tools.
 +
=== Mentors ===
 +
Rishabh Bhatnagar; Steve Winslow as secondary / backup
  
====Skills Needed====
 
* Understanding of documentation tooling: Markdown, HTML, mkdocs, etc.,.
 
* Familiarity with GIT and github API's
 
  
====Background Information====
+
== Update of Python SPDX libraries to handle latest spec ==
The [https://spdx.org/specifications 2.1 SPDX specification] has been moved to markdown on at https://github.com/spdx/spdx-spec
+
=== Description ===
and now generates an HTML version at: https://spdx.github.io/spdx-spec
+
The SPDX Project maintains a library, written in Python, for working with SPDX data.
 +
The development of the library does not always follow the development of the specification immediately.
 +
Since the specification has evolved
 +
and a newer version is expected to be published
 +
right before the timeframe of the project,
 +
it would be useful to have the standard Python libraries
 +
capable of handling the latest spec.
  
If you look at the SPDX GitHub today it shows the 2.1 SPDX specification nicely as HTML. This is done using a Travis VM, GitHub Pages and markdown. Mkdocs is used to generate the HTML pages. You can look at the scripts in the GIT. What we need going forward is a way to generate both a PDF version and HTML and  for each draft and release version of the specification.
+
The project will involve obviously understanding deeply
 +
the existing libraries
 +
and extending them to handle the latest additions
 +
of the specification (to the point of the published version).
 +
=== Technologies ===
 +
Python; see https://github.com/spdx/tools-python
 +
=== Duration ===
 +
This will be a short (175 hours) project.
 +
=== Mentors ===
 +
TBD
  
For the 2.1.1 specification we are currently using [https://pandoc.org/  pandoc] and [https://wkhtmltopdf.org/  wkhtmltopdf] to generate a PDF specification. This is currently being done offline. In doing this we need to combine the MD documents as each chapter in the specification is a separate MD document. That would work ecept there are internal links that go between the MD documents; e.g. xxxxxx/chapter4.md. These links are broken on conversion to the PDF.
 
  
A potential approach other than pandoc and wkhtmltopdf to consider for the PDF is at https://github.com/tombensve/MarkdownDoc.
+
== More to come... ==
 +
Mentors:  please fill out the following template for any projects you wish to propose.  
  
====Available Mentors====
+
=== Project Name ===
[mailto:j-manbeck2@ti.com Jack Manbeck]
+
add overview of project here
[mailto:kstewart@linuxfoundation.org Kate Stewart]
+
====Skills Needed====
[mailto:thomas.steenbergen@here.com Thomas Steenbergen]
+
what skills should the student have to do the coding exercises
 +
====Duration===
 +
whether this is a short or a long project
 +
====Background Information====
 +
context for the project and references to be studied
 +
====Available Mentors====
 +
list individuals who are willing to mentor and provide information about the project proposal.
  
== SPDX Specification Views for legal counsels and developers ==
+
= Historical info =
The proposal is to see if it possible to deduct large SPDX documents into a small subset SPDX document providing a specific reduced "views" on larger data.
+
====Skills Needed====
+
* Understanding of compliance needs of legal counsels and developers so we can remove friction to adopt SPDX
+
====Background Information====
+
SPDX documents commonly contain 100s, if not 1000s of entries making it hard for a human to make manual corrections or draw conclusions. No scanner can provide 100% complete data human corrections are usual needed. The aim from this proposal is twofold:
+
1. Enable developers with a "code view" of tool-generated SPDX document close to the code they work on to enable them to make corrections to the SPDX data. For instance amend SPDX package tag values or model package dependencies not detected by used scanner.
+
2. Provide legal counsels with a "package and limited file view" to enable legal conclusions
+
====Available Mentors====
+
[mailto:swinslow@linuxfoundation.org Steve Winslow]
+
[mailto:thomas.steenbergen@here.com Thomas Steenbergen]
+
  
== SPDX Document Generator for projects using SPDXIDs ==
+
[[GSOC/PastProjectIdeas]]
As more projects start to use SPDXIDs at the file level it becomes much simpler to generate SPDX docs for them from a python script. 
+
====Skills Needed ====
+
* Ability to program in python
+
====Background Information ====
+
Forward thinking open source projects are adopting SPDXIDs in source files (initially U-Boot,  but now much wider use like Zephyr, Linux Kernel, etc.)
+
With these easy to find "SPDX-License-Identifier:" strings,  generating an SPDX document for a project is a matter of iterating over
+
the files in a project and extracting the information from these SPDXIDs and calculating checksums. 
+
Creating an open source tool to do this will aid these projects in generating accurate SBOM information at release time.
+
This tool should be implemented as a command line, so it can be incorporated into builds, and options can be added. 
+
Goal is that projects that use SPDX identifiers can automatically generate a SPDX document as a Software Bill of Materials
+
(SBOM) on demand (build, release, etc.).
+
====Available Mentors====
+
[mailto:kstewart@linuxfoundation.org Kate Stewart]
+
[mailto:saiudayshankar@gmail.com Uday Shankar]
+

Latest revision as of 17:36, 31 March 2022


Welcome to the 2022 SPDX Google Summer of Code Project Page

See the proposal template if you are interested in submitting a Google Summer of Code proposal.

Should you have questions please do not hesitate to contact one of the mentors directly.



What is SPDX ?

First and foremost we are a community dedicated to solving the issues and problems around Open Source licensing and compliance. The SPDX work group (part of the Linux Foundation) consists of individuals, community members, and representatives from companies, foundations and organizations who use or are considering using the SPDX standard. The work group operates much like a meritocratic, consensus-based community project; that is, anyone with an interest in the project can join the community, contribute to the specification, and participate in the decision-making process. We come from many different backgrounds including open source developers, lawyers, consultants and business professionals, many of who have been involved with license compliance and identification for years.

As part of this effort we have developed a set of collateral that can be used:

Why choose an SPDX Project?

Contributing to one of the SPDX projects below will provide a valuable contribution to developers and/or users of open source software. We believe you will find the projects both technically challenging and rewarding. In essence we believe you will be able to look back one day and I say I was part of that effort.


Getting Involved

Beyond working with your mentor(s) we highly encourage students who select one of these projects to get involved with the SPDX community via our technical working group. Interaction with the technical team is primarily done via its mailing list and on gitter (see resources). There is however a weekly call you could join as well. .

Resources

Ideas for 2022 Projects

SBOM Conformance Checker

The goal of this project is to create a simple tool that checks whether an SBOM (in SPDX format) conforms to NTIA's minimum elements guidance.

Description

The SPDX Specification defines a number of fields (elements) that may appear in an SBOM (Software Bill of Materials). Not all of them are mandatory, however, so SBOMs in SPDX format can vary greatly.

While researching the attributes that have to be present in an SBOM, NTIA came up with a guidance about the minimum elements that must appear therein: https://www.ntia.doc.gov/files/ntia/publications/sbom_minimum_elements_report.pdf

It would, therefore, be useful to have a tool that can determine whether an SBOM stored in SPDX format fulfills all such minimum obligations.

The tool should make use of the already existing libraries for reading SPDX documents

Technologies

Python

Duration

This will be a short (175 hours) project. It might be extended to a long (350 hours) project if integration with the existing SPDX handling tools (e.g., the Validation tool) is also implemented.

Mentors

Dick Brooks, Kate Stewart

Private license management system

A web-based system for managing license texts; similar to the SPDX License List but oriented towards other private collections of licenses.

Description

The goal of the project would be to create a simple web application for people to upload license texts and automatically create a license repository. The initial rough "functional specifications" describe it as mainly an input form, where the information is entered. There will be some automatic processing (e.g., canonicalization, duplicate avoidance, etc.), a review/approval (and naming) step, and then publishing in a specified format.

It should be noted that the specification is not yet finalized regarding naming namespaces, way to publish licenses, etc. If the SPDX project has already advanced in these definitions, this project will obviously implement the decisions taken.

Technologies

Python (any framework) for the back-end; JavaScript (any framework) for the minimal front-end.

Duration

This can be either a short (175 hours) project, implementing only the basic functionality; or a long (350 hours) one, implementing more functionality and automation.

Mentors

Alexios Zavras; more TBD

SBOM combiner

The project will result in a simple command-line tool that will be able to “combine” information from a number of SBOMs into a comprehensive SBOM that includes all the information of the provided ones. An actual use case would be the generation of an SBOM for an actual software delivery that is comprised by a number of components, each one of which has its own correct SBOM.

Description

The primary purpose of this tool would be to stitch together smaller component-level SPDX documents and amalgamate them into one top-level SPDX document representing a "sum of parts" piece of software. As an initial pass for implementation, the component-level SBOMs would have to be provided by the caller until the tool was advanced enough to fetch SPDX Documents referenced by ExternalDocumenRef reliably.

Technologies

Python (preferably); or Go.

Duration

This will be a short (175 hours) project.

Mentors

Rose Judge; others TBD

Update of Java SPDX libraries to handle latest spec

Description

The SPDX Project maintains a library, written in Java, for working with SPDX data. The development of the library does not always follow the development of the specification immediately. Since the specification has evolved and a newer version is expected to be published right before the timeframe of the project, it would be useful to have the standard Java libraries capable of handling the latest spec.

The project will involve obviously understanding deeply the existing libraries and extending them to handle the latest additions of the specification (to the point of the published version).

Technologies

Java; see https://github.com/spdx/Spdx-Java-Library

Duration

This will be a short (175 hours) project.

Mentors

TBD

Update of Go SPDX libraries to handle latest spec

Description

The SPDX Project maintains a library, written in Go, for working with SPDX data. The development of the library does not always follow the development of the specification immediately. Since the specification has evolved and a newer version is expected to be published right before the timeframe of the project, it would be useful to have the standard Go libraries capable of handling the latest spec.

The project will involve obviously understanding deeply the existing libraries and extending them to handle the latest additions of the specification (to the point of the published version).

Technologies

Go; see https://github.com/spdx/tools-golang

Duration

This will be a short (175 hours) project.

Mentors

TBD

SPDX Golang RDF Saver

Description

SPDX already has a Golang library to save RDF triples into a file/string using the gordf project: https://github.com/spdx/gordf

The aim of this GSoC project would be to write an adapter in the SPDX Golang Tools (the tools-golang repository at https://github.com/spdx/tools-golang) that would take an SPDX Document struct (see https://github.com/spdx/tools-golang/blob/main/spdx/document.go) as an input, and serialize it and its child elements into RDF triples to be consumed by the aforementioned gordf rdf-writer.

Technologies

Golang; RDF

Duration

This will be a short (175 hours) project. If the project requires less than 175 hours, remaining time can be spent on additional improvements to the Golang tools.

Mentors

Rishabh Bhatnagar; Steve Winslow as secondary / backup


Update of Python SPDX libraries to handle latest spec

Description

The SPDX Project maintains a library, written in Python, for working with SPDX data. The development of the library does not always follow the development of the specification immediately. Since the specification has evolved and a newer version is expected to be published right before the timeframe of the project, it would be useful to have the standard Python libraries capable of handling the latest spec.

The project will involve obviously understanding deeply the existing libraries and extending them to handle the latest additions of the specification (to the point of the published version).

Technologies

Python; see https://github.com/spdx/tools-python

Duration

This will be a short (175 hours) project.

Mentors

TBD


More to come...

Mentors: please fill out the following template for any projects you wish to propose.

=== Project Name ===
add overview of project here
====Skills Needed====
what skills should the student have to do the coding exercises
====Duration===
whether this is a short or a long project
====Background Information====
context for the project and references to be studied
====Available Mentors====
list individuals who are willing to mentor and provide information about the project proposal.

Historical info

GSOC/PastProjectIdeas