Technical Team/Minutes/2016-02-09

From SPDX Wiki
Jump to: navigation, search

February 9, 2016

Attendees

  • Gary O'Neall
  • Kate Stewart
  • Jilayne Lovejoy
  • Kris Reeves
  • Bill Schineller
  • Yev Bronshteyn
  • Scott Sterling
  • Kirsten Newcomer
  • Michael Herzog

License Templatization

This was a joint call with Legal to discuss the input format for the license list to review of Kris' "master" XML files found at https://github.com/myndzi/license-list/tree/xml-test/src

  • Should we use the standard SPDX names (from spec/RDF) for elements (e.g. licenseId instead of identifier)?
    • Advantage in consistency
    • Disadvantage, there is not a 1:1 mapping
    • Since this is an input format and not an output format, standardization may not be as required
    • Proposal: not require standard ID's
  • Need to align with matching guidelines; Kris identified questions in his email about this. Will need to align process with matching guidelines and revise matching guidelines as well (help from legal team here). We will hit upon thing for which a bright line rule is difficult.
  • Optional sections of the markup - what are the semantics?
    • Proposed: optional sections have meaning ONLY in the context of matching licenses. They specify sections of text that NEED NOT be matched. When *presenting* license contents, optional sections SHOULD be included (e.g. display on the SPDX website)
  • Can we change the Github display so that it wraps?
    • Would require changes to "hand-wrap" license content in HTML
      • From Kris after the meeting: I see significant benefits to this from a review and editing point of view, but it might require more markup to properly output/format the license data, akin to HTML’s

        tag. This might be worth having anyway for formatting reasons – relying on line breaks in XML is not a good practice. These tags can be added automatically, but may reduce legibility somewhat. They will allow more control over outputting a well-formatted text copy of a license while being able to edit the license in a more user-friendly appearance, perhaps.

  • How much legal review is needed?
    • Most likely will need to review all of them
  • Improvements in matching code: e.g. limits on Regex matches; limit on # of characters for replaceable text (no limit in current markup); what is meant by optional and subset of optional text, which includes title, copyright, and extraneous text at bottom as per guidelines (title tag purpose is to identify part that is optional; not to identify the “full name” as it appears on th the spreadsheet - if we wanted that, it would need a separate, additional tag)
    • Not included in this pass, Would like to improve - next pass
  • Process for maintaining the license list
    • Review current proposed XML on Github; Can create one pull request per license
    • Can use Kris' tool to make it easier to review and edit - provides color visual of different tags. Legal team members will need to be able to use
    • Can test the XML against the current spreadsheet/output
  • Discussion of web-base tool on spdx.org/license to compare licenses / find licenses
    • Can be written in JavaScript eliminated the need for a servlet engine to run this on the website
      • Kris expressed interest in writing such a tool once the new license formats are ready
  • Process Definition next steps
    • Legal team can pick up front end process
    • Would like to look into the tool Kris wrote to help with the review process
    • First pass of the license list
      • Another pass at converting the existing license list to XML in light of the determination of "optional" sections.
        • Kris will take a pass at this once the optional sections issues have been closed
      • Legal team will do a first pass on the results
      • Proposed process using pull requests
        • Individual pull requests for each license
        • Two people can sign off on each PR simply by posting a comment, similar to the NPM process:
          • Post a comment stating when you plan to tackle a specific item (raise your hand)
          • Post a comment stating when you’ve approved an item (“LGTM” – looks good to me)
          • Further tasking can be done in the Github interface, including a sort of checkbox list of “things that need to be done for this release” – investigate, perhaps?
    • Develop the code to build the source xml files into the publishable xml files for use by tools
    • Update the publishing tools to use the new XML file formats

Summary of Action Items

  • Decide on / confirm the semantic meaning of optional sections of the markup.
    • Proposed: optional sections have meaning ONLY in the context of matching licenses. They specify sections of text that NEED NOT be matched. When *presenting* license contents, optional sections SHOULD be included (e.g. display on the SPDX website)
  • Decide on whether to use the standard SPDX tags in the master (source) files
    • Proposal: Don't require the same tags
  • Decide on a process for producing a complete first pass of the license conversion
    • Take another pass at converting the licenses to XML in light of the determination about “optional” sections.
      • Kris will provide a second pass once the optional sections action item has been closed
    • Proposed process: Individual pull requests for each license. Two people can sign off on each PR simply by posting a comment.
  • Develop the code to build the source xml files into the publishable xml files for use by tools (essentially, add tags for all the synonym stuff)
  • Update the publishing tools to use the source XML files to publish files for the website
    • Gary will update the tools once the new XML file formats are available

Items requiring Further discussion/investigation

  • Decide on what state we need to be in to be eligible for a release of the new templates
    • Include “better” (not “.+”) match expressions for the matching sections?
    • Fill out markup for licenses that don’t have it?
    • Find/include missing license header instructions? (Possibly: licenses with “optional” sections that don’t have a “header” section)
    • Is it worth hard-wrapping license contents so they are more accessible through the Github UI?
    • Should we normalize the license contents where possible? (Remove strange low-ascii, remove UTF-8 when possible [should always be possible], use a consistent newline separator)
      • Proposal from Kris to normalize the contents