-
Notifications
You must be signed in to change notification settings - Fork 324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(cyclonedx): Sanitize copyrights for the CycloneDX XML report #9303
fix(cyclonedx): Sanitize copyrights for the CycloneDX XML report #9303
Conversation
Some characters in copyrights cannot be outputted to XML. Therefore, sanitize the copyrights content for XML. Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.com>
@@ -90,6 +91,7 @@ asciidoctorj-pdf = { module = "org.asciidoctor:asciidoctorj-pdf", version.ref = | |||
awsS3 = { module = "software.amazon.awssdk:s3", version.ref = "s3" } | |||
clikt = { module = "com.github.ajalt.clikt:clikt", version.ref = "clikt" } | |||
commonsCompress = { module = "org.apache.commons:commons-compress", version.ref = "commonsCompress" } | |||
commonsText = { module = "org.apache.commons:commons-text", version.ref = "commonsText" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super happy about introducing a dependency on commons-text
just for escaping XML... can we find a more lightweight solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, we once had #2878, so maybe that idea could be recycled in an extended form?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fell free to suggest one!
I didn't find anything by googling. The Cyclone DX lib delegates to Jackson for the XML generation that delegates to StAX, but I found nothing there too 😞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I remember that somewhere we had a discussion about wrapping Copyright texts in <![CDATA[...]]>
, but I currently cannot find it anymore. Would that also solve the issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat related, though targeting the description: #2814
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I remember that somewhere we had a discussion about wrapping Copyright texts in
<![CDATA[...]]>
, but I currently cannot find it anymore. Would that also solve the issue?
Assuming that this would help, I'm wondering why our Copyright's aren't already wrapped in <![CDATA[...]]>
. Also this test from upstream shows that the text is nested, which does not seem to be the case for our reports. So maybe that's the root cause: For some reason we're not using the <text>
tag, and that's also why data does not get wrapped in <![CDATA[...]]>
. Could you investigate into that direction, @nnobelis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the test you linked, the copyright is under an evidence
element:
<component type="application">
<evidence>
<copyright>
And Component.evidence.copyright
has a special class in the model, with this text
element:
https://github.com/CycloneDX/cyclonedx-core-java/blob/6cfef328eb0c6c73f83dff13f9b2fe1934c271b9/src/main/java/org/cyclonedx/model/Copyright.java#L30
If you use Component.copyright
instead, this is just a string:
https://github.com/CycloneDX/cyclonedx-core-java/blob/6cfef328eb0c6c73f83dff13f9b2fe1934c271b9/src/main/java/org/cyclonedx/model/Component.java#L174
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should open an upstream ticket.
I think this is not nice that we escape it ourselves: with the code in the PR, we are escaping the BOM for XML even if we generate only some JSON.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, it seem odd that Copyrights are wrapped in CDATA
if part of <evidence>
, but not otherwise.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #9303 +/- ##
============================================
+ Coverage 66.29% 67.28% +0.99%
Complexity 1201 1201
============================================
Files 239 239
Lines 8446 8446
Branches 905 905
============================================
+ Hits 5599 5683 +84
+ Misses 2478 2394 -84
Partials 369 369
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@sschuberth Funnily enough, I found out you had an |
Yes, that's what I wrote over here an hour ago 😉 |
Closed in favor of an upstream issue: CycloneDX/cyclonedx-core-java#538. |
I was hoping for a fix, but it's in fact an issue 😄 |
Some characters in copyrights cannot be outputted to XML. Therefore, sanitize the copyrights content for XML.
This fixes the following exception: