Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8281384: Random chars on paste from Windows clipboard #1724

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Schmidor
Copy link
Contributor

@Schmidor Schmidor commented Feb 25, 2025

Windows programs may reuse a clipboard buffer that is larger than the new content. In this case de NUL terminator is not at the end of the buffer, but within it.
The current implementation copys the whole buffer into a text field, including the NUL terminator and the remaining chars.

The JIRA ticket contains a JNA based sample program, which prefills the buffer for demonstrating this issue.
If this should be added as a unit test, I'm open for advice how to do that.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

  • JDK-8281384: Random chars on paste from Windows clipboard (Bug - P3)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jfx.git pull/1724/head:pull/1724
$ git checkout pull/1724

Update a local copy of the PR:
$ git checkout pull/1724
$ git pull https://git.openjdk.org/jfx.git pull/1724/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 1724

View PR using the GUI difftool:
$ git pr show -t 1724

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jfx/pull/1724.diff

Using Webrev

Link to Webrev Comment

Sorry, something went wrong.

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 25, 2025

👋 Welcome back oschmidtmer! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Feb 25, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot changed the title JDK-8281384: Random chars on paste from Windows clipboard 8281384: Random chars on paste from Windows clipboard Feb 25, 2025
@openjdk openjdk bot added the rfr Ready for review label Feb 25, 2025
@mlbridge
Copy link

mlbridge bot commented Feb 25, 2025

@kevinrushforth kevinrushforth self-requested a review February 25, 2025 13:47
@kevinrushforth
Copy link
Member

Reviewers: @kevinrushforth @lukostyra

/reviewers 2

@openjdk
Copy link

openjdk bot commented Feb 25, 2025

@kevinrushforth
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).

Copy link
Member

@kevinrushforth kevinrushforth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple comments. As for an automated test, maybe there is something we could do with FFM, but it might be a bit of a project to figure out how to incorporate it into our test framework.

@kevinrushforth
Copy link
Member

@Schmidor I can't reproduce the original bug on Windows 11. Back when I did reproduce it, I was running Windows 10, so Microsoft might have fixed something on their end. I'll try to find a Windows 10 system to verify the fix on, but it looks like a good fix nonetheless.

Copy link
Contributor

@lukostyra lukostyra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far this change looks good, I could reproduce it with attached ClipboardTest program and HelloTextArea on my Windows 11 machine. After the fix it's no longer an issue and the contents are as expected. Tests also did not show any regressions. I do, however, have one question.

I kind of wish we got correct information about how big Clipboard contents are from Windows API, and I'm kind of surprised it only gives us buffer size information and not actual content information. Are you sure we shouldn't fetch this information from Windows API somewhere instead of manually searching for a null terminator?

@kevinrushforth
Copy link
Member

That's a good question. If Windows does provide the information, then it would be best to use it. If not, maybe it would be better to move this check in the native code anyway, so that the Java code could just use the returned byte array?

@kevinrushforth
Copy link
Member

I can't reproduce the original bug on Windows 11.

Actually, I didn't run the test correctly earlier. I can reproduce it locally, so I'll be able to verify the fix (once we resolve the outstanding questions).

@hjohn
Copy link
Collaborator

hjohn commented Feb 26, 2025

How about fixing this in

?

It already includes special handling for CF_DROP, and you could add special handling for text/unicode:

    if (CF_TEXT == cf || CF_UNICODETEXT == cf) {
           ...
    }

I also checked, it seems that Windows does not provide a way to determine the content length. For (unicode) text it indeed relyings on nul terminators.

Main advantage is that it allocates the Java byte array in this function, and it could allocate it of the correct size immediately.

@kevinrushforth
Copy link
Member

That's a good idea, and is along the lines of what I was thinking when I mentioned moving this check into the native code.

@@ -403,6 +403,13 @@ HRESULT PopMemory(
//as well as corrupted format
cdata = 0;
}
} else if(CF_TEXT == cf || CF_UNICODETEXT == cf){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing that, consider always requesting CF_UNICODETEXT from the clipboard. Windows will then convert both CF_TEXT and CF_OEMTEXT to unicode (see Synthesized Clipboard Formats).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the existing code was already pretty optimistic about 2 byte chars, is it possible that is already handled somewhere else?
I'm not sure whether this is done explicitly somewhere or if CF_UNICODETEXT is just tried first.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be, but in any case, the way it's impemented right now doesn't seem to be right. We should only assume 2-byte characters if what's being pulled from the clipboard is actually CF_UNICODETEXT.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a little bit of digging and we will sometimes request CF_TEXT and CF_OEMTEXT (see create_mime_stuff() in this file). Technically we could change the pairings there and always request text data in CF_UNICODETEXT, but I would do some more testing and check if that doesn't cause regressions in some surprise spot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @mstr2 said, I think the data is always converted to CF_UNICODETEXT. If it wasn't there should have been more visible problems, as the current code always assumes wide chars.

So I just need to know how much we should change within this ticket: Just removing CF_TEXT from here, or also removing flavors from create_mime_stuff().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the point is to explicitly request CF_UNICODETEXT and, if Clipboard has CF_TEXT or CF_OEMTEXT inside, it will automatically convert it to Unicode on output. This way we ensure we always get 2-byte Unicode text independently of Clipboard contents. So I would say to change the flavors in create_mime_stuff() as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me, too. I was originally thinking that removing the CF_TEXT == cf || in this if statement would be sufficient, but it's certainly safer, and seems more correct, to also ensure that we request CF_UNICODETEXT, given that here and in the calling Java code, we assume 2-byte Unicode.

@kevinrushforth kevinrushforth self-requested a review March 3, 2025 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfr Ready for review
Development

Successfully merging this pull request may close these issues.

None yet

5 participants