-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-38915: [Java] Upgrade Arrow Java project to JPMS Java Platform Module System #38876
Conversation
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format?
or
In the case of PARQUET issues on JIRA the title also supports:
See also: |
This is roughly comparable to where #13072 got to, minus the actual module-info.java files.
@davisusanibar @lidavidm @danepitkin Next up will be adding the maven plugin, then adding the module-info files from #13072, then modularizing other modules such as algorithm, adapters, ... |
This suggests that we need to install JDK 9+ on some of the build agents: |
This failure is happening for somewhat complicated reasons: Before refactoring, the allocator used by this test would be the Netty module's DefaultAllocationManagerFactory since TestBaseAllocator lived in the Netty module. This one correctly returns a non-zero value when given a zero-allocation. After refactoring, TestBaseAllocator lives in memory-core, and uses the dummy DefaultAllocationManagerFactory in memory-core's tests. This one however does return zero when given a zero-allocation. This might itself be a bug unrelated to the refactoring. This calls MemoryUtil.UNSAFE.allocateMemory(), which calls sun.misc.Unsafe.allocateMemory(), which is supposed to return a non-zero value. I'm going to @ignore this test for now. Should these allocation manager factorys return a non-zero address when the user asks for an empty allocation? I would assume so since C-allocators have this property. @davisusanibar , did you see anything like this? |
7344241
to
c67a59b
Compare
I'm not sure what's happening in this CI failure after the Netty refactoring.
|
This didn't happen in my last test run and all I did was change the module info computer plugin to inherit its version. So this is probably a threading issue. Maybe extracting this test from other TestBaseAllocator tests is causing them to run in parallel when they weren't before. |
Thanks for pushing this feature, James. I am observing changes. Does this problem persist? |
Yes this still happens. I have marked the test as ignored to get past it. |
I'm looking for some ideas on how to fix up a maven issue. I've rigged the general build configuration in the parent POM to continue to use JDK 8 and exclude module-info.java files. I've also shut off using the module path for dependencies. In the memory-netty build, I need a profile (that runs on JDK9) that compiles using the module-path for dependencies (using the flag added to maven-compiler-plugin 3.11.0). It needs to run an extra command to patch code from another JAR into some of the packages that Netty modules export. I'd also like to compile module-info.java for this. I likely need two executions for the Netty build. One that compiles everything in JDK8 except module-info.java, and one that uses JDK 9, compiles module-info.java and does the patch command. I don't want to compile anything else with JDK9 since I need to generate JDK8 bytecode for everything else (or I can compile everything in JDK9, then recompile everything except module-info.java in 8) The problem I'm seeing is that setting the flag for setting the flag in the profile's configuration isn't taking effect. This problem happens when both when it's an execution-level configuration or a plugin-level configuration. It does take effect when set on the parent POM. |
|
The test errors in memory-core have this cause: I've addressed this is in maven by adding --add-opens java.base/java.nio=org.apache.arrow.memory.core, but perhaps I haven't covered a profile that CI is using. If anyone has any ideas on what else might need to be edited that'd be really helpful. |
This PR has alot of changes. Perhaps we handle other modules (such as Flight, algorithm, dataset) in a separate PR so that this one can close. |
These changes are requiring alot of patching/add-opens calls to work (see the changes It is very unfriendly for the user. The patch-module call is particularly difficult because we, and the user, do not necessarily know where their JARs are. |
I'm not going to be able to review this for a while. FWIW, I would be fine deferring other modules for another PR, and I would also be fine saying that arrow-memory-netty is just not supported when using modules because of the patching it does. |
Given how new this is, I would even be OK saying that the only supported memory implementation is the Java 21+ one to begin with and we can see if there is even demand to use memory-unsafe or memory-netty. Finally, if there's need to reorganize which package files live in, maybe we could split some of that into another PR. |
46ee8fb
to
90a8ee3
Compare
This test fails because we no longer use Netty's allocation manager factory since the test has moved into memory-core. To be fixed.
There are issues looking up gRPC in the flight-integration0tests tests if flight-core gets shaded with a newer version up to at least 3.5.1.
Note that some configuration from the arrow root POM is copied to the maven plugins module because it cannot be inherited without creating a cyclic depdendency.
Reports a used dependency as unused only on Windows 2022 CI runs
It is not correctly getting skipped.
Newer versions of the checkstyle plugin do support module-info.java files but require rewriting the checkstyle rules file.
Change the project to not use the module path for most compilation since most compilation will target JDK8.
Not sure if this is necessary if we just use the plugin to compile module-info.java
…ty and memory-core
…es to memory-core
Needed for the module to get added to the module-path when testing
Move tests outside of io.netty.buffer package because that causes a conflict with Netty's modules. Change functions marked as protected that are used in tests to be public to facilitate this refactoring.
2.16.1 in particular had faulty module-info files.
Do not add tests in org.apache.arrow.util because that is an exported package for arrow-memory-core and causes module conflicts
Having an implicit dependency from arrow-memory-core does not put immutables on the module-path when running tests and causes module issues.
Export public package for arrow-vector module. Allow jackson to use arrow-vector pojo classes reflectively for serialization.
Log what message was reported if the message didn't contain the correct information.
This PR has been split into several smaller PRs and issues. |
|
Rationale for this change
This will allow for static linking, better support for newer JDKs, and integration with tools that require modules.
What changes are included in this PR?
Are these changes tested?
Yes, unit tests now run with modules when using JDK9+.
Are there any user-facing changes?
Yes. The Netty module has been moved around significantly for users that depend on that code. Since modules are now named, the add-opens calls needed to run Arrow is significantly more complicated.