You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue #3266 discusses a potential optimization by re-using XPathContext objects (which are relatively expensive to initialize).
This PR makes the following changes:
- the `XPathContext` object ...
- can accept `nil` as the value arg to `#register_ns` and `#register_variable` to _deregister_ the namespace or variable
- tracks namespaces and variables registered through those two methods
- has a new `#reset` method that deregisters all namespaces and variables registered
- has a new `#node=` method to set the current node being searched
- all the `Searchable` methods ...
- use a thread-local XPathContext object that is only available in a block yielded by `#get_xpath_context`
- and that context object is `#reset` when the block returns
There is an escape hatch that I will leave undocumented, which is to set the env var `NOKOGIRI_DEOPTIMIZE_XPATH`, out of an abundance of caution.
Here's a benchmark, where "small" is a 6kb file and "large" is a 70kb file:
```
$ NOKOGIRI_DEOPTIMIZE_XPATH=t ruby --yjit ./issues/3266-xpath-benchmark.rb
ruby 3.3.6 (2024-11-05 revision 75015d4c1f) +YJIT [x86_64-linux]
Warming up --------------------------------------
large: normal 3.790k i/100ms
Calculating -------------------------------------
large: normal 37.556k (± 1.7%) i/s (26.63 μs/i) - 189.500k in 5.047390s
ruby 3.3.6 (2024-11-05 revision 75015d4c1f) +YJIT [x86_64-linux]
Warming up --------------------------------------
small: normal 11.726k i/100ms
Calculating -------------------------------------
small: normal 113.719k (± 2.5%) i/s (8.79 μs/i) - 574.574k in 5.055648s
$ ruby --yjit ./issues/3266-xpath-benchmark.rb
ruby 3.3.6 (2024-11-05 revision 75015d4c1f) +YJIT [x86_64-linux]
Warming up --------------------------------------
large: optimized 4.609k i/100ms
Calculating -------------------------------------
large: optimized 48.107k (± 1.6%) i/s (20.79 μs/i) - 244.277k in 5.079041s
Comparison:
large: optimized: 48107.3 i/s
large: normal: 37555.7 i/s - 1.28x slower
ruby 3.3.6 (2024-11-05 revision 75015d4c1f) +YJIT [x86_64-linux]
Warming up --------------------------------------
small: optimized 32.014k i/100ms
Calculating -------------------------------------
small: optimized 319.760k (± 0.6%) i/s (3.13 μs/i) - 1.601M in 5.006140s
Comparison:
small: optimized: 319759.6 i/s
small: normal: 113719.0 i/s - 2.81x slower
```
I originally implemented a much simpler approach, which cleared all registered variables and functions, however the standard XPath 1.0 functions were all also deregistered; and calling `xmlXPathRegisterAllFunctions` to re-register them took us back to the original performance profile.
The win here is to avoid re-calling `xmlXPathRegisterAllFunctions`, and for that we track registered variables and namespaces, and de-register them after the query eval completes.
Copy file name to clipboardexpand all lines: CHANGELOG.md
+5
Original file line number
Diff line number
Diff line change
@@ -19,6 +19,11 @@ This release ships separate precompiled GNU and Musl gems for all linux platform
19
19
This release drops precompiled native platform gems for `x86-linux` and `x86-mingw32`. **These platforms are still supported.** Users on these platforms must install the "ruby platform" gem which requires a compiler toolchain. See [Installing the `ruby` platform gem](https://nokogiri.org/tutorials/installing_nokogiri.html#installing-the-ruby-platform-gem) in the installation docs. (#3369, #3081)
20
20
21
21
22
+
### Improved
23
+
24
+
* CSS and XPath queries are faster now that `Node#xpath`, `Node#css`, and related functions are re-using the underlying xpath context object (which is expensive to initialize). We benchmarked a 2.8x improvement for a 6kb file, and a more modest 1.3x improvement for a 70kb file. (#3378) @flavorjones
0 commit comments