You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/src/tutorials/snoop_inference.md
+10-3
Original file line number
Diff line number
Diff line change
@@ -124,9 +124,9 @@ The second number is the *inclusive* time, which is the exclusive time plus the
124
124
Therefore, the inclusive time is always at least as large as the exclusive time.
125
125
126
126
The `ROOT` node is a bit different: its exclusive time measures the time spent on all operations *except* inference.
127
-
In this case, we see that the entire call took approximately 10ms, of which 9.3ms was spent on activities besides inference.
127
+
In this case, we see that the entire call took approximately 3.3ms, of which 2.7ms was spent on activities besides inference.
128
128
Almost all of that was code-generation, but it also includes the time needed to run the code.
129
-
Just 0.76ms was needed to run type-inference on this entire series of calls.
129
+
Just 0.55ms was needed to run type-inference on this entire series of calls.
130
130
As you will quickly discover, inference takes much more time on more complicated code.
131
131
132
132
We can also display this tree as a flame graph, using the [ProfileView.jl](https://github.com/timholy/ProfileView.jl) package:
@@ -155,10 +155,17 @@ Users are encouraged to read the ProfileView documentation to understand how to
155
155
- ctrl-click can be used to zoom in
156
156
- empty horizontal spaces correspond to activities other than type-inference
157
157
- any boxes colored red (there are none in this particular example, but you'll see some later) correspond to *naively non-precompilable*`MethodInstance`s, in which the method is owned by one module but the types are from another unrelated module. Such `MethodInstance`s are omitted from the precompile cache file unless they've been "marked" by `PrecompileTools.@compile_workload` or an explicit `precompile` directive.
158
-
- any boxes colored orange-yellow (there is one in this demo) correspond to methods inferred for specific constants (constant propagation)
158
+
- any boxes colored orange-yellow (there is one in this demo) correspond to methods inferred for specific constants (constant propagation).
159
159
160
160
You can explore this flamegraph and compare it to the output from `print_tree`.
161
161
162
+
!!! note
163
+
Orange-yellow boxes that appear at the base of a flame are worth special attention, and may represent something that you thought you had precompiled. For example, suppose your workload "exercises" `myfun(args...; warn=true)`, so you might think you have `myfun` covered for the corresponding argument *types*. But constant-propagation (as indicated by the orange-yellow coloration) results in (re)compilation for specific *values*: if Julia has decided that `myfun` merits constant-propagation, a call `myfun(args...; warn=false)` might need to be compiled separately.
164
+
165
+
When you want to prevent constant-propagation from hurting your TTFX, you have two options:
166
+
- precompile for all relevant argument *values* as well as types. The most common argument types to trigger Julia's constprop heuristics are numbers (`Bool`/`Int`/etc) and `Symbol`.
167
+
- Disable constant-propagation for this method by adding `Base.@constprop :none` in front of your definition of `myfun`. Constant-propagation can be a big performance boost when it changes how performance-sensitive code is optimized for specific input values, but when this doesn't apply you can safely disable it.
168
+
162
169
Finally, [`flatten`](@ref), on its own or together with [`accumulate_by_source`](@ref), allows you to get an sense for the cost of individual `MethodInstance`s or `Method`s.
163
170
164
171
The tools here allow you to get an overview of where inference is spending its time.
0 commit comments