Skip to content

Commit 96c83db

Browse files
authored
Rewrite the docs (#389)
This is a major rewrite of the documentation for the PrecompileTools era. It tries to more cleanly separate tutorials from explanations, and it contains both introductory and advanced tutorials.
1 parent 34b419c commit 96c83db

32 files changed

+1170
-2607
lines changed

.github/workflows/Documenter.yml

+5
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,11 @@ jobs:
2626
version: '1'
2727
- run: julia --project -e 'using Pkg; Pkg.develop([PackageSpec(path=joinpath(pwd(), "SnoopCompileCore"))])'
2828
- uses: julia-actions/julia-buildpkg@latest
29+
# To access the developer tools from within a package's environment, they should be in the default environment
30+
- run: julia -e 'using Pkg; Pkg.develop([PackageSpec(path=joinpath(pwd(), "SnoopCompileCore")), PackageSpec(path=joinpath(pwd()))]); Pkg.instantiate()'
31+
# Additional packages we'll need
32+
- run: julia -e 'using Pkg; Pkg.add(["AbstractTrees", "Cthulhu"])' # pyplot would be nice but it often errors
33+
# Documenter wants them to be in the local environment
2934
- run: julia --project=docs/ -e 'using Pkg; Pkg.develop([PackageSpec(path=joinpath(pwd(), "SnoopCompileCore")), PackageSpec(path=joinpath(pwd()))]); Pkg.instantiate()'
3035
- uses: julia-actions/julia-docdeploy@releases/v1
3136
env:

SnoopCompileCore/src/snoop_inference.jl

+24-20
Original file line numberDiff line numberDiff line change
@@ -98,36 +98,40 @@ function _snoop_inference(cmd::Expr)
9898
end
9999

100100
"""
101-
tinf = @snoop_inference commands
101+
tinf = @snoop_inference commands;
102102
103-
Produce a profile of julia's type inference, recording the amount of time spent inferring
104-
every `MethodInstance` processed while executing `commands`. Each fresh entrance to
105-
type inference (whether executed directly in `commands` or because a call was made
106-
by runtime-dispatch) also collects a backtrace so the caller can be identified.
103+
Produce a profile of julia's type inference, recording the amount of time spent
104+
inferring every `MethodInstance` processed while executing `commands`. Each
105+
fresh entrance to type inference (whether executed directly in `commands` or
106+
because a call was made by runtime-dispatch) also collects a backtrace so the
107+
caller can be identified.
107108
108-
`tinf` is a tree, each node containing data on a particular inference "frame" (the method,
109-
argument-type specializations, parameters, and even any constant-propagated values).
110-
Each reports the [`exclusive`](@ref)/[`inclusive`](@ref) times, where the exclusive
111-
time corresponds to the time spent inferring this frame in and of itself, whereas
112-
the inclusive time includes the time needed to infer all the callees of this frame.
109+
`tinf` is a tree, each node containing data on a particular inference "frame"
110+
(the method, argument-type specializations, parameters, and even any
111+
constant-propagated values). Each reports the
112+
[`exclusive`](@ref)/[`inclusive`](@ref) times, where the exclusive time
113+
corresponds to the time spent inferring this frame in and of itself, whereas the
114+
inclusive time includes the time needed to infer all the callees of this frame.
113115
114116
The top-level node in this profile tree is `ROOT`. Uniquely, its exclusive time
115-
corresponds to the time spent _not_ in julia's type inference (codegen, llvm_opt, runtime, etc).
117+
corresponds to the time spent _not_ in julia's type inference (codegen,
118+
llvm_opt, runtime, etc).
116119
117-
There are many different ways of inspecting and using the data stored in `tinf`.
118-
The simplest is to load the `AbstracTrees` package and display the tree with
119-
`AbstractTrees.print_tree(tinf)`.
120-
See also: `flamegraph`, `flatten`, `inference_triggers`, `SnoopCompile.parcel`,
121-
`runtime_inferencetime`.
120+
Working with `tinf` effectively requires loading `SnoopCompile`.
121+
122+
!!! warning
123+
Note the semicolon `;` at the end of the `@snoop_inference` macro call.
124+
Because `SnoopCompileCore` is not permitted to invalidate any code, it cannot define
125+
the `Base.show` methods that pretty-print `tinf`. Defer inspection of `tinf`
126+
until `SnoopCompile` has been loaded.
122127
123128
# Example
124-
```jldoctest; setup=:(using SnoopCompile), filter=r"([0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?/[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?|\\d direct)"
129+
130+
```jldoctest; setup=:(using SnoopCompileCore), filter=r"([0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?/[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?|\\d direct)"
125131
julia> tinf = @snoop_inference begin
126132
sort(rand(100)) # Evaluate some code and profile julia's type inference
127-
end
128-
InferenceTimingNode: 0.110018224/0.131464476 on Core.Compiler.Timings.ROOT() with 2 direct children
133+
end;
129134
```
130-
131135
"""
132136
macro snoop_inference(cmd)
133137
return _snoop_inference(cmd)
+8-6
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
11
export @snoop_invalidations
22

33
"""
4-
list = @snoop_invalidations expr
4+
invs = @snoop_invalidations expr
55
66
Capture method cache invalidations triggered by evaluating `expr`.
7-
`list` is a sequence of invalidated `Core.MethodInstance`s together with "explanations," consisting
7+
`invs` is a sequence of invalidated `Core.MethodInstance`s together with "explanations," consisting
88
of integers (encoding depth) and strings (documenting the source of an invalidation).
99
10-
Unless you are working at a low level, you essentially always want to pass `list`
10+
Unless you are working at a low level, you essentially always want to pass `invs`
1111
directly to [`SnoopCompile.invalidation_trees`](@ref).
1212
1313
# Extended help
1414
15-
`list` is in a format where the "reason" comes after the items.
15+
`invs` is in a format where the "reason" comes after the items.
1616
Method deletion results in the sequence
1717
1818
[zero or more (mi, "invalidate_mt_cache") pairs..., zero or more (depth1 tree, loctag) pairs..., method, loctag] with loctag = "jl_method_table_disable"
@@ -22,14 +22,16 @@ where `mi` means a `MethodInstance`. `depth1` means a sequence starting at `dept
2222
Method insertion results in the sequence
2323
2424
[zero or more (depth0 tree, sig) pairs..., same info as with delete_method except loctag = "jl_method_table_insert"]
25+
26+
The authoritative reference is Julia's own `src/gf.c` file.
2527
"""
2628
macro snoop_invalidations(expr)
2729
quote
28-
local list = ccall(:jl_debug_method_invalidation, Any, (Cint,), 1)
30+
local invs = ccall(:jl_debug_method_invalidation, Any, (Cint,), 1)
2931
Expr(:tryfinally,
3032
$(esc(expr)),
3133
ccall(:jl_debug_method_invalidation, Any, (Cint,), 0)
3234
)
33-
list
35+
invs
3436
end
3537
end

SnoopCompileCore/src/snoop_llvm.jl

+4-5
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,10 @@ export @snoop_llvm
33
using Serialization
44

55
"""
6-
```
7-
@snoop_llvm "func_names.csv" "llvm_timings.yaml" begin
8-
# Commands to execute, in a new process
9-
end
10-
```
6+
@snoop_llvm "func_names.csv" "llvm_timings.yaml" begin
7+
# Commands to execute, in a new process
8+
end
9+
1110
causes the julia compiler to log timing information for LLVM optimization during the
1211
provided commands to the files "func_names.csv" and "llvm_timings.yaml". These files can
1312
be used for the input to `SnoopCompile.read_snoop_llvm("func_names.csv", "llvm_timings.yaml")`.

docs/Project.toml

+4
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,19 @@
11
[deps]
22
AbstractTrees = "1520ce14-60c1-5f80-bbc7-55ef81b5835c"
3+
Cthulhu = "f68482b8-f384-11e8-15f7-abe071a5a75f"
34
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
45
JET = "c3a54625-cd67-489e-a8e7-0a5a0ff4e31b"
56
MethodAnalysis = "85b6ec6f-f7df-4429-9514-a64bcd9ee824"
67
PyPlot = "d330b81b-6aea-500a-939a-2ce795aea3ee"
78
SnoopCompile = "aa65fe97-06da-5843-b5b1-d5d13cad87d2"
9+
SnoopCompileCore = "e2b509da-e806-4183-be48-004708413034"
810

911
[compat]
1012
AbstractTrees = "0.4"
13+
Cthulhu = "2"
1114
Documenter = "1"
1215
JET = "0.9"
1316
MethodAnalysis = "0.4"
1417
PyPlot = "2"
1518
SnoopCompile = "3"
19+
SnoopCompileCore = "3"

docs/make.jl

+10-6
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,23 @@
11
using Documenter
2+
using SnoopCompileCore
23
using SnoopCompile
34
import PyPlot # so that the visualizations.jl file is loaded
45

56
makedocs(
67
sitename = "SnoopCompile",
78
format = Documenter.HTML(
8-
prettyurls = get(ENV, "CI", nothing) == "true"
9+
prettyurls = true,
910
),
10-
modules = [SnoopCompile.SnoopCompileCore, SnoopCompile],
11-
linkcheck = true,
11+
modules = [SnoopCompileCore, SnoopCompile],
12+
linkcheck = true, # the link check is slow, set to false if you're building frequently
1213
# doctest = :fix,
14+
warnonly=true, # delete when https://github.com/JuliaDocs/Documenter.jl/issues/2541 is fixed
1315
pages = ["index.md",
14-
"tutorial.md",
15-
"Modern tools" => ["snoop_invalidations.md", "snoop_inference.md", "pgdsgui.md", "snoop_inference_analysis.md", "snoop_inference_parcel.md", "jet.md"],
16-
"reference.md"],
16+
"Basic tutorials" => ["tutorials/invalidations.md", "tutorials/snoop_inference.md", "tutorials/snoop_llvm.md", "tutorials/pgdsgui.md", "tutorials/jet.md"],
17+
"Advanced tutorials" => ["tutorials/snoop_inference_analysis.md", "tutorials/snoop_inference_parcel.md"],
18+
"Explanations" => ["explanations/tools.md", "explanations/gotchas.md", "explanations/fixing_inference.md"],
19+
"reference.md",
20+
]
1721
)
1822

1923
deploydocs(
232 KB
Loading

docs/src/explanations/basic.md

+40
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Understanding SnoopCompile and Julia's compilation pipeline
2+
3+
Julia uses
4+
[Just-in-time (JIT) compilation](https://en.wikipedia.org/wiki/Just-in-time_compilation) to
5+
generate the code that runs on your CPU.
6+
Broadly speaking, there are two major compilation steps: *inference* and *code generation*.
7+
Inference is the process of determining the type of each object, which in turn
8+
determines which specific methods get called; once type inference is complete,
9+
code generation performs optimizations and ultimately generates the assembly
10+
language (native code) used on CPUs.
11+
Some aspects of this process are documented [here](https://docs.julialang.org/en/v1/devdocs/eval/).
12+
13+
Using code that has never been compiled requires that it first be JIT-compiled, and this contributes to the latency of using the package.
14+
In some circumstances, you can cache (store) the results of compilation to files to
15+
reduce the latency when your package is used. These files are the the `*.ji` and
16+
`*.so` files that live in the `compiled` directory of your Julia depot, usually
17+
located at `~/.julia/compiled`. However, if these files become large, loading
18+
them can be another source for latency. Julia needs time both to load and
19+
validate the cached compiled code. Minimizing the latency of using a package
20+
involves focusing on caching the compilation of code that is both commonly used
21+
and takes time to compile.
22+
23+
Caching code for later use is called *precompilation*. Julia has had some forms of precompilation almost since the very first packages. However, it was [Julia
24+
1.9](https://julialang.org/blog/2023/04/julia-1.9-highlights/#caching_of_native_code) that first supported "complete" precompilation, including the ability to store native code in shared-library cache files.
25+
26+
SnoopCompile is designed to try to allow you to analyze the costs of JIT-compilation, identify
27+
key bottlenecks that contribute to latency, and set up `precompile` directives to see whether
28+
it produces measurable benefits.
29+
30+
## Package precompilation
31+
32+
When a package is precompiled, here's what happens under the hood:
33+
34+
- Julia loads all of the package's dependencies (the ones in the `[deps]` section of the `Project.toml` file), typically from precompile cache files
35+
- Julia evaluates the source code (text files) that define the package module(s). Evaluating `function foo(args...) ... end` creates a new method `foo`. Note that:
36+
+ the source code might also contain statements that create "data" (e.g., `const`s). In some cases this can lead to some subtle precompilation ["gotchas"](@ref running-during-pc)
37+
+ the source code might also contain a precompile workload, which forces compilation and tracking of package methods.
38+
- Julia iterates over the module contents and writes the *result* to disk. Note that the module contents might include compiled code, and if so it is written along with everything else to the cache file.
39+
40+
When Julia loads your package, it just loads the "snapshot" stored in the cache file: it does not re-evaluate the source-text files that defined your package! It is appropriate to think of the source files of your package as "build scripts" that create your module; once the "build scripts" are executed, it's the module itself that gets cached, and the job of the build scripts is done.
+165
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
# Techniques for fixing inference problems
2+
3+
Here we assume you've dug into your code with a tool like Cthulhu, and want to know how to fix some of the problems that you discover. Below is a collection of specific cases and some tricks for handling them.
4+
5+
Note that there is also a [tutorial on fixing inference](@ref inferrability) that delves into advanced topics.
6+
7+
## Adding type annotations
8+
9+
### Using concrete types
10+
11+
Defining variables like `list = []` can be convenient, but it creates a `list` of type `Vector{Any}`. This prevents inference from knowing the type of items extracted from `list`. Using `list = String[]` for a container of strings, etc., is an excellent fix. When in doubt, check the type with `isconcretetype`: a common mistake is to think that `list_of_lists = Array{Int}[]` gives you a vector-of-vectors, but
12+
13+
```jldoctest
14+
julia> isconcretetype(Array{Int})
15+
false
16+
```
17+
18+
reminds you that `Array` requires a second parameter indicating the dimensionality of the array. (Or use `list_of_lists = Vector{Int}[]` instead, as `Vector{Int} === Array{Int, 1}`.)
19+
20+
Many valuable tips can be found among [Julia's performance tips](https://docs.julialang.org/en/v1/manual/performance-tips/), and readers are encouraged to consult that page.
21+
22+
### Working with non-concrete types
23+
24+
In cases where invalidations occur, but you can't use concrete types (there are indeed many valid uses of `Vector{Any}`),
25+
you can often prevent the invalidation using some additional knowledge.
26+
One common example is extracting information from an [`IOContext`](https://docs.julialang.org/en/v1/manual/networking-and-streams/#IO-Output-Contextual-Properties-1) structure, which is roughly defined as
27+
28+
```julia
29+
struct IOContext{IO_t <: IO} <: AbstractPipe
30+
io::IO_t
31+
dict::ImmutableDict{Symbol, Any}
32+
end
33+
```
34+
35+
There are good reasons that `dict` uses a value-type of `Any`, but that makes it impossible for the compiler to infer the type of any object looked up in an `IOContext`.
36+
Fortunately, you can help!
37+
For example, the documentation specifies that the `:color` setting should be a `Bool`, and since it appears in documentation it's something we can safely enforce.
38+
Changing
39+
40+
```
41+
iscolor = get(io, :color, false)
42+
```
43+
44+
to
45+
46+
```
47+
iscolor = get(io, :color, false)::Bool # assert that the rhs is Bool-valued
48+
```
49+
50+
will throw an error if it isn't a `Bool`, and this allows the compiler to take advantage of the type being known in subsequent operations.
51+
52+
If the return type is one of a small number of possibilities (generally three or fewer), you can annotate the return type with `Union{...}`. This is generally advantageous only when the intersection of what inference already knows about the types of a variable and the types in the `Union` results in an concrete type.
53+
54+
As a more detailed example, suppose you're writing code that parses Julia's `Expr` type:
55+
56+
```julia
57+
julia> ex = :(Array{Float32,3})
58+
:(Array{Float32, 3})
59+
60+
julia> dump(ex)
61+
Expr
62+
head: Symbol curly
63+
args: Vector{Any(3,))
64+
1: Symbol Array
65+
2: Symbol Float32
66+
3: Int64 3
67+
```
68+
69+
`ex.args` is a `Vector{Any}`.
70+
However, for a `:curly` expression only certain types will be found among the arguments; you could write key portions of your code as
71+
72+
```julia
73+
a = ex.args[2]
74+
if a isa Symbol
75+
# inside this block, Julia knows `a` is a Symbol, and so methods called on `a` will be resistant to invalidation
76+
foo(a)
77+
elseif a isa Expr && length((a::Expr).args) > 2
78+
a::Expr # sometimes you have to help inference by adding a type-assert
79+
x = bar(a) # `bar` is now resistant to invalidation
80+
elseif a isa Integer
81+
# even though you've not made this fully-inferrable, you've at least reduced the scope for invalidations
82+
# by limiting the subset of `foobar` methods that might be called
83+
y = foobar(a)
84+
end
85+
```
86+
87+
Other tricks include replacing broadcasting on `v::Vector{Any}` with `Base.mapany(f, v)`--`mapany` avoids trying to narrow the type of `f(v[i])` and just assumes it will be `Any`, thereby avoiding invalidations of many `convert` methods.
88+
89+
Adding type-assertions and fixing inference problems are the most common approaches for fixing invalidations.
90+
You can discover these manually, but using Cthulhu is highly recommended.
91+
92+
## Inferrable field access for abstract types
93+
94+
When invalidations happen for methods that manipulate fields of abstract types, often there is a simple solution: create an "interface" for the abstract type specifying that certain fields must have certain types.
95+
Here's an example:
96+
97+
```
98+
abstract type AbstractDisplay end
99+
100+
struct Monitor <: AbstractDisplay
101+
height::Int
102+
width::Int
103+
maker::String
104+
end
105+
106+
struct Phone <: AbstractDisplay
107+
height::Int
108+
width::Int
109+
maker::Symbol
110+
end
111+
112+
function Base.show(@nospecialize(d::AbstractDisplay), x)
113+
str = string(x)
114+
w = d.width
115+
if length(str) > w # do we have to truncate to fit the display width?
116+
...
117+
```
118+
119+
In this `show` method, we've deliberately chosen to prevent specialization on the specific type of `AbstractDisplay` (to reduce the total number of times we have to compile this method).
120+
As a consequence, Julia's inference may not realize that `d.width` returns an `Int`.
121+
122+
Fortunately, you can help by defining an interface for generic `AbstractDisplay` objects:
123+
124+
```
125+
function Base.getproperty(d::AbstractDisplay, name::Symbol)
126+
if name === :height
127+
return getfield(d, :height)::Int
128+
elseif name === :width
129+
return getfield(d, :width)::Int
130+
elseif name === :maker
131+
return getfield(d, :maker)::Union{String,Symbol}
132+
end
133+
return getfield(d, name)
134+
end
135+
```
136+
137+
Julia's [constant propagation](https://en.wikipedia.org/wiki/Constant_folding) will ensure that most accesses of those fields will be determined at compile-time, so this simple change robustly fixes many inference problems.
138+
139+
## Fixing `Core.Box`
140+
141+
[Julia issue 15276](https://github.com/JuliaLang/julia/issues/15276) is one of the more surprising forms of inference failure; it is the most common cause of a `Core.Box` annotation.
142+
If other variables depend on the `Box`ed variable, then a single `Core.Box` can lead to widespread inference problems.
143+
For this reason, these are also among the first inference problems you should tackle.
144+
145+
Read [this explanation of why this happens and what you can do to fix it](https://docs.julialang.org/en/v1/manual/performance-tips/#man-performance-captured).
146+
If you are directed to find `Core.Box` inference triggers via [`suggest`](@ref), you may need to explore around the call site a bit--
147+
the inference trigger may be in the closure itself, but the fix needs to go in the method that creates the closure.
148+
149+
Use of `ascend` is highly recommended for fixing `Core.Box` inference failures.
150+
151+
## Handling edge cases
152+
153+
You can sometimes get invalidations from failing to handle "formal" possibilities.
154+
For example, operations with regular expressions might return a `Union{Nothing, RegexMatch}`.
155+
You can sometimes get poor type inference by writing code that fails to take account of the possibility that `nothing` might be returned.
156+
For example, a comprehension
157+
158+
```julia
159+
ms = [m.match for m in match.((rex,), my_strings)]
160+
```
161+
might be replaced with
162+
```julia
163+
ms = [m.match for m in match.((rex,), my_strings) if m !== nothing]
164+
```
165+
and return a better-typed result.

0 commit comments

Comments
 (0)