Skip to content

Commit 9e8ed10

Browse files
Merge pull request #97 from galmasi/python_eventlog_parsing
Python eventlog parsing for measured boot attestation
2 parents 3f75dab + dd59fc8 commit 9e8ed10

File tree

1 file changed

+259
-0
lines changed

1 file changed

+259
-0
lines changed

98_pure_python_eventlog_parsing.md

+259
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,259 @@
1+
# Python Eventlog Parsing for Measured Boot Attestation
2+
3+
<!-- toc -->
4+
- [Release Signoff Checklist](#release-signoff-checklist)
5+
- [Summary](#summary)
6+
- [Motivation](#motivation)
7+
- [Goals](#goals)
8+
- [Non-Goals](#non-goals)
9+
- [Proposal](#proposal)
10+
- [User Stories (optional)](#user-stories-optional)
11+
- [Story 1](#story-1)
12+
- [Story 2](#story-2)
13+
- [Notes/Constraints/Caveats (optional)](#notesconstraintscaveats-optional)
14+
- [Risks and Mitigations](#risks-and-mitigations)
15+
- [Design Details](#design-details)
16+
- [Test Plan](#test-plan)
17+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
18+
- [Drawbacks](#drawbacks)
19+
- [Alternatives](#alternatives)
20+
- [Infrastructure Needed (optional)](#infrastructure-needed-optional)
21+
<!-- /toc -->
22+
23+
## Release Signoff Checklist
24+
25+
- [ ] Enhancement issue in release milestone, which links to pull request in [keylime/enhancements]
26+
- [ ] Core members have approved the issue with the label `implementable`
27+
- [ ] Design details are appropriately documented
28+
- [ ] Test plan is in place
29+
- [ ] User-facing documentation has been created in [keylime/keylime-docs]
30+
31+
## Proposal summary
32+
33+
The proposal is to replace the dependency on `tpm2-tools` in the
34+
Keylime measured boot attestation code with native python
35+
code.
36+
37+
Most of the change in Keylime itself to the nature of the parser's
38+
invocation -- from `os.system` to native Python. Additional complexity
39+
reduction can be achieved by removing the Python based post-processing
40+
of the event log and merging that code into the parser. Current
41+
features of post-processing include correct handling of UUIDs (badly
42+
parsed in earlier versions of `tpm2-tools`) and `libefivar`
43+
enhancements to parse device path entries.
44+
45+
We propose to keep the event log parser itself as a separate project
46+
within the Keylime code base, and to package it independently. This
47+
was proposed by the project maintainers as a way to keep disruptions
48+
in Keylime to a minimum; once the python based event log parser is
49+
packaged, it can be listed as a dependency on Keylime itself. To this
50+
end, the python event log parser will have its own CI and unit tests,
51+
code checks, and its own packaging mechanisms.
52+
53+
## Motivation
54+
55+
The chief motivation for this proposal is to keep measured boot
56+
attestation safe from bit rot. We have, in the last two years of
57+
operating Keylime (2021 to 2023), experienced at least two incidents
58+
in which upgrading `tpm2-tools` to a new version caused measured boot
59+
attestation to malfunction, because the output format (nominally YAML)
60+
of the tool was changed.
61+
62+
Generally speaking the quality of `tpm2-tools` has been improving.
63+
Some examples of disruptive changes we have encountered:
64+
* Trailing zeroes used to be excluded, now are excluded;
65+
* multiline output handled incorrectly in `tpm2-tools` v. 5.5
66+
* certain input parsed to strings of incorrect length, resulting in utf-16 garbage
67+
* UUIDs being interpreted incorrectly as in network order in v. 5.11
68+
69+
These are generally easy to correct for, but require us to actively
70+
follow changes in the tooling, and each new version of `tpm2-tools`
71+
requires us to adjust.
72+
73+
Therefore we propose to replace the approx. 1000 lines of C code in
74+
the `tpm2-tools` package with a mostly equivalent length (700 lines at
75+
present) of native Python code, maintained by the Keylime project. The
76+
code savings come from the fact that OO features of Python allow for
77+
simpler, more natural code organization; Python `struct` is very
78+
effective at parsing binary structures into Python native objects; and
79+
lastly, Python has JSON support.
80+
81+
### Goals
82+
83+
* Improve the stability of measured boot attestation over time.
84+
* Improve the quality of the measured boot event log parser.
85+
* Reduce Keylime's dependency on foreign tooling.
86+
87+
### Non-Goals
88+
89+
An explicit non-goal of _this_ proposal is to also enhance the
90+
`tpm2-tools` package by enumerating the issues found during the
91+
rewrite. That (entirely reasonable) work item should be handled
92+
somewhere else.
93+
94+
We do not (in this proposal) aim to change anything else about
95+
measured boot attestation in Keylime. The python event
96+
log parser is designed to produce output comparible with
97+
`tpm2-tools`.
98+
99+
### Notes/Constraints/Caveats (optional)
100+
101+
The most obvious caveat to address is that this project is basically
102+
duplicating code that already exists in `tpm2-tools`, and therefore
103+
could be seen as a duplication of effort, unwise both because of the
104+
wasted labor and the implied long tail of required maintenance.
105+
106+
The developers' argument is that (a) the duplication of effort is not
107+
that great, since it was done in 2 weeks of part-time work, minus the
108+
organizational aspects such as writing this document (b) the long tail
109+
of management is not really that long, because the specifications we
110+
are following are changing very slowly.
111+
112+
In any case we think that the implicit advantages of python-based
113+
development (e.g. native JSON support) and the much shorter
114+
development loop (i.e. the event log parser's developers and the end
115+
users of Keylime overlap) mitigate the disadvantage of code
116+
duplication.
117+
118+
### Risks and Mitigations
119+
120+
The largest risk is that the Python based event loop stops being
121+
maintained, develops bit rot and contributes to a decline of the
122+
quality of the Keylime code base. Mitigation lays in the decision to
123+
continue to use JSON as the exchange format between the parser and the
124+
Keylime measured boot policy agent, which would make a move *back* to
125+
`tpm2-tools` possible at least in principle.
126+
127+
There are no serious security aspects regarding the event log
128+
parser. To the extent that the measured boot event log is
129+
self-checking (i.e. event digests can be recalculated for certain
130+
types of events), the event log parser is already doing so. NB we are
131+
not aware of the `tpm2-tools` based event log parser doing any
132+
self-checking. And of course the python based event log parser is
133+
capable of generating PCR reference values in the same way
134+
as `tpm2-tools`.
135+
136+
Corrupt or maliciously crafted binary event logs have the capability
137+
to disrupt the parser (insufficient self checks have been implemented
138+
to date). To the best of our knowledge such events merely result in
139+
the parsing being aborted with an python Exception.
140+
141+
## Design Details
142+
143+
The parser is implemented as a single Python file, `eventlog.py`. It
144+
includes an `EventLog` class, which is `list` of `Event` objects.
145+
146+
An `EventLog` object can be instantiated with a binary buffer
147+
containing the binary event log, and becomes a list of the individual
148+
events in the log.
149+
150+
`Event` is actually a class hierarchy organized by the TCG list of
151+
event types. Every Event has an attached list of digests, can be
152+
natively parsed into JSON, and has a self-verification method (which
153+
is a noop for certain event types).
154+
155+
Currently only three operations are implemented on the EventLog class
156+
itself: (a) parse into JSON, (b) self-verification and (c) generate a
157+
list of PCRs for authentication against a TPM quote.
158+
159+
Typical use case for the EventLog class:
160+
161+
```
162+
args = parser.parse_args()
163+
assert args.file, "file argument is required"
164+
165+
with open (args.file, 'rb') as fp:
166+
buffer = fp.read()
167+
evlog = eventlog.EventLog(buffer, len(buffer))
168+
print(json.dumps(evlog, default=lambda o: o.toJson(), indent=4))
169+
```
170+
171+
The current implementation is somewhat lacking in buffer overflow
172+
checks (these would only happen if the binary event log is corrupted
173+
or crafted maliciously). Exception handling is also somewhat lacking.
174+
175+
176+
### Test Plan
177+
178+
The event log parser has its own test suite (10 different event logs
179+
collected from real machines). The comparison is performed against
180+
`tpm2_eventlog` from `tpm2-tools` version 5.5.
181+
182+
Attached is example output from the CI system:
183+
184+
```
185+
+------------------------------+-------+-------+-------+----
186+
| log file | #Evts | #Fail | Pct. | msg.
187+
+------------------------------+-------+-------+-------+----
188+
|bootlog-5.0.0-rhel-20210423T13| 34| 0| 0.00%|
189+
|thyme-eventlog.bin | 39| 0| 0.00%|
190+
|p511.bin | 58| 0| 0.00%|
191+
|swtpm_measurements.bin | 104| 0| 0.00%|
192+
|puiterwijk.bin | 101| 0| 0.00%|
193+
|201123-eventlog.bin | 60| 0| 0.00%|
194+
|ideapad1.bin | 39| 0| 0.00%|
195+
|inspur.bin | 101| 0| 0.00%|
196+
|rk049-s26.bin | 47| 0| 0.00%|
197+
|intel_svr.bin | 119| 0| 0.00%|
198+
|css-flex14vm4-bootlog.bin | 106| 1| 0.94%|
199+
+------------------------------+-------+--------+------+
200+
| Totals: | 808| 1| 0.12%|
201+
+------------------------------+-------+--------+------+
202+
```
203+
204+
TODO: integration testing
205+
TODO: antagonistic testing (broken event logs)
206+
TODO: testing of PCR generation
207+
TODO: self-check testing
208+
TODO: coverage testing
209+
210+
### Upgrade / Downgrade Strategy
211+
212+
TODO
213+
214+
### Dependency requirements
215+
216+
No major dependencies we are aware of, beyond Python itself. There is
217+
an optional dependency on `libefivar`, which would be used by means of
218+
a `CDLL` call in Python.
219+
220+
Of course, for Keylime itself to use this code, it would have to be
221+
packaged first.
222+
223+
## Drawbacks
224+
225+
"Why should this enhancement _not_ be implemented?" -- the most
226+
serious argument against this code is the duplication of effort
227+
(discussed in detail earlier).
228+
229+
## Unresolved problems at the time of writing this proposal
230+
231+
* TRANSFER: transferring the current Python-native event log parser into the
232+
Keylime project. The current implementation
233+
(https://github.com/galmasi/python3-uefi-eventlog) has its own CI
234+
tooling based on Github Actions, which includes pylint, python type
235+
checking and extensive functional testing against `tpm2_eventlog` by
236+
comparing JSON outputs for a collection of 10 different event logs.
237+
238+
* TESTING: The developer[s] would absolutely welcome contributions of
239+
other binary event logs for even more extensive functional
240+
testing. Including maliciously crafted binary event logs designed to
241+
trip up the parser.
242+
243+
* PACKAGING: There is no packaging code implemented in the project at
244+
this point. The developer[s] would welcome both help with the `pypy`
245+
packaging code and with Red Hat/Canonical packages.
246+
247+
* PEER REVIEW: The code was written mostly by a single person. A peer
248+
review of the code would be welcome.
249+
250+
* DOCUMENTATION: A lot of the event log parser is based on TCG
251+
documentation. Better documentation (attributing parts of the code to
252+
specific TCG and EFI documentation, by chapter and verse) would be
253+
welcome.
254+
255+
* INTEGRATION: a PR in keylime to replace the `tpm2-tools` call with
256+
the native call. Goal 1 would be a pre-requisitve, since the
257+
implication would be that the python event log parser is a package
258+
dependency of keylime.
259+

0 commit comments

Comments
 (0)