Skip to content

Commit 7536493

Browse files
state presented at PHDays VI, 17 May 2016, with slides
1 parent 2024c1d commit 7536493

File tree

126 files changed

+19030
-763
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

126 files changed

+19030
-763
lines changed

README.md

+25-37
Original file line numberDiff line numberDiff line change
@@ -2,43 +2,18 @@
22

33
john-devkit is an advanced code generator for [John the Ripper](https://github.com/magnumripper/JohnTheRipper). It aims to separate crypto primitives (sha-512, md5, crypt, pbdkf2 and so on), optimizations (interleave, loop unrolling, round reversing, bitslice and so on) and output for different computing devices (cpu, sse, gpu, fpga).
44

5-
john-devkit uses its own domain specific language (dsl) on top of Python to describe crypto algorithms in an abstract way. Also there is a kind of instruction language for intermediate representation (referenced as "bytecode"). So there are two levels of code generation: dsl -> bytecode -> platform's language (for instance C for cpu).
5+
john-devkit uses its own domain specific language (dsl) on top of Python to describe crypto algorithms in an abstract way. Also there is a kind of instruction language for intermediate representation (referenced as "bytecode"). So there are two levels of code generation: dsl -> intermediate representation -> platform's language (for instance C for cpu).
66

7-
"bytecode" is a wrong word for the intermediate language/representation and will be fixed soon.
7+
"bytecode" is a wrong word for the intermediate language/representation and will be fixed in code someday.
88

99
## Current State
1010

1111
The current implementation is a Proof of Concept.
1212

13-
There is no documentation for dsl and bytecode. It is not obvious how many times everything will be changed drastically.
13+
There is no documentation for dsl and intermediate. It is not obvious how many times everything will be changed drastically.
1414

1515
There is a draft of hash parsing library by Alexander Cherepanov. It is not finished (and works by pure luck). It is included into john-devkit but it will be removed when the library become a persistent part of John the Ripper.
1616

17-
7 "raw" formats were implemented: raw-sha256, raw-sha224, raw-sha512, raw-sha384, raw-md4, raw-md5, raw-sha1. SHA-2 family got speed up (~12% for SHA-256, ~5% for SHA-512). md5, md4 and sha1 got noticable slowdown (it needs investigation). (Such speed up for raw-sha256 may be caused by changes in test vector. Accurate benchmarks are needed.)
18-
19-
raw-sha256 lost cisco hashes (base64 encoded) so benchmarks differ from John the Ripper not only due to optimizations.
20-
21-
The current main priority is to try more complex formats like sha256crypt.
22-
23-
john-devkit has now:
24-
* formats: raw-sha256, raw-sha224, raw-sha512, raw-sha384, raw-md4, raw-md5, raw-sha1
25-
* optimizations: vectorization (sse only), early reject, interleave, loop unrolling, additional batching in crypt_all() method
26-
27-
broken:
28-
* reverse (there is only initial implementation, it has problems with vectorization)
29-
* bitslice (there is no support in template file; no long vectors, only regular ints for vectors)
30-
* output to standalone program (it will not be fixed soon)
31-
32-
Nearest plans:
33-
* output to avx and avx2 for better benchmarking
34-
* fix bitslice and play with it
35-
* investigate md5/md4 slowdown (on sse)
36-
* fix reverse
37-
* sha2crypt scheme
38-
* output formats for john on gpu
39-
* incorporate on-gpu mask-mode
40-
* try kernel splitting
41-
4217
## Usage
4318

4419
To run john-devkit, you need
@@ -54,13 +29,21 @@ Then the following will work:
5429

5530
After the first run (for each format), you need to cd into JohnTheRipper/src/ , rerun ./configure script (otherwise you'll get "Unknown ciphertext format name requested") and rerun the command.
5631

32+
Calling `format_abstract_all.py` can give various formats. It is tricky though because it does not work with `run_format_raw.sh` and may require changes to enable or disable certain formats.
33+
5734
## Files layout
5835

5936
`algo_*.py` are abstract definitions of crypto primitives written in DSL.
6037

6138
`bytecode_main.py` is the main library to transform intermediate representation.
6239

63-
`format_john_*.py` are config files to output certain formats for john.
40+
`bytecode_*.py` are other files with filters for intermediate representation.
41+
42+
`format_abstract_all.py` is a big mess with code to generate and test ~100 formats for john. TODO: split the file.
43+
44+
`format_john_*.py` are other config files to output certain formats for john.
45+
46+
`format_*.py` are config files to produce other output files (1 hash algo per file usually).
6447

6548
`lang_main.py` and `lang_spec.py` contain support code for DSL.
6649

@@ -74,18 +57,19 @@ After the first run (for each format), you need to cd into JohnTheRipper/src/ ,
7457

7558
`run_format_raw.sh` is a script to call `format_john_*.py` quickly for "raw" formats, call disassembler and count instructions/size of crypt_all() method.
7659

77-
`slides_2015-05-26_phdays_v.src.org` is a textual source of slides for PHDays V. See below.
60+
`slides_*.src.org` are textual source files of slides for talks. See below.
7861

7962
`t_raw.c` is a C template for "raw" formats. It contains padding and byteswap for endianity fix. It is derived from code of John the Ripper. See below about its license.
8063

81-
`util_ui.py` is a miscellaneous library.
64+
`t_*` are other templates.
8265

83-
## Slides for PHDays V
66+
`util_*.py` are miscellaneous library files.
67+
68+
## Slides for PHDays V, 2015
8469

8570
Benchmarks demonstrated at PHDays V are quite inaccurate: raw-sha256 and raw-sha224 are more likely to be only 12% over John the Ripper's speed.
8671

87-
Source file of slides for talk about john-devkit at PHDays V is in
88-
`slides_2015-05-26_phdays_v.src.org`
72+
Source file of slides for talk about john-devkit at PHDays V is in `slides_2015-05-26_phdays_v.src.org`.
8973

9074
The slides contain code example from Keccak (from [JohnTheRipper/src/KeccakF-1600-unrolling.macros](https://github.com/magnumripper/JohnTheRipper/blob/bleeding-jumbo/src/KeccakF-1600-unrolling.macros)) and code example from NetBSD's libcrypt (from [here](https://github.com/rumpkernel/netbsd-userspace-src/blob/3280867f12bbd346f39d5a4efb41fcf9b087bf33/lib/libcrypt/hmac_sha1.c)). Everything else is under the following license:
9175

@@ -97,9 +81,15 @@ Code examples are between `>>>>` and `<<<<` . `#` in text is quoted with `\`: so
9781

9882
TODO: a link to .pdf file, a script to compile slides.
9983

84+
## Slides for PHDays VI, 2016
85+
86+
Source file of slides for talk about john-devkit at PHDays VI is in `slides_2016-05-17_phdays_vi.src.org`. The code example is in `simple_example.py`.
87+
88+
TODO: a link to .pdf file.
89+
10090
## Usage without John the Ripper
10191

102-
The idea to write a hash algo in Python and get optimized C code is very attractive. But there are limitations with john-devkit: john-devkit is not suitable for regular applications. It is a really bad idea to use john-devkit in most cases.
92+
The idea to write a hash algo in Python and get optimized C code is very attractive. But there are limitations with john-devkit: john-devkit is not suitable for regular applications. It is a really bad idea to use john-devkit in most cases. Please use standard/good libraries for hashing; for instance Argon2 (see [Password Hashing Competitions](https://password-hashing.net/)), [yescrypt](http://www.openwall.com/yescrypt/) or [phpass for php](http://www.openwall.com/phpass/).
10393

10494
It is possible to separate john-devkit and John the Ripper (use custom C template, fix output in `output_c.py` for some instructions that depend on `pseudo_intrinsics.h` and/or `johnswap.h` or pull the headers into your application if the licenses permit).
10595

@@ -109,8 +99,6 @@ Also it should be easier to implement and optimize 1 hash algo manually than usi
10999

110100
There is a totally "no-no" thing for regular applications: john-devkit does not care about security, there are no defensive tricks, for instance john-devkit does not prevent information disclose through timings, so produced code is weak against a range of attacks.
111101

112-
Please use standard/good libraries for hashing. For instance phpass for php.
113-
114102
## License
115103

116104
Currently, files generated by john-devkit are subject for original license of John the Ripper. See below.

algo_aes_decrypt.py

+28
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# AES decryption, abstract
2+
3+
# Copyright © 2016 Aleksey Cherepanov <[email protected]>
4+
# Redistribution and use in source and binary forms, with or without
5+
# modification, are permitted.
6+
7+
Var.setup('be', 4)
8+
9+
msg = input_salt()
10+
key = input_key()
11+
12+
assume_length(msg, 16, 16)
13+
assume_length(key, 16, 16)
14+
15+
include('aes')
16+
17+
r = AES(key).decrypt(msg)
18+
19+
# We've got 16 ints with 1 bytes. Let's pack them.
20+
21+
# print_many(*r)
22+
23+
o = []
24+
for i in range(0, 16, 4):
25+
o.append((r[i + 0] << 24) ^ (r[i + 1] << 16) ^ (r[i + 2] << 8) ^ r[i + 3])
26+
27+
s = bytes_join_nums(*o)
28+
output_bytes(s)

algo_aes_encrypt.py

+28
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# AES encryption, abstract
2+
3+
# Copyright © 2016 Aleksey Cherepanov <[email protected]>
4+
# Redistribution and use in source and binary forms, with or without
5+
# modification, are permitted.
6+
7+
Var.setup('be', 4)
8+
9+
msg = input_salt()
10+
key = input_key()
11+
12+
assume_length(msg, 16, 16)
13+
assume_length(key, 16, 16)
14+
15+
include('aes')
16+
17+
r = AES(key).encrypt(msg)
18+
19+
# We've got 16 ints with 1 bytes. Let's pack them.
20+
21+
# print_many(*r)
22+
23+
o = []
24+
for i in range(0, 16, 4):
25+
o.append((r[i + 0] << 24) ^ (r[i + 1] << 16) ^ (r[i + 2] << 8) ^ r[i + 3])
26+
27+
s = bytes_join_nums(*o)
28+
output_bytes(s)

algo_crypt_sha2.py

+210
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,210 @@
1+
# crypt with sha2 scheme
2+
# %% draft implementation
3+
# %% sha512 only
4+
5+
# Copyright © 2016 Aleksey Cherepanov <[email protected]>
6+
# Redistribution and use in source and binary forms, with or without
7+
# modification, are permitted.
8+
9+
# http://www.akkadia.org/drepper/SHA-crypt.txt
10+
11+
# Var.setup('be', args['size'])
12+
13+
sha = load_hfun('sha512')
14+
15+
# Algo
16+
17+
key = input_key()
18+
salt = input_salt()
19+
rounds = input_rounds()
20+
21+
key_length = bytes_length(key)
22+
salt_length = bytes_length(salt)
23+
# key_bit_length = get_bit_length(key)
24+
25+
ks = bytes_concat(key, salt)
26+
ksk = bytes_concat(ks, key)
27+
b = invoke_hfun(sha, ksk)
28+
29+
a = sha_init()
30+
ksa = bytes_concat(ks, b)
31+
32+
# 11
33+
# maximal password length is 125
34+
# %% хорошо бы тут использовать другой тип
35+
c = new_var()
36+
c // key_length
37+
bb = new_bytes()
38+
bytes_assign(bb, ksa)
39+
cycle_while_begin('step11')
40+
cycle_while('step11', c > 0)
41+
42+
if_condition('c1', c & 1)
43+
bytes_assign(bb, bytes_concat(bb, b))
44+
if_else('c1')
45+
bytes_assign(bb, bytes_concat(bb, key))
46+
if_end('c1')
47+
48+
c // (c >> 1)
49+
cycle_end('step11')
50+
51+
# 12
52+
a = invoke_hfun(sha, bb)
53+
54+
# %% not inclusive
55+
kkk = new_bytes()
56+
unused = cycle_range('step14', 0, key_length - 1, 1)
57+
bytes_assign(kkk, bytes_concat(kkk, key))
58+
cycle_end('step14')
59+
dp = invoke_hfun(sha, kkk)
60+
61+
62+
# %%% я остановился тут
63+
64+
p = fill_string(key_length, dp_string, dp_string_length)
65+
p_length = key_length
66+
67+
# 18
68+
ds = sha_init()
69+
# for i in range(16 + a[0]):
70+
# sha_update(ds, salt)
71+
# %% big endian?
72+
# %% not inclusive
73+
c = cycle_range('setup_s', 0, new_const(16) + ((a[0] & (0xff << 56)) >> 56) - 1, 1)
74+
sha_update(ds, salt, salt_length)
75+
cycle_end('setup_s')
76+
ds = sha_final(ds)
77+
78+
ds_string = digest_to_string(ds)
79+
ds_string_length = 8 * 8
80+
81+
s = fill_string(salt_length, ds_string, ds_string_length)
82+
s_length = salt_length
83+
84+
# 21
85+
86+
# # pseudo code
87+
# for r in range(rounds - 1):
88+
# c = sha_init()
89+
# sha_update(c, p if r & 1 else ac)
90+
# if r % 3 != 0: sha_update(c, s)
91+
# if r % 7 != 0: sha_update(c, p)
92+
# sha_update(c, ac if r & 1 else p)
93+
# c = sha_final(c)
94+
# ac = c
95+
96+
# Notice that computation of some blocks maybe lifted from the loop
97+
98+
# Statistics of sequences for first 5k rounds:
99+
# perl -e 'for (0 .. 5000) { if ($_ & 1) { print " ac," } else { print " p," } print " s," if $_ % 3; print " p," if $_ % 7; if ($_ & 1) { print " p" } else { print " ac" } print "\n" }' | sort | uniq -c
100+
# 119 ac, p
101+
# 714 ac, p, p
102+
# 238 ac, s, p
103+
# 1429 ac, s, p, p
104+
# 120 p, ac
105+
# 714 p, p, ac
106+
# 238 p, s, ac
107+
# 1429 p, s, p, ac
108+
# =5001, 1667 ps and psp may be brought upper
109+
110+
# The full cycle is 2 * 3 * 7 == 42 rounds, having such unroll it is
111+
# possible to avoid most "if"s in the loop. There are 21 variants
112+
# Without ac/p.
113+
# %% Right?
114+
115+
# %% It'd be nice to do such optimization automatically...
116+
117+
# # %% add 0x80 and compute lengths
118+
# # Only ac is variable. Other parts may be extracted:
119+
# pp = string_concat(p, p)
120+
# sp = string_concat(s, p)
121+
# spp = string_concat(s, p, p)
122+
# ps = string_concat(p, s)
123+
# psp = string_concat(p, s, p)
124+
# pp = string_to_ints(pp)
125+
# sp = string_to_ints(sp)
126+
# spp = string_to_ints(spp)
127+
# ps = string_to_ints(ps)
128+
# psp = string_to_ints(psp)
129+
130+
# %% It is possible to reduce memory footprint storing pp in spp, and
131+
# sp in psp; also it is possible to store ps inside psp, but it may
132+
# be needed to pass lengths then
133+
134+
# for r in range(rounds - 1):
135+
# c = sha_init()
136+
# if r & 1:
137+
# if r % 3 and r % 7:
138+
# sha_update(c, psp)
139+
# elif r % 7:
140+
# sha_update(c, pp)
141+
# elif r % 3:
142+
# sha_update(c, ps)
143+
# else:
144+
# sha_update(c, p)
145+
# sha_update(c, ac)
146+
# else:
147+
# sha_update(c, ac)
148+
# if r % 3 and r % 7:
149+
# sha_update(c, spp)
150+
# elif r % 7:
151+
# sha_update(c, pp)
152+
# elif r % 3:
153+
# sha_update(c, sp)
154+
# else:
155+
# sha_update(c, p)
156+
# c = sha_final(c)
157+
# ac = c
158+
159+
# r = cycle_range('main', 0, rounds, 1)
160+
# # %% Compute 8?
161+
# c_array = [Var() for i in range(8)]
162+
# # ints_concat is "macro", free
163+
# f = lambda x, y: sha_ints(ints_concat(x, y))
164+
# c = iff(r & 1,
165+
# # %% lift computation of some blocks from the cycle
166+
# iff(r % 3 and r % 7, f(psp, c),
167+
# iff(r % 7, f(pp, c),
168+
# iff(r % 3, f(ps, c),
169+
# f(p, c)))),
170+
# iff(r % 3 and r % 7, f(c, spp),
171+
# iff(r % 7, f(c, pp),
172+
# iff(r % 3, f(c, sp),
173+
# f(c, p)))))
174+
# for i, v in enumerate(c):
175+
# c_array[i] // c[i]
176+
# cycle_end('main')
177+
178+
ac = a
179+
180+
# %% not inclusive
181+
r = cycle_range('main', 0, rounds - 1, 1)
182+
c = sha_init()
183+
# print_var(r)
184+
# print_digest(ac)
185+
ac_string = digest_to_string(ac)
186+
ac_string_length = 8 * 8
187+
if_condition('r1', r & 1)
188+
sha_update(c, p, p_length)
189+
if_else('r1')
190+
sha_update(c, ac_string, ac_string_length)
191+
if_end('r1')
192+
if_condition('r3', r % 3)
193+
sha_update(c, s, s_length)
194+
if_end('r3')
195+
if_condition('r7', r % 7)
196+
sha_update(c, p, p_length)
197+
if_end('r7')
198+
if_condition('r2', r & 1)
199+
sha_update(c, ac_string, ac_string_length)
200+
if_else('r2')
201+
sha_update(c, p, p_length)
202+
if_end('r2')
203+
# мы затираем 'a'
204+
# %% memory leak?
205+
ac // sha_final(c)
206+
cycle_end('main')
207+
208+
# %% avoid hardcoded value, how?
209+
for i in range(8):
210+
output(ac[i])

0 commit comments

Comments
 (0)