Friday, March 29, 2019

0CTF/TCTF 2019 Quals - zer0lfsr

Description

Please enjoy the classical lfsr.

zer0lfsr.tar.gz


Attachment content

chall.py
keystream

chall.py script

from secret import init1,init2,init3,FLAG
import hashlib
assert(FLAG=="flag{"+hashlib.sha256(init1+init2+init3).hexdigest()+"}")

class lfsr():
    def __init__(self, init, mask, length):
        self.init = init
        self.mask = mask
        self.lengthmask = 2**(length+1)-1

    def next(self):
        nextdata = (self.init << 1) & self.lengthmask 
        i = self.init & self.mask & self.lengthmask 
        output = 0
        while i != 0:
            output ^= (i & 1)
            i = i >> 1
        nextdata ^= output
        self.init = nextdata
        return output

def combine(x1,x2,x3):
    return (x1*x2)^(x2*x3)^(x1*x3)

if __name__=="__main__":
    l1 = lfsr(int.from_bytes(init1,"big"),0b100000000000000000000000010000000000000000000000,48)
    l2 = lfsr(int.from_bytes(init2,"big"),0b100000000000000000000000000000000010000000000000,48)
    l3 = lfsr(int.from_bytes(init3,"big"),0b100000100000000000000000000000000000000000000000,48)

    with open("keystream","wb") as f:
        for i in range(8192):
            b = 0
            for j in range(8):
                b = (b<<1)+combine(l1.next(),l2.next(),l3.next())
            f.write(chr(b).encode())

Write-up

As Wikipedia states a linear-feedback shift register (LFSR) is a shift register whose input bit is a linear function of its previous state. I had never heard of LFSR before (or maybe I had and forgot about it) so I looked at this script for several hours.

At some point I started to print the b variable written to file and it was never greater than 255. So, since the range goes from 0 to 8191, why does the keystream file have more than 8192 bytes?

$ cat keystream | wc -c
11990

I found out that the encode() sometimes returned 1 and sometimes 2 bytes for each character. So the first thing was to create a decoded-keystream of 8192 bytes.

I think there are better methods, but this was the one that I came up with:

barr = b''
with open('keystream','rb') as f:
    data = f.read()
    i = 0
    while i < len(data):
        if data[i] == 194 or data[i] == 195:
            barr += bytes([ord(bytes([data[i], data[i+1]]).decode())])
            i += 1
        else:
            barr += bytes([ord(bytes([data[i]]).decode())])
        i += 1
with open('decoded-keystream', 'wb') as f:
    f.write(barr)
$ cat decoded-keystream | wc -c
8192

Ok. As I did not know nothing about LFSR, I studied how the b was generated:

b = (b<<1)+combine(l1.next(),l2.next(),l3.next())

The operation b<<1 multiplies b by 2. If b = 16, b<<1 = 32. Sometimes an odd number like 33 appears and I found out this was because of the combine method. b<<1 can only generate even numbers, so when b was odd in some iteration I knew combine returned 1. So, I understood I could retrieve every output of combine with this algorithm:

b = 193

iteration 8 -> 193 = 96 * 2 + 1
iteration 7 -> 96  = 48 * 2 + 0
iteration 6 -> 48  = 24 * 2 + 0
iteration 5 -> 24  = 12 * 2 + 0
iteration 4 -> 12  = 6  * 2 + 0
iteration 3 -> 6   = 3  * 2 + 0
iteration 2 -> 3   = 1  * 2 + 1
iteration 1 -> 1   = 0  * 2 + 1

If the b = 193, I know the combine method generated 1, 1, 0, 0, 0, 0, 0 and 1 as outputs. So, I came up with a method to generate all the outputs for me, for each iteration:

with open("decoded-keystream", "rb") as f:
    data = f.read()
    for b in data:
        arr = []
        tmp = b
        for i in range(8):
            arr += [tmp]
            tmp //= 2
        tmp = 0
        for l in arr[::-1]:
            outputs += [l - (tmp * 2)]
            tmp = l
>>> print(len(outputs))
65536

Good signal: I have 65536 outputs for the 8192 * 8 interactions.

To make sure this was right, I created a method to generate a proof and compare it to the original keystream:

with open('proof', 'wb') as f:
    o = 0
    for i in range(8192):
        b = 0
        for j in range(8):
            b = (b<<1) + outputs[o]
            o += 1
        f.write(chr(b).encode())
$ sha256sum keystream
44b4e996f4790b2940c596b006e3f0916ce589b960bb1a7c600831d42004d12b  keystream
$ sha256sum proof
44b4e996f4790b2940c596b006e3f0916ce589b960bb1a7c600831d42004d12b  proof

Same result, so my output is reliable!

Then I got stuck. As I knew nothing about LFSR, I can’t understand how to find the original seeds.

I asked our crypto specialist for help.

Marzano knew that with Z3 it was possible to reverse the keystream into the seeds. As the LFSR length was 48 bits, he created this script:

from z3 import *

class lfsr():
    def __init__(self, init, mask, length):
        self.init = init
        self.mask = mask
        self.lengthmask = 2**(length+1)-1
        self.length = length

    def next(self):
        nextdata = (self.init << 1) & self.lengthmask
        i = self.init & self.mask & self.lengthmask

        output = 0
        for j in range(self.length):
            output ^= (i & 1)
            i = i >> 1

        nextdata ^= output
        self.init = nextdata
        return output

def combine(x1,x2,x3):
    return (x1*x2)^(x2*x3)^(x1*x3)

init1 = BitVec('init1', 48)
init2 = BitVec('init2', 48)
init3 = BitVec('init3', 48)

l1 = lfsr(init1, 0b100000000000000000000000010000000000000000000000, 48)
l2 = lfsr(init2, 0b100000000000000000000000000000000010000000000000, 48)
l3 = lfsr(init3, 0b100000100000000000000000000000000000000000000000, 48)

s = Solver()
with open('decoded-keystream', 'rb') as f:
    keystream = f.read()

outputs = []
for b in keystream:
    arr = []
    tmp = b
    for i in range(8):
        arr += [tmp]
        tmp //= 2

    tmp = 0
    for l in arr[::-1]:
        outputs += [l - (tmp * 2)]
        tmp = l

for i in range(len(outputs)):
    s.add(outputs[i] == combine(l1.next(), l2.next(), l3.next()))
    print(i)

s.check()
print(s.model())

We tested the script with the first bits of output and everything worked. However, when we ran the script with full length, the script crashed with no memory left.

I thought for some time and I came up with the solution: let’s try to run a script with only x bits, then run again with a y greater than x and see if the result is the same.

The line

for i in range(len(outputs)):

was changed to:

for i in range(bits):
bits = 100
[init3 = 319098525903, init2 = 159339250822750, init1 = 165948368634684]

bits = 200
[init3 = 191532558614761, init2 = 181037482648735, init1 = 70989122156399]

bits = 300
[init3 = 191532558614761, init2 = 181037482648735, init1 = 70989122156399]

200 bits and 300 bits returned the same result, so it seemed like 200 bits was enough.

The flag was asserted like this:

assert(FLAG=="flag{"+hashlib.sha256(init1+init2+init3).hexdigest()+"}")

In the original script, the seeds were converted from bytes to int:

int.from_bytes(init1,"big")

Based on this I assumed that init1, init2 and init3 were in byte array format. This was my final code to generate the flag:

from Crypto.Util.number import long_to_bytes
from hashlib import sha256

flag = 'flag{' + sha256(long_to_bytes(70989122156399) + long_to_bytes(181037482648735) + long_to_bytes(191532558614761)).hexdigest() + '}'

print(flag)

Flag

flag{b527e2621131134ec22250cfbca75e8c9f5ae4f40370871fd55910927f66a1b4}

Cryptography , Capture the Flag , Writeup