Code Background

Periodically, the mind of an engineer will be prone to wandering and thankfully, here at DomainTools, we have a process that corrals these thoughts into productivity: writing blog posts. While corporal punishment may seem the most effective remedy, it’s actually been a rather enlightening adventure to try to turn daydreams into codified experiments worth writing about. We leave the assessment of that latter part to the reader at the end of this blog…

The particular daydream in question arose from a post on HackerNews linking to a 2001 paper from Albert-László Barabási, Vincent W. Freeh, Hawoong Jeong & Jay B. Brockman Parasitic Computing (henceforth BFJB), in which TCP checksums are exploited to solve 2-satisfiability problems. Because 2-satisfiability is a generalized problem format into which more specific tasks can be mapped, this opens up the possibility of using the communication protocols provided by Internet hosts as a massive distributed computer. What’s more, the computers that participate in the endeavor are unwitting participants—from their perspective, they are merely responding (or not) to TCP traffic. Parasitic computing is especially interesting because the exploit doesn’t compromise the security of the affected computers, it piggybacks a math problem onto the TCP checksum work which TCP enabled hosts carry out under routine operating conditions.

Parasitic Computing Described

TCP checksums are normally used to ensure data corruption hasn’t occurred somewhere along a packet’s journey from one computer to another along what is usually a multi-hop route across the Internet. The transmitting computer adds a two-byte checksum field to the TCP header which is a function of the routing information and the data payload of the packet. The idea is that if corruption occurs in the transport or physical layers, the receiving computer will detect this because the presented checksum no longer corresponds with the data received.

Parasitic computing on TCP checksums maps a new problem set onto the TCP checksum function. In the particular instance discussed in BFJB, the technique is to compute a checksum which corresponds to an answer set for a particular boolean satisfiability problem, and then to send out packets with data payloads which are potential solutions to that problem. Receiving computers will attempt to validate the checksum against the data payload, effectively checking the solution they were offered against the problem under consideration. If the checksum validates, properly configured hosts will reply, and the parasitic computer knows that a solution to the problem has been found. The value of this model is that problems for which there is no known efficient algorithm can be solved by brute-force through parallelization to many hosts.

 

Image courtesy of: https://www.nature.com/articles/35091039

 

Schematic illustration of Parasitic Computing from BFJB. Our experiment implemented a modified version of the Parasitic Computing idea: we didn’t rely on the HTTP layer as shown in the drawing. We modified the SYN request packet and listened for a SYN-ACK response.  Using the handshake step avoids the overhead of establishing the connection beforehand, but may have introduced the false positives we discuss below.

The TCP checksum function breaks the information it checks into 16 bit words’ and then sums them. Whenever the sum gets larger than 0xffff, the additional column is carried back around (so 0xffff + 0x0001 = 0x0001). The 1’s complement of this sum is written down as the checksum. The trick presented in BFJB is to take advantage of the correlation between a numeric sum and a boolean operation: if a and b are bits, and summing results in 2, then upon treatment as booleans ab is TRUE; similarly, if summing a + b and results in a 1, then ab is TRUE. BFJB provides the truth table showing the relationship between these two boolean operators and the mathematical sum. This is the basis of mapping boolean satisfiability into the TCP checksum space.

In more detail, a boolean satisfiability problem asks whether an assignment of truth-values exists which will allow a given formula to evaluate to TRUE. Typically, these formulae are presented in conjunctive normal form (an AND of ORs). However, the problem exemplified in BFJB allows either ⊕ or ∧ to appear in a clause (aside: it’s noteworthy that ∨ is derivable from ⊕ and ∧: [ab] = [ab] ⊕ [ab], but more on this below). The example in BFJB is built from the specific case where there are 16 unique variables and 8 clauses.

The first step, then, in finding a solution to this problem parasitically is to calculate a checksum which is in correspondence with an assignment which solves the problem. To do this, BFJB take each of the 8 clauses in some order and generate a checksum based on the operator involved: for every ∧, write down 10, and for every ⊕, we write down 01. Then, a checksum is created by taking the 1’s complement. This value is the “magic checksum” which corresponds to a solution to our problem.

 

 

After that packets with variable assignments as data payloads can be generated and sent to TCP enabled hosts on the Internet. Each variable is padded with a 0 and then column-aligned with its clause operator according to the ordering used for the checksum creation. The padding is crucial because it is what allows a pair of variables with the ∧ operator to sum to 10, without overflowing into the column to the left.

For example, under the assignment where x1 = 1 and x2 = 1 and x3 = 1 and x4 = 0 and x15 = 0 and x16 = 1 we generate two 16 bit words with padding and the operands lined up in columns.

0101...00
0100...01

This packet is then sent to a network host which will evaluate the checksum against the data. Because TCP_CHECKSUM(data) = “magic checksum” only when the data solves our problem, only those hosts which received a good solution will reply. Each logically possible assignment is generated as a packet payload and sent to a TCP enabled host for evaluation and in this way we have the Internet as a parallel distributed computer for finding solutions to our boolean formulae!

Unfortunately, this relatively high-level description leaves a fair bit of implementation detail to the reader and so we set out to fill in the gaps in order to reproduce the reported results.

The Missing Bits

BFJB provided the conceptual skeleton, but we wanted to see if we could build an actual parasitic node as described. We downloaded scapy, a handy dandy-library and Domain-Specific Language for network packet manipulation. After that, the first step for us to take was to take a closer look at the TCP checksum calculation. As we had suspected when reading BFJB, the TCP checksum sums not only the data payload, but also a pseudo-header. “Pseudo”, because this header isn’t actually sent with the packet, its purpose is to include some basic metadata inside the sum. As detailed in this blog post, the pseudo-header structure is built from the source IP, the destination IP, some constants, and the packet length. Algorithmically, what this means for parasitic computing is that you don’t get to construct a single pseudo-header for the problem you want to solve and then send that to various remote hosts for computation. Since the pseudo-header will change with the destination IP, we’re going to have to compute a new checksum for packet. BFJB suggests that we can evaluate:

TCP_CHECKSUM(data_payload) == magic_checksum

The reality is that we’re going to have to accommodate:

TCP_CHECKSUM(pseudo_header + data_payload) == magic_checksum

We accounted for this wrinkle using a bit of algebra. Our solution is to include an additional 16 bit word in the data-payload which is the value that zeros out the pseudo-header. We have:

anti_pseudo_header = 0xffff - sum(pseudo_header) TCP_CHECKSUM(pseudo_header + data_payload + anti_pseudo_header) == magic_checksum

Because anti-pseudo-header is the 16 bit word which, when added to sum of pseudo-header, results in 0x0000, we are able to simplify to the situation described in BFJB, where we merely have to check that the TCP_CHECKSUM of the data-payload (ie, a proposed solution) is equal to the magic checksum (ie, the solution key) for a given problem.

In more detail: In this simplistic version, we’re working on the class of problems that has 16 literals in 8 clauses. So, just like in BFJB, our data payload is 2 16-bit words long. We start by calculating our magic_checksum for a given problem as described in BFJB. This is probably most clearly expressed by showing some Python/scapy code (DISCLAIMER: non-optimized, barely-legible, proof-of-concept code, shared here for the edification of all):

HEXCHARS_PER_BYTE = 2
DATA_OFFSET = 5 # this isn't always 5, but it is in our case, see IP.ihl
BYTES_PER_SECTION = 4
CHARS_PER_WORD = 4 * HEXCHARS_PER_BYTE
IP_HEADER_LEN = DATA_OFFSET * HEXCHARS_PER_BYTE * BYTES_PER_SECTION

def sum_only(words):
    accumulator = 0
    for w in words:
        accumulator += w
    if accumulator >= 0x10000:
        accumulator = (accumulator + 1) % 0x10000

    return accumulator

def remove_chksum(hexstr, payload_byte_count):
    """surgically remove the checksum bytes and replace with 0000
       checksum is pay_len_chars + 8 chars counting from the right"""
    idx = -1*(8 + (payload_byte_count * HEXCHARS_PER_BYTE))
    left = hexstr[:idx]
    right = hexstr[idx+4:]
    return left + '0000' + right

def build_pseudo_header(pak_str, tcp_pak_str):
    src = pak_str[3*CHARS_PER_WORD:4*CHARS_PER_WORD]
    dst = pak_str[4*CHARS_PER_WORD:5*CHARS_PER_WORD]
    reserved = '00'
    proto = '06'
    tcp_len = int(len(tcp_pak_str) / HEXCHARS_PER_BYTE)
    tcp_len_str = hex(tcp_len)[2:].zfill(4)
    return src + dst + reserved + proto + tcp_len_str
PAYLOAD_FORMAT = '!HHH'

S0 = ['0'+b for i,b in enumerate(self._assignment) if i % 2 == 0]
S1 = ['0'+b for i,b in enumerate(self._assignment) if i % 2 == 1]
w0 = int(''.join(S0), 2)
w1 = int(''.join(S1), 2)
payload = pack(PAYLOAD_FORMAT, w0, w1, 0)
syn_pak=ip/TCP(dport=80, flags='S')/payload
pak_str = hexlify(bytes(syn_pak)).decode('ascii')
payload_byte_count = calcsize(PAYLOAD_FORMAT)
tcp_pak_str = remove_chksum(pak_str[IP_HEADER_LEN:], payload_byte_count)
ph = build_pseudo_header(pak_str, tcp_pak_str)
idx = -1 * (payload_byte_count * HEXCHARS_PER_BYTE)
to_sum = ph + tcp_pak_str[:idx]

# four hexchar per 16-bit checksum word
words = []
for i in range(int(len(to_sum)/4)):
words.append(int(to_sum[i * 4:i * 4 + 4], 16))

fake_sum = sum_only(words)
anti_header = 0xffff - fake_sum

# now build a new syn pak with the calculated w3
syn_pak=ip/TCP(dport=80, flags='S')/pack(PAYLOAD_FORMAT, w0, w1, anti_header)

del syn_pak[TCP].chksum
syn_pak[TCP].chksum = magic_chksum
r = sr1(syn_pak, timeout=1, verbose=0)

The calculations above show how we used an algebraic trick to build the anti-header and stick with a constant “magic checksum” per problem. The ‘assignment’ tuple there is just a vector of {0,1} in correspondence with a variable assignment into the literal space.

Takeaways

Our experiment in parasitic computing turned up an interesting thing; false positives. BFJB talked about the fact that if you override the TCP checksum and use it to instead validate the solution to a boolean algebra problem, you have removed the seat belt provided by the checksum, so you no longer have a data-integrity check on your packet. This creates the danger of false negatives: if a packet you send out with a good solution for your problem gets corrupted in transport such that the checksum fails, you won’t get a reply from that host. Additionally, if your implementation isn’t robust enough to minimize false negatives, you can miss results.The obvious solution here is just to send out each packet more than once, as the odds that successive tries are each corrupted are quick-diminishingly low. However, BFJB didn’t mention false positives: hosts which will reply no matter what the TCP checksum contains.

It’s worth noting here that we didn’t completely follow the technique specified in BFJB. They suggest establishing a TCP connection with a remote host, then using an HTTP/GET request for the actual computational work. We decided to skip the initial handshake and instead use the handshake request for the parasitic computation. We craft our parasitic packet directly, setting the SYN flag. We then look for SYN-ACK. Our reading of the TCP standard suggests that on properly configured hosts, we should only expect a reply if the checksum passes validation. Our approach, while quicker for initially coding something up that sorta worked, has some drawbacks: among them is that probably increase the chances of false negatives (for example, perhaps the host is no longer online).

What we didn’t expect was to find hosts which would consistently reply to our SYN packet no matter what the contents of the checksum contained. These false positives were surprisingly common in our testing when we moved out of our local network and began sending packets to external servers. To test our ability to distribute many solution packets out to many servers, we started discussing what type of servers would be an ethical target for testing. As far as we could discern, our packed solutions shouldn’t cause undue stress on a large public facing web server and looked for viable candidates. We settled on using the Alexa Top Sites list, servers on this list are built to handle large volume and are distributed fairly well around the globe.

While we were successful in getting responses from a number of these servers and even found solutions to the problem sets we were testing, false positives began to emerge. A number of the servers on the list that we used would seemingly return SYN-AWK responses no matter the validity of the TCP checksums. Among all of the oddities that arose in re-engineering BFJB, this was one that stuck out. Whether this justifies our ramblings or not, we leave to you, dear reader.

We also did a bit of reading and thinking about how boolean algebra satisfiability maps to other problems we wanted to solve. Joshua’s favorite was discrete tomography, because he really wants to make a thingy where the internet will paint-by-numbers for him. Another of our colleagues, however, pointed out that the factoring of large prime numbers, is describable in terms of boolean algebra.  Depending on your mapping into the algebra space, there’s clearly an upper limit on how many clauses could be computed within a single packet. Nevertheless, even very large problems (because all clauses are conjoined by AND) could be collected into a series of packets for parasitic computing.

About the Authors

Joshua Crowgey is a Junior Engineer on the Backend Team at DomainTools as well as a PhD candidate in the Department of Linguistics at the University of Washington. Eats vegetables, prefers vim, uses LaTeX. Like many human beings, he has activities and hobbies which he will speak with you about upon request.

Matt Erhart, Software Engineer at DomainTools and meddling amateur strongman, has been involved in web development for roughly a decade. When not messing about in various web stacks, he can be found participating in in deep philosophical “discussions” about silly topics.

Appendix

Below is a collection of some of the scripts and code we were playing with when exploring this space.

DISCLAIMER: THE FOLLOWING IS UNOPTIMIZED, BACK-OF-THE-ENVELOPE CODE. WE ARE NOT RESPONSIBLE FOR THE CONSEQUENCES OF YOUR USE OF THIS CODE INCLUDING BUT NOT LIMITED TO THERMONUCLEAR WARFARE, THE HOT-TOASTING OF YOUR CPU, THE BLACKLISTING OF YOUR IP ADDRESS, OR THE FACT THAT YOUR BREAKFAST NEVER TASTES GOOD ANYMORE.

#!/usr/bin/env python
#
# implementation of the nsat problem using
# tcp parasitic computing
from collections import namedtuple

Clause = namedtuple('Clause', ['x', 'y', 'negx', 'negy', 'op'])
OPTABLE = { 'and': lambda x, y: x and y,
            'or' : lambda x, y: x or y,
            'xor': lambda x, y: x ^ y,
            'nor': lambda x, y: not (x or y)}

class TwoSat(object):

    def __init__(self, clauses):
        self.clauses = [Clause(*c) for c in clauses]

    def fake_checksum(self):
        chcksum = ''
        for clause in self.clauses:
            # chcksum += '01' if clause.op == 'xor' else '10'
            # NO! it's the opposite
            if clause.op == 'xor':
                chcksum += '10'
            elif clause.op == 'and':
                chcksum += '01'
            elif clause.op == 'nor':
                chcksum += '00'
            else:
                return None
 return chcksum

    def check_clause(self, clause, x, y):
        x = x ^ clause.negx
        y = y ^ clause.negy
        return OPTABLE[clause.op](x, y)

    def check(self, assignment):
        for clause in self.clauses:
            if not self.check_clause(clause, assignment[clause.x], assignment[clause.y]):
                return False
        return True


#!/usr/bin/env python3
import sys
import itertools
from binascii import hexlify
from nsat import TwoSat
from scapy.all import *
from struct import pack, calcsize
from threading import Thread
from queue import Queue

# taobao.com seems to reply no matter what!?
# so, uh, false positives :(

HEXCHARS_PER_BYTE = 2
DATA_OFFSET = 5  # this isn't always 5, but it is in our case, see IP.ihl
BYTES_PER_SECTION = 4
CHARS_PER_WORD = 4 * HEXCHARS_PER_BYTE
IP_HEADER_LEN = DATA_OFFSET * HEXCHARS_PER_BYTE * BYTES_PER_SECTION

IN_FLIGHT_MAX = 512
in_flight = Queue(maxsize=IN_FLIGHT_MAX)

# instantiate the problem
ns = TwoSat([('x1', 'x2', False, False, 'xor'), ('x3', 'x4', False, False, 'and'),
        ('x5', 'x6', False, False, 'and'), ('x7', 'x8', False, False, 'and'),
        ('x9', 'x10', False, False, 'xor'), ('x11', 'x12', False, False, 'and'),
        ('x13', 'x14', False, False, 'and'), ('x15', 'x16', False, False, 'xor')])

magic_chksum = int(ns.fake_checksum(), 2)

results = {}

class ScapySender(Thread):
    def __init__(self, ass, host):
        Thread.__init__(self)
        self._assignment = ass
        self._host = host

    def run(self):
        try:
            ip=IP(dst=self._host)
        except socket.gaierror:
            # todo, need a way to put this back on the queue
            print("socket gaierror: {}".format(self._host), file=sys.stderr)
            return
        except OSError:
            print("OSError: {}".format(self._host), file=sys.stderr)
            # todo, need a way to put this back on the queue
            return

        PAYLOAD_FORMAT = '!HHH'

        S0 = ['0'+b for i,b in enumerate(self._assignment) if i % 2 == 0]
        S1 = ['0'+b for i,b in enumerate(self._assignment) if i % 2 == 1]

        w0 = int(''.join(S0), 2)
        w1 = int(''.join(S1), 2)
        payload = pack(PAYLOAD_FORMAT, w0, w1, 0)

        try:
            syn_pak=ip/TCP(dport=80, flags='S')/payload
            pak_str = hexlify(bytes(syn_pak)).decode('utf-8')
        except struct.error as exc:
            print("struct error: {} {}".format(payload, self._host), file=sys.stderr)
            print("ass: {}, s0 {}, s1 {}, w0 {}, w1 {}".format(self._assignment, S0, S1, w0, w1))
            print(str(exc))
            print("---")
            return

        payload_byte_count = calcsize(PAYLOAD_FORMAT)
        tcp_pak_str = remove_chksum(pak_str[IP_HEADER_LEN:], payload_byte_count)

        ph = build_pseudo_header(pak_str, tcp_pak_str)
        idx = -1 * (payload_byte_count * HEXCHARS_PER_BYTE)
        to_sum = ph + tcp_pak_str[:idx]

        # four hexchar per word
        words = []
        for i in range(int(len(to_sum)/4)):
            words.append(int(to_sum[i*4:i*4+4], 16))

        fake_sum = sum_only(words)
        w3 = 0xffff - fake_sum

        # now build a new syn pak with the calculated w3
        syn_pak=ip/TCP(dport=80, flags='S')/pack(PAYLOAD_FORMAT, w0, w1, w3)
 del syn_pak[TCP].chksum
        syn_pak[TCP].chksum = magic_chksum
        r = sr1(syn_pak, timeout=1, verbose=0)

        in_flight.get()
        if r:
            ass_dict = build_assdict(self._assignment)
            is_good = ns.check(ass_dict)
            results[self._assignment] = { "host": self._host, "reply": True, "good": is_good }
        else:
            results[self._assignment] = { "host": self._host, "reply": False }

        return



def build_pseudo_header(pak_str, tcp_pak_str):
    src = pak_str[3*CHARS_PER_WORD:4*CHARS_PER_WORD]
    dst = pak_str[4*CHARS_PER_WORD:5*CHARS_PER_WORD]
    reserved = '00'
    proto = '06'
    tcp_len = int(len(tcp_pak_str) / HEXCHARS_PER_BYTE)
    tcp_len_str = hex(tcp_len)[2:].zfill(4)
    return src + dst + reserved + proto + tcp_len_str


def remove_chksum(hexstr, payload_byte_count):
    """surgically remove the checksum bytes and replace with 0000
       checksum is pay_len_chars + 8 chars counting from the right"""
    idx = -1*(8 + (payload_byte_count * HEXCHARS_PER_BYTE))
    left = hexstr[:idx]
    right = hexstr[idx+4:]
    return left + '0000' + right


def sum_only(words):
    accumulator = 0
    for w in words:
        accumulator += w
        if accumulator >= 0x10000:
            accumulator = (accumulator + 1) % 0x10000

    return accumulator

def build_assdict(ass):
    return {'x'+str(i+1) : True if v == '1' else False  for i, v in enumerate(ass)}


def try_all(hostsfile):
    # 2^16 baby!
    all_assignments = itertools.product(('1', '0'), repeat=16)
    i = 0
    try:
        while all_assignments:
            if not in_flight.full():
                ass = next(all_assignments)
                if i % 256 == 0:
                    print("loading {}th".format(i))
                host = hostsfile.readline().split(',')[0]
                t = ScapySender(ass, host)
                in_flight.put(ass)
                t.start()
                i +=1
    except StopIteration:
        pass


    for key, value in results.items():
        if value['reply'] == True and value['is_good'] == True:
            print("good solution: {}".format(key))
            print("thank you {}".format(value['host']))

if __name__ == '__main__':
    with open('top1mscannable.csv') as hostsfile:
        # try_one(('1', '0', '1', '1', '1', '0', '1', '0', '1', '1', '1', '0', '1', '1', '1', '0' ))
        try_all(hostsfile)