Architecting teamtask / list: The early years

Teamtask (a.k.a. List) is a pet project / algorithm I developed back in 2000 as part of a brute force password cracking experiment.  Actually though, now that I think about it, I originally started working on the algorithm in 5th grade.

Back in my early years I wanted to password protect my computer, which wasn’t a thing back then.  I set about writing a program with a prompt for a username and password.  It worked.  I could start it up when the computer started, and though you could just ctrl-c out of it (not super sophisticated), my next thought was how could someone get around it.  I began investigating how to generate all passwords so they all could be tested one after another until the password was guessed.  I figured out the following two algorithms given a string of 3 characters ‘abc’ and length ‘3’:

aaa 111 000
aab 112 001
aac 113 002

abc 123
acb 132
bac 213
bca 231
cab 312
cba 321

The algorithms were thus: one generating all combinations with reusing characters and one without reusing characters.  The first worked best for brute force password cracking (though I didn’t know the term at the time, if it even existed).  But, in 5th grade I wasn’t able to create the algorithm to generate the strings.  Later in life though, I was able to create an algorithm for both using iteration with a base equal to the number of characters, rather than base 10, along with factorials(!).

With these two algorithms the following became possible:  If there were 6 possible arrangement of characters I could give a client the number, such as 1 along with a string of characters ‘abc’ and the client could translate that into ‘abc’, and do something with it (test if the password works).  Since the client only needed the index, and the string of characters, we could also give out a block of characters such as index=0, size=3.  This would result in two blocks that two different clients could work on simultaneously.  Each client would take a block, process three combinations, then report back the result.

Implementing the algorithm there’s one more magic that occurs.  One might initially implement the algorithm above in the following way, given 100 items to complete, and breaking those into chucks of 10, you could add 10 records to a database to reflect these blocks pending completion:

index = 0, size = 10, status = pending
index = 10, size = 10, status = pending

index = 90, size = 10, status = pending

After each completes you could mark the status as ‘completed’ and once all blocks are completed flag the Job as done.

However, this would mean when processing more extreme lists with thousands of blocks, think testing for the largest prime number ever found, you wouldn’t want to hold the status of all blocks.  With one more algorithm this concern disappears, what you do is merge sibling blocks, so if you have three blocks in a row and clients are working on them: (0, 10, 0), (10, 10, 0), (20, 10, 0), and the second two complete (0, 10, 0), (10, 10, 1), (20, 10, 1) , you can merge them for tracking purposes: (0, 10, 0), (10, 20, 1), and if the first completes you can merge again, (0, 30, 1), indicating from position 0 of size 30, all of those have been completed.  This conveniently means that when the whole list has been processed you will have one block with the whole size in completed status (0, 100000000, 1).

The algorithm has evolved to have timeouts with blocks, to handle the use case of a client disappearing while working on a block (or crashes), limiting the number of blocks a client can have at one time (to avoid some level of someone trying to interfer with processing by requesting blocks and not working on them), and work with OIDC to work within an enterprise infrastructure.

Roadmap:
– implement teamtask (a.k.a. list) as a container
– implement webapp gui & mobile gui, both with single implementation using flutter

Posted in Development, Infrastructure, Kubernetes.

Leave a Reply