Lake Oswego CD
Standard Deviation of the image:
The disc contains the following files:
|otp.txt||2376276||11881376||71288256||five columns* of 5-letter "words" (aka 5grams). Every permuation exists in the file.|
|p1.txt||11881376||23762752||142576512||two columns of 5-letter words, each containing every permuation; the first column in alphabetic order.|
|p2.txt||11881376||23762752||142576512||same, second column is shuffled.|
|p3.txt||11881376||23762752||142576512||same, second column is shuffled.|
|p4.txt||11881376||23762752||142576512||same, second column is shuffled.|
Px.txt sequential letters
On January 20th, 2014 it was found that when you compare the first 5gram in the right column of each file (the first "random" / otp encoding) - the last letter was sequential. This could point to the first 5gram being chosen at will, or pointing to the generation method being incremental.
|filename||first righthand entry||last letter|
|p1.txt (New York*)||QCUPP||P|
*The New York p1.txt file was not included in the Lake Oswego CD, but illustrates this pattern continuing later.
This pattern does not seem to apply to the second row, etc.
OTP.txt contains primarily lines representing 5 columns of 5-letter words. The file contains every 5-letter sequence possible.
Since there are only 11881376 words possible - which is not divisible by 5 - a discrepancy in the file eventually has to occur. Why this discrepancy does not appear directly at the end is a mystery.
Lines 2376259-2376277 (end of file):
BLKYM IQLXY NBQJZ YKZYE ZGAZQ UXAXQ MSCJP FBAAC QWAUP KZDWN TPFJP JIEGZ HFVYQ MYIJH LKMFS LYEHY TLOCT DJIPU KQGTV UZQWO IVHKG XJKYC EACIC LLRZV SSZJK KMOYG MYITG WTZXC ZNZCE ITKOX RGPMQ ELPXY BHMDM ZINCT MWUNQ YCEGO XRNBB KLHGV ZAIYR AAFPQ ESSPV VXIXQ CBFAR ZHCPE JUOLP KALWP JCJBK RLUHX AUYUF WAUBI PNYHY USTMV XQMHO QQALE GSVDZ NXMVD ASXXQ MCPOG SYRVE LRXSC PEAMU IOKGK IPIHA GPALI GIJQM SCEYT QXVAJ EPNJN TLYJB UVPQZ ILKBW LZBXD RWUPE IBTPZ JKCAP MQMPU VZWWQ CTPFD HUSYG TECQY JDJXC PALQU BHHCE VVJBK LWGPH FEENV
It is unknown at this time whether this fluke has any importance or is simply an artifact of the generation program.
The method of generating these files remains unknown, however it is most likely that either the files are generated from shuffling the list of all permuations, or is (less intensively) generated from formula with an increasing index.
The existence of multiple different files seems to imply a different seed value, formula, or variable used when generating the files.
Use as order codes
Use as encoding keys
The P1-P4.txt files can and have been used as replacement keys for 5gram messages. This is often seen in the List of NATO messages we received. In this scheme, a message is encoded by splitting it into sections of 5 letters, finding the matching sequence in the first column, and outputting the value in the second column. To decode the message, the reverse is done.
So far, all of our messages except for one have used Right-To-Left decoding (encoded Left-to-Right).
Files similar to OTP.txt have been generated by the device found at the Bursage Loop drop.
Relevant IRC posts
Abridged marcusw explanation of CD contents, #otp22, Sat Sep 15 17:50:00 UTC 2012 12:51 < marcusw> anyway, line/word/byte counts for the files are as follows: 12:51 < marcusw> 2376276 11881376 71288256 otp.txt 12:51 < marcusw> 11881376 23762752 142576512 p1.txt 12:51 < marcusw> 11881376 23762752 142576512 p2.txt 12:51 < marcusw> 11881376 23762752 142576512 p3.txt 12:51 < marcusw> 11881376 23762752 142576512 p4.txt 12:51 < marcusw> 49901780 106932384 641594304 total 12:51 < marcusw> so we see that p[1-4] are very similar to each other 12:52 < marcusw> looking at these files, we also see that the are 26^5 lines long 12:52 < marcusw> and also that they go like this: 12:52 < marcusw> AAAAA TADLL 12:52 < marcusw> AAAAB EVQVQ 12:52 < marcusw> AAAAC APKRK 12:53 < marcusw> all the way through 12:53 < marcusw> ZZZZX ZFZXT 12:53 < marcusw> ZZZZY KHDSO 12:53 < marcusw> ZZZZZ QMCTH 12:53 < marcusw> the first column is in order, every permutation from AAAAA to ZZZZZ 12:53 < marcusw> and the second column appears random at first glance 12:54 < marcusw> anyway, the second column is NOT random 12:54 < marcusw> it contains exactly the same thing as the first column 12:55 < marcusw> it's just shuffled around a lot 12:55 < marcusw> so each 5gram appears exactly twice in the file, once in each column 12:55 < marcusw> which means there's a one-to-one mapping both ways 12:55 < marcusw> which means you can encrypt and decrypt both ways 12:57 < marcusw> anyway, p[1-4] are identical except for the order of column 2 12:57 < marcusw> they are 4 versions of the same key 12:59 < marcusw> so we move on to otp.txt 12:59 < marcusw> otp.txt is a very, very interesting file 12:59 < marcusw> as camelCase (IIRC) observed, it has the exact same number of every letter 13:00 < marcusw> 2284880 of each, if my calculations are correct 13:00 < marcusw> this means that this file is again NOT random 13:00 < marcusw> even though it looks like it to begin with 13:00 < marcusw> see: 13:00 < marcusw> OICOV ZDNOU EKKYO GVKMD CSIRC 13:00 < marcusw> HXDGF DTKAS WTDAT QMBKK PKMVL 13:00 < marcusw> QGJIR HAVHB JXPMQ PZYIN HOJTK 13:00 < marcusw> FJVDE DWINR SOPLM GDGEG RPIGL 13:00 < marcusw> so we see, 5 words of 5 letters each line 13:01 < marcusw> 25 letters per line, 30 bytes per line 13:01 < marcusw> otp.txt...OTP=one time pad, which tells us how to use the file 13:02 < marcusw> 2376276 11881376 71288256 otp.txt 13:02 < marcusw> that's lines, words, bytes, if you're just joining us 13:02 < marcusw> 1 character = 1 byte 13:02 < marcusw> that includes newlines and spaces 13:02 < marcusw> so we see that 2376276 lines * 5 words/line = 11881380 words total 13:03 < marcusw> right? 13:03 < marcusw> WRONG 13:03 < marcusw> we only have 11881376 words 13:03 < marcusw> otp.txt is missing four words 13:03 < marcusw> before we figure out WHY, we're looking at WHERE 13:04 < marcusw> thankfully, they're very easy to find 13:06 < marcusw> the words missing from otp.txt are: 13:06 < marcusw> 2376260 UXAXQ MSCJP FBAAC QWAUP 13:06 < marcusw> 2376276 LWGPH FEENV 13:06 < marcusw> one missing from a line near the end, three missing from the last line 13:06 < marcusw> why are they missing from these lines specifically? 13:06 < marcusw> fuck if I know 13:07 < marcusw> an unpredictability of some sort in their script 13:07 < marcusw> it would appear the file was generated in chunks 13:07 < marcusw> but I'm going to talk about HOW the file was generated before that 13:07 < marcusw> remember how it has only 11881376 words instead of 11881380? 13:07 < marcusw> 11881376 is a special number, 26^5 13:08 < marcusw> and every letter appears the same number of times 13:08 < marcusw> so the solution is that otp.txt is simply all words AAAAA-ZZZZZ in random order 13:08 < marcusw> they just took a column of a p file 13:08 < marcusw> and scrambled it 13:09 < marcusw> and divided it into 5 columns 13:09 < marcusw> now the funny thing about 11881376 is that it's only divisible by 2 and 13 13:09 < marcusw> no other factors, especially not 5 13:10 < marcusw> and now we see why the words are missing 13:10 < marcusw> they wanted to have 11881376 words and 5 per line, but that's not possible 13:11 < marcusw> so as for exactly why the words are missing from those lines, I don't know 13:12 < marcusw> it looks like they were grabbed from a bucket or something, and when the bucket got empty, shit got weird 13:12 < marcusw> this concludes the report on otp.txt 13:13 < marcusw> ok, gonna talk about the CD 13:14 < marcusw> it probably has date+time stamps...IDK 13:14 < marcusw> I just looked at it with hexdump 13:14 < anonen_> marcusw: can you see what type of computer made it or any info at all 13:14 < marcusw> anonen_: yes, actually 13:15 <@asdgfsadg> marcusw, dump infoz 13:15 < marcusw> CDBURNERXP 18.104.22.16843 13:15 < marcusw> CD001 13:15 < marcusw> DISC