[-] An analysis of Schtserv's rare enemy rates

With schthack's relaunching of their blue burst server, I was curious if the infamous rare enemy reader prank addon worked on it. However, this took a strange turn into us analyzing the rates of rare enemies compared to other servers.

But first, lets go over how stock pso handles rare enemies. In memory there is a [u16; 16] where each element is an index into the object table, and 0xFFFF for a null value. For example if the first rag rappy were rare in the quest Respective Tomorrow the packet would look like this:
0000 | DE 00 18 00 00 00 00 00 1D 00 FF FF FF FF FF FF
0010 | FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF
0020 | FF FF FF FF FF FF FF FF
now removing the packet header and getting just the payload:
0000 | 1D 00 FF FF FF FF FF FF FF FF FF FF FF FF FF FF
0010 | FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF

This blob of data just gets directly copied into memory and is later referenced in a function:
// this function generated by ghidra's decompiler
bool __cdecl is_rare_enemy(short enemy_id)
{
  if ((rare_enemy_table != (short *)0x0) &&
     (((((enemy_id == *rare_enemy_table || (enemy_id == rare_enemy_table[1])) ||
        (enemy_id == rare_enemy_table[2])) ||
       (((enemy_id == rare_enemy_table[3] || (enemy_id == rare_enemy_table[4])) ||
        ((enemy_id == rare_enemy_table[5] ||
         ((enemy_id == rare_enemy_table[6] || (enemy_id == rare_enemy_table[7])))))))) ||
      ((enemy_id == rare_enemy_table[8] ||
       (((((enemy_id == rare_enemy_table[9] || (enemy_id == rare_enemy_table[10])) ||
          (enemy_id == rare_enemy_table[0xb])) ||
         ((enemy_id == rare_enemy_table[0xc] || (enemy_id == rare_enemy_table[0xd])))) ||
        ((enemy_id == rare_enemy_table[0xe] || (enemy_id == rare_enemy_table[0xf])))))))))) {
    return true;
  }
  return false;
}

The server would typically generate this data by going through each enemy, checking if it could be rare, rolling the die to see if it is rare, and if it is then add the enemy's id to the array.
Anyway, the packet schtserv sends out is quite strange:
0000 | 2D 01 A9 02 00 03 BC 04 9B 06 C8 07 B8 08 D6 09
0010 | 01 0B D6 0C 46 0E 17 10 88 10 93 12 84 13 F8 14
At this point we have myself, a Curious Bystander, an Anonymous Expert, and the Original Prankster all workshopping this trying to figure out what is going on. I immediately think this is all junk data and begin digging through schty.dll to see if any of these functions are getting monkeypatched. The Anonymous Expert, however, is certain the data is correct and the server is just doing something fucky.

Turns out he was right and I spent hours in schty.dll for nothing.

These values are properly used when the client checks for rare enemies, it just appears that they are generated in a strange way. And by strange I mean this is mostly junk, for example the largest quest in standard server rotation is Tyrell's Ego, which has 1054(0x41E) enemies. That number only encompasses the first three elements in this array. 0xB5O is the max amount of enemies a quest could possibly contain, which gets us all the way to the 9th element. the other 7 being completely useless.

Disclaimer: everything past this point is speculation on how schthack generates the ids for rare enemies based on observing the rare enemy packet that the server sends.

I speculate that the way these numbers are generated is that the server continually generates a number 0-511, adds it to the previous number, then puts it in the array until it is full.

So this feels wrong, but I can't quite place how. I'm sure theres some math-y way to prove it but I'm shit at math so here we are. So to prove to myself that the logic I assume the server is doing is worse I'm just gonna write a script that mocks out some runs of just temple in Respective Tomorrow looking at rappies.
from random import randrange
from functools import reduce
RAPPY_IDS =  {0x1D, 0x1E, 0x21, 0x22, 0x23, 0x6B, 0x6C, 0x46, 0x47, 0x48}
ITERATIONS = 1_000_000

def scht_rare_table():
    return set(reduce(lambda a,k: a + [k+a[-1]], [randrange(0, 512) for _ in range(15)], [randrange(0,512)]))

def vanilla_rare_table():
    return {rappy_id for rappy_id in RAPPY_IDS if randrange(0, 512) == 0}

def test_table(table_function):
    rare_counts = [len(table_function() & RAPPY_IDS) for _ in range(ITERATIONS)]
    return sum(rare_counts)
    
def report(id, count):
    print(f"{id}: {count/(len(RAPPY_IDS)*ITERATIONS)}")
    
report("scht", test_table(scht_rare_table))
report("vanilla", test_table(vanilla_rare_table))
print(f"ideal: {1/512.0}")
jake@sharnoth ~/post/0006 $ python rarerates.py
scht: 0.0022089
vanilla: 0.0019388
ideal: 0.001953125
Well thats interesting, if my assumption is correct then scht has slightly better rare enemy rates. Maybe this way has a better chance of getting multiple rare enemies in a single run?
from random import randrange
from functools import reduce
from collections import Counter
RAPPY_IDS = {0x1D, 0x1E, 0x21, 0x22, 0x23, 0x6B, 0x6C, 0x46, 0x47, 0x48}
ITERATIONS = 1_000_000

def scht_rare_table():
    return set(reduce(lambda a,k: a + [k+a[-1]], [randrange(0, 512) for _ in range(15)], [randrange(0,512)]))

def vanilla_rare_table():
    return {rappy_id for rappy_id in RAPPY_IDS if randrange(0, 512) == 0}

def test_table(table_function):
    rare_counts = [len(table_function() & RAPPY_IDS) for _ in range(ITERATIONS)]
    return (sum(rare_counts), Counter(rare_counts))
    
def report(id, count, counter):
    print(f"{id}: {count/(len(RAPPY_IDS)*ITERATIONS)} {counter}")
    
report("scht", *test_table(scht_rare_table))
report("vanilla", *test_table(vanilla_rare_table))
print(f"ideal: {1/512.0}")
jake@sharnoth ~/post/0006 $ python rarerates.py
scht: 0.0022244 Counter({0: 977939, 1: 21878, 2: 183})
vanilla: 0.0019589 Counter({0: 980597, 1: 19219, 2: 182, 3: 2})
ideal: 0.001953125
NOPE.

Disclaimer: from here on out I am really going to be hitting the limits of my math knowledge, anything I say here is possibly totally incorrect as I did not pay attention in school.

For science, what happens if instead RAPPY_IDS = set(range(10))?
jake@sharnoth ~/post/0006 $ python rarerates.py
scht: 0.0019614 Counter({0: 980556, 1: 19274, 2: 170})
vanilla: 0.00197 Counter({0: 980452, 1: 19397, 2: 150, 3: 1})
ideal: 0.001953125
WOW look at that almost perfect 1/512.

lets go through some different rappy id tables:
RAPPY_IDS = set(range(200, 210))
jake@sharnoth ~/post/0006 $ python rarerates.py
scht: 0.002896 Counter({0: 971291, 1: 28458, 2: 251})
vanilla: 0.0019588 Counter({0: 980588, 1: 19239, 2: 170, 3: 3})
ideal: 0.001953125
RAPPY_IDS = set(range(300, 310))
jake@sharnoth ~/post/0006 $ python rarerates.py
scht: 0.0035257 Counter({0: 965062, 1: 34621, 2: 315, 3: 2})
vanilla: 0.0019513 Counter({0: 980644, 1: 19200, 2: 155, 3: 1})
ideal: 0.001953125
RAPPY_IDS = set(range(400, 410))
jake@sharnoth ~/post/0006 $ python rarerates3.py
scht: 0.004307 Counter({0: 957297, 1: 42338, 2: 363, 3: 2})
vanilla: 0.0019622 Counter({0: 980550, 1: 19280, 2: 168, 3: 2})
ideal: 0.001953125
It seems that the higher the enemy id the more likely it is to be rare.

RAPPY_IDS = set(range(10, 0xb50, int(0xb50/10)))
jake@sharnoth ~/post/0006 $ python rarerates.py
scht: 0.0036391 Counter({0: 964174, 1: 35264, 2: 559, 3: 3})
vanilla: 0.0019534 Counter({0: 980630, 1: 19207, 2: 162, 3: 1})
ideal: 0.001953125
And an evenly distributed rappy id set is much higher of a rate than standard.

To control for maybe being incorrect about 512 being the number that the server adds on each interation, lets try with 1024 and see if the general properties of this remain the same.
from random import randrange
from functools import reduce
from collections import Counter
RAPPY_IDS = set(range(10, 0xb50, int(0xb50/10)))
ITERATIONS = 1_000_000

def scht_rare_table():
    return set(reduce(lambda a,k: a + [k+a[-1]], [randrange(0, 1024) for _ in range(15)], [randrange(0,1024)]))

def vanilla_rare_table():
    return {rappy_id for rappy_id in RAPPY_IDS if randrange(0, 512) == 0}

def test_table(table_function):
    rare_counts = [len(table_function() & RAPPY_IDS) for _ in range(ITERATIONS)]
    return (sum(rare_counts), Counter(rare_counts))
    
def report(id, count, counter):
    print(f"{id}: {count/(len(RAPPY_IDS)*ITERATIONS)} {counter}")
    
report("scht", *test_table(scht_rare_table))
report("vanilla", *test_table(vanilla_rare_table))
print(f"ideal: {1/512.0}")
jake@sharnoth ~/post/0006 $ python rarerates.py
scht: 0.0010403 Counter({0: 989644, 1: 10309, 2: 47})
vanilla: 0.0019608 Counter({0: 980565, 1: 19262, 2: 173})
ideal: 0.001953125
RAPPY_IDS = set(range(200, 210))
jake@sharnoth ~/post/0006 $ python rarerates.py
scht: 0.0011827 Counter({0: 988240, 1: 11693, 2: 67})
vanilla: 0.0019461 Counter({0: 980693, 1: 19154, 2: 152, 3: 1})
ideal: 0.001953125
RAPPY_IDS = set(range(300, 310))
jake@sharnoth ~/post/0006 $ python rarerates.py
scht: 0.0013261 Counter({0: 986798, 1: 13143, 2: 59})
vanilla: 0.0019439 Counter({0: 980747, 1: 19068, 2: 184, 3: 1})
ideal: 0.001953125
RAPPY_IDS = set(range(400, 410))
jake@sharnoth ~/post/0006 $ python rarerates.py
scht: 0.0014332 Counter({0: 985724, 1: 14220, 2: 56})
vanilla: 0.0019609 Counter({0: 980565, 1: 19263, 2: 170, 3: 2})
ideal: 0.001953125
RAPPY_IDS = set(range(10, 0xb50, int(0xb50/10)))
jake@sharnoth ~/post/0006 $ python rarerates.py
scht: 0.0017881 Counter({0: 982232, 1: 17655, 2: 113})
vanilla: 0.0019483 Counter({0: 980684, 1: 19149, 2: 167})
ideal: 0.001953125
So even if I the constant I have figured out is incorrect, this property should hold for whatever number they use.
As for why this is happening, I have no goddamn idea. Statistics classes are for sleeping, not listening. But I do suspect it has something to do with the fact that higher valued enemy ids can effectively have multiple rolls for a single enemy.

To control for variation with multiple enemy ids we will only have 1 id, and it will be the largest possible enemy id the game can handle.
RAPPY_IDS={0xB50}
jake@sharnoth ~/post/0006 $ python rarerates.py
scht: 0.003873 Counter({0: 996127, 1: 3873})
vanilla: 0.001917 Counter({0: 998083, 1: 1917})
ideal: 0.001953125
The result is approximately 1/256, twice as likely as a normal rare enemy.

However that will take minimum 6 rolls to reach, so to figure out the formula we should use something a bit lower, like 0xFF:
RAPPY_IDS={0xFF}
jake@sharnoth ~/post/0006 $ python rarerates.py
scht: 0.003202 Counter({0: 996798, 1: 3202})
vanilla: 0.001918 Counter({0: 998082, 1: 1918})
ideal: 0.001953125
So lets just take the first 4 rolls, for speed of computation.
# rolls# of possibilities that hit target
11
2256
332896
42829056
Where # of possiblility values come from
#include <stdio.h>

int rolls2(int target)
{
    int result = 0;
    for(int x = 0; x < 512; x++) {
        for(int y = 0; y < 512; y++) {
            if (x + y == target) {
                result++;
            }
        }
    }

    return result;
}

int rolls3(int target)
{
    int result = 0;
    for(int x = 0; x < 512; x++) {
        for(int y = 0; y < 512; y++) {
            for(int z = 0; z < 512; z++) {
                if (x + y + z == target) {
                    result++;
                }
            }
        }
    }

    return result;
}

int rolls4(int target)
{
    int result = 0;
    for(int w = 0; w < 512; w++) {
        for(int x = 0; x < 512; x++) {
            for(int y = 0; y < 512; y++) {
                for(int z = 0; z < 512; z++) {
                    if (w + x + y + z == target) {
                        result++;
                    }
                }
            }
        }
    }

    return result;
}

int main()
{
    int r2 = rolls2(0xFF);
    printf("r2: %d\n", r2);
    int r3 = rolls3(0xFF);
    printf("r3: %d\n", r3);
    int r4 = rolls4(0xFF);
    printf("r4: %d\n", r4);
}
So we have here 1/512 + 256/(512*512) + 32896/(512*512*512) + 2829056/(512*512*512*512) which equals 0.003215949982404709. Thats pretty close to the empirically measured value above.

With this, I am comfortable saying that Schtserv's rare enemy rates are probaly not what they intend, and given their new push towards a pure vanilla experience, they should fix this.


How this affects pranksnake kondreiu is left as an exercise for the reader.