Compute derivation only once per tx, instead of once per output. Approx 33% faster while using 75% as much CPU on my machine. Note old functions in cryptonote_core (lookup_acc_outs and is_out_to_acc) are still used by tests.