I was not questioning the ability to do what you describe, I was questioning the effectiveness.
Let's take a closer look at the example you provided and pretend you could encode the length/offset code for the "00100" at the cost of an escape symbol (which is not possible, but puts a lower bound on the order 0 entropy):
data: 00110010011001001100100110010000100000100001000001 00010011001001
prevalent bit pattern: 00100 (let's call this "A")
new data: 0011A11A11A11AA0AA0A010011A1
original data: 43 0's, 21 1's => order 0 entropy
58.43 bits (43 x 0.574 + 21 * 1.60
new data: 7 0's, 12 1's, 9 A's, plus 4 0's, 1 1's and 1 escape for the dictionary (total 11 0's, 13 1's, 9 A's, 1 escape) => order 0 entropy
58.28 bits (13 x 1.387 + 11 x 1.628 + 9 x 1.918 + 1 x 5.087)
That would provide 0.15 bits of savings if you could encode the offset/length code in just 1 escape, but you can't, so you have not compressed this data at all.