Improve Remarkup parsing performance for certain large input blocks
Summary: Fixes T13487. In PHI1628, an install has a 4MB remarkup corpus which takes a long time to render. This is broadly expected, but a few reasonable improvements fell out of running it through the profiler.
Test Plan:
- Saw local cold-cache end-to-end rendering time drop from 12s to 4s for the highly secret input corpus.
- Verified output has the same hashes before/after.
- Ran all remarkup unit tests.
Maniphest Tasks: T13487
Differential Revision: https://secure.phabricator.com/D20968