It makes no memory allocation, but a quicksort on every call. Other C++ implementations will probably not help. But if someone knows a trick to calculate percentiles of arbitrary arrays with no sorting, we'll gratefully implement it.

For the percentile of a time series, if you're in need of speed, you could keep a sorted copy of that series, and insert new values and remove old values with bsearch. This would be faster than the percentile function. But needs a bit more code in your script.