Accuracy of Discretely Sampled Stochastic Policies in Continuous-time Reinforcement Learning