One more question. When we say that the function returns the run with the best posterior probability, what is an intuitive explanation of what that means in the context of LDA?

Also, I see that my original comment in this thread is still awaiting moderation, which I’m sure is simply a mistake!

]]>Apologies for the delay in my response. Yes, you are right. The clearest explanation I’ve seen as to why it is not necessary is this one by John Cook, where he states *A Markov chain has no memory. That’s its defining characteristic: its future behavior depends solely on where it is, not how it got there. So if you “burn in” a thousand samples, your future calculations are absolutely no different than if you had started where there first thousand samples left off. Also, any point you start at is a point you might return to, or at least return arbitrarily close to again.*.

Many thanks for pointing this out, a timely reminder that I must go back and review some of my older pieces!

Regards,

Kailash.

]]>