Can’t understand entropy in tensorflow-probability.


I am new at using tensorflow-probability. I am using Categorical Distribution to sample a value and then get its probabilty and entropy but every time I sample from the distribution, I get the same entropy. This problem is of policy gradient algorithms. NN outputs logits which are then fed to Categorical distribution then action is sampled. Please let me know, what I am missing here.

submitted by /u/Better-Ad8608
[visit reddit] [comments]

Leave a Reply

Your email address will not be published.