AI Engineering Glossary
Search

Policy Gradient

Policy Gradient methods are a class of algorithms that optimize policies directly by adjusting the parameters to maximize expected rewards. Unlike value-based methods, which evaluate and improve actions directly from the value function, policy gradients learn parameterized policies that directly map states to actions. This approach is powerful in environments with high-dimensional action spaces, such as robotics, where traditional methods struggle to represent all possible actions efficiently.

Search Perplexity | Ask ChatGPT | Ask Clade

a

b

c

d

e

f

g

h

i

j

k

l

m

n

o

p

q

r

s

t

u

v

w

z