by Daan Wout, Jan Scholten, Carlos E. Celemin, and Jens Kober
Reference:
Daan Wout, Jan Scholten, Carlos E. Celemin, and Jens Kober. Learning Gaussian Policies from Corrective Human Feedback. arXiv:1903.05216 [cs.LG], 2019.
Bibtex Entry:
@Misc{Wout2019arXiv,
author = {Wout, Daan AND Scholten, Jan AND Celemin, Carlos E. AND Kober, Jens},
note = {arXiv:1903.05216 [cs.LG]},
title = {Learning {G}aussian Policies from Corrective Human Feedback},
year = {2019},
doi = {10.48550/ARXIV.1903.05216},
code = {https://github.com/DWout/GPC},
file = {https://arxiv.org/pdf/1903.05216.pdf},
project = {TERI},
oa = {bronze},
}