This project is no longer actively maintained. While the code remains available for reference and use, no updates, bug fixes, or new features will be provided. Users are encouraged to seek alternative ...
RULER (Relative Universal LLM-Elicited Rewards) eliminates the need for hand-crafted reward functions by using an LLM-as-judge to automatically score agent trajectories. Simply define your task in the ...
PyPI, the default platform for Python's package management tools, is warning users of a fresh phishing campaign.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results