Inspired by the diversity and depth of XLand and the simplicity and minimalism of MiniGrid, we present XLand-MiniGrid, a suite of tools and grid-world environ-ments for meta-reinforcement learning research. Written in JAX, XLand-MiniGrid is designed to be highly scalable and can potentially run on GPU or TPU acceler-
受XLand的多样性和深度以及MiniGrid的简单性和极简主义的启发,我们推出了XLand-MiniGrid,这是一套用于元强化学习研究的工具和网格世界环境。XLand-MiniGrid是用JAX编写的,它被设计成高度可扩展的,并且有可能在GPU或TPU加速器上运行,从而在有限的资源下实现大规模实验的民主化。
We present XLand-Minigrid, a suite of tools and grid-world environments for meta-reinforcement learning research inspired by the diversity and depth of XLand and the simplicity and minimalism of MiniGrid. XLand-Minigrid is written in JAX, designed to be highly scalable, and can potentially run on GPU or TPU accelerators, democratizing large-scale
We present XLand-100B, a large-scale dataset for in-context reinforcement learning based on the XLand-MiniGrid environment, as a first step to alleviate this problem. It contains complete learning histories for nearly 30,000 different tasks, covering 100B transitions and 2.5B episodes.
XLand-MiniGrid появился, чтобы закрыть этот пробел», — пояснил Вячеслав Синий из T-Bank AI Research. Руководитель группы «Адаптивные агенты» Владислав Куренков добавил, что благодаря разнообразию задач
XLand-Minigrid is written in JAX, designed to be highly scalable, and can potentially run on GPU or TPU accelerators, democratizing large-scale experimentation with limited resources. To demonstrate the generality of our library, we have implemented some well-known single-task environments as well as new meta-learning environments capable of
Written in JAX, XLand-MiniGrid is designed to be highly scalable and can potentially run on GPU or TPU accelerators, democratizing large-scale experimentation with limited resources. Along with the environments, XLand-MiniGrid provides pre-sampled benchmarks with millions of unique tasks of varying difficulty and easy-to-use baselines that
Minigrid contains simple and easily configurable grid world environments to conduct Reinforcement Learning research. This library was previously known as gym-minigrid. Toggle site navigation sidebar. MiniGrid Documentation. Farama Foundation Hide navigation sidebar. Hide table of contents sidebar
XLand-MiniGrid is a suite of tools, grid-world environments and benchmarks for meta-reinforcement learning research inspired by the diversity and depth of XLand and the simplicity and minimalism of MiniGrid. Despite the similarities, XLand-MiniGrid is written in JAX from scratch and designed to be highly scalable, democratizing large-scale
Inspired by the diversity and depth of XLand and the simplicity and minimalism of MiniGrid, we present XLand-MiniGrid, a suite of tools and grid-world environments for meta-reinforcement learning research. Written in JAX, XLand-MiniGrid is designed to be highly scalable and can potentially run on GP
Inspired by the diversity and depth of XLand and the simplicity and minimalism of MiniGrid, we present XLand-MiniGrid, a suite of tools and grid-world environments for meta-reinforcement learn-ing research. Written in JAX, XLand-MiniGrid is designed to be highly scalable and can poten-tially run on GPU or TPU accelerators, democ-
Written in JAX, XLand-MiniGrid is designed to be highly scalable and can potentially run on GPU or TPU accelerators, democratizing large-scale experimentation with limited resources. Along with the
XLand-MiniGrid is a suite of tools, grid-world environments and benchmarks for meta-reinforcement learning research inspired by the diversity and depth of XLand and the simplicity and minimalism of MiniGrid. Despite the similarities, XLand-MiniGrid is written in JAX from scratch and designed to be highly scalable, democratizing large-scale
We present XLand-MiniGrid, a suite of tools and grid-world environments for meta-reinforcement learning research inspired by the diversity and depth of XLand and the simplicity and minimalism of MiniGrid. XLand-Minigrid is written in JAX, designed to be highly scalable, and can potentially run on GPU or TPU accelerators, democratizing large
Written in JAX, XLand-MiniGrid is designed to be highly scalable and can potentially run on GPU or TPU accelerators, democratizing large-scale experimentation with limited resources. Along with the environments, XLand-MiniGrid provides pre-sampled benchmarks with millions of unique tasks of varying difficulty and easy-to-use baselines that
Written in JAX, XLand-MiniGrid is designed to be highly scalable and can potentially run on GPU or TPU accelerators, democratizing large-scale experimentation with limited resources. Along with the environments, XLand-MiniGrid provides pre-sampled benchmarks with millions of unique tasks of varying difficulty and easy-to-use baselines that
We present XLand-MiniGrid, a suite of tools and grid-world environments for meta-reinforcement learning research inspired by the diversity and depth of XLand and the simplicity and minimalism of MiniGrid. XLand-Minigrid is written in JAX, designed to be highly scalable, and can potentially run on GPU or TPU accelerators, democratizing large
We present XLand-MiniGrid, a suite of tools and grid-world environments for meta-reinforcement learning research inspired by the diversity and depth of XLand and the simplicity and minimalism of MiniGrid. XLand-Minigrid is written in JAX, designed to be highly scalable, and can potentially run on GPU or TPU accelerators, democratizing large
introduce XLand-MiniGrid, a library of grid world environments for meta-RL research. It does not compromise on task complexity in favour of affordability, democratizing large scale experimentation with limited resources. 2 XLand-MiniGrid We present an initial release of XLand-MiniGrid(v0.0.1), a suit of tools and grid world environments
文章浏览阅读413次,点赞5次,收藏3次。xland-minigrid 开源项目教程 xland-minigrid JAX-accelerated Meta-Reinforcement Learning Environments Inspired by XLand and MiniGrid ????️_xland 强化学习
We present XLand-100B, a large-scale dataset for in-context reinforcement learning based on the XLand-MiniGrid environment, as a first step to alleviate this problem. It contains complete learning histories for nearly
Written in JAX, XLand-MiniGrid is designed to be highly scalable and can potentially run on GPU or TPU accelerators, democratizing large-scale experimentation with limited resources. Along with the environments, XLand-MiniGrid provides pre-sampled benchmarks with millions of unique tasks of varying difficulty and easy-to-use baselines that
Written in JAX, XLand-MiniGrid is designed to be highly scalable and can potentially run on GPU or TPU accelerators, democratizing large-scale experimentation with limited resources. Along
Along with the environments, XLand-MiniGrid provides pre-sampled benchmarks with millions of unique tasks of varying difficulty and easy-to-use baselines that allow users to quickly start training adaptive agents.
ck time. While we do not introduce any novel algorithmic improvements in our work, we hope that the proposed highly scalable XLand-MiniGrid environments will help practitioners perform meta-reinforcement learning experiments at scale faster and with fewer r
Similar to Jumanji (Bonnet et al., 2023), XLand-MiniGrid Environment interface is inspired by the dm_env API (Muldal et al., 2019), which is particularly well suited for the meta-RL, as it separates episodes from trials by design (see Section D.1 ). Thus, each environment should provide jit-compatible reset, reset_trial and step methods.
For single-tasks environments we consider random policy and PPO. As can be seen, compared to the commonly used MiniGrid (Chevalier-Boisvert et al., 2023) environments with gymnasium (Towers et al., 2023) asynchronous vectorization, XLand-Minigrid achieves at least 10x faster throughput reaching tens of millions of steps per second.
Full-scale XLand environment can use more than five rules according to the Team et al. ( 2023). To test XLand-MiniGrid in similar conditions we report simulation throughput varying number of rules. For testing purposes we just replicated same NEAR rule multiple times in the PutNear environment.
aselinesWith the release of XLand-MiniGrid, we are providing near-single-file implementations of recurrent PPO (Schulman et al., 2017) for single-task environments and its extension to RL2 (Duan et al., 2016; Wang et al., 2016) for meta-learning as b
We are deeply committed to excellence in all our endeavors.
Since we maintain control over our products, our customers can be assured of nothing but the best quality at all times.