Adversarial Reinforcement Learning

This was a project under the IEEE NITK Computer Society. The objective of this project was to explore adversarial attacks and defenses in single as well as multi-agent systems. In the single-agent domains, we focus on pixel-based attacks to explore the vulnerabilites in current function approximators (like neural networks) used with popular deep reinforcement learning algorithms (like PPO, DQN, etc) taking pixel-input in atari games from the Gym environments. In multi-agent, we concentrate on attacking SOTA self-play policies by training adversarial policies, that take actions that are naturally adversarial to the other agent, in 1-vs-1 zero-sum continuous control robotic environments from the MuJoCo simulator. We also studied potential defense procedures to counter such attacks.

More details about the work done during the project can be found here.

A detailed article about the methods and approaches studied during the project can be found here. We have also implemented some of these in this repository.

Following are some other links: