Enhancing deep reinforcement learning for scale flexibility in real-time strategy games


We published, in partnership with DCC at UFMG, a new technique for training reinforcement learning agents to play real time strategy games (RTS). Our architecture allows the agent to train across maps of different size, enabling the agent to take decisions on different maps at execution time. This method is a first step towards more adaptable AI agents, which would be able to handle sequential decision problems with observations of different size.

This work was originally presented as a paper at SBGames, the main conference on digital games in Brazil. An extended version was then recently published in the Journal Entertainment Computing. The extended version is available here, and our source code is available on GitHub.

Automatic On-line Configuration of Multi-Arm Bandit Algorithms

We presented at the European Conference on Artificial Intelligence (ECAI) our work “An Online Incremental Learning Approach for Configuring Multi-arm Bandits Algorithms”. Our approach employs Bayesian optimisation in an on-line way to dynamically adapt the hyper-parameters of multi-arm bandit algorithms. We are interested in uncertain and dynamic environments, such as on-line web server optimisation. Our paper can be found here, and our source code is on Github.

Best Thesis Award

Photo from DCC-UFMG.

Our previous PhD student, Washington L. S. Ramos, was awarded the best thesis award in the workshop of thesis and dissertations of the SIBGRAPI conference. SIBGRAPI is an international conference in Computer Graphics and Computer Vision, organised by the Brazilian Computing Society. It is a key target in Brazil, where Washington is based.

Washington developed reinforcement learning techniques to automatically accelerate unedited videos based on textual data. His agents are able to dynamically decide the relevance of the current video frame, and decide the current video acceleration at execution time, generating a video with varying speed-up rates according to the relevance of the content.

A demonstration video is available at https://youtu.be/u6ODTv7-9C4, and his thesis extended abstract is available here.

Identifying Adversaries in Teamwork

Our work “It Is Among Us: Identifying Adversaries in Ad-hoc Domains Using Q-valued Bayesian Estimations” was presented at the International Conference on Autonomous Agents and Multiagent Systems (AAMAS). That works integrates a Bayesian framework within the Monte Carlo Tree Search algorithm, in order to allow an autonomous agent to identify hidden adversaries in a team. Our paper can be found here, and the source code is on Github.

Enhancing robustness in video recognition models: Sparse adversarial attacks and beyond


We published in the journal Neural Networks an extended version of our previous conference paper “Sparse Adversarial Video Attacks with Spatial Transformations”. In this journal extension, titled “Enhancing robustness in video recognition models: Sparse adversarial attacks and beyond”, we explored how our attack technique can increase the robustness of video classification models, through an adversarial training process. The paper can be found here. As usual, our source code is also available on Github.

ReCePS: Reward Certification for Policy Smoothed Reinforcement Learning


We presented the paper “ReCePS: Reward Certification for Policy Smoothed Reinforcement Learning”, at the AAAI Conference on Artificial Intelligence (AAAI 2024). Our work introduces a certification method for smoothed policies, which are more robust than traditional policies in reinforcement learning. We presented a more general certification method than the previous state-of-the-art, using a black-box approach. Our paper is freely available. The source code is also available.

Leveraging Synthetic Data to Learn Video Stabilization Under Adverse Conditions

Our work “Leveraging Synthetic Data to Learn Video Stabilization Under Adverse Conditions” was presented at the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024). Our paper presents a new technique that is trained on synthetic data in order to perform video stabilisation (on real videos). In particular, our approach also performs well under adverse weather conditions, since it does not rely on the usual feature extraction techniques. The paper is available here. Our code and datasets are also available.

Information-guided Planning: An Online Approach for Partially Observable Problems

Our work “Information-guided Planning: An Online Approach for Partially Observable Problems” was presented at the 37th Conference on Neural Information Processing Systems (NeurIPS 2023). The work integrates entropy into the decision-making process of the Monte Carlo simulations of an on-line planner, improving the agent’s performance, especially in scenarios with sparse rewards. The paper is freely available. Our source code is also available in the paper’s GitHub.

Scale-Invariant Reinforcement Learning in Real-Time Strategy Games

We presented in the Brazilian Symposium On Games and Digital Entertainment our work “Scale-Invariant Reinforcement Learning in Real-Time Strategy Games”. We integrate Spatial Pyramid Pooling (SPP) with Deep Reinforcement Learning, in order to allow a trained agent to play in maps of different dimensions in Real Time Strategy Games. The paper is freely available, including our source code.

Robust Federated Learning Method against Data and Model Poisoning Attacks with Heterogeneous Data Distribution

We presented at the European Conference on Artificial Intelligence (ECAI) the work “Robust Federated Learning Method against Data and Model Poisoning Attacks with Heterogeneous Data Distribution”. The work introduces a novel technique for defending against data and model poisoning attacks in federated learning, even when there is high data heterogeneity. The paper is available for free here, and the source code is available on GitHub.