About Health Gym
‘Health Gym’ is an open platform providing synthetic health-related data to the machine learning and clinical research communities. The primary purpose of these data is to allow researchers to easily prototype, test and compare offline reinforcement learning (RL) algorithms. Our goal is the development of safe and reproducible applications of RL in health and medicine, and ultimately to improve patient care. The distributed data can also be used for educational purposes and we encourage the re-use of the software used to create these data for openly sharing further synthetic patient records on the Health Gym platform.


Reinforcement Learning
Reinforcement learning (RL) is an area of artificial intelligence which centres on the problem of learning a behavioural ‘policy’, a mapping from states or situations to actions, which maximizes cumulative long-term reward in an evolving, time-varying environment. The recent combination of RL with neural network modelling (deep RL) has led to algorithms with super-human performance in videogames and complex board games, including GO and chess. We refer to Lilian Weng’s blog for an overview of RL, and to Sutton and Barto’s textbook and David Silver’s lectures for a more comprehensive introduction to RL.
RL in Healthcare
This generally requires modifying the duration, dose or type of treatment over time, and is challenging due to patient heterogeneity in response to treatment, potential relapse and side-effects. Clinicians often rely on clinical judgement and instinct, rather than formal evidence-based processes, to optimize sequences of treatments. Thus, there is vast potential for the application of RL algorithms for adaptive personalisation of treatment regimens, as shown by early research on optimizing antiretroviral therapy in HIV, radiotherapy planning in lung cancer, and the management of sepsis [1]Yu C, Dong Y, Liu J, Ren G. Incorporating causal factors into reinforcement learning for dynamic treatment regimes in HIV. BMC medical informatics and decision making. 2019;19(2):60. [2]Tseng HH, Luo Y, Cui S, Chien JT, Ten Haken RK, Naqa IE. Deep reinforcement learning for automated radiation adaptation in lung cancer. Medical physics. 2017;44(12):6690-705. [3]Komorowski M, Celi LA, Badawi O, Gordon AC, Faisal AA. The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care. Nature Medicine. 2018;24(11):1716..
Nonetheless, some authors have highlighted the lack of reproducibility and potential for patient harm inherent in these methods [4]Challen R, Denny J, Pitt M, Gompels L, Edwards T, Tsaneva-Atanasova K. Artificial intelligence, bias and clinical safety. BMJ Qual Saf. 2019;28(3):231-7.. In particular, recommendations made by RL algorithms may not be safe if the training data omit variables that influence clinical decision making, or if the effective sample size is small [5]Gotesman O, Johansson F, Komorowski M, Faisal A, Sontag D, Doshi-Velez F, Celi LA. Guidelines for reinforcement learning in healthcare. Nat Med. 2019;25(1):16-8..

Synthetic Data Generation
One of the difficulties in developing robust RL algorithms for healthcare lies in the highly sensitive and confidential nature of clinical data, which often requires scientists to establish formal collaborations and execute extensive data use agreements before sharing data. Our approach to overcome these barriers consists of generating synthetic data that closely resembles the original dataset but does not allow re-identification of individual patients and can therefore be freely distributed. The synthetic data distributed through the Health Gym platform were created using generative adversarial networks (GANs) and underwent extensive statistical and disclosure-related validation, as described in detail in our related publication [6]Kuo N et al. The Health Gym: An Open Platform with Health-Related Benchmark Problems for the Development of Reinforcement Learning Algorithms. Nat Scientific Data. 2021.. The developed software is also freely available and can be used for the generation and open sharing of further synthetic clinical datasets.
The Team Behind the Research
Our team has a diverse background in mathematics, computer science, epidemiology and medicine, with a shared enthusiasm for machine learning applications in health and medicine.

Nicholas Kuo
Research Fellow at the Centre for Big Data Research in Health

Prof. Mark Polizzotto
Professor of Medicine at The Australian National University College of Health and Medicine

Prof. Simon Finfer
Professorial Fellow in the Critical Care Division at The George Institute
Adj Prof, UNSW and Chair of Critical Care, ICL

Prof. Louisa Jorm
Foundation Director of the Centre for Big Data Research in Health

Dr Sebastiano Barbieri
Senior Research Fellow at the Centre for Big Data Research in Health
References[+]
1 | Yu C, Dong Y, Liu J, Ren G. Incorporating causal factors into reinforcement learning for dynamic treatment regimes in HIV. BMC medical informatics and decision making. 2019;19(2):60. |
---|---|
2 | Tseng HH, Luo Y, Cui S, Chien JT, Ten Haken RK, Naqa IE. Deep reinforcement learning for automated radiation adaptation in lung cancer. Medical physics. 2017;44(12):6690-705. |
3 | Komorowski M, Celi LA, Badawi O, Gordon AC, Faisal AA. The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care. Nature Medicine. 2018;24(11):1716. |
4 | Challen R, Denny J, Pitt M, Gompels L, Edwards T, Tsaneva-Atanasova K. Artificial intelligence, bias and clinical safety. BMJ Qual Saf. 2019;28(3):231-7. |
5 | Gotesman O, Johansson F, Komorowski M, Faisal A, Sontag D, Doshi-Velez F, Celi LA. Guidelines for reinforcement learning in healthcare. Nat Med. 2019;25(1):16-8. |
6 | Kuo N et al. The Health Gym: An Open Platform with Health-Related Benchmark Problems for the Development of Reinforcement Learning Algorithms. Nat Scientific Data. 2021. |