A1 Journal article (refereed)
Sima – an Open-source Simulation Framework for Realistic Large-scale Individual-level Data Generation (2021)


Tikka, S., Hakanen, J., Saarela, M., & Karvanen, J. (2021). Sima – an Open-source Simulation Framework for Realistic Large-scale Individual-level Data Generation. International Journal of Microsimulation, 14(3), 27-53. https://doi.org/10.34196/IJM.00240


JYU authors or editors


Publication details

All authors or editorsTikka, Santtu; Hakanen, Jussi; Saarela, Mirka; Karvanen, Juha

Journal or seriesInternational Journal of Microsimulation

eISSN1747-5864

Publication year2021

Publication date31/12/2021

Volume14

Issue number3

Pages range27-53

PublisherInternational Microsimulation Association

Publication countryUnited Kingdom

Publication languageEnglish

DOIhttps://doi.org/10.34196/IJM.00240

Publication open accessOpenly available

Publication channel open accessOpen Access channel

Publication is parallel published (JYX)https://jyx.jyu.fi/handle/123456789/80356


Abstract

We propose a framework for realistic data generation and the simulation of complex systems and demonstrate its capabilities in a health domain example. The main use cases of the framework are predicting the development of variables of interest, evaluating the impact of interventions and policy decisions, and supporting statistical method development. We present the fundamentals of the framework by using rigorous mathematical definitions. The framework supports calibration to a real population as well as various manipulations and data collection processes. The freely available open-source implementation in R embraces efficient data structures, parallel computing, and fast random number generation, hence ensuring reproducibility and scalability. With the framework, it is possible to run daily-level simulations for populations of millions of individuals for decades of simulated time. An example using the occurrence of stroke, type 2 diabetes, and mortality illustrates the usage of the framework in the Finnish context. In the example, we demonstrate the data collection functionality by studying the impact of nonparticipation on the estimated risk models and interventions related to controlling excessive salt consumption.


Keywordsstatistical methodsmathematical modelsdata systemsdata structuresdata processinghealth sectorsimulationforecastssource codesopen source code


Contributing organizations


Related projects


Ministry reportingYes

Reporting Year2022

JUFO rating1


Last updated on 2024-03-04 at 18:15