MODELLING REAL WORLD PRESCRIPTIVE PROCESS: DATA MODELLING, MODEL SIMULATION AND ANALYSIS

Suchismita Sahu
Analytics Vidhya
Published in
12 min readSep 15, 2021

--

Modeling is the process of producing a model which is a representation of the construction and working of some system of interest. Simulation is used before an existing system is altered or a new system built, to reduce the chances of failure to meet specifications, to eliminate unforeseen bottlenecks, to prevent under or over-utilization of resources, and to optimize system performance.

In this article, we are going to learn the following :

  • What is Simulation Modelling
  • Benefits of having Simulation Modelling
  • Elements of Simulation Analysis
  • An Use Case: A Hospital ward Bed Occupancy Model.
  • Implementation of the Use Case with Python code
  • Limitations of Simulation Modelling.
  • What is Data Modelling?
  • Limitations of Data Modelling.
  • Why is Simulation Modelling along with Data Science?
  • Conclusion

So let’s start..

WHAT IS SIMULATION MODELLING?

Simulation Modeling is the process of creating and analyzing a digital prototype of a physical model to predict its performance in the real world, which helps the designers and engineers understand whether, under what conditions, and in which ways a part of the system could fail and what loads it can withstand. Simulation Modelling, provides valuable solutions by giving clear insights into complex systems, across industries and disciplines, simulation modelling. It is a physics-based simulation modeling is more classical, but powerful, approach to represent causal relationships between a set of controlled inputs and corresponding outputs.

BENEFITS OF HAVING SIMULATION MODELLING

  • Risk-free Environment: Simulation modeling provides to explore different “what-if” scenarios. The effect of changing bed capacity levels in a hospital (Please refer the Use Case) may be seen without putting production at risk, which helps to make the right decision before making real-world changes. Mathematical model built using historical data of a Healthcare system, enables a prospective analysis of the pressures on the system if the current trends continue into the future. The results can help you to understand and plan for scenarios that would place significant pressure on healthcare systems. For example, if there was a significant flu outbreak, some parameters can be predicted that can be added into a model to help understand the impact on service delivery. As well as extra demand for services, a flu outbreak would also affect healthcare staff, so the supply of services would also be affected. These are useful if we are considering making long term or significant changes, the impact of which can be difficult to predict. Before, we initiate change, it will help us to plan and implement more effective service improvements. They can also enable us to understand what the demand for services may be like in the future, to ensure that any service developments are ‘future proofed’.
  • Save Money and Time: Virtual experiments with simulation models are less expensive and take less time than experiments with real assets.
  • Increased accuracy: A simulation model can capture many more details than an analytical model, providing increased accuracy and more precise forecasting.
  • Visualization: Simulation models concepts and ideas to be more easily verified, communicated, and understood using various kinds of 2D/3D visualizations. Analysts gain trust in a model by visualizing it in action and can clearly demonstrate findings to management.
  • Handle uncertainty: Uncertainty in operation times and outcome can be easily represented in simulation models, allowing risk quantification, and for more robust solutions to be found.

ELEMENTS OF SIMULATION ANALYSIS

Following are the basic elements of Simulation Analysis:

  • Problem Formulation
  • Data Collection and Analysis
  • Model development
  • Model Verification and Validation
  • Model Experimentation and Optimization
  • Implementation of Simulation Results

USE CASE: A HOSPITAL WARD BED OCCUPANCY MODEL.

Problem Statement: Bed management has been an issue from the evolution of hospitals, but due to the increasing demand it has become more critical and have become an important criterion in delivering quality and cost effective health service, which needs a Bed Management team. So, how do they manage this?

Data Collection and Analysis : Patient Arrival Rate, that is the rate at which Patients arrive to a hospital, which we can get with the time between any two patients sampled from an exponential distribution. Length of stay in the hospital is again sampled from an exponential distribution. We assume here that the hospital ward can expand to cope with peaks in demand, and use the model to look at the expected natural variation in ward bed occupancy.

It is a Goal orientated task. The goals of bed management involve access to an appropriate bed to each patient in a timely way and reduction in number of patients that are turned away and directed to another facility due to lack of an available bed. There are numerous benefits of bed management including customer satisfaction, increased profits, forecasting capacity, and increased level of care. Hospitals must focus on reliability, accuracy, and customer level of care to be competitive and profitable, a key method to accomplish this is by continuously improving their bed management system. Complexity in planning is rising due to the increased day to-day variations in demand and insufficient resources.

If we go little back to understand Goal Programming, then we can say that this is a type of Linear Programming, which is important and useful for the problems related to optimization. We optimize a scenario built upon a number of constraints which administer that situation.

So, there are many kinds of techniques used to build such kind of models, which is not out topic of discussion in this blog. What we will discuss is that how we will simulate the model, once it is built.

MODEL DEVELOPMENT WITH PYTHON CODE

In Python, simpy library provides event simulation.

There are three steps to run a simulation in Python:

  1. Establish the environment: env = simpy.Environment()
  2. Pass in the parameters: env.process(checkpoint_run(env, num_booths, check_time, passenger_arrival))
  3. Run the simulation: env.run(until=10)

Once we have our environment established, we’ll pass in all of the variables that will act as our parameters. These are the things we can vary to see how the system will react to changes.

Step 1: Importing all the required libraries.

SimPy is a process-based discrete-event simulation framework based on standard Python.

from SimPy.SimulationTrace import *import simpy as smimport randomimport pandas as pdimport matplotlib.pyplot as plt

Step 2: We will store global variables in a class ‘g’. All the global variables are declared with required comments

class g:inter_arrival_time = 1  # Average time (days) between arrivalslos = 10  # Average length of stay in hospital (days)sim_duration = 500  # Duration of simulation (days)audit_interval = 1  # Interval between audits (days)beds = 15  # bed capacity of hospital

Step 3: Second Class: ‘Hospital’ contains methods for audit of beds occupied, summarizing audit (at end of run), and plotting bed occupancy over time (at end of run).

class Hospital:"""Hospital class holds:1) Dictionary of patients present2) List of audit times3) List of beds occupied at each audit time4) Current total beds occupied5) Admissions to data​Methods:__init__: Set up hospital instanceaudit: records number of beds occupied build_audit_report: builds audit report at end of run (calculate 5th, 50th and 95th percentile bed occupancy.chart: plot beds occupied over time (at end of run) """

Constructor method for hospital class. Initialize object with attributes.

def __init__(self):

self.patients = {} # Dictionary of patients present

self.patients_in_queue = {}self.patients_in_beds = {}self.audit_time = []  # List of audit timesself.audit_beds = []  # List of beds occupied at each audit timeself.audit_queue = []self.bed_count = 0  # Current total beds occupiedself.queue_count = 0self.admissions = 0  # Admissions to data
return

Audit method. When called appends current simulation time to audit_time list, and appends current bed count to audit_beds.

def audit(self, time):self.audit_time.append(time)self.audit_beds.append(self.bed_count)
self.audit_queue.append(self.queue_count)
return

This method is called at end of run. It creates a pandas DataFrame, transfers audit times and bed counts to the DataFrame, and
calculates/stores 5th, 50th and 95th percentiles.

def build_audit_report(self):self.audit_report = pd.DataFrame()self.audit_report['Time'] = self.audit_time
self.audit_report['Occupied_beds'] = self.audit_beds
self.audit_report['Median_beds'] =\
self.audit_report['Occupied_beds'].quantile(0.5)
self.audit_report['Beds_5_percent'] = \self.audit_report['Occupied_beds'].quantile(0.05)self.audit_report['Beds_95_percent'] = \self.audit_report['Occupied_beds'].quantile(0.95)self.audit_report['Queue'] = self.audit_queueself.audit_report['Median_queue'] = \

self.audit_report['Queue'].quantile(0.5)

self.audit_report['Median_queue'] = \

self.audit_report['Queue'].quantile(0.5)

self.audit_report['Queue_5_percent'] = \

self.audit_report['Queue'].quantile(0.05)self.audit_report['Queue_95_percent'] = \

self.audit_report['Queue'].quantile(0.95)

return

This method is called at end of run. It plots beds occupancy over the model run, with 5%, 50% and 95% percentiles.

def chart(self):"""This method is called at end of run. It plots beds occupancy over the model run, with 5%, 50% and 95% percentiles."""

Plotting the occupied Beds

# Plot occupied beds
plt.plot(self.audit_report['Time'],
self.audit_report['Occupied_beds'],color='k',marker='o',linestyle='solid',markevery=1,label='Occupied beds')
plt.plot(self.audit_report['Time'],
self.audit_report['Beds_5_percent'],

color='0.5',

linestyle='dashdot',

markevery=1,

label='5th percentile')
plt.plot(self.audit_report['Time'],
self.audit_report['Median_beds'],
color='0.5', linestyle='dashed',label='Median')

plt.plot(self.audit_report['Time'],self.audit_report['Beds_95_percent'],
color='0.5'
linestyle='dashdot',label='95th percentile')plt.xlabel('Day')plt.ylabel('Occupied beds')plt.title('Occupied beds (individual days with 5th, 50th and 95th ' +'percentiles)')plt.legend()
plt.show()

Plot the Queue of beds

# Plot queue for bedsplt.plot(self.audit_report['Time'],self.audit_report['Queue'],color='k',marker='o',linestyle='solid',markevery=1, label='Occupied beds')

plt.plot(self.audit_report['Time'],self.audit_report['Queue_5_percent'],color='0.5',linestyle='dashdot',markevery=1,label='5th percentile')

plt.plot(self.audit_report['Time'],self.audit_report['Median_queue'],color='0.5',linestyle='dashed',label='Median')plt.plot(self.audit_report['Time'],self.audit_report['Queue_95_percent'],color='0.5',linestyle='dashdot',label='95th percentile')plt.xlabel('Day')
plt.ylabel('Queue for beds')
plt.title('Queue for beds (individual days with 5th, 50th and 95th' +' percentiles)')plt.legend()plt.show()return

Third Class: ‘Model’ contains the model environment. The modelling environment is set up, and patient arrival and audit processes initiated. Patient arrival triggers a spell for that patient in hospital. Arrivals and audit continue fort he duration of the model run. The audit is then summarized and bed occupancy, and number of people waiting for beds (with 5th, 50th and 95th percentiles) plotted.

class Model:"""The main model class.

The model class contains the model environment. The modelling environment is set up, and patient arrival and audit processes initiated. Patient arrival triggers a spell for that patient in hospital. Arrivals and audit continue for the duration of the model run. The audit is then summarised and bed occupancy (with 5th, 50th and 95th percentiles) plotted.

Methods are:__init__: Set up model instanceaudit_beds: call for bed audit at regular intervals (after initial delayfor model warm-up)new_admission: trigger new admissions to hospital at regular intervals.Call for patient generation with patient id and length of stay, then callfor patient spell in hospital.run: Controls the main model run. Initialises model and patient arrival andaudit processes. Instigates the run. At end of run calls for an auditsummary and bed occupancy plot.spell_gen: stores patient in hospital patient list and bed queuedictionaries, waits for bed resource to become available, then removespatient from bed queue dictionary and adds patient to hospital beddictionary and increments beds occupied. Waits for the patient length ofstay in the hospital and then decrements beds occupied and removes patientfrom hospital patient dictionary and beds occupied dictionary."""

Constructor class for new model.

def __init__(self):self.env = simpy.Environment()return

Bed audit process. Begins by applying delay, then calls for audit at
intervals set in g.audit_interval :param delay: delay (days) at start of model run for model warm-up.

def audit_beds(self, delay):# Delay first audityield self.env.timeout(delay)# Continually generate audit requests until end of model runwhile True:# Call audit (pass simulation time to hospital.audit)self.hospital.audit(self.env.now)# Delay until next callyield self.env.timeout(g.audit_interval)returnNew admissions to hospital.
:param interarrival_time: average time (days) between arrivals
:param los: average length of stay (days)
def new_admission(self, interarrival_time, los):while True:# Increment hospital admissions countself.hospital.admissions += 1# Generate new patient object (from Patient class). Give patient id# and set length of stay from inverse exponential distribution).p = Patient(patient_id=self.hospital.admissions,los=random.expovariate(1 / los))# Add patient to hospital patient dictionaryself.hospital.patients[p.id] = p# Generate a patient spell in hospital (by calling spell method).# This triggers a patient admission and allows the next arrival to# be set before the paitent spell is finishedself.spell = self.spell_gen(p)self.env.process(self.spell)# Set and call delay before looping back to new patient admissionnext_admission = random.expovariate(1 / interarrival_time)yield self.env.timeout(next_admission)return#Controls the main model run. Initializes model and patient arrival and audit processes.
Instigates the run. At end of run calls for an audit summary and bed occupancy plot.#
def run(self):
# Set up hospital (calling Hospital class)self.hospital = Hospital()# Set up resources (beds)self.resources = Resources(self.env, g.beds)# Set up starting processes: new admissions and bed audit (with delay)self.env.process(self.new_admission(g.inter_arrival_time, g.los))self.env.process(self.audit_beds(delay=20))# Start model runself.env.run(until=g.sim_duration)# At end of run call for bed audit summary and bed occupancy plotself.hospital.build_audit_report()self.hospital.chart()return

Patient hospital stay generator. Increment bed count, wait for patient length of stay to complete, then decrement bed count and remove patient from hospital patient dictionary. :param p: patient object (contains length of stay for patient)

def spell_gen(self, p):# The following 'with' defines the required resources and automatically# releases resources when no longer requiredwith self.resources.beds.request() as req:# Increment queue countself.hospital.queue_count += 1# Add patient to dictionary of queuing patients. This is not used# further in this model.self.hospital.patients_in_queue[p.id] = p# Yield resource request. Sim continues after yield when resources# are vailable (so there is no delay if resources are immediately# available)yield req# Resource now available. Remove from queue count and dictionary of# queued objectsself.hospital.queue_count -= 1del self.hospital.patients_in_queue[p.id]# Add to count of patients in beds and to dictionary of patients in# bedsself.hospital.patients_in_beds[p.id] = pself.hospital.bed_count += 1# Trigger length of stay delayyield self.env.timeout(p.los)# Length of stay complete. Remove patient from counts and# dictionariesself.hospital.bed_count -= 1del self.hospital.patients_in_beds[p.id]del self.hospital.patients[p.id]return

Third Class: ‘Patient’ is the template for all patients generated (each new patient arrival creates a new patient object). The patient object contains patient id and length of stay.

class Patient:def __init__(self, patient_id, los):"""Contructor for new patient.:param patient_id: id of patient  (set in self.new_admission):param los: length of stay (days, set in self.new_admission)"""self.id = patient_idself.los = los
return

Fourth Class:’Resources’ holds the beds resource (it could also hold other resources, such as doctors)

class Resources:def __init__(self, env, number_of_beds):"""        Constructor method to initialise beds resource)"""self.beds = simpy.Resource(env, capacity=number_of_beds)
return
def main():
model = Model()
model.run()

return

Fifth Class: ‘Main’ Creates model object, and runs model

# Code entry point. Calls main method.if __name__ == '__main__':main()

Observation: Model displays the the percentiles of ‘Occupied Beds’ and ‘Queued Beds’, which helps to determine how many beds shall be put in queue in order to meet the patient’s demand and in this way, the entire system can be optimized.

LIMITATIONS OF SIMULATION MODELLING

Simulation modeling is a theory-based modeling approach, which uses physical or operational laws. Because the theory means a statement of what causes what and why, it is possible to represent clearly the causality between a set of controlled inputs and the corresponding outputs of the system which contradicts the data-modeling approach. Simulation-modeling approach is based on the prior knowledge of the target system, and its completion depends on how much we can understand about the system. Hence it cannot alone be a perfect solution for our problem.

WHAT IS DATA MODELLING

Big data has received greater attention in diverse research fields, where , the concept of modeling with data, has been referred to as data modeling, which has focused on representing correlations of data. Such an approach has been classified and studied in two ways: data mining and machine learning.

Data Modeling consists of a steps of processes: Data Acquisition, Modeling, Validation, and Prediction. However, data modeling is not always a powerful modeling approach. It has some limitations.

LIMITATIONS OF DATA MODELLING

  • One of the representative limitations is that it is able to just describe correlations between data, not represent causal relationships between controlled inputs and corresponding outputs. It cannot cope with anomalies and changing circumstances of the system. It can also be influenced by how much data we have of the target system.
  • Prediction under changed environment, is difficult for a Data Model.
  • Another limitation of the data model is that it cannot cope with unexpected events. In a real system, unexpected events, such as a rare event with a very low probability of occurrence, may occur due to its high complexity and uncertainty. Usually, a data set does not include these events, considering as outliers. When the event happens, the data model subordinated to the data set cannot predict the result of the unexpected events precisely.

To overcome such limitations of data modeling, it is necessary to use another approach through simulation modeling.

To sum up, each modeling approach has its advantages and disadvantages.

WHY IS SIMULATION MODELLING ALONG WITH DATA SCIENCE?

To mitigate the disadvantages of both the modelling approaches, a new modeling method is required, that employs advantages of both data modeling and simulation modelling.

Analytics is of 3 types:

  • Descriptive Analytics: Provides the insight of the current data pattern.
  • Predictive Analytics: Provides a predictive outcome.
  • Prescriptive Analytics: Provides discernment of how we can make things happen.

If we consider the last one: Prescriptive Analytics, then we can say that data set and data model can be used in the descriptive and predictive analysis, respectively, but cannot be used for the prescriptive analysis due to lack of causality. So, here, the role of Simulation Modelling comes into picture.

A Data Mining and Machine Learning model shall be built with correlation between and shall be fed to the Simulation Model, which shall contain the Causal relationship between the cause and effect, which is called Prescriptive Modelling, in combined.

CONCLUSION:

In this post, we learnt about Simulation Modelling, Data Modelling and there respective benefits and Limitations. We learnt, Simulation Modelling for Bed Occupancy Model. We designed an overview of Data Flow and System Architecture of Prescriptive Modelling system.

Happy Learning…See you in my next blog. Till then, Stay Tuned!

Reference:

--

--

Suchismita Sahu
Analytics Vidhya

Working as a Technical Product Manager at Jumio corporation, India. Passionate about Technology, Business and System Design.