Shah Zeb · Follow
--
Are you Facing the challenge of optimizing a slow, cumbersome VBA script? The solution is to transform VBA to Python.
Why Python Over VBA?
The leap from VBA to Python is driven by a compelling set of advantages:
- Speed: Python’s processing time dramatically outpaces VBA.
- Scalability: Capable of handling more complex analyses and larger data sets with ease.
- Libraries: Access to powerful libraries like Pandas and NumPy enhances data manipulation and analysis capabilities.
- Cross-Platform: Runs on various operating systems, providing greater flexibility.
- Community: A vast, active community for support, resources, and innovation.
To test Python’s ability over VBA, here are the project details we tested out recently and found results accordingly
The Project: Insurance Policy Fee Projection Model
Overview
Transforming an existing VBA model into a Python-powered application, this project aimed to project monthly fees from a block of insurance policies more efficiently, utilizing xlwings for seamless Excel integration.
Goal
To enhance model performance, reducing processing times while leveraging Excel for input and output manipulation, across a span of 600 months (50 years) and various stochastic scenarios.
Methodology
The Python model emulates and optimizes the VBA script’s logic:
Scenario Projection: Loops through stochastic scenarios, assessing policy status monthly.
- Pandemic Adjustment: Adjusts mortality rates for pandemic effects.
- Randomization: Uses uniform random numbers for lapse/survival decisions.
Inputs and Outputs
- Inputs: Policy details, lapse and mortality assumptions from the ‘Inforce’ and ‘Assumptions’ tabs in Excel.
- Output: A CSV file with detailed scenario projections and total fees for each month.
Quick Start Guide
1. Prerequisites: Install Python and xlwings.
2. Execution: Run the Python script with the Excel workbook open.
3. Results: Check the specified directory for the output CSV file.
Embracing Python: The Path Forward
This project not only showcased the significant efficiency gains from migrating to Python but also illuminated the path for future project enhancements and complex modeling capabilities, proving Python’s superiority over VBA in modern data processing tasks.
Here is the VBA code which was transformed to Python (coming ahead)
The VBA code took a processing time of almost 10 mins, whereas the same processing was done under 30 seconds with Python
Sub Projection() Application.ScreenUpdating = False
Application.Calculation = False
Dim Age As Integer
Dim MortalityTable As Integer
Dim LapseTable As Integer
Dim Duration As Integer
Dim Fee As Double
Dim FeeMode As Integer
Dim oFile As String
Dim TotalFee(1 To 1200) As Double
Dim SurvivalRate As Double
Dim LapseRate As Double
Dim MortalityRate As Double
Dim RunNumber As Integer
Dim PandemicIncidence As Double
Dim PandemicSeverity As Double
Dim PandemicYear(1 To 50) As Double
Dim PandemicFactor(1 To 600) As Double
Set LapseRange = Range("LapseRange") 'lapse rates dataframe
Set MortalityRange = Range("MortalityRange") 'mortality rates dataframe
RunNumber = Range("NumScen") 'number of stochastic scenarios
PandemicIncidence = Range("PandemicIncidence") 'Probability of a pandemic
PandemicSeverity = Range("PandemicSeverity") 'Severity of a pandemic
oFile = Range("file") 'output file
'Open oFile For Output As #1
For i = 1 To RunNumber 'loop through stochastic scenarios
For j = 1 To 600
TotalFee(j) = 0
Next j
Calculate
For j = 1 To 50 'determine if a pandemic happens in a year
If j = 1 Then
If Rnd < PandemicIncidence Then
PandemicYear(j) = 1
Else
PandemicYear(j) = 0
End If
Else
If PandemicYear(j - 1) = 1 Then
PandemicYear(j) = 0.5
Else
If Rnd < PandemicIncidence Then
PandemicYear(j) = 1
Else
PandemicYear(j) = 0
End If
End If
End If
Next j
For j = 1 To 600 'determine the pandemic factors by month
PandemicFactor(j) = PandemicYear(Int((j - 0.1) / 12) + 1) * PandemicSeverity
If Sheets("Inforce").Cells(j + 1, 4) > 0 Then
Age = Sheets("Inforce").Cells(j + 1, 4)
Duration = Sheets("Inforce").Cells(j + 1, 6)
MortalityTable = Sheets("Inforce").Cells(j + 1, 7)
LapseTable = Sheets("Inforce").Cells(j + 1, 8)
Fee = Sheets("Inforce").Cells(j + 1, 9)
FeeMode = Sheets("Inforce").Cells(j + 1, 10)
For k = 1 To 600 'loop through 600 months (50 years)
If k = 1 Then ' determine the policy survival rate
SurvivalRate = 1
LapseRate = 0
MortalityRate = 0
Else
LapseRate = LapseRange(Duration, LapseTable) 'lapse rates are monthly rates
MortalityRate = 1 - ((1 - MortalityRange(Age, MortalityTable) * PandemicFactor(k)) ^ (1 / 12)) 'mortality rates are annual rates and are converted to monthly
If RunType = 0 Then
If LapseRate < Rnd Then
LapseRate = 0
Else
LapseRate = 1
End If
If MortalityRate < Rnd Then
MortalityRate = 0
Else
MortalityRate = 1
End If
End If
SurvivalRate = SurvivalRate * (1 - LapseRate) * (1 - MortalityRate)
End If
TotalFee(k) = TotalFee(k) + SurvivalRate * Fee
Duration = Duration + 1
Age = Age + 1
If SurvivalRate = 0 Then 'if policy doesn't survive all future cashflows are zero and go to next policy
GoTo 123:
End If
Next k
End If
123:
Next j
'Write data to output csv file
For j = 1 To 599
Write #1, TotalFee(j),
Next j
Write #1, TotalFee(600)
Next i
Close #1
Application.ScreenUpdating = True
Application.Calculation = True
End Sub
Here is the transformed Python Code
import openpyxl
import xlwings as xw
import pandas as pd
import csv
import random
import time# @echo off
# "C:\Users\shahz\AppData\Local\Programs\Python\Python311\python.exe" "C:\path\to\your\excelV3.py"
# pause
def projection():
# Load the Excel workbook and sheets
wb = xw.Book("C:/Users/shahz/Desktop/excel225/StochasticScenarios.xlsm") #CLIENT EDIT : PASTE YOUR Main Macro Enabled Workbook Directory here
sht_control = wb.sheets['Control']
sht_inforce = wb.sheets['Inforce']
sht_assumptions = wb.sheets['Assumptions']
# Load ranges from Assumptions tab
lapse_range = sht_assumptions.range('LapseRange').value
mortality_range = sht_assumptions.range('MortalityRange').value
# Read values from Control tab
run_number = int(sht_control.range("NumScen").value)
o_file = sht_control.range("file").value
pandemic_incidence = sht_assumptions.range("PandemicIncidence").value
pandemic_severity = sht_assumptions.range("PandemicSeverity").value
# Loop through stochastic scenarios
with open(o_file, 'w', newline='') as file:
writer = csv.writer(file)
for i in range(run_number):
total_fee = [0] * 600
pandemic_year = [0] * 50
pandemic_factor = [0] * 600
# Determine if a pandemic happens in a year
for j in range(50):
if j == 0:
pandemic_year[j] = 1 if random.random() < pandemic_incidence else 0
else:
pandemic_year[j] = 0.5 if pandemic_year[j - 1] == 1 else (1 if random.random() < pandemic_incidence else 0)
# Determine the pandemic factors by month
for j in range(1,601):
# pandemic_factor[j] = pandemic_year[int((j - 0.1 / 12)+1)] * pandemic_severity
pandemic_factor[j-1] = pandemic_year[int((j-0.1)/ 12)] * pandemic_severity
# In VBA we called the last row as 100000, but in python its best to dynamically calculate the last row,
last_row = sht_inforce.range('A' + str(sht_inforce.cells.last_cell.row)).end('up').row
# Load the Inforce data
df_inforce = sht_inforce.range(f'A1:J{last_row}').options(pd.DataFrame, header=1, index=False).value
# Loop through policies
for _, row in df_inforce.iterrows():
if row['Current Age (Months)'] > 0:
age = int(row['Current Age (Months)'])
duration = int(row['Policy Duration (Months)'])
mortality_table = int(row['Mortality Table'])
lapse_table = int(row['Lapse Table'])
fee = float(row['Monthly Fee (in dollars)'])
fee_mode = int(row['Premium Mode'])
survival_rate = 1
for k in range(600):
if k == 0: # k = 0, because of 0 based indexing
lapse_rate = 0
mortality_rate = 0
else:
if mortality_range[age - 1][mortality_table - 1] is None:
mortality_rate = 0
else:
mortality_rate = 1 - ((1 - mortality_range[age - 1][mortality_table - 1] * pandemic_factor[k]) ** (1/12))
lapse_rate = lapse_range[duration - 1][lapse_table - 1] # Considering 0-based indexing
if lapse_rate < random.random():
lapse_rate = 0
else:
lapse_rate = 1
if mortality_rate < random.random():
mortality_rate = 0
else:
mortality_rate = 1
survival_rate = survival_rate * (1 - lapse_rate) * (1 - mortality_rate)
total_fee[k] += survival_rate * fee
duration += 1
age += 1
if survival_rate == 0:
break
# Write to output csv file
writer.writerow(total_fee)
print(f'Progress {(i/100)*100}%')
if __name__ == '__main__':
start_time = time.time()
projection()
end_time = time.time()
elapsed_time = end_time - start_time
print(f"Program executed in {elapsed_time} seconds.")
For more details on the project, including the full code and methodology, check out the GitHub repository