Fire Damage Statistical Analysis

Quick Links: Github Repo, Presentation (HTML)

Table of Contents:

  1. Introduction

  2. Process

  3. Results

  4. Insights

  5. Learnings

Introduction:

This project involved applying statistical methods, specifically Simple Linear Regression (SLR), to real data using R to solve some problem. In my project, I analyzed a dataset about fire damage taken from our textbook to examine the relationship between distance of a fire from the fire station and the corresponding damage in thousands of dollars. Ultimately, the project sought to contribute to better-informed decision-making and resource allocation in the prevention of damaging fires.

  • Goal: The primary objectives of the project were to:

    • Describe and analyze the fire damage dataset using SLR in R.

    • Produce a comprehensive R Markdown report, which includes data storytelling, the theoretical basis of SLR, and a conclusion with actionable, data-driven recommendations.

  • Key Questions:

    • To what extent does fire damage linearly depend on the distance of a fire from the fire station?

    • What methods can be used to improve the accuracy of statistical analysis, and how might future studies enhance this analysis?

    • What actionable insights can be derived from the analysis for stakeholders like fire insurance companies and urban planners?

Process:

  • Tools: Wrote in R, Rmd, HTML, and LaTeX in RStudio.

  • Methodology:

    • Research: Conducted initial research to understand the significance of the study and the dataset, identifying gaps in existing knowledge that this analysis could address.

    • Data Analysis: Explored the dataset to identify patterns, then applied the SLR model. I validated the model through checks such as residual analysis, t-tests, F-tests, and examining Cook’s Distance to account for potential outliers.

    • Insights: Based on the statistical findings, I formulated recommendations and insights to provide actionable suggestions to key stakeholders.

Results:

My analysis resulted in the following:

  • Validating the SLR Model: The initial model was fitted using the Method of Least Squares, and I validated it by checking the sum of squares identity and residuals. The t-tests, F-test, and significance of coefficients were all examined to ensure the model's reliability.

  • Linear Dependency: The analysis revealed a strong linear relationship between the distance of a fire from the fire station and the damage caused. The high Adjusted R-Squared, low Residual Standard Error (RSE), and significant F-test indicated that as the distance increases, fire damage increases as well.

  • Addressing Outliers: I used Cook’s Distance to identify influential outliers and re-fitted the model without the most influential outlier. While this adjustment lowered the R-Squared value, it resulted in a more accurate and well-rounded model with a lower RSE.

Insights:

This analysis provided actionable insights that can benefit fire insurance companies, urban planners, and other stakeholders involved in fire safety and mitigation efforts. Key insights include:

  • Re-Visiting Urban Planning: In suburban areas where fire stations are sparse, this analysis could inform urban planning decisions regarding the optimal location of fire stations. This is especially important for fire insurance companies, as higher fire damage leads to greater claims.

  • Evaluating Additional Variables: To enhance the analysis, other relevant variables such as response time, building materials, and fire suppression systems should be considered. These factors may have a significant impact on fire damage severity and could provide a more comprehensive understanding of the issue.

  • Predictive Models for Insurance Companies: Understanding which factors most significantly influence fire damage can assist insurance companies in determining risk and pricing policies. Predictive models based on these variables can improve the accuracy of damage estimations and help mitigate future fire damage costs.

Learnings:

This project allowed me to apply technical skills with practical problem-solving to address real-world needs. Key takeaways include:

  • Technical Skills: I gained hands-on experience with R and R Markdown, conducting detailed statistical analysis in R and presenting the results in a clear, accessible HTML format.

  • Real-World Applications: The project deepened my understanding of how data analysis can influence decision-making in real-world contexts. I learned how to tell a compelling story through data and how important it is to consider the broader impact of analytical findings on stakeholders like fire departments, urban planners, and insurance companies.