Georgia Tech — Decision-Focused Risk Modeling Under Data Scarcity
Outcome: Built an interpretable risk-modeling framework to support occupational health decision-making when direct exposure data are limited, emphasizing early risk detection and conservative safety thresholds.
Decision Context
- Workplace exposure risks are difficult to monitor directly in many operational environments
- Decision-makers still need tools to prioritize monitoring and prevention efforts
- False negatives carry higher cost than false positives in safety-critical settings
What I Built
- Simulation-based cohort: Generated structured synthetic data grounded in published exposure research to enable scenario analysis
- Predictive modeling pipeline: Trained and compared multiple tree-based classifiers for nonlinear risk relationships
- Safety-oriented evaluation: Tuned thresholds to prioritize recall and minimize missed high-risk cases
- Interpretability layer: Applied feature attribution to support transparent review and communication of risk drivers
- Reproducible workflow: Delivered an end-to-end pipeline from data generation through evaluation
Impact
- Demonstrated a practical approach for decision support when real-world data are incomplete
- Provided interpretable signals to guide monitoring and prevention prioritization
- Established a reusable framework adaptable to other low-observability risk domains
This work focuses on decision support and risk prioritization rather than clinical diagnosis. Simulation is used deliberately to explore feasibility and trade-offs under realistic constraints.
Decision Support
Risk Modeling
Interpretable ML
Class Imbalance
Safety-Critical Systems