EXPLAINABLE CATBOOST-BASED PREDICTION OF CONCRETE COMPRESSIVE STRENGTH USING MIX PROPORTIONS AND CONCRETE PROPERTIES
Keywords:
Concrete compressive strength; CatBoost regression; Machine learning; SHAP analysis; Explainable artificial intelligence; Sustainable concreteAbstract
Accurate prediction of concrete compressive strength is essential for ensuring structural safety, optimizing mix design, and reducing experimental time and cost in construction engineering. Traditional empirical models often struggle to capture the nonlinear interactions among concrete constituents, particularly in mixes incorporating supplementary cementitious materials. This study presents an explainable machine learning framework based on the CatBoost regression algorithm to predict concrete compressive strength using material composition and curing age. A dataset comprising 1,133 concrete samples was employed, including cement, blast-furnace slag, fly ash, water, super-plasticizer, coarse aggregate, fine aggregate, and age of testing as input variables. Comprehensive exploratory data analysis was conducted using statistical characterization, correlation assessment, and distribution analysis to understand feature behavior. The CatBoost model was developed through systematic training, testing, and validation with optimized hyperparameters to ensure robust generalization. Model performance was evaluated using multiple statistical metrics, including variance-based, absolute, relative, and normalized error measures, along with residual analysis to assess bias and error distribution. Furthermore, Shapley Additive Explanations (SHAP) were integrated to interpret feature contributions and enhance model transparency. The results demonstrate high predictive accuracy and consistent performance across datasets, while SHAP analysis identifies curing age and cement content as dominant contributors to strength development, followed by water content and supplementary cementitious materials. The proposed framework combines strong predictive capability with explainability, offering a reliable and interpretable decision-support tool for concrete mix design and performance prediction.













