Bayesian Optimization for Material and Product Optimization

Personal Project, Summer 2025

This project explores the application of Bayesian optimization techniques to various regression problems, demonstrating how to use Bayesian optimization for both model hyperparameter tuning and finding optimal input feature values to maximize target variables.

Project Overview

The project analyzes three datasets from the UCI Machine Learning Repository, applying a structured machine learning approach to each:

  1. Superconductivity Dataset: Predicting and optimizing the critical temperature of superconducting materials
  2. Concrete Dataset: Predicting and optimizing the compressive strength of concrete mixtures
  3. Wine Quality Dataset: Predicting and optimizing the quality of wine based on physicochemical properties

Methodology

Each notebook follows a consistent approach:

  • Data acquisition and preparation from UCI ML Repository
  • Exploratory data analysis (EDA)
  • Model training using Bayesian optimization for hyperparameter tuning (Random Forest and XGBoost)
  • Prediction and evaluation with performance metrics
  • Feature optimization using Bayesian optimization to maximize target variables
  • Visualization of model performance and feature relationships

Key Features

  • Hyperparameter Tuning: Uses scikit-optimize’s Bayesian optimization to tune Random Forest and XGBoost model hyperparameters efficiently
  • Feature Optimization: Applies Bayesian optimization to find optimal input feature combinations that maximize target variables (critical temperature, compressive strength, wine quality)
  • Comparative Analysis: Demonstrates consistent methodology across three diverse domains
  • Visualization: Includes comprehensive visualizations of model performance and feature relationships

Technologies Used

  • Python, Jupyter Notebooks
  • scikit-learn, scikit-optimize
  • XGBoost
  • NumPy, Pandas
  • Matplotlib, Seaborn
  • ucimlrepo

GitHub Repository

View the project on GitHub