SpaceX Falcon 9 Rocket Landing Predictions

Machine Learning | Data Science | Classification | Model Evaluation

This project explores the use of machine learning techniques to predict whether a SpaceX Falcon 9 first stage will successfully land after launch. The work combines real-world launch data, rigorous preprocessing, feature engineering, and comparative model evaluation to demonstrate the value of data-driven aerospace predictions.

Problem Statement

SpaceX has revolutionized spaceflight by making its Falcon 9 rockets partially reusable. A key part of this innovation is the rocket's ability to land its first stage after launch. Accurately predicting the success of such landings can optimize launch planning and risk assessment. This project aims to build predictive models that determine the likelihood of a successful landing, given launch-related parameters.

What I Did:

  • Exploratory Data Analysis (EDA): Conducted in-depth visual and statistical analysis to understand the launch parameters (e.g., payload mass, orbit type, flight number).

  • Hyperparameter Tuning: Used GridSearchCV to optimize model performance.

Flight Number v Payload Mass (kg)

  • Feature Engineering: Encoded categorical variables (like orbit and landing pad) and normalized numerical data.

Feature Engineering

Logistic Regression and GridSearchCV

  • Model Development:
    Trained and evaluated multiple classification models:

    • Logistic Regression

    • Support Vector Machines (SVM)

    • Decision Trees

    • K-Nearest Neighbors

Decision Tree Classifier

  • Model Comparison: Evaluated models using accuracy to identify the best-performing classifier.

Finding Model with Best Accuracy

This project was completed as part of the IBM Data Science Specialization, which I completed on 06 July 2023. The code can be found here on my GitHub.

Credit to Authors and Other Contributors: Rav Ahuja, Lakshmi Holla, Yan Luo, Joseph Santarcangelo, Nayef Abou Tayoun, and Pratiksha Verma

Key Results:

  • The Decision Tree classifier had the best accuracy, while the Logistic Regression model had the worst.

  • Visualizations and metrics helped explain how features like payload mass and orbit contribute to landing outcomes.

  • The final model accurately predicted landing success on test data. However, the amount of data was limited, so not much can be said of the model's scalability.

Tools and Technologies Used:

  • Python (Pandas, NumPy, Scikit-learn, Seaborn, Matplotlib)

  • Jupyter Notebooks

  • SpaceX API for real-world data

  • Machine Learning Pipelines

  • Model evaluation metrics

Confusion Matrix for the Decision Tree Classifier

Takeaways:

This project was an exercise in full-stack data science, spanning the entire data science pipeline from sourcing and cleaning raw data to training models, tuning parameters, and evaluating predictions. It demonstrates how machine learning can enhance decision-making in complex, high-stakes systems, such as orbital launch operations.