Machine Learning Driven
Wearable Sensor for
Foot Landing Classification in Badminton

Introduction

Machine Learning Driven-Wearable Sensor System for Foot Landing Classification in Badminton

Badminton gameplay comprises rapid and abrupt movements, including jumping, lunging, and pivoting, which can cause substantial pressure on the knee and ankle joints. Players are susceptible to a range of injuries affecting their lower limbs, including ankle sprains, Achilles tendonitis, and patellar tendinopathy. The primary objective of this research project is to present a robust and accurate machine learning algorithm that can effectively classify high-impact foot landings in badminton players. To mitigate false foot landings, a wearable motion sensor system has been developed to prevent injuries and enhance player performance. Later the sensor system was used to conduct a study on the landing techniques for players with different experience levels across a range of shots in order to discuss preventive solutions in the gameplay for various player groups.

Methodology

Portable Sensor System: Our sensor system, including a microcontroller board, 6-degree-of-freedom Inertial Measurement Unit, data logging and charging module, and a 2000mAh Li-Ion battery, collaborates for precise data acquisition and storage. Housed in a 3D-printed casing and attached to a flexible wearable strap worn on the player’s foot/shank, the IMU is positioned at the heel.  The X-axis of the IMU was approximately aligned perpendicularly to the tibia and the Y-axis was along the shank’s longitudinal axis. The positive local Z-axis was adjusted to point outwards from the body in the left-to-right direction (approximately aligning with the mediolateral walking axis). The system captures and saves the accelerometer and gyroscope data for each of the six axes of IMU with the primary goal of foot landing classification during gameplay.

Experiment Protocol: The study involved a two-fold data-collection procedure, where the subjects were first instructed to wear the portable sensor system on their foot and were then asked to perform a series of calibration activities, comprising natural physical movements such as walking, jogging, and jumping. By performing such calibration exercises, any variation in the acquired data stemming from player-specific differences was minimized, allowing for more accurate and consistent threshold calibration for defining a high-impact activity and further data analysis. Subsequently, the subjects played a natural 7-point game of badminton while the system logged their data. A total of 15 participants (12 male, 3 female) provided written consent that was reviewed and approved by the Ethics Committee of the Indian Institute of Technology Gandhinagar (Identifier Number: IEC/2022-2023/EXP/VV/001).

Below is a sample model training code I wrote to train the model:

import pandas as pd
import numpy as np
from scipy.stats import skew, kurtosis
from scipy.signal import welch
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score                  
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression

# Load the data
data = pd.read_csv('toe_heel_data.CSV')

# Extract the features and labels
X = data_2[['AccX', 'AccY', 'AccZ', 'GyroX', 'GyroY', 'GyroZ']]
y = data_2['Labels']

# Magnitude of acceleration vector
X['Acc_mag'] = np.sqrt(X['AccX']**2 + X['AccY']**2 + X['AccZ']**2)

# Magnitude of gyroscope vector
X['Gyro_mag'] = np.sqrt(X['GyroX']**2 + X['GyroY']**2 + X['GyroZ']**2)

# Skewness and kurtosis of acceleration and gyroscope signals
X['AccX_skew'] = skew(X['AccX'], axis=0)
X['AccY_skew'] = skew(X['AccY'], axis=0)
X['AccZ_skew'] = skew(X['AccZ'], axis=0)
X['AccX_kurtosis'] = kurtosis(X['AccX'], axis=0)
X['AccY_kurtosis'] = kurtosis(X['AccY'], axis=0)
X['AccZ_kurtosis'] = kurtosis(X['AccZ'], axis=0)

X['GyroX_skew'] = skew(X['GyroX'], axis=0)
X['GyroY_skew'] = skew(X['GyroY'], axis=0)
X['GyroZ_skew'] = skew(X['GyroZ'], axis=0)
X['GyroX_kurtosis'] = kurtosis(X['GyroX'], axis=0)
X['GyroY_kurtosis'] = kurtosis(X['GyroY'], axis=0)
X['GyroZ_kurtosis'] = kurtosis(X['GyroZ'], axis=0)                        


# Set up the pipeline with scaling, classifier, and hyperparameters
pipelines = {
    'lr': Pipeline([('scaler', StandardScaler()), ('lr', LogisticRegression())]),
    'knn': Pipeline([('scaler', StandardScaler()), ('knn', KNeighborsClassifier())]),
    'nb': Pipeline([('scaler', StandardScaler()), ('nb', GaussianNB())]),
    'rf': Pipeline([('scaler', StandardScaler()), ('rf', RandomForestClassifier())])
}

# Set up the hyperparameters for each classifier
params = {
    'lr': {
        'lr__C': [0.1, 1, 10],
        'lr__penalty': ['l1', 'l2']
    },
    'knn': {
        'knn__n_neighbors': [3, 5, 7],
        'knn__weights': ['uniform', 'distance'],
        'knn__p': [1, 2]
    },
    'nb': {},
    'rf': {
        'rf__n_estimators': [10, 20, 50],
        'rf__max_depth': [None, 5, 10, 20],
        'rf__min_samples_split': [2, 5, 10]
    }
}

# Perform grid search for each classifier
for classifier_name, pipeline in pipelines.items():
    clf = GridSearchCV(pipeline, params[classifier_name], cv=5, n_jobs=-1)
    clf.fit(X_train, y_train)
    print(f"Best hyperparameters for {classifier_name}: {clf.best_params_}")
    print(f"Training accuracy: {clf.best_score_}")
    print(f"Test accuracy: {clf.score(X_test, y_test)}")
    
    
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Edit Template

Data Processing : The processing phase of the study entailed the strategic determination of an impact threshold based on the mean values of maxima peaks for Acceleration in the vertical direction obtained during calibration activities. This threshold was then used to detect peaks in the Gyroscope readings of the gameplay and the regions of interest were subsequently labeled with respective time stamps using a video of the real-time game as the ground truth. (Algorithm as showed in fig.2, explained in our manuscript: under-review). 

Model training: Various feature engineering and selection techniques were applied to process training data, including statistical measures like magnitude, skewness, and kurtosis of accelerometer and gyroscope signals. Signal preprocessing involved optimizing Butterworth and Savitzky-Golay filters. Machine learning classification algorithms (Random Forest, K-Nearest Neighbour, Naive Bayes, Logistic Regression) were assessed for performance using metrics like Accuracy, Precision, and Recall. Hyperparameters were optimized using GridSearchCV, ensuring accurate model selection. Predicted data points were clustered within a specific time window of a single landing activity based on the majority classification as toe or heel landings.

Classification based on Experience level and Type of Shot: The algorithm was later used to detect toe and heel landings for players across different experience levels (Beginner, Moderate, Experienced, 5 subjects in each group) based on their frequency and initiation of playing the sport. We recorded the time window of foot landings and associated ground truth shot types for high-impact landings. This data was then input into our model to classify foot landings, enabling the analysis of landing patterns among players of varying experience levels and during different shot types.

Results & Discussion

The classification outcomes of the study evinced commendable levels of accuracy, signifying the efficacy of the applied methodologies. . Random Forest, surpassing its counterparts, yielded the highest accuracy of 97.53%. This accomplishment can be attributed to its employment of ensemble learning techniques, amalgamating multiple decision trees to engender resolute predictions. By skillfully navigating the hyperparameter space, including unfettered ’max_depth’, 2 for the minimal samples required for node splitting, and a notable ensemble count of 50, the model adeptly captured intricate intricacies and interdependencies between the input features and the target variable. The analysis based on players’ experience levels yielded interesting findings. Although the frequency of high-impact shots was similar across groups, indicating comparable playing proficiencies, intensities, and errors, professional players exhibited a linear trend between shot type and foot landing compared to other groups. This suggests a correlation between these variables, adapted by players over time. The overall classification highlighted toe landings as the most common, especially during shots like smash and drop, while heel landings occurred during lunges or overhand returns. These patterns were more scattered among beginners and moderate players but significantly clustered in the professional group. Furthermore, foot landing times varied for different shots among the groups, suggesting a strategic play with familiarity and providing insights for beginners to analyze similar patterns with experienced players. Results for the study: Submitted for Review (Not Presented Here)

Conclusion & Future Work

In conclusion, the study achieved remarkable classification accuracy, with Random Forest outperforming other algorithms at 97.53%. This success can be attributed to its ensemble learning techniques, effectively capturing intricate patterns. The findings based on players’ experience levels revealed nuanced relationships between shot types and foot landings, particularly evident in professional players. The prevalence of toe landings during specific shots and the clustering of patterns in experienced players provide valuable insights. These results showcase the potential of machine learning in classifying high-impact foot landings, emphasizing its applicability in understanding player dynamics and enhancing sports performance. The introduction of real-time analysis capabilities would significantly enhance the system’s utility in live games, providing instantaneous insights into player performance. Efforts to minimize reliance on pre-recorded videos are essential for aligning the analyzed data more closely with real-time movements. Increasing the participant pool, including individuals of diverse skill levels and competitive backgrounds, would enhance the generalizability of the findings. Further refinement of the model’s accuracy and adaptability can be achieved through deeper exploration of feature engineering techniques and the incorporation of extensive and diverse training datasets. Addressing these aspects will contribute to the ongoing evolution of technology-driven biomechanical analysis, fostering innovation in competitive sports environments.

Additional Documents

Guide: Vineet Vashista  |  Collaborators: Dhyey Shah, Ronak Vyas 

Scroll to Top