Human-Action-Classifier

A Human activity classifier to categorize actions into 6 classes like WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING based on the dataset hosted in Kaggle

About Dataset

The Human Activity Recognition database was built from the recordings of 30 study participants performing activities of daily living (ADL) while carrying a waist-mounted smartphone with embedded inertial sensors. The objective is to classify activities into one of the six activities performed. Description of experiment

The experiments have been carried out with a group of 30 volunteers within an age bracket of 19-48 years. Each person performed six activities (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING) wearing a smartphone (Samsung Galaxy S II) on the waist. Using its embedded accelerometer and gyroscope, we captured 3-axial linear acceleration and 3-axial angular velocity at a constant rate of 50Hz. The experiments have been video-recorded to label the data manually. The obtained dataset has been randomly partitioned into two sets, where 70% of the volunteers was selected for generating the training data and 30% the test data.

The sensor signals (accelerometer and gyroscope) were pre-processed by applying noise filters and then sampled in fixed-width sliding windows of 2.56 sec and 50% overlap (128 readings/window). The sensor acceleration signal, which has gravitational and body motion components, was separated using a Butterworth low-pass filter into body acceleration and gravity. The gravitational force is assumed to have only low frequency components, therefore a filter with 0.3 Hz cutoff frequency was used. From each window, a vector of features was obtained by calculating variables from the time and frequency domain.

Attribute information

For each record in the dataset the following is provided:

Triaxial acceleration from the accelerometer (total acceleration) and the estimated body acceleration.
Triaxial Angular velocity from the gyroscope.
A 561-feature vector with time and frequency domain variables.
Its activity label.
An identifier of the subject who carried out the experiment.

About the project

The project is divided into 5 parts:

Part 1: Data Analysis
Part 2: Dimensionality Reduction using PCA
Part 3: Further Dimensionality Reduction using LDA
Part 4: Classification using KNN
Part 5: Report including the results of our project and some implementations from sklearn library

Part 1: Data Analysis

The dataset is loaded using pandas library and the data is analyzed using pandas and matplotlib library. We immediately start by doing some general data analysis to get a better understanding of the dataset like pie chart of the number of samples in each class, bar chart of the number of samples for each subject, etc.

training_set = pd.read_csv("dataset/train.csv")
print(training_set.shape)
(7352, 563)

As we ca see the data is pretty evenly distributed.However the number number of features is very large, so the first thing we do is to reduce the number of features using PCA.

Part 2: Dimensionality Reduction using PCA

By using PCA as a dimensionality reduction we manage to go from 561 to 155. The implementation is found the in file pca.py and returns the projection matrix and the number of components that retain 99% of the variance.

projection_matrix, component_num = PCA(x, show_plots=True)
pca = x @ projection_matrix.T

Inside the PCA function we calculate the number of components that retain 99% of the variance and we plot the cumulative sum of the explained variance ratio.

# returns the #num of components that retain 99% of the variance
component_num = np.argmax(cumulative_variance >= 0.99) + 1

We do a scatter plot of the first 3 components to have a better understanding of the data.

We can see that the data is still not clearly separable, so we do further dimensionality reduction using LDA which is a supervised dimensionality reduction method where we use the labels to find the best projection matrix.

Part 3: Further Dimensionality Reduction using LDA

The implementation of LDA is in the file lda.py and similarly to pca it takes in input the data and the labels and returns the projection matrix.Since we have k = 6 classes, we will be projecting on k − 1 = 5 axes.

lda_proj = LDA(pca, y_train, n_classes=6)
lda = np.matmul(pca, lda_proj.T)

We have managed to reduce the number of features from 561 to 5. We do a scatter plot of the first 3 components to look at the current situation.

We can see that the data is now clearly separable (except SITTING and STANDING which are still close). So we can start the classification process where i decided to use KNN.

Part 4: Classification using KNN

The implementation of KNN is in the file knn.py and it takes in input the data, the labels and the number of neighbors and returns the predicted class. The execution of the code takes a while (around 2min), but the accuracy is pretty good.

knn = KNN(lda, y_train, n_neighbors=21)

The number of neighbors is chosen empirically and it is 21. We can see that the accuracy is 0.95.... which looks pretty good.

Accuracy:  0.9548693586698337

Part 5: Report

I decided to give a try to the famous library of sklearn and compare the results with my implementation.

Accuracy:  0.9548693586698337 (me)
Accuracy:  0.9562266711910418 (RandomForestClassifier)
Accuracy:  0.9416355615880556 (DecisionTreeClassifier)
Accuracy:  0.9548693586698337 (KNeighborsClassifier)
Accuracy:  0.9572446555819477 (SVMClassifier)

Relevant papers

Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz. Human Activity Recognition on Smartphones using a Multiclass Hardware-Friendly Support Vector Machine. International Workshop of Ambient Assisted Living (IWAAL 2012). Vitoria-Gasteiz, Spain. Dec 2012
Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, Jorge L. Reyes-Ortiz. Energy Efficient Smartphone-Based Activity Recognition using Fixed-Point Arithmetic. Journal of Universal Computer Science. Special Issue in Ambient Assisted Living: Home Care. Volume 19, Issue 9. May 2013
Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz. Human Activity Recognition on Smartphones using a Multiclass Hardware-Friendly Support Vector Machine. 4th International Workshop of Ambient Assited Living, IWAAL 2012, Vitoria-Gasteiz, Spain, December 3-5, 2012. Proceedings. Lecture Notes in Computer Science 2012, pp 216-223.
Jorge Luis Reyes-Ortiz, Alessandro Ghio, Xavier Parra-Llanas, Davide Anguita, Joan Cabestany, Andreu Català. Human Activity and Motion Disorder Recognition: Towards Smarter Interactive Cognitive Environments. 21st European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2013. Bruges, Belgium 24-26 April 2013.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
img		img
.gitignore		.gitignore
Classifier.ipynb		Classifier.ipynb
README.md		README.md
README.pdf		README.pdf
knn.py		knn.py
lda.py		lda.py
pca.py		pca.py
plots.py		plots.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Human-Action-Classifier

About Dataset

Attribute information

About the project

Part 1: Data Analysis

Part 2: Dimensionality Reduction using PCA

Part 3: Further Dimensionality Reduction using LDA

Part 4: Classification using KNN

Part 5: Report

Relevant papers

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Human-Action-Classifier

About Dataset

Attribute information

About the project

Part 1: Data Analysis

Part 2: Dimensionality Reduction using PCA

Part 3: Further Dimensionality Reduction using LDA

Part 4: Classification using KNN

Part 5: Report

Relevant papers

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages