Skip to content

dakshrandom/CODTECH.task33

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

NAME : DAKSH LODHA

DOMAIN : MACHINE LEARNING

COMPANY : CODTECH IT SOLUTIONS

DURATION : JUNE TO JULY 2024

ID : CT04ML2337

Overview of Credit Card Fraud Detection Project Project Goal . The primary goal of this project is to develop a machine learning model that can accurately detect fraudulent credit card transactions. Given the highly imbalanced nature of the dataset, the project aims to handle data preprocessing, model training, and evaluation to achieve high performance in identifying fraudulent activities.

Dataset The dataset used for this project is the Credit Card Fraud Detection Dataset from Kaggle. It consists of 284,807 transactions, where only 492 are fraudulent, making it a highly imbalanced dataset. The features include:

. Time: Seconds elapsed between this transaction and the first transaction. . V1 to V28: Principal component analysis (PCA) components. . Amount: Transaction amount. Class: Target variable (1 for fraud, 0 for non-fraud). Methodology

  1. Data Preprocessing Data Cleaning: Handle missing values and remove irrelevant features. Data Normalization: Scale the numerical features to ensure they contribute equally to the model. Handling Imbalanced Data: Implement techniques like oversampling, undersampling, and SMOTE to address class imbalance.
  2. Exploratory Data Analysis (EDA) Perform EDA to understand data distribution and relationships. Visualize data using histograms, box plots, scatter plots, and correlation matrices to identify patterns and anomalies.
  3. Feature Engineering Feature Selection: Identify and select the most relevant features based on their importance and correlation with the target variable. Feature Creation: Create new features that might enhance model performance.
  4. Model Training . Train various machine learning models, such as: . Logistic Regression . Decision Trees . Random Forest . Gradient Boosting . Support Vector Machines (SVM) . Neural Networks . Use techniques like cross-validation to ensure the models generalize well to unseen data.
  5. Model Evaluation . Evaluate models using metrics including: . Accuracy . Precision . Recall . F1 Score . Area Under the Receiver Operating Characteristic Curve (ROC-AUC)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors