Skip to content

Confusion matrix graph output x-axis display #19

@amschu

Description

@amschu

Current output for Random Forest Confusion Matrix graph using ML_classification.py displays an additional set of negative ("0") and positive ("1") on the predicted axis of the image -- attached.
base2_model_cofunction.csv_mod.txt_RF_CM.pdf

Also, in the RF_results output file, the Mean Balanced Confusion Matrix also shows the extra columns for another negatice and positive class.

Mean Balanced Confusion Matrix:
Class	0	1	0	1
0.0			4542.62	2905.38
1.0			4119.38	3328.62

The expected output should only have two categories for each axis display a total of 4 regions in the graph to illustrate: True negatives, False Positives, True Positives, and False Negatives.

A subset of data can be used to run the pipeline to show display the graph format.
subset_matrixdata_CM_GitIssue.csv

Incorrect graph format can be recreated using provided subset data and code below:

  1. ML_preprocess.py: python /ML-Pipeline/ML_preprocess.py -df subset_matrixdata_CM_GitIssue.csv -sep ',' -onehot f
  2. test_set.py: python /ML-Pipeline/test_set.py -df subset_matrixdata_CM_GitIssue.csv_mod.txt -sep ',' -type c -p 0.1 -save <test_set_file.txt>
  3. ML_classification.py: python /ML-Pipeline/ML_classification.py -df subset_matrixdata_CM_GitIssue.csv_mod.txt -sep ',' -test <test_set_file.txt> -cl_train 1,0 -alg RF -cm t -plots t -n_jobs 8

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions