From 2f2ef8e3eb20ba68f0153fded1e00906c5e19a3c Mon Sep 17 00:00:00 2001
From: Meghana Palukuri <meghana.palukuri@utexas.edu>
Date: Tue, 21 Jun 2022 16:58:17 -0500
Subject: [PATCH 01/14] Update README.md

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 3440b850..1205a821 100644
--- a/README.md
+++ b/README.md
@@ -1,5 +1,5 @@
 # Reinforcement Learning Complex Detection
-This reinforcement learning algorithm is a machine learning method for complex detection in networks. Using known communities, it is trained and learns to find new complexes in the network.
+This is a reinforcement learning algorithm for community detection in networks. Trained on known communities, it learns to find new communities in a network.
 
 # Installation:
 Required python3                                  

From 533e3d6b1f25f22d994b2094af69677384428def Mon Sep 17 00:00:00 2001
From: Meghana Palukuri <meghana.palukuri@utexas.edu>
Date: Tue, 21 Jun 2022 17:03:59 -0500
Subject: [PATCH 02/14] Update README.md

---
 README.md | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 1205a821..43d664ea 100644
--- a/README.md
+++ b/README.md
@@ -10,15 +10,15 @@ Requirements installation:
 1. For a toy network use input_toy.yaml
 2. For hu.MAP - use input file input_humap.yaml
 
-
 # Instructions:
 To run this pipeline on a new network, construct an input file similar to input_toy.yaml specifying where to find the required inputs.
-1. Specify input options relating to network: Set options dir_nm (directory containing the network) and netf_nm (file name of the network)
-2. Specify input options relating to known communities in network: If you already have sepearated known communities into train and test communitites, specify their paths in the options comf_nm and comf_test_nm (relative to the directory specified in the option:dir_nm) Otherwise, Split complex list into train and test: Set option split_flag = 1 Verify that train test size distributions in figure are the similar. Also check that number of training complexes is not too low by looking at the res_metrics.out file. Set options comf_nm and comf_test_nm with these two files. All the above paths are set relative to the directory specified in the option:dir_nm Make sure to change the option split_flag back to 0 after this step
+1. Specify the network input file: Set options dir_nm (directory containing the network) and netf_nm (file name of the network)
+2. Specify the paths for train and test communitites, in the options comf_nm and comf_test_nm (relative to the directory specified in the option- dir_nm)
 
 An example bash script to run the RL pipeline after the above steps is shown below: This is for hu.MAP complexes
 ```
 #!/bin/bash
+
 mtype = humap
 input_file_name = input_$mtype.yaml
 graph_file = hu.MAP_network_experiments/input_data/humap_network_weighted_edge_lists.txt
@@ -28,6 +28,7 @@ out_dir_name = /results_$mtype
 train_results = $out_dir_name/train_results
 pred_results = $out_dir_name/pred_results
 id_map_path = convert_ids/humap_gene_id_name_map.txt
+
 echo Training Algorithm....
 python3 functions/main_training.py --input_training_file $input_training_file --graph_file $graph_file --train_results $train_results
 

From cd2bc43341d0497e2c9c23b5a99c52a65e3456a3 Mon Sep 17 00:00:00 2001
From: Meghana Palukuri <meghana.palukuri@utexas.edu>
Date: Tue, 21 Jun 2022 17:09:58 -0500
Subject: [PATCH 03/14] Added reference

---
 README.md | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 43d664ea..d9a374b3 100644
--- a/README.md
+++ b/README.md
@@ -43,6 +43,9 @@ python3 functions/eval_complex_RL --input_file_name $input_file_name  --input_tr
 
 ```
 
-# Additional tips:
+## Additional tips:
 For each of the scripts, optional arguments can be viewed by running: python3 script_name.py --help
 For each command, add the desired argument directly on the terminal.
+
+# References:
+[Molecular complex detection in protein interaction networks through reinforcement learning](https://doi.org/10.1101/2022.06.20.496772) 

From 2086dacaf9986e6736afa87a45149677d2306d74 Mon Sep 17 00:00:00 2001
From: Meghana Palukuri <meghana.palukuri@utexas.edu>
Date: Tue, 21 Jun 2022 17:16:42 -0500
Subject: [PATCH 04/14] Added link to visualizations

---
 README.md | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index d9a374b3..fa39f657 100644
--- a/README.md
+++ b/README.md
@@ -15,7 +15,7 @@ To run this pipeline on a new network, construct an input file similar to input_
 1. Specify the network input file: Set options dir_nm (directory containing the network) and netf_nm (file name of the network)
 2. Specify the paths for train and test communitites, in the options comf_nm and comf_test_nm (relative to the directory specified in the option- dir_nm)
 
-An example bash script to run the RL pipeline after the above steps is shown below: This is for hu.MAP complexes
+An example bash script to run the RL pipeline after the above steps is shown below: This is for complexes learned on the human PPI network, hu.MAP 1.0:
 ```
 #!/bin/bash
 
@@ -49,3 +49,5 @@ For each command, add the desired argument directly on the terminal.
 
 # References:
 [Molecular complex detection in protein interaction networks through reinforcement learning](https://doi.org/10.1101/2022.06.20.496772) 
+
+Interactive visualizations of complexes learned by the RL algorithm on two human PPI networks, hu.MAP 1.0 and hu.MAP 2.0 are available here: [https://marcottelab.github.io/RL_humap_prediction/](https://marcottelab.github.io/RL_humap_prediction/)

From 902e89ce5bad9e76f90cb1e1000fdfcc77470239 Mon Sep 17 00:00:00 2001
From: "meghana.palukuri" <meghana.palukuri@gmail.com>
Date: Tue, 21 Jun 2022 17:29:21 -0500
Subject: [PATCH 05/14] rmd unnecessary files

---
 .DS_Store                | Bin 6148 -> 0 bytes
 hu.MAP_network/.DS_Store | Bin 6148 -> 0 bytes
 toy_network/.DS_Store    | Bin 6148 -> 0 bytes
 3 files changed, 0 insertions(+), 0 deletions(-)
 delete mode 100644 .DS_Store
 delete mode 100644 hu.MAP_network/.DS_Store
 delete mode 100644 toy_network/.DS_Store

diff --git a/.DS_Store b/.DS_Store
deleted file mode 100644
index bc93ed102d4f0c9d8f22f8b46bd6a253ee7890c2..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001

literal 6148
zcmeHKJ5Iwu5S@h-2`Pe7QM#c?fe47+MCJq$f)11q5ar{1K}ZN((9m!jH~|$E5)C(?
z<p{jlU6GA!lh7c9W~7~GuV>b8p6y*P5vj_gQ6^d?qAVI?W(6+6_&T?kE!iGUR(gyk
z4QWhA)S^Z#S~^4lQQ)sBz~643>J(Cs&dL3LN1fAhwH|i+4&{`{ZWsi$PFO=^_HJ-{
z{Z@S1O&_-#HhvJ@KZ~)*(Fsk!w~o1{rI{yjO^9-N-WT?ZmS<w*l;ii23H%C}1)DSi
z#|{{dFdq)UbCWh`o3?1j)wOgLJVU8D87}Ylm-cHer}dqiHLYRsEVrK0%?CgHJgNxk
z5_R-Y*O)GFHl(&&2ZzB@B3=FmrN@WEb^CrKj_bZ3^P^q9hJ^S96~LO!mdzU!FA9hP
zqQI8|JU@77jK0Olpq@J5^cDb^LAN#>^DhPGcouz&kwIL6DU%8`smg9Klu1XwXXAW}
zkwKGA%5FZCy|c1A6s7Nu@jV?*$~P!p6c7bw6<9W>tGxd2rSJb|ljKYk5C#5~0xA<!
zg9=KrYiq1HUTZzHV>C97iwx=%IJ+F{173>P(6r&2#|J>)Vq_3KF!>SCGKdoeepP`_
DdO@he

diff --git a/hu.MAP_network/.DS_Store b/hu.MAP_network/.DS_Store
deleted file mode 100644
index 15d62dd426e905284354a79a9c9a6e6540bd5f90..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001

literal 6148
zcmeHK&1w`u5Ux&EGn*i+U?j&uZxVE)Mi4KP&5C&N;)otp)|t$Z49-jsyEECtvM}H!
zc^YrNfIfpD9()DCi#M&Ru9#i23oC+%bV1eEKULjb-_YID0H8Wmv<1)t00)(@v5IDf
zP(SGzS(27DM8#v2(1Qp9C{e9g1}p>rIs>$KS0RBM3aD`3{@T9}Q@t0aI>dfD9NwZ{
z|G|Q#IWggIP)N+Xut<|Z*=~PSOUsRO=U1E+XU%!(ALyZ924ywq1pVh!d!lt3o|J>|
zX%Y{6txLN)FM}kH`y7x&F}l2XmgJEhc660T1CI4ghtqVLz1HQ?X!~Y+(`}D8-O*ht
z-M)2aJZ?JcS8v>V+<Tp7xjy2xgTm8_<$~Y|9AjZ78|nU*zo)Z;9oN4+nCF-2ZdVwf
zBlFGYYwvgF`$h}<76+5_f2C#2T<4+R&r_Z6V?X71k-uySG4vrr8)2PJ=lJwHPi02q
zE)+GH=c8}mt4~)xP@W6Sm+8~2e=l;1174^#f5d?70*B53E0zJvfMsBk0XiRCR6<X%
zHpoW@8h!#GHn5om%k;ZIF|ME|SR2G16lFpYO{kJ5hBD#U?uwr$SQ|9qpyctPWM(BV
z6veaS_%6GH@(fyQ8L$jYGq7x`ExP~j&wu|<2iclsz%uY(F`yb<zuUo-<ZeBa9No1p
s>MK+d@~aJU2pYMLxj<L(1}Y2oxl};(1Z#uXf?|IJm<DSs1Amo)pZ?Rawg3PC

diff --git a/toy_network/.DS_Store b/toy_network/.DS_Store
deleted file mode 100644
index 5008ddfcf53c02e82d7eee2e57c38e5672ef89f6..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001

literal 6148
zcmeH~Jr2S!425mzP>H1@V-^m;4Wg<&0T*E43hX&L&p$$qDprKhvt+--jT7}7np#A3
zem<@ulZcFPQ@L2!n>{z**<q8>++&mCkOWA81W14cNZ<zv;LbK1Poaz?KmsK2CSc!(
z0ynLxE!0092;Krf2c+FF_Fe*7ECH>lEfg7;MkzE(HCqgga^y>{tEnwC%0;vJ&^%eQ
zLs35+`xjp>T0<F0fCPF1$Cyrb|F7^5{eNG?83~ZUUlGt@xh*qZDeu<Z%US-OSsOPv
j)R!Z4KLME7ReXlK;d!wEw5GODWMKRea10D2@KpjYNUI8I


From 0df4afbdc06af378dcb4a55cf39cb2e3152cebc9 Mon Sep 17 00:00:00 2001
From: "meghana.palukuri" <meghana.palukuri@gmail.com>
Date: Tue, 21 Jun 2022 17:37:35 -0500
Subject: [PATCH 06/14] rmd unnecessary reqs

---
 requirements_py3.txt | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/requirements_py3.txt b/requirements_py3.txt
index 78c0bc1c..c6d28759 100644
--- a/requirements_py3.txt
+++ b/requirements_py3.txt
@@ -8,13 +8,5 @@ joblib==1.1.0
 tqdm==4.63.0
 numpy==1.22.0
 pandas==1.4.1
-scikit-MDR==0.4.4
-deap==1.3.1
-update-checker==0.18.0
-stopit==1.1.2
-TPOT==0.11.7
 seaborn==0.11.2
-xgboost
-tensorflow
-pytest
 pypiwin32
\ No newline at end of file

From 94d4ee33c0c7a5d63be0d2885eca5d56abf70f83 Mon Sep 17 00:00:00 2001
From: "meghana.palukuri" <meghana.palukuri@gmail.com>
Date: Tue, 21 Jun 2022 19:41:17 -0500
Subject: [PATCH 07/14] removed unnecessary options

---
 main6_eval.py     |  3 ---
 read_complexes.py | 54 +----------------------------------------------
 2 files changed, 1 insertion(+), 56 deletions(-)

diff --git a/main6_eval.py b/main6_eval.py
index baf11886..ca8fe1b1 100644
--- a/main6_eval.py
+++ b/main6_eval.py
@@ -93,10 +93,7 @@ def main():
     parser = argparse_ArgumentParser("Input parameters")
     parser.add_argument("--input_file_name", default="input_toy.yaml", help="Input parameters file name")
     parser.add_argument("--out_dir_name", default="/results", help="Output directory name, by default - /results")
-    parser.add_argument("--seed_mode", help="Seed mode - specify 'cliques' for the cliques algo")
     parser.add_argument("--train_test_files_dir", default="", help="Train test file path")
-    
-    parser.add_argument("--search_method", help="Sampling algorithm")
     parser.add_argument("--model_dir", help="Directory containing model")
     parser.add_argument("--python_command", default="python", help="python / python3")
     parser.add_argument("--read_flag", default=0, help="1 when you want to read from file for evaluation")
diff --git a/read_complexes.py b/read_complexes.py
index af10f6cc..4cfc3a7e 100644
--- a/read_complexes.py
+++ b/read_complexes.py
@@ -9,7 +9,7 @@
 from jaccard_coeff import jaccard_coeff
 from numpy import mean as np_mean, argmax as np_argmax, var as np_var, sqrt as sqrt
 from numpy.random import permutation as rand_perm, choice as rand_choice
-from logging import info as logging_info
+from logging import info as logging_info, debug as debug_info
 from networkx import write_weighted_edgelist as nx_write_weighted_edgelist, is_connected as nx_is_connected
 from scipy.stats import norm as norm_dist
 from convert_humap_ids2names import convert2names
@@ -199,55 +199,6 @@ def split_ratio(perm_lines, ratio):
     return train_list, test_list
 
 
-def split_meth_orig(perm_lines, inputs):
-    fact = inputs['fact']  # 0.99
-    split_pt = int(round(len(perm_lines) * fact))
-    train_list = [line for line in perm_lines[0:split_pt]]
-    test_list = [line for line in perm_lines[split_pt:]]
-    # Start with something that has a biased size distribution !!
-
-    sizes = [len(line) for line in train_list]
-    train_mean = np_mean(sizes)
-
-    # Transferring some of the smaller complexes to the test list
-    train_list_lower_mean = [line for line in train_list if len(line) < train_mean]
-    perc_transfer = inputs['perc_transfer']  # 0.3 # You can optimize these parameters !
-    to_transfer = train_list_lower_mean[:int(round(len(train_list_lower_mean) * perc_transfer))]
-    test_list = test_list + to_transfer
-
-    # Now remove from train set
-    for line in to_transfer:
-        train_list.remove(line)
-
-    # Finding complexes in train that share an edge with a complex in test
-    com_comp = 10
-    while com_comp != 0:  # Do until train and test sets are completely separated
-
-        # Removing super huge complexes also (nodes >30 ) from test set
-        test_list = [line for line in test_list if len(line) < 30]
-
-        # REMOVE OVERLAP B/W TRAIN AND TEST DATA
-        # Remove complexes from train set sharing two proteins with test set
-        train_rem = []
-        train_rem_append = train_rem.append
-        com_comp = 0
-        for train_line in train_list:
-            pres = 0
-            for test_line in test_list:
-                common = len(set(train_line.edges()).intersection(set(test_line.edges)))
-                if common >= 1:
-                    pres = 1
-                    break
-            if pres == 1:
-                train_rem_append(train_line)
-                com_comp += 1
-
-        logging_info("No. of train complexes transferred = %s", str(com_comp))
-        test_list = test_list + train_rem
-        for t_line in train_rem:
-            train_list.remove(t_line)
-    return train_list, test_list
-
 
 def merge_overlapped(list_comp,overlap_thres = 0.6):
     logging_info("Merging complexes...")
@@ -354,7 +305,6 @@ def split_train_test_complexes(inputs, G):
 
     perm_lines = rand_perm(complexes)
     ratio = (70, 30)
-    # train_list, test_list = split_meth_orig(perm_lines, inputs)
     train_list, test_list = split_ratio(perm_lines, ratio)
     plot_size_dists(train_list, test_list, out_comp_nm)
     with open(out_comp_nm + "_train_complexes_new.txt", "w") as f:
@@ -366,8 +316,6 @@ def split_train_test_complexes(inputs, G):
             f.write(sep.join(line) + "\n")
     with open(out_comp_nm + '_metrics.out', "a") as fid:
         print("Split ratio = %.3f" % str(float(len(train_list)) / len(test_list)), file=fid)
-        # print("Initial train_test split = ", fact, file=fid)
-        # print("Percentage of low sizes transferred from train to test = ", perc_transfer, file=fid)
     return train_list, test_list
 
 
From 31fb6d281c830424cfc537c0fb260fa27d07a6b7 Mon Sep 17 00:00:00 2001
From: "meghana.palukuri" <meghana.palukuri@gmail.com>
Date: Tue, 21 Jun 2022 19:53:06 -0500
Subject: [PATCH 08/14] bash file for humap exp

---
 humap_steps.bash | 23 +++++++++++++++++++++++
 input_humap.yaml | 32 --------------------------------
 input_toy.yaml   | 34 ----------------------------------
 3 files changed, 23 insertions(+), 66 deletions(-)
 create mode 100644 humap_steps.bash

diff --git a/humap_steps.bash b/humap_steps.bash
new file mode 100644
index 00000000..9f28dac1
--- /dev/null
+++ b/humap_steps.bash
@@ -0,0 +1,23 @@
+#!/bin/bash
+
+mtype = humap
+input_file_name = input_$mtype.yaml
+graph_file = hu.MAP_network_experiments/input_data/humap_network_weighted_edge_lists.txt
+input_training_file = hu.MAP_network_experiments/intermediate_output_results_data/training_CORUM_complexes_node_lists.txt
+input_testing_file = hu.MAP_network_experiments/intermediate_output_results_data/testing_CORUM_complexes_node_lists.txt
+out_dir_name = /results_$mtype
+train_results = $out_dir_name/train_results
+pred_results = $out_dir_name/pred_results
+id_map_path = convert_ids/humap_gene_id_name_map.txt
+
+echo Training Algorithm....
+python3 functions/main_training.py --input_training_file $input_training_file --graph_file $graph_file --train_results $train_results
+
+echo Predicting new complexes from known communities...
+python3 functions/main_prediction.py --graph_file $graph_file --train_results $train_results --out_dir_name $out_dir_name --pred_results $pred_results
+
+echo Merging similar communities...
+python3 functions/postprocessing.py --input_file_name $input_file_name --graph_file $graph_file --out_dir_name $out_dir_name --pred_results $pred_results --train_results $train_results --input_training_file $input_training_file --input_testing_file $input_testing_file --id_map_path $id_map_path
+
+echo Comparing predicted and known communitites...
+python3 functions/eval_complex_RL --input_file_name $input_file_name  --input_training_file $input_training_file --input_testing_file $input_testing_file --out_dir_name $out_dir_name
diff --git a/input_humap.yaml b/input_humap.yaml
index 525b6826..d75a259f 100644
--- a/input_humap.yaml
+++ b/input_humap.yaml
@@ -6,43 +6,11 @@ comf_nm: "/res_train_complexes_new_73_more.txt"
 comf_test_nm: "/res_test_complexes_new_73_more.txt" # Make sure no extra rows are present
 comf_nm_all: "/all_complexes.txt"
 out_comp_nm: "/results_qi0.325/res"
-scale_factor: 10  # Number of times negatives should be higher than positives 
-use_full: 1
 split_flag: 0
-fact: 0.99
-perc_transfer: 0.275
-mode: non_gen # gen means only feature extraction, non_gen is all 
 # -------------------Training parameters--------------------------------
-feats: 6
 
-classifier_file: "humap/results_73_neg_unif_10x/res_classifiers_new.txt" # or remove new - CHECK
-model_type: "tpot" # Options: tpot, NN
-train_feat_mat: "humap/results_73_neg_unif_10x/res_train_dat.csv"
-test_feat_mat: "humap/results_73_neg_unif_10x/res_test_dat.csv"
-model_name: "tpot_select" #Options: FF_1hidden, log_reg, SVM, rand_forest, extra_trees, estimator_SVM
 model_dir: "/results_73_neg_unif_10x/res"
 # --------------------Search parameters ------------------------------
-seed_mode: "all_nodes" # Options:all_nodes_known_comp, all_nodes, n_nodes,cliques
-num_comp: 5 # Options: 10, 7778, 1500 - only for n_nodes mode
-classi_thresh: 0.5
-
-run_mode: "parallel" # Options: serial, parallel
-max_size_thres: 11 
-search_method: "isa" # isa, metropolis, search_top_neigs, search_max_neig 
-
-# All methods except max_neig    
-use_all_neigs: 1
-thres_neig: 30 # Maximum number of neighbors sampled for checking 
-min_thres_neig_sorted: 100 # Threshold above which only a percentage of neigs are considered as per sorted weights
-perc: 0.7 # Percentage of neighbors to check for adding new node
-explore_prob: 0.01 # use 0.1 for top_neigs 
-
-# Metropolis algorithm params
-prob_metropolis: 0.1
-
-# ISA params
-T0: 1.75
-alpha: 0.005
 
 over_t: 0.325 # Overlap threshold = 0.7/0.9
 overlap_method: "qi" # testing_qi_0.3 or 1
diff --git a/input_toy.yaml b/input_toy.yaml
index 3dd71693..2678ae56 100644
--- a/input_toy.yaml
+++ b/input_toy.yaml
@@ -3,44 +3,10 @@ dir_nm: "toy_network" # Options: toy_network, toy_network_old, humap, humap2
 sep: " " # Options: " ", "\t"
 out_comp_nm: "/results/res" 
 split_flag: 0
-fact: 0.7
-perc_transfer: 0.2
-use_full: 1
-scale_factor: 1.1  # Number of times negatives should be higher than positives 
-mode: non_gen # gen means only feature extraction, non_gen is all
 # -------------------Training parameters--------------------------------
-feats: 6
-
-model_type: "tpot" # Options: tpot, NN
-train_feat_mat: "toy_network/results_train_dat.csv"
-test_feat_mat: "toy_network/results_train_dat.csv"
-model_name: "SVM" #Options: FF_1hidden, log_reg, SVM, rand_forest, extra_trees
-# humap with separted train and test sets - tpot result - extra_trees
  
 model_dir: "/results/res"
 # --------------------Search parameters ------------------------------
-seed_mode: "all_nodes" # Options:all_nodes_known_comp, all_nodes, n_nodes, cliques
-num_comp: 40 # Options: 10, 7778, 1500 -  only for n_nodes mode
-
-run_mode: "parallel" # Options: serial, parallel
-max_size_thres: 50 
-
-search_method: "search_top_neigs" # isa, metropolis, search_top_neigs, search_max_neig
-    
-# All methods except search_max_neig    
-# No. of neighbors considered params
-use_all_neigs: 1
-thres_neig: 30 # Maximum number of neighbors sampled for checking
-min_thres_neig_sorted: 30 # Threshold above which only a percentage of neigs are considered as per sorted weights
-perc: 0.7 # Percentage of neighbors to check for adding new node
-
-explore_prob: 0.01 # use 0.1 for top_neigs
-# Metropolis algorithm params
-prob_metropolis: 0.1
-
-# ISA params
-T0: 0.88
-alpha: 1.8
 
 over_t: 0.1 # Overlap threshold = 0.7/0.9
 infer_overlap_threshold: "y"

From 3502a93c7d1d0d404e34da0c5a13bf3e390f8350 Mon Sep 17 00:00:00 2001
From: Meghana Palukuri <meghana7@marccomp01.ccbb.utexas.edu>
Date: Tue, 21 Jun 2022 21:00:13 -0500
Subject: [PATCH 09/14] matplotlib backend opt

---
 README.md        | 26 +++++++++++++-------------
 humap_steps.bash | 23 -----------------------
 humap_steps.sh   | 23 +++++++++++++++++++++++
 main_training.py |  5 ++++-
 4 files changed, 40 insertions(+), 37 deletions(-)
 delete mode 100644 humap_steps.bash
 create mode 100755 humap_steps.sh

diff --git a/README.md b/README.md
index fa39f657..00302b83 100644
--- a/README.md
+++ b/README.md
@@ -19,27 +19,27 @@ An example bash script to run the RL pipeline after the above steps is shown bel
 ```
 #!/bin/bash
 
-mtype = humap
-input_file_name = input_$mtype.yaml
-graph_file = hu.MAP_network_experiments/input_data/humap_network_weighted_edge_lists.txt
-input_training_file = hu.MAP_network_experiments/intermediate_output_results_data/training_CORUM_complexes_node_lists.txt
-input_testing_file = hu.MAP_network_experiments/intermediate_output_results_data/testing_CORUM_complexes_node_lists.txt
-out_dir_name = /results_$mtype
-train_results = $out_dir_name/train_results
-pred_results = $out_dir_name/pred_results
-id_map_path = convert_ids/humap_gene_id_name_map.txt
+mtype=humap
+input_file_name=input_$mtype.yaml
+graph_file=hu.MAP_network/input_data/humap_network_weighted_edge_lists.txt
+input_training_file=hu.MAP_network/intermediate_data/training_CORUM_complexes_node_lists.txt
+input_testing_file=hu.MAP_network/intermediate_data/testing_CORUM_complexes_node_lists.txt
+out_dir_name=/results_$mtype
+train_results=$out_dir_name/train_results
+pred_results=$out_dir_name/pred_results
+id_map_path=convert_ids/humap_gene_id_name_map.txt
 
 echo Training Algorithm....
-python3 functions/main_training.py --input_training_file $input_training_file --graph_file $graph_file --train_results $train_results
+python3 main_training.py --input_training_file $input_training_file --graph_file $graph_file --train_results $train_results
 
 echo Predicting new complexes from known communities...
-python3 functions/main_prediction.py --graph_file $graph_file --train_results $train_results --out_dir_name $out_dir_name --pred_results $pred_results
+python3 main_prediction.py --graph_file $graph_file --train_results $train_results --out_dir_name $out_dir_name --pred_results $pred_results
 
 echo Merging similar communities...
-python3 functions/postprocessing.py --input_file_name $input_file_name --graph_file $graph_file --out_dir_name $out_dir_name --pred_results $pred_results --train_results $train_results --input_training_file $input_training_file --input_testing_file $input_testing_file --id_map_path $id_map_path
+python3 postprocessing.py --input_file_name $input_file_name --graph_file $graph_file --out_dir_name $out_dir_name --pred_results $pred_results --train_results $train_results --input_training_file $input_training_file --input_testing_file $input_testing_file --id_map_path $id_map_path
 
 echo Comparing predicted and known communitites...
-python3 functions/eval_complex_RL --input_file_name $input_file_name  --input_training_file $input_training_file --input_testing_file $input_testing_file --out_dir_name $out_dir_name
+python3 eval_complex_RL --input_file_name $input_file_name  --input_training_file $input_training_file --input_testing_file $input_testing_file --out_dir_name $out_dir_name
 
 ```
 
diff --git a/humap_steps.bash b/humap_steps.bash
deleted file mode 100644
index 9f28dac1..00000000
--- a/humap_steps.bash
+++ /dev/null
@@ -1,23 +0,0 @@
-#!/bin/bash
-
-mtype = humap
-input_file_name = input_$mtype.yaml
-graph_file = hu.MAP_network_experiments/input_data/humap_network_weighted_edge_lists.txt
-input_training_file = hu.MAP_network_experiments/intermediate_output_results_data/training_CORUM_complexes_node_lists.txt
-input_testing_file = hu.MAP_network_experiments/intermediate_output_results_data/testing_CORUM_complexes_node_lists.txt
-out_dir_name = /results_$mtype
-train_results = $out_dir_name/train_results
-pred_results = $out_dir_name/pred_results
-id_map_path = convert_ids/humap_gene_id_name_map.txt
-
-echo Training Algorithm....
-python3 functions/main_training.py --input_training_file $input_training_file --graph_file $graph_file --train_results $train_results
-
-echo Predicting new complexes from known communities...
-python3 functions/main_prediction.py --graph_file $graph_file --train_results $train_results --out_dir_name $out_dir_name --pred_results $pred_results
-
-echo Merging similar communities...
-python3 functions/postprocessing.py --input_file_name $input_file_name --graph_file $graph_file --out_dir_name $out_dir_name --pred_results $pred_results --train_results $train_results --input_training_file $input_training_file --input_testing_file $input_testing_file --id_map_path $id_map_path
-
-echo Comparing predicted and known communitites...
-python3 functions/eval_complex_RL --input_file_name $input_file_name  --input_training_file $input_training_file --input_testing_file $input_testing_file --out_dir_name $out_dir_name
diff --git a/humap_steps.sh b/humap_steps.sh
new file mode 100755
index 00000000..6a9c5288
--- /dev/null
+++ b/humap_steps.sh
@@ -0,0 +1,23 @@
+#!/bin/bash
+
+mtype=humap
+input_file_name=input_$mtype.yaml
+graph_file=hu.MAP_network/input_data/humap_network_weighted_edge_lists.txt
+input_training_file=hu.MAP_network/intermediate_data/training_CORUM_complexes_node_lists.txt
+input_testing_file=hu.MAP_network/intermediate_data/testing_CORUM_complexes_node_lists.txt
+out_dir_name=/results_$mtype
+train_results=$out_dir_name/train_results
+pred_results=$out_dir_name/pred_results
+id_map_path=convert_ids/humap_gene_id_name_map.txt
+
+echo Training Algorithm....
+python3 main_training.py --input_training_file $input_training_file --graph_file $graph_file --train_results $train_results
+
+echo Predicting new complexes from known communities...
+python3 main_prediction.py --graph_file $graph_file --train_results $train_results --out_dir_name $out_dir_name --pred_results $pred_results
+
+echo Merging similar communities...
+python3 postprocessing.py --input_file_name $input_file_name --graph_file $graph_file --out_dir_name $out_dir_name --pred_results $pred_results --train_results $train_results --input_training_file $input_training_file --input_testing_file $input_testing_file --id_map_path $id_map_path
+
+echo Comparing predicted and known communitites...
+python3 eval_complex_RL --input_file_name $input_file_name  --input_training_file $input_training_file --input_testing_file $input_testing_file --out_dir_name $out_dir_name
diff --git a/main_training.py b/main_training.py
index 9266506c..d8c1123d 100755
--- a/main_training.py
+++ b/main_training.py
@@ -136,7 +136,10 @@ def main():
     start_time = time.time()
     matplotlib.use('Agg')
     logging.basicConfig(level=logging.WARNING)
-    matplotlib.use('tkagg')
+    try:
+        matplotlib.use('tkagg')
+    except:
+        print("Can't use tkagg backend")
     # input data
     parser = argparse_ArgumentParser("Input parameters")
     parser.add_argument("--input_training_file", default="", help="Training Complexes file path")

From 1f82c3247f45553cb9321238fdbe752c07765bdd Mon Sep 17 00:00:00 2001
From: Meghana Palukuri <meghana7@marccomp01.ccbb.utexas.edu>
Date: Tue, 21 Jun 2022 21:31:31 -0500
Subject: [PATCH 10/14] creating dirs if absent

---
 README.md          | 3 ++-
 humap_steps.sh     | 3 ++-
 main_prediction.py | 9 +++++++++
 main_training.py   | 3 +++
 4 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 00302b83..7bc9a407 100644
--- a/README.md
+++ b/README.md
@@ -24,7 +24,8 @@ input_file_name=input_$mtype.yaml
 graph_file=hu.MAP_network/input_data/humap_network_weighted_edge_lists.txt
 input_training_file=hu.MAP_network/intermediate_data/training_CORUM_complexes_node_lists.txt
 input_testing_file=hu.MAP_network/intermediate_data/testing_CORUM_complexes_node_lists.txt
-out_dir_name=/results_$mtype
+mkdir results_$mtype
+out_dir_name=./results_$mtype
 train_results=$out_dir_name/train_results
 pred_results=$out_dir_name/pred_results
 id_map_path=convert_ids/humap_gene_id_name_map.txt
diff --git a/humap_steps.sh b/humap_steps.sh
index 6a9c5288..6fd042aa 100755
--- a/humap_steps.sh
+++ b/humap_steps.sh
@@ -5,7 +5,8 @@ input_file_name=input_$mtype.yaml
 graph_file=hu.MAP_network/input_data/humap_network_weighted_edge_lists.txt
 input_training_file=hu.MAP_network/intermediate_data/training_CORUM_complexes_node_lists.txt
 input_testing_file=hu.MAP_network/intermediate_data/testing_CORUM_complexes_node_lists.txt
-out_dir_name=/results_$mtype
+mkdir results_$mtype
+out_dir_name=./results_$mtype
 train_results=$out_dir_name/train_results
 pred_results=$out_dir_name/pred_results
 id_map_path=convert_ids/humap_gene_id_name_map.txt
diff --git a/main_prediction.py b/main_prediction.py
index cb811709..a260fb5f 100644
--- a/main_prediction.py
+++ b/main_prediction.py
@@ -113,14 +113,21 @@ def pred_complex(n, nodes_list, G, gg, value_functions, intervals,args):
 
    # args.pred_results = "../results/pred_results"
     file = args.pred_results + '/nodes_complexes/'
+
     with open(file + str(n), 'wb') as f:
         pickle_dump(tup_cmplx, f)
     with open(file + str(n), 'rb') as f:
         pickle_load(f)
 
+import os
 
 def network(G, gg, nodes, intervals, value_functions,args):
     ## input data
+
+    fol = args.pred_results + '/nodes_complexes/'
+    if not os.path.exists(fol):
+        os.mkdir(fol)
+
     nodes_list = list(nodes)
     # make sure all intervals are accounted for
     for i in intervals:
@@ -128,6 +135,8 @@ def network(G, gg, nodes, intervals, value_functions,args):
             val_fn = interpolate(value_functions, i)
             value_functions[i] = val_fn
     filename = args.pred_results + '/value_fns_pred.pkl'
+    if not os.path.exists(args.pred_results):
+        os.mkdir(args.pred_results)
     with open(filename, 'wb') as f:
         pickle.dump(value_functions, f)
     fname = args.pred_results + '/value_fns_interp.txt'
diff --git a/main_training.py b/main_training.py
index d8c1123d..25845b2e 100755
--- a/main_training.py
+++ b/main_training.py
@@ -131,6 +131,7 @@ def network(G, gg, value_dict, dens_counter, valuefn_update, intervals, subgraph
     return gg
     # e += 1
 
+import os
 
 def main():
     start_time = time.time()
@@ -184,6 +185,8 @@ def main():
     network(G, gg, value_dict, dens_counter, valuefn_update, intervals, subgraphs)
     # save value function scores in dictionary
     #args.train_results = "../results/train_results"
+    if not os.path.exists(args.train_results):
+        os.mkdir(args.train_results)
     fname = args.train_results + "/value_fn_dens_dict.txt"
     file = open(fname, "w")
     value_dict_sorted = sorted(value_dict.items())

From fdada3517fd445fd1d83115ee42783af4ed5128c Mon Sep 17 00:00:00 2001
From: Meghana Palukuri <meghana7@marccomp01.ccbb.utexas.edu>
Date: Wed, 22 Jun 2022 11:43:56 -0500
Subject: [PATCH 11/14] added id2name map input option during eval

---
 README.md                  |  2 +-
 convert_humap_ids2names.py |  4 ++--
 eval_cmplx_sc.py           | 38 +++++++++++++++++++-------------------
 eval_complex_RL.py         | 10 +++++-----
 humap_steps.sh             |  2 +-
 main6_eval.py              |  2 +-
 postprocessing.py          |  7 +++++++
 7 files changed, 36 insertions(+), 29 deletions(-)

diff --git a/README.md b/README.md
index 7bc9a407..9ec03f97 100644
--- a/README.md
+++ b/README.md
@@ -40,7 +40,7 @@ echo Merging similar communities...
 python3 postprocessing.py --input_file_name $input_file_name --graph_file $graph_file --out_dir_name $out_dir_name --pred_results $pred_results --train_results $train_results --input_training_file $input_training_file --input_testing_file $input_testing_file --id_map_path $id_map_path
 
 echo Comparing predicted and known communitites...
-python3 eval_complex_RL --input_file_name $input_file_name  --input_training_file $input_training_file --input_testing_file $input_testing_file --out_dir_name $out_dir_name
+python3 eval_complex_RL.py --input_file_name $input_file_name  --input_training_file $input_training_file --input_testing_file $input_testing_file --out_dir_name $out_dir_name --id_name_path $id_map_path
 
 ```
 
diff --git a/convert_humap_ids2names.py b/convert_humap_ids2names.py
index d68a3e18..7b398331 100644
--- a/convert_humap_ids2names.py
+++ b/convert_humap_ids2names.py
@@ -184,6 +184,6 @@ def convert2names_wscores(complexes, filename, G, filename_edges, ids_map):
     convert_edges_wscore(lines, G, filename_edges, id_name_map)
 
 
-def convert2names_wscores_matches(complex_matches, filename):
-    id_name_map = read_gene_id_map()
+def convert2names_wscores_matches(complex_matches, filename, id_name_map_path):
+    id_name_map = read_gene_id_map(id_name_map_path)
     convert_nodes_matches_wscore(complex_matches, filename, id_name_map)    
diff --git a/eval_cmplx_sc.py b/eval_cmplx_sc.py
index e2287b7d..0a2c5ef7 100644
--- a/eval_cmplx_sc.py
+++ b/eval_cmplx_sc.py
@@ -16,11 +16,11 @@
 
 
-def write_best_matches(best_matches_for_known,out_comp_nm,dir_nm,suffix):
+def write_best_matches(best_matches_for_known,out_comp_nm,dir_nm,suffix,id_name_map):
        
     sorted_matches = sorted(best_matches_for_known,key=lambda x: x[2],reverse=True)
     if dir_nm == "humap":
-        convert2names_wscores_matches(sorted_matches, out_comp_nm + suffix + '_known_pred_matches_names.out')
+        convert2names_wscores_matches(sorted_matches, out_comp_nm + suffix + '_known_pred_matches_names.out',id_name_map)
  
     with open(out_comp_nm + suffix + '_known_pred_matches.out', "w") as fn:
         fn_write = fn.write
@@ -127,7 +127,7 @@ def f1_similarity(P,T):
     return F1_score, C 
 
 
-def one2one_matches(known_complex_nodes_list, fin_list_graphs, N_pred_comp, N_test_comp,out_comp_nm,suffix,dir_nm):
+def one2one_matches(known_complex_nodes_list, fin_list_graphs, N_pred_comp, N_test_comp,out_comp_nm,suffix,dir_nm, id_name_map):
 
     Metric = np_zeros((N_test_comp, N_pred_comp))
     Common_nodes = np_zeros((N_test_comp, N_pred_comp))
@@ -174,8 +174,8 @@ def one2one_matches(known_complex_nodes_list, fin_list_graphs, N_pred_comp, N_te
     avg_f1_score = (avged_f1_score4known + avged_f1_score4pred)/2
     net_f1_score = 2 * avged_f1_score4known * avged_f1_score4pred / (avged_f1_score4known + avged_f1_score4pred)
     
-    write_best_matches(best_matches_4known,out_comp_nm,dir_nm,'_best4known' + suffix)
-    write_best_matches(best_matches_4predicted,out_comp_nm,dir_nm,'_best4predicted' + suffix)
+    write_best_matches(best_matches_4known,out_comp_nm,dir_nm,'_best4known' + suffix, id_name_map)
+    write_best_matches(best_matches_4predicted,out_comp_nm,dir_nm,'_best4predicted' + suffix, id_name_map)
 
     prec_MMR, recall_MMR, f1_MMR, max_matching_edges = f1_mmr(Metric)
     
@@ -295,28 +295,28 @@ def remove_unknown_prots(fin_list_graphs_orig, prot_list):
     return fin_list_graphs
 
 
-def compute_metrics(known_complex_nodes_list, fin_list_graphs,out_comp_nm,N_test_comp,N_pred_comp,inputs,suffix):
+def compute_metrics(known_complex_nodes_list, fin_list_graphs,out_comp_nm,N_test_comp,N_pred_comp,inputs,suffix, id_name_map):
 
     if N_test_comp != 0 and N_pred_comp != 0:
         Precision, Recall, F1_score = node_comparison_prec_recall(known_complex_nodes_list,fin_list_graphs, N_pred_comp, N_test_comp, inputs["eval_p"],out_comp_nm+suffix)
         
-        avg_f1_score, net_f1_score,PPV,Sn,acc_unbiased,prec_MMR, recall_MMR, f1_MMR,n_matches = one2one_matches(known_complex_nodes_list, fin_list_graphs, N_pred_comp, N_test_comp,out_comp_nm,suffix,inputs['dir_nm'])
+        avg_f1_score, net_f1_score,PPV,Sn,acc_unbiased,prec_MMR, recall_MMR, f1_MMR,n_matches = one2one_matches(known_complex_nodes_list, fin_list_graphs, N_pred_comp, N_test_comp,out_comp_nm,suffix,inputs['dir_nm'], id_name_map)
         
         with open(out_comp_nm + '_metrics.out', "a") as fid:
             print("No. of matches in MMR = ", n_matches, file=fid)            
-            print("MMR Precision = %.3f" % prec_MMR, file=fid)
-            print("MMR Recall = %.3f" % recall_MMR, file=fid)
-            print("MMR F1 score = %.3f" % f1_MMR, file=fid)               
-            print("Net F1 score = %.3f" % net_f1_score, file=fid)   
+            print("FMM Precision = %.3f" % prec_MMR, file=fid)
+            print("FMM Recall = %.3f" % recall_MMR, file=fid)
+            print("FMM F1 score = %.3f" % f1_MMR, file=fid)               
+            print("CMMF = %.3f" % net_f1_score, file=fid)   
             print("Unbiased PPV = %.3f" % PPV, file=fid)
             print("Unbiased Sn = %.3f" % Sn, file=fid)
-            print("Unbiased accuracy= %.3f" % acc_unbiased, file=fid)             
+            print("Unbiased accuracy (UnSPA)= %.3f" % acc_unbiased, file=fid)             
             print("Net Averaged F1 score (Average of Precision and Recall based on F1 score) = %.3f" % avg_f1_score, file=fid)
-            print("Prediction Precision = %.3f" % Precision, file=fid)
-            print("Prediction Recall = %.3f" % Recall, file=fid)
-            print("Prediction F1 score = %.3f" % F1_score, file=fid)    
+            print("Qi et al Precision = %.3f" % Precision, file=fid)
+            print("Qi et al Recall = %.3f" % Recall, file=fid)
+            print("Qi et al F1 score = %.3f" % F1_score, file=fid)    
     
-def eval_complex(rf=0, rf_nm=0, inputs={}, known_complex_nodes_list=[], prot_list=[], fin_list_graphs=[], out_comp_nm = '',suffix="both"):
+def eval_complex(rf=0, rf_nm=0, inputs={}, known_complex_nodes_list=[], prot_list=[], fin_list_graphs=[], out_comp_nm = '',suffix="both", id_name_map = ""):
     # rf - read flag to read complexes from file
     logging_info("Evaluating complexes..." + suffix)
     if rf == 1:
@@ -338,7 +338,7 @@ def eval_complex(rf=0, rf_nm=0, inputs={}, known_complex_nodes_list=[], prot_lis
         print("No. of Predicted complexes = ", N_pred_comp, file=fid)
         print("\n -- Metrics on complexes with all proteins -- ", file=fid)       
     print(out_comp_nm)
-    compute_metrics(known_complex_nodes_list, fin_list_graphs, out_comp_nm,N_test_comp,N_pred_comp,inputs,suffix+'_all_prots')            
+    compute_metrics(known_complex_nodes_list, fin_list_graphs, out_comp_nm,N_test_comp,N_pred_comp,inputs,suffix+'_all_prots',id_name_map)            
     
     fin_list_graphs = remove_unknown_prots(fin_list_graphs, prot_list)
     plot_size_dists(known_complex_nodes_list, fin_list_graphs, sizes_orig, out_comp_nm)
@@ -348,8 +348,8 @@ def eval_complex(rf=0, rf_nm=0, inputs={}, known_complex_nodes_list=[], prot_lis
         print("No. of Predicted complexes after removing non-gold std proteins = ", N_pred_comp, file=fid)
         print("\n -- Metrics on complexes with only gold std proteins -- ", file=fid)   
     
-    compute_metrics(known_complex_nodes_list, fin_list_graphs, out_comp_nm,N_test_comp,N_pred_comp,inputs,suffix+'_gold_std_prots')            
+    compute_metrics(known_complex_nodes_list, fin_list_graphs, out_comp_nm,N_test_comp,N_pred_comp,inputs,suffix+'_gold_std_prots', id_name_map)            
     with open(out_comp_nm + '_metrics.out', "a") as fid:
         print("-- Finished writing main metrics -- \n", file=fid)   
 
-    logging_info("Finished evaluating basic metrics for complexes " + suffix)
\ No newline at end of file
+    logging_info("Finished evaluating basic metrics for complexes " + suffix)
diff --git a/eval_complex_RL.py b/eval_complex_RL.py
index 19f15ace..5e09a0ad 100644
--- a/eval_complex_RL.py
+++ b/eval_complex_RL.py
@@ -1,8 +1,7 @@
 from argparse import ArgumentParser as argparse_ArgumentParser, ArgumentParser
 from pickle import load as pickle_load
 from yaml import load as yaml_load, dump as yaml_dump, Loader as yaml_Loader
-from eval_cmplx_sc import eval_complex
-from eval_cmplx_sc import remove_unknown_prots
+from eval_cmplx_sc import eval_complex, remove_unknown_prots
 from main6_eval import run_metrics
 import os
 def main():
@@ -13,6 +12,7 @@ def main():
     parser.add_argument("--input_testing_file", default="", help="Testing Graph file path")
     parser.add_argument("--out_dir_name", default="", help="Output directory name")
     parser.add_argument("--evaluate_additional_metrics", default=1, help="complexes file name")
+    parser.add_argument("--id_name_path", default="", help="Path for id to gene name file")
     args = parser.parse_args()
     print(args.input_file_name)
     with open(args.input_file_name, 'r') as f:
@@ -52,7 +52,7 @@ def main():
     # Remove all proteins in Predicted complexes that are not present in known complex protein list
     fin_list_graphs = remove_unknown_prots(fin_list_graphs_orig, prot_list)
     suffix = ''
-    eval_complex(0, 0, inputs, known_complex_nodes_list, prot_list, fin_list_graphs, out_comp_nm, suffix="_train")
+    eval_complex(0, 0, inputs, known_complex_nodes_list, prot_list, fin_list_graphs, out_comp_nm, suffix="_train", id_name_map = args.id_name_path)
     if args.evaluate_additional_metrics:
         try:
             run_metrics(known_complex_nodes_list, fin_list_graphs, out_comp_nm, "_train")
@@ -75,7 +75,7 @@ def main():
     # Remove all proteins in Predicted complexes that are not present in known complex protein list
     fin_list_graphs = remove_unknown_prots(fin_list_graphs_orig, prot_list)
     suffix = ''
-    eval_complex(0, 0, inputs, known_complex_nodes_list, prot_list, fin_list_graphs, out_comp_nm,suffix="_train")
+    eval_complex(0, 0, inputs, known_complex_nodes_list, prot_list, fin_list_graphs, out_comp_nm,suffix="_train", id_name_map = args.id_name_path)
 
     if args.evaluate_additional_metrics:
         try:
@@ -109,7 +109,7 @@ def main():
     N_pred_comp = len(fin_list_graphs)
     suffix = ''
 
-    eval_complex(0, 0, inputs, known_complex_nodes_list, prot_list, fin_list_graphs, out_comp_nm,suffix="_train")
+    eval_complex(0, 0, inputs, known_complex_nodes_list, prot_list, fin_list_graphs, out_comp_nm,suffix="_train", id_name_map = args.id_name_path)
 
     if args.evaluate_additional_metrics:
         try:
diff --git a/humap_steps.sh b/humap_steps.sh
index 6fd042aa..bccb6f6f 100755
--- a/humap_steps.sh
+++ b/humap_steps.sh
@@ -21,4 +21,4 @@ echo Merging similar communities...
 python3 postprocessing.py --input_file_name $input_file_name --graph_file $graph_file --out_dir_name $out_dir_name --pred_results $pred_results --train_results $train_results --input_training_file $input_training_file --input_testing_file $input_testing_file --id_map_path $id_map_path
 
 echo Comparing predicted and known communitites...
-python3 eval_complex_RL --input_file_name $input_file_name  --input_training_file $input_training_file --input_testing_file $input_testing_file --out_dir_name $out_dir_name
+python3 eval_complex_RL.py --input_file_name $input_file_name  --input_training_file $input_training_file --input_testing_file $input_testing_file --out_dir_name $out_dir_name --id_name_path $id_map_path
diff --git a/main6_eval.py b/main6_eval.py
index ca8fe1b1..39ae2f5a 100644
--- a/main6_eval.py
+++ b/main6_eval.py
@@ -21,7 +21,7 @@
 sys_path.insert(1, 'functions_py3/')
 from yaml import load as yaml_load, dump as yaml_dump, Loader as yaml_Loader
 from argparse import ArgumentParser as argparse_ArgumentParser
-from humap.functions.eval_cmplx_sc import eval_complex
+from eval_cmplx_sc import eval_complex
 # from random_walk_control import control
 
 from logging import basicConfig as logging_basicConfig, INFO as logging_INFO
diff --git a/postprocessing.py b/postprocessing.py
index 3ee52121..bd5c577c 100644
--- a/postprocessing.py
+++ b/postprocessing.py
@@ -61,15 +61,22 @@ def main():
     file = ''
     if inputs['dir_nm'] == 'toy_network':
         file = args.out_dir_name + '/qi_results'
+
+        if not os.path.exists(args.out_dir_name + '/qi_results'):
+            os.mkdir(args.out_dir_name + '/qi_results')
         filename = file + '/res'
     else:
         if inputs['overlap_method'] == 'qi':
            file = args.out_dir_name + '/qi_results'
+           if not os.path.exists(args.out_dir_name + '/qi_results'):
+               os.mkdir(args.out_dir_name + '/qi_results')
            #os.makedirs(args.out_dir_name + '/qi_results', exist_ok=True)
            filename = file + '/res'  # inputs['out_comp_nm']
            #os.makedirs(file + '/results_qi', exist_ok=True)
         elif inputs["overlap_method"] == '1':  # jaccard coeff
            file = args.out_dir_name + '/jacc_results'
+           if not os.path.exists(file):
+               os.mkdir(file)
            #os.makedirs(args.out_dir_name + '/jacc_results', exist_ok=True)
            filename = file + '/res'  # inputs['out_comp_nm']
            #os.makedirs(file + '/results_jacc', exist_ok=True)

From 8fafc2d277939f3cc044655c0370ffd8daf3f11d Mon Sep 17 00:00:00 2001
From: Meghana Palukuri <meghana.palukuri@utexas.edu>
Date: Wed, 22 Jun 2022 11:52:59 -0500
Subject: [PATCH 12/14] Update README.md

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 9ec03f97..d0f87ca4 100644
--- a/README.md
+++ b/README.md
@@ -49,6 +49,6 @@ For each of the scripts, optional arguments can be viewed by running: python3 sc
 For each command, add the desired argument directly on the terminal.
 
 # References:
-[Molecular complex detection in protein interaction networks through reinforcement learning](https://doi.org/10.1101/2022.06.20.496772) 
+[Molecular complex detection in protein interaction networks through reinforcement learning](https://www.biorxiv.org/content/10.1101/2022.06.20.496772v1) 
 
 Interactive visualizations of complexes learned by the RL algorithm on two human PPI networks, hu.MAP 1.0 and hu.MAP 2.0 are available here: [https://marcottelab.github.io/RL_humap_prediction/](https://marcottelab.github.io/RL_humap_prediction/)

From d3b4fc0a4a7761e2e5231d2e130fb08026c0a655 Mon Sep 17 00:00:00 2001
From: Meghana Palukuri <meghana.palukuri@utexas.edu>
Date: Wed, 22 Jun 2022 12:07:03 -0500
Subject: [PATCH 13/14] Update README.md

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index d0f87ca4..1ea0ec90 100644
--- a/README.md
+++ b/README.md
@@ -49,6 +49,6 @@ For each of the scripts, optional arguments can be viewed by running: python3 sc
 For each command, add the desired argument directly on the terminal.
 
 # References:
-[Molecular complex detection in protein interaction networks through reinforcement learning](https://www.biorxiv.org/content/10.1101/2022.06.20.496772v1) 
+M. V. Palukuri, R. S. Patil, and E. M. Marcotte, “Molecular complex detection in protein interaction networks through reinforcement learning.” bioRxiv, p. 2022.06.20.496772. doi: [10.1101/2022.06.20.496772](https://www.biorxiv.org/content/10.1101/2022.06.20.496772v1).
 
 Interactive visualizations of complexes learned by the RL algorithm on two human PPI networks, hu.MAP 1.0 and hu.MAP 2.0 are available here: [https://marcottelab.github.io/RL_humap_prediction/](https://marcottelab.github.io/RL_humap_prediction/)

From 1b2c45a8a48d8e4460c477b38c1e19a598b179b8 Mon Sep 17 00:00:00 2001
From: Meghana Palukuri <meghana7@marccomp01.ccbb.utexas.edu>
Date: Wed, 3 Aug 2022 11:27:18 -0500
Subject: [PATCH 14/14] added n cores option

---
 main_prediction.py | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/main_prediction.py b/main_prediction.py
index a260fb5f..dc6c7c32 100644
--- a/main_prediction.py
+++ b/main_prediction.py
@@ -144,7 +144,11 @@ def network(G, gg, nodes, intervals, value_functions,args):
         f.write(str(value_functions))
 
     # parallel running
-    num_cores = mul_cpu_count()
+    if args.n_cores == "all":
+        num_cores = mul_cpu_count()
+    else:
+        num_cores = int(args.n_cores)
+    print("No. of cores used = ",num_cores)
     Parallel(n_jobs=num_cores, backend='loky')(
         delayed(pred_complex)(node, nodes_list, G, gg, value_functions, intervals,args) for node in tqdm(nodes_list))
 
@@ -171,6 +175,7 @@ def main():
     parser.add_argument("--train_results", default="", help="Directory for training results")
     parser.add_argument("--pred_results", default="", help="Directory for prediction results")
     parser.add_argument("--out_dir_name", default = "", help = 'Main output directory')
+    parser.add_argument("--n_cores", default = "all", help = 'No. of cores to use for parallel processing')
     args = parser.parse_args()
     #os.makedirs(args.pred_results + '/nodes_complexes', exist_ok=True)