-
Notifications
You must be signed in to change notification settings - Fork 9
Corley-Kilgore-McCown Exercise 11 Submission #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
1c5055c
1fac0c3
7958470
623a186
e41ba11
2b6c17c
8f3960c
e80c18c
e742d6e
81e1998
c55f4f9
b7b2543
609ee42
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| #!/usr/bin/ bash | ||
|
|
||
| for i in Problem1/*.ref | ||
| do | ||
| ../../.././muscle3.8.31_i86win32 -in $i -out $i.aln | ||
| ../../../hmmer-3.1b2-cygwin64/binaries/./hmmbuild $i.hmm $i.aln | ||
| done | ||
| cat Problem1/*.fasta >> all-files.fasta | ||
| ../../../hmmer-3.1b2-cygwin64/binaries/./hmmsearch Problem1/sigma.ref.hmm all-files.fasta > sigma.hmms | ||
| ../../../hmmer-3.1b2-cygwin64/binaries/./hmmsearch Problem1/sporecoat.ref.hmm all-files.fasta > sporecoat.hmms | ||
| ../../../hmmer-3.1b2-cygwin64/binaries/./hmmsearch Problem1/transporter.ref.hmm all-files.fasta > transporter.hmms | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| import numpy | ||
| import pandas | ||
| import re | ||
|
|
||
| # open files to read and write | ||
| infile=open("motifsort.fasta","r") | ||
| out1=open("motif1.fasta","w") | ||
| out2=open("motif2.fasta","w") | ||
| null=open("null.fasta","w") | ||
|
|
||
| #define the REGEX patterns | ||
| Pat1=r"AKKPRVZE" | ||
| Pat2=r"AAQWWRNYGG" | ||
|
|
||
| for line in infile: | ||
| line=line.strip() | ||
| if line[0] == ">": | ||
| seqid=line | ||
| else: | ||
| if Pat1 in line: | ||
| out1.write(seqid + "\n") | ||
| out1.write(line + "\n") | ||
| elif Pat2 in line: | ||
| out2.write(seqid + "\n") | ||
| out2.write(line + "\n") | ||
| else: | ||
| null.write(seqid + "\n") | ||
| null.write(line + "\n") | ||
|
|
||
| #Close files | ||
| infile.close() | ||
| out1.close() | ||
| out2.close() | ||
| null.close() | ||
|
|
||
|
|
||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good job |
||
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,131 @@ | ||
| # hmmsearch :: search profile(s) against a sequence database | ||
| # HMMER 3.1b2 (February 2015); http://hmmer.org/ | ||
| # Copyright (C) 2015 Howard Hughes Medical Institute. | ||
| # Freely distributed under the GNU General Public License (GPLv3). | ||
| # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | ||
| # query HMM file: ../sigma70hmm.hmm | ||
| # target sequence database: ../Roseobacter.fasta | ||
| # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | ||
|
|
||
| Query: sigma70 [M=70] | ||
| Scores for complete sequences (score includes all domains): | ||
| --- full sequence --- --- best 1 domain --- -#dom- | ||
| E-value score bias E-value score bias exp N Sequence Description | ||
| ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- | ||
| 5.4e-22 75.3 0.4 1.1e-21 74.3 0.4 1.6 1 tr|B7RH33|B7RH33_9RHOB | ||
| 1.2e-18 64.7 0.1 2.5e-18 63.6 0.1 1.6 1 tr|B7RGX5|B7RGX5_9RHOB | ||
| 2.2e-18 63.8 0.2 5.2e-18 62.6 0.2 1.7 1 tr|B7RH17|B7RH17_9RHOB | ||
| 4.7e-18 62.7 0.4 1.8e-17 60.9 0.1 2.0 2 tr|B7RI57|B7RI57_9RHOB | ||
| 4.8e-18 62.7 0.2 8.4e-18 61.9 0.2 1.4 1 tr|B7RJL9|B7RJL9_9RHOB | ||
| 3.7e-16 56.6 0.1 7.4e-16 55.7 0.1 1.5 1 tr|B7RH51|B7RH51_9RHOB | ||
| 7.7e-14 49.2 0.4 1.4e-13 48.4 0.4 1.4 1 tr|B7RSA6|B7RSA6_9RHOB | ||
|
|
||
|
|
||
| Domain annotation for each sequence (and alignments): | ||
| >> tr|B7RH33|B7RH33_9RHOB | ||
| # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc | ||
| --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- | ||
| 1 ! 74.3 0.4 1.9e-24 1.1e-21 1 69 [. 437 506 .. 437 507 .. 0.97 | ||
|
|
||
| Alignments for each domain: | ||
| == domain 1 score: 74.3 bits; conditional E-value: 1.9e-24 | ||
| sigma70 1 lverylplvrriarrllgdgadaeDLvQegflrllraierfdpek.asfstwlyriarnaiidylRkarr 69 | ||
| +ve++l+lv +ia++++++g ++ DL+Qeg+++l++a+++f+ ++ ++fst++++++r+ai + + +++r | ||
| tr|B7RH33|B7RH33_9RHOB 437 MVEANLRLVISIAKKYTNRGLQFLDLIQEGNIGLMKAVDKFEYRRgYKFSTYATWWIRQAITRSIADQAR 506 | ||
| 589***************************************************************9998 PP | ||
|
|
||
| >> tr|B7RGX5|B7RGX5_9RHOB | ||
| # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc | ||
| --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- | ||
| 1 ! 63.6 0.1 4.2e-21 2.5e-18 1 68 [. 30 98 .. 30 100 .. 0.97 | ||
|
|
||
| Alignments for each domain: | ||
| == domain 1 score: 63.6 bits; conditional E-value: 4.2e-21 | ||
| sigma70 1 lverylplvrriarrllgdgadaeDLvQegflrllraierfdpek.asfstwlyriarnaiidylRkar 68 | ||
| l+++y +l + a ++ ++ga + DL+Qe+ l+l++a+++fdp++ ++fst+++++++ i d++ +++ | ||
| tr|B7RGX5|B7RGX5_9RHOB 30 LITAYMRLAISMAGKFKRYGAPMNDLIQEAGLGLMKAADKFDPDRgVRFSTYAVWWIKASIQDHVMRNW 98 | ||
| 7999************************************************************99976 PP | ||
|
|
||
| >> tr|B7RH17|B7RH17_9RHOB | ||
| # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc | ||
| --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- | ||
| 1 ! 62.6 0.2 8.7e-21 5.2e-18 5 69 .. 274 338 .. 270 339 .. 0.96 | ||
|
|
||
| Alignments for each domain: | ||
| == domain 1 score: 62.6 bits; conditional E-value: 8.7e-21 | ||
| sigma70 5 ylplvrriarrllgdgadaeDLvQegflrllraierfdpekasfstwlyriarnaiidylRkarr 69 | ||
| p+v++ a r+lgd+ +aeD++Q++++rl++ ++ ++ a++stwlyr++ n++ d+lR+++r | ||
| tr|B7RH17|B7RH17_9RHOB 274 LTPRVFGHAFRVLGDSSEAEDVTQDALMRLWKIAPDWRIGEAKVSTWLYRVVANLCTDRLRRRGR 338 | ||
| 679**************************************99*******************987 PP | ||
|
|
||
| >> tr|B7RI57|B7RI57_9RHOB | ||
| # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc | ||
| --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- | ||
| 1 ! 60.9 0.1 3e-20 1.8e-17 7 69 .. 30 92 .. 24 93 .. 0.92 | ||
| 2 ? -1.0 0.0 0.63 3.7e+02 38 54 .. 153 169 .. 150 172 .. 0.70 | ||
|
|
||
| Alignments for each domain: | ||
| == domain 1 score: 60.9 bits; conditional E-value: 3e-20 | ||
| sigma70 7 plvrriarrllgdgadaeDLvQegflrllraierfdpekasfstwlyriarnaiidylRkarr 69 | ||
| +++++ r+l+++a aeD +Q++f++++ +++r+ ++ s+ twl++iarn+ id+lR +++ | ||
| tr|B7RI57|B7RI57_9RHOB 30 AKLFGVCLRVLNKRAAAEDAMQDTFVKIWNNADRYHSNGLSPMTWLITIARNTSIDRLRARKK 92 | ||
| 67999**********************************77******************9875 PP | ||
|
|
||
| == domain 2 score: -1.0 bits; conditional E-value: 0.63 | ||
| sigma70 38 ierfdpekasfstwlyr 54 | ||
| ++rf + +twl r | ||
| tr|B7RI57|B7RI57_9RHOB 153 ADRFGVPLNTMRTWLRR 169 | ||
| 57777777447888765 PP | ||
|
|
||
| >> tr|B7RJL9|B7RJL9_9RHOB | ||
| # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc | ||
| --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- | ||
| 1 ! 61.9 0.2 1.4e-20 8.4e-18 4 69 .. 12 76 .. 10 77 .. 0.96 | ||
|
|
||
| Alignments for each domain: | ||
| == domain 1 score: 61.9 bits; conditional E-value: 1.4e-20 | ||
| sigma70 4 rylplvrriarrllgdgadaeDLvQegflrllraierfdpekasfstwlyriarnaiidylRkarr 69 | ||
| ++lp +r++a +l+++ga a+D vQ+++++++ +i++f+ ++ ++wl++i+rn+ + Rka+r | ||
| tr|B7RJL9|B7RJL9_9RHOB 12 EHLPAMRAFAISLTRNGAIADDMVQDTLVKAWTNIDKFEVG-TNMRAWLFTILRNTYYSSRRKANR 76 | ||
| 69***************************************.69*******************998 PP | ||
|
|
||
| >> tr|B7RH51|B7RH51_9RHOB | ||
| # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc | ||
| --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- | ||
| 1 ! 55.7 0.1 1.2e-18 7.4e-16 2 66 .. 54 119 .. 53 122 .. 0.95 | ||
|
|
||
| Alignments for each domain: | ||
| == domain 1 score: 55.7 bits; conditional E-value: 1.2e-18 | ||
| sigma70 2 verylplvrriarrllgdgadaeDLvQegflrllraierfdpek.asfstwlyriarnaiidylRk 66 | ||
| v+++l+l +ia+ + g+g ++++e++++l++a++rfdpek ++++t++++++r i y+ + | ||
| tr|B7RH51|B7RH51_9RHOB 54 VTSHLRLAAKIAMGYRGYGLPQAEVISEANVGLMQAVKRFDPEKgFRLATYAMWWIRASIQEYILR 119 | ||
| 889**********************************************************99977 PP | ||
|
|
||
| >> tr|B7RSA6|B7RSA6_9RHOB | ||
| # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc | ||
| --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- | ||
| 1 ! 48.4 0.4 2.3e-16 1.4e-13 6 69 .. 42 105 .. 38 106 .. 0.95 | ||
|
|
||
| Alignments for each domain: | ||
| == domain 1 score: 48.4 bits; conditional E-value: 2.3e-16 | ||
| sigma70 6 lplvrriarrllgdgadaeDLvQegflrllraierfdpekasfstwlyriarnaiidylRkarr 69 | ||
| ++++++a+r+l+++ ae+ +Q++++ ++r++ + ++s + w+y i+rn++++ lR+ +r | ||
| tr|B7RSA6|B7RSA6_9RHOB 42 GRQLLGVAYRILRRQDLAEEALQDAMVQVWRKAGTQGAGSGSARGWIYAILRNRCLNILRDGKR 105 | ||
| 689***********************************9999******************9776 PP | ||
|
|
||
|
|
||
|
|
||
| Internal pipeline statistics summary: | ||
| ------------------------------------- | ||
| Query model(s): 1 (70 nodes) | ||
| Target sequences: 4185 (1301909 residues searched) | ||
| Passed MSV filter: 119 (0.0284349); expected 83.7 (0.02) | ||
| Passed bias filter: 101 (0.0241338); expected 83.7 (0.02) | ||
| Passed Vit filter: 13 (0.00310633); expected 4.2 (0.001) | ||
| Passed Fwd filter: 7 (0.00167264); expected 0.0 (1e-05) | ||
| Initial search space (Z): 4185 [actual number of targets] | ||
| Domain search space (domZ): 7 [number of targets reported over threshold] | ||
| # CPU time: 0.02u 0.00s 00:00:00.01 Elapsed: 00:00:00.01 | ||
| # Mc/sec: 7594.47 | ||
| // | ||
| [ok] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| Incorrect number of command line arguments. | ||
| Usage: hmmsearch [options] <hmmfile> <seqdb> | ||
|
|
||
| where most common options are: | ||
| -h : show brief help on version and usage | ||
|
|
||
| To see more help on available options, do ./hmmsearch -h | ||
|
|
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to generate a single results file that contains proteome, HMM result and e-value
cat *.hmms | grep -v "#" | awk '{print $1,$3,$5}' | sed -E 's/tr|[A-Z0-9]+|[A-Z0-9]+_9//g' > hmmOut.txt
-0.1