Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added Exercise-Files/.DS_Store
Binary file not shown.
13 changes: 13 additions & 0 deletions Exercise-Files/Ex11Q1-1.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/usr/bin/ bash

for i in Problem1/*.ref
do
../../.././muscle3.8.31_i86win32 -in $i -out $i.aln
../../../hmmer-3.1b2-cygwin64/binaries/./hmmbuild $i.hmm $i.aln
done
cat Problem1/*.fasta >> all-files.fasta
../../../hmmer-3.1b2-cygwin64/binaries/./hmmsearch Problem1/sigma.ref.hmm all-files.fasta > sigma.hmms
../../../hmmer-3.1b2-cygwin64/binaries/./hmmsearch Problem1/sporecoat.ref.hmm all-files.fasta > sporecoat.hmms
../../../hmmer-3.1b2-cygwin64/binaries/./hmmsearch Problem1/transporter.ref.hmm all-files.fasta > transporter.hmms


@lyy005 lyy005 Dec 11, 2017

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to generate a single results file that contains proteome, HMM result and e-value
cat *.hmms | grep -v "#" | awk '{print $1,$3,$5}' | sed -E 's/tr|[A-Z0-9]+|[A-Z0-9]+_9//g' > hmmOut.txt

-0.1

36 changes: 36 additions & 0 deletions Exercise-Files/Ex11Q2.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
import numpy
import pandas
import re

# open files to read and write
infile=open("motifsort.fasta","r")
out1=open("motif1.fasta","w")
out2=open("motif2.fasta","w")
null=open("null.fasta","w")

#define the REGEX patterns
Pat1=r"AKKPRVZE"
Pat2=r"AAQWWRNYGG"

for line in infile:
line=line.strip()
if line[0] == ">":
seqid=line
else:
if Pat1 in line:
out1.write(seqid + "\n")
out1.write(line + "\n")
elif Pat2 in line:
out2.write(seqid + "\n")
out2.write(line + "\n")
else:
null.write(seqid + "\n")
null.write(line + "\n")

#Close files
infile.close()
out1.close()
out2.close()
null.close()


Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job

22,215 changes: 22,215 additions & 0 deletions Exercise-Files/Problem1/Arthrobacter.fasta

Large diffs are not rendered by default.

23,952 changes: 23,952 additions & 0 deletions Exercise-Files/Problem1/Bacillus.fasta

Large diffs are not rendered by default.

11,021 changes: 11,021 additions & 0 deletions Exercise-Files/Problem1/Clostridium.fasta

Large diffs are not rendered by default.

31,988 changes: 31,988 additions & 0 deletions Exercise-Files/Problem1/Flavobacterium.fasta

Large diffs are not rendered by default.

21,882 changes: 21,882 additions & 0 deletions Exercise-Files/Problem1/Limnohabitans.fasta

Large diffs are not rendered by default.

32,137 changes: 32,137 additions & 0 deletions Exercise-Files/Problem1/Rhizobium.fasta

Large diffs are not rendered by default.

27,930 changes: 27,930 additions & 0 deletions Exercise-Files/Problem1/Roseobacter.fasta

Large diffs are not rendered by default.

12,477 changes: 12,477 additions & 0 deletions Exercise-Files/Problem1/Verrucomicrobia.fasta

Large diffs are not rendered by default.

Binary file added Exercise-Files/Problem1/binaries/alimask.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/esl-afetch.exe
Binary file not shown.
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/esl-alimap.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/esl-alimask.exe
Binary file not shown.
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/esl-alipid.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/esl-alistat.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/esl-cluster.exe
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/esl-mask.exe
Binary file not shown.
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/esl-selectn.exe
Binary file not shown.
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/esl-seqstat.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/esl-sfetch.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/esl-shuffle.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/esl-ssdraw.exe
Binary file not shown.
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/esl-weight.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/hmmalign.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/hmmbuild.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/hmmc2.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/hmmconvert.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/hmmemit.exe
Binary file not shown.
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/hmmfetch.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/hmmlogo.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/hmmpgmd.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/hmmpress.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/hmmscan.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/hmmsearch.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/hmmsim.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/hmmstat.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/jackhmmer.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/makehmmerdb.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/nhmmer.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/nhmmscan.exe
Binary file not shown.
Binary file added Exercise-Files/Problem1/binaries/phmmer.exe
Binary file not shown.
131 changes: 131 additions & 0 deletions Exercise-Files/Problem1/binaries/works.out
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.1b2 (February 2015); http://hmmer.org/
# Copyright (C) 2015 Howard Hughes Medical Institute.
# Freely distributed under the GNU General Public License (GPLv3).
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file: ../sigma70hmm.hmm
# target sequence database: ../Roseobacter.fasta
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query: sigma70 [M=70]
Scores for complete sequences (score includes all domains):
--- full sequence --- --- best 1 domain --- -#dom-
E-value score bias E-value score bias exp N Sequence Description
------- ------ ----- ------- ------ ----- ---- -- -------- -----------
5.4e-22 75.3 0.4 1.1e-21 74.3 0.4 1.6 1 tr|B7RH33|B7RH33_9RHOB
1.2e-18 64.7 0.1 2.5e-18 63.6 0.1 1.6 1 tr|B7RGX5|B7RGX5_9RHOB
2.2e-18 63.8 0.2 5.2e-18 62.6 0.2 1.7 1 tr|B7RH17|B7RH17_9RHOB
4.7e-18 62.7 0.4 1.8e-17 60.9 0.1 2.0 2 tr|B7RI57|B7RI57_9RHOB
4.8e-18 62.7 0.2 8.4e-18 61.9 0.2 1.4 1 tr|B7RJL9|B7RJL9_9RHOB
3.7e-16 56.6 0.1 7.4e-16 55.7 0.1 1.5 1 tr|B7RH51|B7RH51_9RHOB
7.7e-14 49.2 0.4 1.4e-13 48.4 0.4 1.4 1 tr|B7RSA6|B7RSA6_9RHOB


Domain annotation for each sequence (and alignments):
>> tr|B7RH33|B7RH33_9RHOB
# score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc
--- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ----
1 ! 74.3 0.4 1.9e-24 1.1e-21 1 69 [. 437 506 .. 437 507 .. 0.97

Alignments for each domain:
== domain 1 score: 74.3 bits; conditional E-value: 1.9e-24
sigma70 1 lverylplvrriarrllgdgadaeDLvQegflrllraierfdpek.asfstwlyriarnaiidylRkarr 69
+ve++l+lv +ia++++++g ++ DL+Qeg+++l++a+++f+ ++ ++fst++++++r+ai + + +++r
tr|B7RH33|B7RH33_9RHOB 437 MVEANLRLVISIAKKYTNRGLQFLDLIQEGNIGLMKAVDKFEYRRgYKFSTYATWWIRQAITRSIADQAR 506
589***************************************************************9998 PP

>> tr|B7RGX5|B7RGX5_9RHOB
# score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc
--- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ----
1 ! 63.6 0.1 4.2e-21 2.5e-18 1 68 [. 30 98 .. 30 100 .. 0.97

Alignments for each domain:
== domain 1 score: 63.6 bits; conditional E-value: 4.2e-21
sigma70 1 lverylplvrriarrllgdgadaeDLvQegflrllraierfdpek.asfstwlyriarnaiidylRkar 68
l+++y +l + a ++ ++ga + DL+Qe+ l+l++a+++fdp++ ++fst+++++++ i d++ +++
tr|B7RGX5|B7RGX5_9RHOB 30 LITAYMRLAISMAGKFKRYGAPMNDLIQEAGLGLMKAADKFDPDRgVRFSTYAVWWIKASIQDHVMRNW 98
7999************************************************************99976 PP

>> tr|B7RH17|B7RH17_9RHOB
# score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc
--- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ----
1 ! 62.6 0.2 8.7e-21 5.2e-18 5 69 .. 274 338 .. 270 339 .. 0.96

Alignments for each domain:
== domain 1 score: 62.6 bits; conditional E-value: 8.7e-21
sigma70 5 ylplvrriarrllgdgadaeDLvQegflrllraierfdpekasfstwlyriarnaiidylRkarr 69
p+v++ a r+lgd+ +aeD++Q++++rl++ ++ ++ a++stwlyr++ n++ d+lR+++r
tr|B7RH17|B7RH17_9RHOB 274 LTPRVFGHAFRVLGDSSEAEDVTQDALMRLWKIAPDWRIGEAKVSTWLYRVVANLCTDRLRRRGR 338
679**************************************99*******************987 PP

>> tr|B7RI57|B7RI57_9RHOB
# score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc
--- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ----
1 ! 60.9 0.1 3e-20 1.8e-17 7 69 .. 30 92 .. 24 93 .. 0.92
2 ? -1.0 0.0 0.63 3.7e+02 38 54 .. 153 169 .. 150 172 .. 0.70

Alignments for each domain:
== domain 1 score: 60.9 bits; conditional E-value: 3e-20
sigma70 7 plvrriarrllgdgadaeDLvQegflrllraierfdpekasfstwlyriarnaiidylRkarr 69
+++++ r+l+++a aeD +Q++f++++ +++r+ ++ s+ twl++iarn+ id+lR +++
tr|B7RI57|B7RI57_9RHOB 30 AKLFGVCLRVLNKRAAAEDAMQDTFVKIWNNADRYHSNGLSPMTWLITIARNTSIDRLRARKK 92
67999**********************************77******************9875 PP

== domain 2 score: -1.0 bits; conditional E-value: 0.63
sigma70 38 ierfdpekasfstwlyr 54
++rf + +twl r
tr|B7RI57|B7RI57_9RHOB 153 ADRFGVPLNTMRTWLRR 169
57777777447888765 PP

>> tr|B7RJL9|B7RJL9_9RHOB
# score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc
--- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ----
1 ! 61.9 0.2 1.4e-20 8.4e-18 4 69 .. 12 76 .. 10 77 .. 0.96

Alignments for each domain:
== domain 1 score: 61.9 bits; conditional E-value: 1.4e-20
sigma70 4 rylplvrriarrllgdgadaeDLvQegflrllraierfdpekasfstwlyriarnaiidylRkarr 69
++lp +r++a +l+++ga a+D vQ+++++++ +i++f+ ++ ++wl++i+rn+ + Rka+r
tr|B7RJL9|B7RJL9_9RHOB 12 EHLPAMRAFAISLTRNGAIADDMVQDTLVKAWTNIDKFEVG-TNMRAWLFTILRNTYYSSRRKANR 76
69***************************************.69*******************998 PP

>> tr|B7RH51|B7RH51_9RHOB
# score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc
--- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ----
1 ! 55.7 0.1 1.2e-18 7.4e-16 2 66 .. 54 119 .. 53 122 .. 0.95

Alignments for each domain:
== domain 1 score: 55.7 bits; conditional E-value: 1.2e-18
sigma70 2 verylplvrriarrllgdgadaeDLvQegflrllraierfdpek.asfstwlyriarnaiidylRk 66
v+++l+l +ia+ + g+g ++++e++++l++a++rfdpek ++++t++++++r i y+ +
tr|B7RH51|B7RH51_9RHOB 54 VTSHLRLAAKIAMGYRGYGLPQAEVISEANVGLMQAVKRFDPEKgFRLATYAMWWIRASIQEYILR 119
889**********************************************************99977 PP

>> tr|B7RSA6|B7RSA6_9RHOB
# score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc
--- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ----
1 ! 48.4 0.4 2.3e-16 1.4e-13 6 69 .. 42 105 .. 38 106 .. 0.95

Alignments for each domain:
== domain 1 score: 48.4 bits; conditional E-value: 2.3e-16
sigma70 6 lplvrriarrllgdgadaeDLvQegflrllraierfdpekasfstwlyriarnaiidylRkarr 69
++++++a+r+l+++ ae+ +Q++++ ++r++ + ++s + w+y i+rn++++ lR+ +r
tr|B7RSA6|B7RSA6_9RHOB 42 GRQLLGVAYRILRRQDLAEEALQDAMVQVWRKAGTQGAGSGSARGWIYAILRNRCLNILRDGKR 105
689***********************************9999******************9776 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s): 1 (70 nodes)
Target sequences: 4185 (1301909 residues searched)
Passed MSV filter: 119 (0.0284349); expected 83.7 (0.02)
Passed bias filter: 101 (0.0241338); expected 83.7 (0.02)
Passed Vit filter: 13 (0.00310633); expected 4.2 (0.001)
Passed Fwd filter: 7 (0.00167264); expected 0.0 (1e-05)
Initial search space (Z): 4185 [actual number of targets]
Domain search space (domZ): 7 [number of targets reported over threshold]
# CPU time: 0.02u 0.00s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 7594.47
//
[ok]
8 changes: 8 additions & 0 deletions Exercise-Files/Problem1/binaries/works1.out
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Incorrect number of command line arguments.
Usage: hmmsearch [options] <hmmfile> <seqdb>

where most common options are:
-h : show brief help on version and usage

To see more help on available options, do ./hmmsearch -h

Binary file not shown.
Loading