Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
38 changes: 38 additions & 0 deletions bioinformaticsProject/BioinformaticsProject.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#Searches set of proteomes for the mcrA and hsp70 genes, and outputs results into a table, in a file titled "Summary.csv". Produces a file "Candidates.txt" that contains a list of proteomes that are candidates for the growth experiments.
#Note: The code is dependent on where tools and files are located in a system, so could vary for users.
#Usage: bash BioinformaticsProject.sh

#Align reference sequences
cd ref_sequences
cat mcrAgene_**.fasta > mcrAgene.fasta
cat hsp70gene_**.fasta > hsp70gene.fasta
../muscle -in mcrAgene.fasta -out mcrAgene_muscle.fasta
../muscle -in hsp70gene.fasta -out hsp70gene_muscle.fasta

#Build a profile for the mcrA and hsp70 genes
../bin/hmmbuild --amino mcrAgene_hmm.txt mcrAgene_muscle.fasta
../bin/hmmbuild --amino hsp70gene_hmm.txt hsp70gene_muscle.fasta

#Search the proteomes for mcrA gene, identifies methanogens
cd ../proteomes
for number in proteome_**.fasta
do
name=$(echo $number | cut -d_ -f 2 | cut -d. -f 1)
../bin/hmmsearch --tblout ${name}resultsmcrAgene_match ../ref_sequences/mcrAgene_hmm.txt $number
../bin/hmmsearch --tblout ${name}resultshsp70gene_match ../ref_sequences/hsp70gene_hmm.txt $number
done

#Produce a table with proteome number, mcrA match, number of hsp70 matches
for n in {01..50}
do
column2=$(cat "$n"resultsmcrAgene_match | grep -v '#' | wc -l)
column3=$(cat "$n"resultshsp70gene_match | grep -v '#' | wc -l)
echo "Proteome $n" "$column2" "$column3" >> ../Summary.csv
done

#Provide a list of candidate pH-resistant methanogens
#Based on the results, the cutoff for sufficient pH resistance was 1 copy of hsp70. While the average number of copies was 2.62 across the proteomes, only 14/50 were methanogens. Considering pH resistance to be just one copy of hsp70 allows you to proceed with a larger sample for the growht experiments.
#Note: Average was calculated by awk '{ total += $3; count++ } END { print total/count }' Summary.csv
cd ..
grep -e '1 [1-9]' Summary.csv > Candidates.txt

13 changes: 13 additions & 0 deletions bioinformaticsProject/Candidates.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Proteome 03 1 3
Proteome 05 1 2
Proteome 07 1 2
Proteome 15 1 1
Proteome 16 1 1
Proteome 24 1 2
Proteome 38 1 1
Proteome 39 1 1
Proteome 42 1 3
Proteome 44 1 1
Proteome 45 1 3
Proteome 48 1 1
Proteome 50 1 3
50 changes: 50 additions & 0 deletions bioinformaticsProject/Summary.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
Proteome 01 0 4
Proteome 02 0 2
Proteome 03 1 3
Proteome 04 0 4
Proteome 05 1 2
Proteome 06 0 0
Proteome 07 1 2
Proteome 08 0 5
Proteome 09 0 1
Proteome 10 0 3
Proteome 11 0 6
Proteome 12 0 6
Proteome 13 0 3
Proteome 14 0 2
Proteome 15 1 1
Proteome 16 1 1
Proteome 17 0 4
Proteome 18 0 8
Proteome 19 2 1
Proteome 20 0 3
Proteome 21 0 5
Proteome 22 0 9
Proteome 23 2 2
Proteome 24 1 2
Proteome 25 0 5
Proteome 26 0 1
Proteome 27 0 1
Proteome 28 0 1
Proteome 29 1 0
Proteome 30 0 1
Proteome 31 0 7
Proteome 32 0 4
Proteome 33 0 0
Proteome 34 0 2
Proteome 35 0 1
Proteome 36 0 3
Proteome 37 0 1
Proteome 38 1 1
Proteome 39 1 1
Proteome 40 0 2
Proteome 41 0 1
Proteome 42 1 3
Proteome 43 0 3
Proteome 44 1 1
Proteome 45 1 3
Proteome 46 0 2
Proteome 47 0 1
Proteome 48 1 1
Proteome 49 0 3
Proteome 50 1 3
Binary file added bioinformaticsProject/bin/alimask
Binary file not shown.
Binary file added bioinformaticsProject/bin/hmmalign
Binary file not shown.
Binary file added bioinformaticsProject/bin/hmmbuild
Binary file not shown.
Binary file added bioinformaticsProject/bin/hmmconvert
Binary file not shown.
Binary file added bioinformaticsProject/bin/hmmemit
Binary file not shown.
Binary file added bioinformaticsProject/bin/hmmfetch
Binary file not shown.
Binary file added bioinformaticsProject/bin/hmmlogo
Binary file not shown.
Binary file added bioinformaticsProject/bin/hmmpgmd
Binary file not shown.
Binary file added bioinformaticsProject/bin/hmmpgmd_shard
Binary file not shown.
Binary file added bioinformaticsProject/bin/hmmpress
Binary file not shown.
Binary file added bioinformaticsProject/bin/hmmscan
Binary file not shown.
Binary file added bioinformaticsProject/bin/hmmsearch
Binary file not shown.
Binary file added bioinformaticsProject/bin/hmmsim
Binary file not shown.
Binary file added bioinformaticsProject/bin/hmmstat
Binary file not shown.
Binary file added bioinformaticsProject/bin/jackhmmer
Binary file not shown.
Binary file added bioinformaticsProject/bin/makehmmerdb
Binary file not shown.
Binary file added bioinformaticsProject/bin/nhmmer
Binary file not shown.
Binary file added bioinformaticsProject/bin/nhmmscan
Binary file not shown.
Binary file added bioinformaticsProject/bin/phmmer
Binary file not shown.
48 changes: 48 additions & 0 deletions bioinformaticsProject/hmmer/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
configure
config.log
config.status
Makefile
easel
*.o
*.dSYM
*.stamp
src/*_example
src/*_example?
src/*_utest
src/*_benchmark
src/*_stats
src/impl
src/impl*/*_example
src/impl*/*_utest
src/impl*/*_benchmark
documentation/userguide/copyright.tex
documentation/userguide/titlepage.tex
documentation/man/*.man
libdivsufsort/divsufsort.h
libdivsufsort/libdivsufsort.a
src/libhmmer.a
src/p7_config.h
profmark/create-profmark
profmark/rocplot
src/alimask
src/hmmalign
src/hmmbuild
src/hmmc2
src/hmmconvert
src/hmmemit
src/hmmerfm-exactmatch
src/hmmfetch
src/hmmlogo
src/hmmpgmd
src/hmmpgmd_shard
src/hmmpress
src/hmmscan
src/hmmsearch
src/hmmsim
src/hmmstat
src/itest_brute
src/jackhmmer
src/makehmmerdb
src/nhmmer
src/nhmmscan
src/phmmer
14 changes: 14 additions & 0 deletions bioinformaticsProject/hmmer/.travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
language: c

env:
- SQC_NONZERO_EXIT=1

script:
- git clone -b develop https://github.com/${TRAVIS_REPO_SLUG/hmmer/easel}.git
- ln -s easel/aclocal.m4 aclocal.m4
- autoconf
- ./configure
- make
- make dev
- make check

20 changes: 20 additions & 0 deletions bioinformaticsProject/hmmer/INSTALL
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Brief installation instructions
HMMER 3.3.2 (Nov 2020)

Starting from a source distribution, hmmer-3.3.2.tar.gz:
uncompress hmmer-3.3.2.tar.gz
tar xf hmmer-3.3.2.tar
cd hmmer-3.3.2
./configure
make
make check # optional: automated tests
make install # optional: install HMMER programs and man pages
(cd easel; make install) # optional: install Easel tools too

For more details including customization, supported platforms, and
troubleshooting, see the Installation chapter in the HMMER User's
Guide (Userguide.pdf).




93 changes: 93 additions & 0 deletions bioinformaticsProject/hmmer/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
HMMER - Biological sequence analysis with profile hidden Markov models
Copyright (C) 1992-2020 Sean R. Eddy
Copyright (C) 2000-2020 Howard Hughes Medical Institute
Copyright (C) 2015-2020 President and Fellows of Harvard College
Copyright (C) 1992-2004 Washington University School of Medicine
Copyright (C) 1992-1994 MRC Laboratory of Molecular Biology
-----------------------------------------------------------------------

The code includes contributions and input from current and past
members of the HMMER development team, as well as other colleagues and
sources, including:

Bill Arndt
Jeremy Buhler
Tyler Camp
Nick Carter
Sergi Castellano
Goran Ceric
Michael Farrar
Rob Finn
Ian Holmes
Bjarne Knudsen
Diana Kolbe
Erik Lindahl
Graeme Mitchison
Eric Nawrocki
Lee Newberg
Elena Rivas
Walt Shands
Travis Wheeler

HMMER also includes copyrighted and licensed code that has been
incorporated from other sources, including:

Yuta Mori (libdivsufsort-lite)
Apple Computer
Free Software Foundation, Inc.
IBM TJ Watson Research Center
X Consortium

HMMER uses the Easel software library, which has its own license and
copyright information. See easel/LICENSE.

HMMER includes patent-pending SIMD technology under a nonexclusive
license from the estate of Michael Farrar. You are sublicensed to use
this technology specifically for the use, modification, and
redistribution of HMMER.

HMMER development is supported in part by the National Human Genome
Research Institute of the US National Institutes of Health under grant
number R01HG009116. The content is solely the responsibility of the
authors and does not necessarily represent the official views of the
National Institutes of Health.

HMMER source code is distributed as open source under the terms of the
BSD three-clause license:

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.

3. Neither the name of any copyright holder nor the names of
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
OF THE POSSIBILITY OF SUCH DAMAGE.








Loading