-
Notifications
You must be signed in to change notification settings - Fork 8
burton_doherty_submission #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,32 @@ | ||
| #Exercise 8, Python question 1 | ||
| #10/13/17, MMD | ||
| import vcf | ||
| import re | ||
| #Open files to read and write | ||
| vcffile = open("Cflorida.vcf","r") | ||
| outfile = open("CfloridaCounts.txt","w") | ||
|
|
||
| #assign regex to variable name, or compile to variable name | ||
|
|
||
| lineNumber=0 | ||
| #loop over file | ||
| for line in vcffile:#look at old code to see how you looped over a file | ||
| #strip end of line | ||
| line=line.strip() | ||
| if lineNumber==0: #how can you tell if this is the header line? | ||
| outfile.write(line+"\n") | ||
| #write unchanged header line to file | ||
| elif lineNumber==1: #how can you tell if this is the line with the column headings? | ||
| #standardize (replace) sample names with TX and FL regexes | ||
| re.sub(((CF|cf){1}.?{4}\.,(Cf.Sfa.),line) ###Having trouble gettig re.sub to work, syntax not working. Trying to replace CF and then any 4 characters leading to a period | ||
| re.sub(((CF|cf){1}.?{4}\.,(Cf.Gai.),line) | ||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Quotation mark is missing. |
||
| #write new version of line to file | ||
| outfile.write(line+"\n") | ||
| else: #now you're in the data | ||
| #replace full SNP info with allele counts only | ||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sub = re.sub("[01.]/[01.]:([0-9,.]+):[0-9.]+:[0-9.]+:[0-9,.]+",r"\1",line) # extract allele information |
||
| #replace missing data with NA | ||
| #write new version of line to new file | ||
|
|
||
| #Close files | ||
|
|
||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. -0.5 points |
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you can use
if line[0:1] == "##":