Hi class-
It was just pointed out to me that some of the protein structure files I posted have some complicating factors in them that can result in multiple alpha carbon coordinates for the same amino acid. I have now preprocessed and cleaned up these files. You should download the new proteins.zip or proteins.tar.gz files from the website.
Also, here are some more programming hints:
When you are parsing a pdb file, be sure to take the exact columns that the pdb format specifies. The different fields are not necessarily separated by spaces and so splitting a string apart can fail. You can access specific subsections of a string using slice notation. For example, if line is the variable containing the contents of a line read in from a file, you can get columns 18-20 (inclusive) using the notation line[17:20].
Also, you can use Python's list constructors to your advantage. For your function that returns the IsPhobic array, you can easily iterate over the residues in a form like:
[Res in Phobics for Res in ResNames]
where Phobics is a list containing hydrophobic amino acid names. The "in" statement here will result in either a True or False value so that the constructed array will contain only Booleans.
Cheers,
MSS
Monday, April 6, 2009
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment