==================================================== PepcDB files Last updated: July 16, 2009 ==================================================== Database files: **************************************************** -- pepcTargets.xml.gz - complete list of targets in PepcDB in XML format. This file includes all target metadata icluding text protocols protein and DNA sequences, and description of experiments. -- pepcTargets.fasta.gz - protein sequences of PepcDB targets -- pepcTargetsDNA.fasta.gz - DNA sequences of PepcDB targets. Please note that DNA sequence is an optional data item in the PepcDB schema. Currently only four centers provide DNA sequences in addition to the protein sequences. These are ATCG3D, CESG, NYSGXRC, and SGX -- pepcTrialSequences.fasta.gz - protein sequences of experimental trials. A protein target can be associated with multiple trials or experiments. The trial sequences might be identical to the target sequences or represent truncations, mutations, and other sequence modifications introduced to achive experimental goal. -- trial_experimental_history.txt - Experimental status history of each reported protein production experiment in PepcDB. The file contains dates of selection, cloning, expression, purification, crystallization, etc.for each experiment. -- protein_expression_trials.txt - list of expressed proteins in the PepcDB and corresponding expression system: - bacteria - yeast - insect cells - cell free Note: this table lists only those expression trials for which expression system was reported by depositors Table header: 1. TargetID 2. TrialID 3. SG center 4. Organism Name 5. Organism TaxID 6. Protein Sequence => if expressed protein is a complex all protein sequences are included into the string separated by ',' 7. Status => final status of the experiment 8. ProtocolID => expression protocol identifier in the database. 9. Expression System => example: bacteria, yeast, insect cells, cell free 10. Expression Host Strain Code 11. Expression Vector 12. Expression Host TaxID 13. Specific Expression Details => any specific modifications to expression protocol 14. Experiment Stop Info => information if experiment was terminated due experimental failure or other reason Data deposited by each structural genomics center in XML format: -- CENTER_NAME.xm.gz