No, the database aims to curate invertebrate-active pesticidal proteins from bacterial origin only. At present, toxin complex (Tc) proteins from bacteria are not included in the database or the nomenclature but a separate site listing these proteins has been established here. Other bacterial pesticidal proteins that are reported in the literature may be absent from the database at present but it is planned to add them over time.
Some proteins have been assigned new names in the latest nomenclature revision. Please check this link to find the new name (new name/old name chart). The BPPRC database is an expansion of the previous Bt toxin database and some invertebrate-active proteins may not yet have been added {see question below}.
How do I find proteins with identical sequences?
Sequences in the BPPRC database are assigned separate names for each entry, even if there is already an entry with the same protein sequence. To find identical proteins within the database, search for your protein of interest (eg Cyt1Aa1), select the entry and beneath the sequence you will find information on predicted molecular mass, predicted isoelectric point and the names of the sequences of identical proteins.
I have a pesticidal protein derived from a plant/fungus/invertebrate. Can I add this sequence to the database?
Probably not. This site only aims to deal with proteins of bacterial origin. However, if your protein of non-bacterial origin has a high degree of identity with a protein that is already in the database, you may make a case for its inclusion. The committee will consider these requests on a case-by-case basis.
Yes, enter “pending” in the accession number box on the bpprc submission form. The protein sequence will not be added to the database until we have the accession number, however.
No. Even though you can identify the most closely related proteins, you can only acquire an official name by use of this site. Self-naming of proteins will inevitably lead to different users adopting the same protein name, confusing the literature. Only the nomenclature committee can issue new protein names.
Probably not. This site only aims to deal with proteins of bacterial origin. However, if your protein of non-bacterial origin has a high degree of identity with a protein that is already in the database, you may make a case for its inclusion. The committee will consider these requests on a case-by-case basis.
No. Xpp is a holding group containing members of several disparate protein families. As a result, Xpp-like would not actually indicate any significant homology. You could however, refer to your protein in relation to a specific Xpp protein (as Xpp37-like for example). The numeric value will be retained when the protein is re-categorized to a specific protein family in the future.
BPPRC General FAQs
Help & Support
To send feedback please use the link here https://camtech-bpp.ifas.ufl.edu/feedback_home/.
How do I cite BPPRC Web services?
Crickmore, N., Berry, C., Panneerselvam, S., Mishra, R., Connor, T. R., & Bonning, B. C. (2020). A structure-based nomenclature for Bacillus thuringiensis and other bacteria-derived pesticidal proteins. Journal of invertebrate pathology, 107438. https://doi.org/10.1016/j.jip.2020.107438.
How often does BPPRC database update?
Currently, on average we update every month.
What information is available in BPPRC?
BPPRC site provides information on the recently revised nomenclature system that clarifies the naming of newly discovered proteins, details on bacterial sources and an interactive database containing protein sequences and associated information.
Database FAQ
Can I use the database locally on my laptop?
Yes. You can download from here https://github.com/bpprc/database and the installation instructions are provided.
How can I download all the sequences?
You can download publicly available sequences using the link here https://camtech-bpp.ifas.ufl.edu/category_form.
How do I find proteins with identical sequences?
Sequences in the BPPRC database are assigned separate names for each entry, even if there is already an entry with the same protein sequence. To find identical proteins within the database, search for your protein of interest (eg Cyt1Aa1), select the entry and beneath the sequence you will find information on predicted molecular mass, predicted isoelectric point and the names of the sequences of identical proteins.
I tried to download domain information for a Cry protein to the Cart (N-terminal, middle domain, C-terminal) but it does not appear. Why is this?
Domain information is normally extracted from the NCBI entry for the protein and, if this is not present (or if we have not yet added it manually to the BPPRC database), it will not appear in your Cart. You can find the domain boundaries for yourself manually by copying the full-length sequence and performing a “Sequence Search” here.
What are the supporting platforms?
MacOS X, Linux, UNIX, and other UNIX-like systems, MS Windows (Python3 is essential)
Where can I find the developer’s documentation of the project?
We are currently updating the documentation. Soon it will be available here.
Submit Sequence FAQs
You can find it here https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi.
You can write to us https://camtech-bpp.ifas.ufl.edu/feedback_home/.
Guide Tree
How do I download a guide tree?
There is an export PDF/PNG option button above the guide tree.
What input formats can I use?
Currently raw or FASTA sequences are accepted formats.
What is the difference between a guide tree and a phylogenetic tree?
The guide tree is referred to as a guide tree to emphasize that is only used to guide the progressive alignment; it is not reliable guide to the phylogeny of the sequences. You can use other external available programs https://en.wikipedia.org/wiki/List_of_phylogenetics_software.
What version of Clustal Omega is used?
We use Clustal Omega version 1.2.4. Please check this publication https://www.embopress.org/doi/full/10.1038/msb.2011.75.
Why do I get a ‘Two sequences cannot share the same identifier’ error?
Clustal Omega needs unique sequence identifiers, which it defines as the first word on the sequence identifier line. Check that you've not got a duplicate identifier somewhere in the input, that you're not using spaces or tabs in your identifiers, and that the first 30 characters of your identifier are unique if using Clustal format files.
Why do I get a minimum of three sequence required errors?
Clustal Omega is a new multiple sequence alignment program that uses seeded guide trees and HMM profile-profile techniques to generate alignments between three or more sequences.
How many sequences can I upload at once in the custom sequence?
The current limit is 1000 sequences. Due to limited computational resource we suggest using the services of The European Bioinformatics Institute (EMBL-EBI) https://www.ebi.ac.uk/services and NCBI https://www.ncbi.nlm.nih.gov/home/analyze/.
Where can I find the parameters used?
We use default parameters for Clustal omega and BLAST.
BestMatchFinder FAQs
What do consensus symbols represent in a Pairwise Alignment? (BestMatchFinder)
EMBOSS Needle and NCBI BLAST. You can read the Needle explanations https://www.ebi.ac.uk/Tools/psa/emboss_needle/ and BLAST https://www.ncbi.nlm.nih.gov/books/NBK1762/.
What input formats can I use?
Currently raw or FASTA sequences are accepted formats.