GUIDELINES FOR NAVIGATING COMBREX

In order to browse COMBREX, no login or registration is required. However, to submit bids and communicate information about predictions, genes or experiments, one has to register and login. For information on how to register and login, please see: REGISTERING AND LOGGING IN.

ENTERING A SEARCH TERM

THE PROTEIN CLUSTERS RESULTS PAGE

THE PROTEIN CLUSTERS DETAIL PAGE

THE GENE DETAIL PAGE

SUBMITTING A BID

REGISTERING AND LOGGING IN



ENTERING A SEARCH TERM

When first visiting the database query page, you will supply a search term, which could be a gene name or name fragment, a gene symbol, Uniprot Accession Number (e.g. Q02PQ5), RefSeq ID (e.g. YP_780411) , Protein Cluster ID (e.g. PRK09601), Enzyme Commission EC number (e.g. 2.1.7.40) or an Entrez Gene accession number (e.g. 899833). These can be found here: Entrez Gene

For example, you could enter the term “isomerase” in the search box a shown below:

home_page_isomerase_example

Press the search button, and after a few seconds you will receive 2473 protein cluster results:

search_results

THE PROTEIN CLUSTERS RESULTS PAGE

Each of these 2473 results represents a protein cluster. The results can be sorted by clusters with wider phylogenetic spread, clusters with higher numbers of organisms, and clusters with higher numbers of proteins. A protein cluster is a group of orthologous genes that is thought to perform the same function. These assignments are based primarily upon sequence similarity. For each of the 2473 clusters, we highlight specifically whether or not there is a protein present in this cluster for either of our two focus organisms, E. coli K12 MG1655 and Helicobacter pylori 26695. However, we are interested in both predictions and experimental validations in any sequenced bacterial organism.

For more information on the details of cluster formation, please see:

The National Center for Biotechnology Informations's Protein Clusters Database Protein Clusters at NCBI

For the first listed cluster "phosphoheptose isomerase" there are 69 proteins in 69 organisms. COMBREX considers only completely sequenced bacterial genomes present in RefSeq. The initial cluster results are ranked by a prioritization scheme, with important considerations being a broad phylogenetic spread and the lack of experimentally validated genes. To the left of the cluster name, there is a colored symbol which gives a quick summary of that clusters experimental validation status. The symbols are based on the familiar classification scheme used for ski slopes. A green circle indicates that the cluster contains at least one experimentally validated gene. A blue square indicates that the cluster does not contain any experimentally validated genes, but that there are fairly specific functional predictions for the proteins in this cluster. A black diamond indicates that the cluster does not contain any experimentally validated genes and that there are no specific predictions for gene function of its members. To the right of the cluster name, numbers encased in either a black, blue, green, or gold box indicate the number of black, blue, green, and gold genes within that cluster. The color assigned to a gene reflects its experimental validation status. A gold gene has been experimentally validated, and the DNA sequence coding for the exact protein has been determined. The publication(s) reporting the sequence and the biochemistry of a gold gene are documented. A green gene (1) has been experimentally validated, however manual curation is incomplete, or information required for gold status is lacking. (2) Alternatively, a gene having greater than 98% full-length homology to a gold gene is also considered a green gene. A blue gene has a specific prediction of molecular function, but has not been experimentally validated, or, the gene's experimental validation status has yet to be established in COMBREX. A black gene does not have a specific prediction of molecular function but it may have predictions of "general" or "non-specific" functionality.

THE PROTEIN CLUSTERS DETAIL PAGE

If you click on the name of the cluster, you will view a page that provides details about the cluster on the top of the page, and the individual genes that comprise the cluster at the bottom:

Protien_Cluster_Detail_Page_Top

At the top of the cluster page, you will find summarized information about the cluster which includes links to the cluster page at NCBI and links to Pfam and CDD conserved domains found within proteins comprising that cluster.

Protien_Cluster_Histogram

In the middle of the cluster page you will find information about the experimental validation status of the constituent genes, information about the phylogenetic distribution of proteins, and for curated clusters only, a histogram for the average distances from each protein to every other protein in the cluster. The Histogram for the average distances from each protein to every other protein in a cluster serves to help COMBREX users identify a good target for experimental validation within a certain cluster. Please refer to the FAQ sections on Histogram for Average Distances. cluster_page_bottom

At the bottom of the page you will find a list of genes that comprise the cluster. Once again, we highlight at the top of the gene list if present in the cluster, genes from our two focus organisms E. coli K12 MG1655 and H. pylori 26695, and below those, the rest of the cluster members. After the genes from E. coli K12 MG1655 and H. pylori 26695, the other genes are ranked roughly in order of the perceived value for experimental validation. We stress that this is a very weak dependence, and welcome bids on any of the genes. Once again, we have placed symbols (green circle, blue square, or black diamond) as an indicator of its validation status. We expect that most bids will be made on genes that are marked by blue squares, indicating that there is a relatively specific prediction about the gene's function, but that it has not been experimentally verified. We introduce a new symbol here, a gold star, for those genes that have proven published experimental validation of function. This information has been gathered from a number of sources, including Uniprot, Ecocyc, Brenda and others. Identifying a comprehensive set of "gold standard" genes with functions that have been carefully experimentally validated is a major goal of COMBREX, and very much a work in progress. We welcome your participation in contributing to this effort every bit as much as to the external validation of genes of unknown function. For more information on the Gold Standard Project, please refer to the Gold Standard Genes document in the Help Center.

THE GENE DETAIL PAGE

Clicking on the official gene symbol, in this example "HP0857", brings you to a page that details information about the specific protein. On this page you will find links to the NCBI gene database, links to the protein sequence, the functional curated annotation provided by Gene Ontology, with evidence codes, links to external reference sites, and more.

Gene Detail A

At the top of the gene page, you will find summarized information about the gene which includes links to the gene page at NCBI and UniProt, and links to Pfam and CDD conserved domains found within the protein. In addition there is a link listed as "Experimental Determined Interaction" that lets you explore relationships between the gene of interest with other genes using Visant, an "Integrative Visual Analysis Tool for Biological Networks and Pathways" developed by the laboratory of Charles Delisi.

For more details on Visant, please visit:

VisANT 3.5: multi-scale network visualization, analysis and inference based on the geneontology

Gene Detail B

If you scroll further down the page, you will find information about the experimental validation status of the gene as well as predictions of molecular function for the gene, and a Functional Linkage Table. The “Submit a bid” button links to a form which allows users to submit a bid to receive funding in order to experimentally test the predicted molecular function(s) of that gene.

How to Submit a Bid.

Under the “Predicted Function” section users can press the “Submit” button which links to forms that allow the submission of annotations, predictions, and comments relevant to that gene.

How to Submit Predictions

How to Submit an Annotations

Gene Detail C

If you scroll down even further, you will find metabolic network predictions, the functional curated annotation provided by Gene Ontology, with evidence codes, and links to external reference sites.

Gene Detail D

At the bottom of the gene detail page, you will find the “User Comments” section, where users can submit any comments related to that gene.

For more information on how to search for your gene of interest, please refer to the Tips for Searching COMBREX

documentation in the help center.

SUBMITTING A BID

Once you click on the “Submit a bid” button in the gene detail page and have logged in, you will be taken to a page that has most of the fields required for the bid pre-populated.

Submit A Bid

In the bid submission page you will also see a location for you to enter the amount of direct funds you are requesting.COMBREX will fund at different levels depending on justified costs and need, however the most competitive bids will be for $10,000 or less (indirect funds will be provided according to your institutions agreement with NIH, but for this field, only include direct costs). There is also a location for you to upload a completed Bid Submission Form, which can be found here: Bid Submission Form.

An example of a completed bid submission form can be found here: Bid Submission Example here.

If your bid is judged to be competitive by the COMBREX executive committee and its external reviewers, you will be asked to submit a full proposal, which will include a detailed budget, and the normal institutional assurances and paperwork required by NIH to establish a subcontract with Boston University. These details are not needed at the time of the initial bid.

REGISTERING AND LOGGING IN

In order to participate in bids or discussions, submit predictions, or nominate genes for high priority status, one needs to register to create an account. No account is needed for browsing the database.

registration

At the very bottom, we ask you to indicate the roles you foresee participating in, including signing up for email alerts whenever new predictions are entered.

After having registered, you will be asked to login when starting a browsing session. There is also an option to reset your password. Click on the link and enter your email address If the email is valid, you will receive a link to click on where you can enter your new password and submit it. You would then be able to log in using your new password.

More information on how to submit a bid can be found within the How to Submit a Bid documentation in the help center.

If you have any questions please contact us at: help-desk@combrex.bu.edu