GOfetcher Help

The GOfetcher Web application and search engine has been written in PHP programming language. It communicates with a local MySQL database in the backbone which stored the data.

Search capabilities

We developed a very comprehensive search facility and variety of output formats for the results.
The search options enable users to input simple as well as complex queries and search the GOfetcher. The advanced search panel allows users to define specific queries using Boolean operators connecting multiple fields for specific requirements. Each search returns a result list including species ID, species unique ID, symbol, GO term, name, and category as well as a summary of the distinct matching entries with the pie chart for categories. The user can also print or export query results in multiple formats including Excel, Word and XML.
GOfetcher has three different levels for searching the GO:

  • Quick Search: It searches any keyword as a species ID, species unique ID, symbol, GO term, name, or category. Keywords should be separated by any comma delimited or whitespace such as space, tab, or line break. There is also option for searching "any words", "all words", or "exact phrase".

  • Advanced Search (Figure 1):  In the "advanced search" tab user is able to search very complicated combination of keywords for the species ID, species unique ID, symbol, GO term, name, or category. Results could be the “exact match”, “contain”, “not contain”, or “starts with” keywords.

  • GOfetcher Advanced Search
    Figure 1. GOfetcher Advanced Search

     

  • Upload Files (Figure 2): In the “upload files” tab users can upload file(s) containing keywords which like quick search separated by comma or any white spaces. GOfetcher then searches for any words in the files and shows the results.



GOfetcher File Upload
Figure 2. GOfetcher File Upload

Browse by Species

From the browse menu it is possible to browse the database by species. Currently our database includes 18 model organism’s information including; Arabidopsis thaliana, Bacillus anthracis, Caenorhabditis elegans, Campylobacter jejuni, Candida albicans, Drosophila melanogaster, Mus musculus, Oryza sativa, Rattus norvegicus, Saccharomyces cerevisiae, and Vibrio cholera. Table 10 illustrates organisms and the annotation records in details. The total numbers of annotations are 847,510 records.

Search Results

If selected by the user a summary of the distinct matching entries with a pie chart for categories will appeare on the top of the search results page (Figure 3). The summary table contains unique numbers for species unique ID, symbol, GO term, and term name without any redundancy. By clicking on each number users will be able to view the related list.


3

Figure 3. Distinct matching entries with a pie chart for categories


Results include the following:

  1. Species ID: This is often a two or three letter abbreviation for a species, for instance FB for flybase and MGI for mouse. 

    Species Species ID
    Arabidopsis thaliana TAIR
    Bacillus anthracis Ames BA
    Caenorhabditis elegans WB
    Campylobacter jejuni CMR
    Candida albicans CGD
    Coxiella burnetii CBU
    Danio rerio ZFIN
    Dictyostelium discoideum DDB
    Drosophila melanogaster FB
    Homo sapiens GH
    Listeria monocytogenes LMO
    Mus musculus MGI
    Oryza sativa GR
    Plasmodium falciparum PF
    Rattus norvegicus RGD
    Saccharomyces cerevisiae SGD
    Trypanosoma brucei TB
    Vibrio cholerae VC


  2. DB specific ID: specific Accession ID to a species, e.g. “MGI:1918918” for mouse and “FBgn0015567” for flybase Drosophila. Information about a specific organism from related external databases provided here.
  3. Gene Symbol: This is the gene name with access to NCBI gene database information
  4. GO Term ID: Gene Ontology specific term ID with both tree and graphic view
  5. GO Name: Gene Ontology specific term name with related information from gene ontology database (geneontology.org).
  6. Category: One of the three organizing principles of GO which are “cellular component”, “biological process” and “molecular function”.
  7. References: Corresponds to literature reference(s) or database record(s) from a model database or PubMed.
  8. Evidence Code: Explains the codes that are used to indicate the nature of the evidence that supports a particular annotation. see the GO evidence code guide for the list of valid evidence codes for GO annotations

The output in GOfetcher can be saved into several different formats; Excel spreadsheet, Microsoft word document, comma-separated values (CSV), the Extensible Markup Language (XML) format, and printer friendly format. Although it is not our aim to be a visual-oriented tool, we provided a tree and graphical view for the results.

Fetching

Figure 4 illustrates the flow chart for the searching and fetching process. When search results appear, each record contains information fetched from related external databases. The GOfetcher extracts information from a variety of databases including MCBI, ArabidopsisDB, GeneDB, Saccharomyces Genome Database, FlybaseDB, Mouse Genome Informatics, Wormbase and TIGR Annotation.


Flow chart for searching and fetching process
Figure 4. Flow chart for searching and fetching process

UML Diagram

Figure 5 illustrates the Unified Modeling Language (UML) Schema of the GOfetcher database.

 

 

 

Copyright © 2007 - 2008 , Mississippi Computational Biology Consortium (MCBC) , Comments