SuCComBase

Data Sources

SCC gene data

SuCComBase integrates all the genes that are related to SCC biosynthesis in A. thaliana from various sources. The three main sources are databases, expression studies as well as publications.

Databases

AraCyc

KEGG

Figure 1. Example of genes that were computationally identified and experimentally validated.

Co-expression Data

AraNet (Lee et al. 2015)
GeneMANIA (Warde-Farley et al. 2010)
ATTED (Aoki et al. 2016)

Figure 2. Identification of potential SCC genes from three co-expression databases.

Publications

SCCs data

This database also provide the list of SCCs that were specifically produced in various Brassicales plants in SuCComBase: papaya (C. papaya), cabbage (B. rapa), broccoli (B. oleracea) and A. thaliana.

Databases

KNApSAcK

Publications

Relevant information data

Relevant information data are the data that are related to SCCs and SCC genes. This section provides all of the sources, where all these data were taken from.

AraCyc: AraCyc (https://www.arabidopsis.org/biocyc/) is a tool for visualizing biochemical pathways of Arabidopsis thaliana (Mueller et al. 2003).
AraNet: genome-scale functional gene network for Arabidopsis thaliana, constructed by integrating 19 types of genomics data and can be explored through a web-server (http://www.inetbio.org/aranet) to identify candidate genes for traits of interest (Lee et al. 2015).
ATTED: ATTED-II (http://atted.jp) is a coexpression database for plant species to aid in the discovery of relationships of unknown genes within a species (Aoki et al. 2016).
GeneMANIA: GeneMANIA (http://www.genemania.org/) is a flexible, user-friendly web interface for generating hypotheses about gene function, analyzing gene lists and prioritizing genes for functional assay (Warde-Farley et al. 2010)
KEGG: Kyoto Encyclopaedia of Genes and Genomes (KEGG) (https://www.genome.jp/kegg/) is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies (Kanehisa et al., 2016).
KNApSAcK: The KNApSAcK Metabolite Activity DB (http://kanaya.naist.jp/KNApSAcK/) is integrated within the KNApSAcK Family DBs to facilitate further systematized research in various omics fields, especially metabolomics, nutrigenomics and foodomics (Nakamura et al. 2013).
NCBI Gene: NCBI Gene (https://www.ncbi.nlm.nih.gov/gene/) integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide.
PubMed: PubMed (https://www.ncbi.nlm.nih.gov/pubmed/) contains more than 27 million publications from MEDLINE, life sciences journals and online books.
UniProtKB/ Swiss-Prot: UniProt DB (https://www.uniprot.org/) supplies a comprehensive, high-quality and freely accessible resource of protein sequence and functional information (The UniProt Consortium, 2015).

References

Aoki,Y. et al. (2016) ATTED-II in 2016: a plant coexpression database towards special online collection. Plant Cell Physiol., 57, 1–9.
Bateman,A. et al. (2017) UniProt: The universal protein knowledgebase. Nucleic Acids Res., 45, D158–D169.
Brown,G.R. et al. (2015) Gene: A gene-centered information resource at NCBI. Nucleic Acids Res., 43, D36–D42.
Kanehisa,M. et al. (2016) KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res., 44, D457–D462.
Lee,T. et al. (2014) AraNet v2 : an improved database of co-functional gene networks for the study of Arabidopsis thaliana and 27 other nonmodel plant species. Nucleic Acids Res., 1–7.
Mueller,L.A. et al. (2003) AraCyc: a biochemical pathway database for Arabidopsis. Plant Physiol., 132, 453–460.
Nakamura,Y. et al. (2014) KNApSAcK metabolite activity database for retrieving the relationships between metabolites and biological activities. Plant Cell Physiol., 55, 1–9.
Warde-Farley,D. et al. (2010) The GeneMANIA prediction server : biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res., 38, W214–W220.