[
Credit
]
Database Catalog
Search:
Japanese
Category
Catalog
Organization
DB Type
Project
Proteins found in human salivary intercalated duct cell line
,
Two-Dimensional Electrophoresis Database of HSG cells proteins
2-dimensional gel electrophoresis of the proteins in HSG (Human salivary intercalated duct cell line, a cell line established with radiation on human salivary glands) is registered. Peptides have been identified from each spot using MALDI-TOF or peptide sequencers.
Protein Structure of proteins that contain hydrogen and hydration water
Analysis of Protein 3D structures containing hydrogen and hydration water with neutron diffraction method. The database contains PDB-derived X-ray structural analysis data and neutron diffraction data.
SAGE of human immune system cells SAGE
SAGE analyses of gene expression in cells of human immune systems.
Gene Diversity DataBase System (GDBS)
Gene Diversity DataBase System (GDBS)
Gene expression profile of mouse brain during postnatal development
The gene expression profiles were analyzed using Affymetrix GeneChip in the mouse cerebellum developmental stages after birth (2001) and in the mouse embryo brains developmental processes (2005). (2001) and (2005) are supplement data for 15018818 and 15893506, respectively.
A Database of Japanese Single Nucleotide Polymorphism for Geriatric Research (JG-SNP)
,
Database of Japanese SNP for geriatric disease
The database contains sexes, ages, disease status, and polymorphisms of geriatric disease patients in Tokyo Metropolitan Geriatric Hospital. These information are registered in a separate database called GEAD (geriatric autopsy database) that contains the above mentioned data as well as smoking history, drinking history, pathological findings, and the extent of atherosclerosis.
IINO lab. Germline index
,
Phenotypes of worm, gene function assessed by RNAi
Phenotypes from RNAi gene function inhibition targeted to genes that are specifically expressed in C.elegans germ lines.
Genome Medicine Database of Japan (GeMDBJ)
,
Polymorphism, gene/ protein expression of diseases
Analyses of genetic polymorphisms (SNP), gene expression (with GeneChip) and protein expression (2D-DIGE, LC-MS/MS) related to Alzheimer disease, gastric cancer, diabetes, hypertension, and asthma. Limited patient information (sexes, stratified ages, living prefectures, past history, smoking history) is included. User registration is required to view some of the information.
Characteristics of endogenous? peptides
,
PEPTIDOME
Peptides in living organisms were comprehensively separated and identified, and have been arranged with their electric charges, hydrophobicity, and molecular weights.
Extremobiospheres Research Center
,
Genomes of Extremophiles/ annotation
Extremophils genome sequences and the annotations analyzed in JAMSTEC. The database contains a list of links to extremophils genome information analyzed in other institutes.
Full-length human cDNA Database
A METI full-length cDNA sequencing and analysis project” database of 30,000 human full-length cDNA sequences that have been collected using Oligo-cap method.
Standard spectrum of metabolites
,
Standard Spectrum Search
NMR and MS reference spectrum for metabolic products (chemical compounds) have been analyzed. It is available only via search interface. The relation to NMR reference spectrum is not described.
ESTs of Lotus japonicus (model legume)
,
Lotus japonicus EST Index
Lotus japonicus EST sequences, 3'EST consensus sequences, and the annotations are registered. Supplement data for 10819328.
EST analysis of Physcomitrella patens (moss) gene expression
,
Physcomitrella patens Full-Length cDNA Clone Database Search
A list of Physcomitrella patens subsp. patens. Full-length cDNA clones distributed by RIKEN. Full length sequences for the clones are registered.
Database of Chimpanzee cDNAs (PRIGEN)
,
Full-length cDNAs of chimpanzee (Pan troglodytes verus)
Full-length cDNA libraries were constructed from chimpanzee brain, liver, testis, and epithelial tissues. Then, 5'EST were sequenced, and some of the clones were sequenced in the full-length. Supplementary data for 12727913 and 15677748.
タンパク3000成果データベース
A database of the present status and the outcomes of MEXT Protein 3000 Project"
ESTs of Porphyra yezoensis (red alga)
,
Porphyra yezoensis EST Index
Porphyra yezoensis ESTs and BASLT annotations are registered. Supplement data for 10907854, Journal of Phycology 39,923-930(2003) .
Arabidopsis Full-Length Clone Database Search
,
Catalogue of Arabidopsis, full-length cDNA
A list of Arabidopsis full-length cDNA clones distributed by RIKEN. Full length sequences for the clones are registered.
Activation T-DNA tags of mutant Arabidopsis lines
,
Activation Tagging Line Database
The database contains phenotypes and the photos of Arabidopsis activation tag lines that showed phenotypical changes in the T1 generation. The initial intent of the database was to distribute mutant strains.
Genome Network Platform
It is a database of human and mouse cDNA CAGE tag sequencing data and molecular interaction data between transcription factors based on yeast two hybrid method, which are outcomes of genome function information analyses in MEXT Genome Network Project.
Chlamydomonas reinhardtii EST index
,
ESTs of chlamydomonas (single celled green alga)
chlamydomonas EST, the contig sequences, and BLASTX annotations to the contigs are registered, Supplement data for 11089912, Phycologia,43,722-726(2004)
Database for Macaca fascicularis (cynomolgus monkey) full-length cDNA
,
QFbase - Macaca fascicularis cDNA database
EST and full-length sequences and their homology to human sequences of Macaca fascicularis cDNA clones are registered. The full-length cDNAs are referred to in 17194215.
Gene expression database of Ciona intestinalis (sea squirt)
Ciona intestinalis gene expression analyses with EST and in situ hybridization are registered. 16 types of cDNA libraries for tissues and developmental stages have been constructed. EST clustering results and genome (obtained from JGI) mapping results are registered. In-situ hybridization images can be searchable with expression location/stages.
Full-length cDNA of silkworm
,
Insect Genome Databases, IGB lab., Univ. of Tokyo
EST sequences of the silkworm full-length cDNA clones.
Immunostaining images of whole mouse sections by all matrix proteins
,
mouse basement membrane bodymap
Immunostaining images of whole mouse sections by all matrix proteins Good data for matrix researchers and tissue engineers Poly/monoclonal antibodies against each of 44 matrix-proteins Immnunohistochemical images of E16.5 whole body embryo section list of target proteins http://www.matrixome.com/bm/EnterBodymap/Protein/protein.asp
Database of functional RNA sequences and literature information
,
fRNAdb
It is one of the outcome databases of METI functional RNA project. It is a database of known and novel functional RNA sequences and the literature.
SNPs in the transcriptional promoter regions in human
,
dbQSNP
A database of SNP sequences and allele frequency information (experimental data) in the human genome promoter regions (mainly 1.0kb upstream region and 0.2kb downstream region of TSS). SNP typing and quantification were by SSCP analysis.
Bacteria genomes/ annotation
,
a genome database of microorganisms sequenced at NITE. (DOGA
Annotations to the genomic sequences and the proteomic analyses (if any), ORF comparison between other species (if any), and the general description of the microorganisms that NITE has sequenced their genomes. Supplement data for 9679194, 10382966, 11572479, 11418146, 12044378, 16237012, 12840036, 12692562, 16372010.
Database of small nucleolar RNAs from budding yeast
,
Yeast snoRNA Database
A database of budding yeast snoRNA structures and their interactions with other RNAs.
Annotation of yeast, introns
,
Yeast Intron Database
A database of budding yeast introns. Known introns were confirmed with microarrays, and those actually exist are put together with experimental data into a public database.
XDB
,
Xenopus laevis (frog) gene expression
The database contains Xenopus laevis EST, their assemblies, and WISH images. The assembly sequences are annotated with BLAST searches targeted to NCBI-NR, TIGR-XGI, Xenopus protein database of NIH, and InterProScan. WISH images have been taken from each direction at developmental stages.
Database for worm, genome/annotation
,
WormBase
A database of biological information of C.elegans and other worms. It contains genomes (structures, functions, genetic polymorphisms, comparative genomic studies), genes (structures, expressions, phenotypes, and RNAi), lineages (strains, genetics, and markers), and literature information.
Database of Vibrio parahaemolyticus (gram-negative marine bacterium), genome
,
VPARA(Vibrio parahaemolyticus)
A database of the whole-genome sequences of Vibrio parahaemolyticus and the annotations. Downloadable.
UT Genome Browser (Medaka)
A project support environment to confirm clone information and assembly status.
ESTs of Nicotiana tabaccum cell line (BY-2),
,
Transcription Analysis of BY-2
Nicotiana tabaccum -derived cell line BY-2 cDNA libraries were constructed and the ESTs were sequenced. BLASTX annotations for each EST and the clustering (with BLASTN searches between ESTs) are registered.
Database of disrupted mouse genes by gene trap methods
,
The NAISTrap database or NAISTrap データベース
Mutant mouse ES cell lines were produced using random gene disruption by a new gene trap method (UPATrap). Partial trapped gene sequences as well as the homology search results are registered.
Database for mouse genome/annotation
,
The Mouse Genome Informatics (MGI)
An integrated database of mouse genome researches in Jackson Laboratory. MGD (sequences, gene definition, mapping, phenotypes, mutants, strains, comparative studies with other mammals), GXD (gene expression: collection from literature or submission by researchers), and MTB (information of a model mouse that generates tumors) are integrated.
Database for Arabidopsis genome/annotation
,
The Arabidopsis Information Resource (TAIR)
A database of Arabidopsis genome, genes, and molecular biological data. It contains annotated genome, gene products, metabolism, gene expression, markers, strain resource information, and literature information.
Genome of Streptomyces avermitilis (industrial microorganism)/annotation
,
Streptomyces avermitilisゲノムデータベース
A database of Streptomyces avermitilis (a microorganism producing avermectin, an anthelmintic) genome sequences and annotations. It contains physical maps, KEGG pathway analyses, protein families, secondary metabolic products, conserved genes, lineage trees based on 16S rRNA, request methods of cosmid clones. Supplement data for articles (PubMed:11572948, 12692562).
Genome database for anthropoids/ phylogenetic comparison
,
Silver Project (Ape Genome Sequencing)
Genome sequences for the chimpanzee and gorilla. Comparison of DNA sequences between human and these anthropoids, and the comparisons based on the DNA sequences that have been sequenced in the Apes Genome Project (Silver) are registered.
EST analysis of silkworm gene expression
,
SilkBase
Silkworm ESTs and the cDNA library information. Annotation with BLASTX is registered.
Database for baker’s or budding yeast genome/annotation
,
SGD - Saccharomyces Genome Database
A gene-based database of the molecular biological and genetic data of budding yeasts. Most of the annotations rely on manual information extraction from literature. Genomic locations of the genes, GO annotations, sequences for nucleic acids and amino acids, phenotypes, and expression data are registered.
SGD
,
Sugi Genome Database
Micrographs of the morphology of budding yeast mutants
,
SCMD - Saccharomyces cerevisiae Morphological Database
A database that classified the morphology of mutants (budding states) of budding yeasts. Feature extraction from the morphology photographs and the classification were conducted computationally.
Database of gene expression profiles in human tissues and organs
,
SBM Database(Systems Biology and Medicine Database)
RefExA registers gene expression of human normal tissues, normal cells, and various cancer cell lines. LSBM GeNet registers gene expression analyses of cells in various pathological states and in drug administration. HUVEC DB registers gene expression changes of HUVEC after stimuli such as TNF-alpha. All the gene expression has been analyzed with GeneChip.
Database for yeast genome/annotation
,
S.pombe genome project
A database containing data from a fission yeast genome project at Sanger Institute. Genome sequences and the annotation, GO annotation, clone libraries, mapping resources (tiling path, gene map, physical map) are contained.
Rice Tos17 Insertion Mutant Database
,
rice, transposon gene disruption mutant strain, strain list
Rice strains with disrupted genes are produced by transferring endogenous transposons (Tos17) and are distributed as resources. The database contains flanking DNA sequences to the transposons.
Database of rice proteome
,
Rice Proteome Database
A database that collected spots of 2D gel electrophoresis targeted to rice tissues and organelles. Protocols of proteomic analyses are registered.
Rice Mitochondrial Genome Information (RMG)
Microarray gene expression of rice
,
Rice Expression Database (RED)
Gene expression data analyzed with rice cDNA mircoarrays are registered. The database contains NIAS and STAFF-derived data, as well as analyses in other projects using the same microarrays. The article for the database is Trends in Plant Science (2002) Dec 7 (12):563-564.
Genome database for Rhizobia with annotation
,
RhizoBase
Rhizobia and photosynthesis bacterium (Rhodopseudomonas palstris) genomes database. Genome sequences, ORF information, genes and the gene categories are registered.
Database for rat genome/annotation
,
Rat Genome Database (RGD)
A rat genome and gene information database. Maps (gene and RH), genes, QTL, SSLP, EST/cDNA, strains, and sequences are registered. A separate user interface makes comparisons between rat, mouse and human from a diseases point of view.
Cross-sectional images of rat brain
,
Rat Brain Sections: Super-fine images
The database contains images of transverse and sagittal sections of rat brains. To display the images, Viewpoint Media Player is required. Scrolling and expansion/reduction is possible using the mouse.
RPG
,
Ribosomal protein gene database
A database of genes encoding ribosomal proteins. It contains nucleic acid sequences, amino acid sequences, gene structures, orthologs for the genes, as well as multiple alignments for each orthologous group. Human data were obtained in a project. Other species data were obtained from public databases.
ROUGE
,
cDNA of mouse, unidentified gene-encoded large proteins?/annotation
A database of cDNA clones (mKIAA/mFLJ) and the analyses from Kazusa mouse cDNA project. A mouse version of HUGE database.
Rice Microarray Opening Site (RMOS)
ESTs of rice
,
RGP Rice cDNA Sequence Database
The database contains ESTs sequenced in NIAS and the clustering analyses.
Arabidopsis transposon mutant strains
,
RARGE [Ac/Ds Transposon Mutants]
A database of insertion positions and the adjacent genes of Arabidopsis transposon mutants. A part of RARGE.
Portal site for Arabidopsis research
,
RARGE- RIKEN Arabidopsis Genome Encyclopedia
A Web site to make searches in data and resources related to Arabidopsis researches in RIKEN. Full-length cDNA, microarray analyses, transposon mutants, genomic locations of the genes, and splicing patterns are registered.
Arabidopsis full-length cDNA
,
RAFL cDNAs
A database of Arabidopsis full-length cDNA sequences and the BLASTX annotation. A part of RARGE.
Human microsatellite polymorphism
,
Polymorphism of Microsatellite Marker Loci in the Japanese P
Heterozygosity of microsatellite markers in Japanese populations. Targeted markers are from deCODE Genetics 2002 (Kong A et al. Nat. Genet. 2002)
Pig genomics infromation system
It is a portal page of China-Denmark joint genome project. Genome sequences and EST sequences are generated from the project, and they make annotation on the. They say that all the data is on INSD.
Phenome Analysis of Ds transposon-tagging line in Arabidopsi
,
Phenotypes of transoposon-insertional mutants? in Arabidopsis
A list of Arabidopsis Ds transposon inserted mutants with the mutated gene loci and the genomic locations (also categorized in UTR, coding regions, exon or intron). The shapes of the mutants (eight primary categories and 50 secondary categories), but the list of phenotypes cannot be obtained. Detailed information display uses MIPS sites.
PhenoSITE :Phenotype Semantic Information with Terminology of Experiments
,
Standardized vocabulary for describing phenotypes of mouse mutants
Terms for mouse mutant phenotypes are hierarchically defined in a similar form to GO. The database consists of four categories of Simple category for GSC mouse, Extend representation build of GSCMPE, Mammalian Phenotype (used in MGI , a Jackson Lab, not restricted to the mouse), Mouse adult gross anatomy. Standardized methods (protocols) of phenotypic screening and the terms used are defined. The database targets ENU-induced mouse mutants established in RIKEN, and a table of the phenotypes and the chromosomal mapping is registered.
PRIDE (PSC-RIKEN Database of EST/Gene Expression)
,
Zinnia elegans EST /microarray gene expression
Gene expression analyses with EST and microarrays of Zinnia elegans are registered. ESTs are shown with BLASTX search results with each sequence. Microarray analyses are accessible only via GeNet system (now the service stopped?)
ESTs of Physcomitrella patens (moss)
,
PHYSCObase
A database of Physcomitrella patens subsp. Patens mRNA, EST, contigs, and experiment protocols. Downloadable.
Overview of Arabidopsis transposon mutant strains
,
Plant Functional Genomics Research Group
A page describing the outline of a transposon mutation database
Full-length cDNA of pig
,
Pig EST Data Explorer (PEDE)
The database contains pig ESTs from full-length cDNA clones, the assembled contigs, the annotations, and the full-length sequences of selected clones. The database functions as a resource bank to distribute the clones. Pig cSNP Database, a database of SNPs identified in the assembly processes, has been made. It is not clear whether it has ESTs or cDNA sequences that are registered only here. Should they have been registered in INSD?
Database for prostate gene expression
,
PEDB
A database of gene expression in human and mouse prostate that have been analyzed with EST, microarrays, protein masspectrometry.
Nocardia farcinica genome
A database of Nocardia farcinia IFM 10152 (a gram positive aerobic actinomycete that causes nocardia infections) whole genome sequence containing two plasmids and the annotation. Supplement data for 15466710.
EST analysis of barley and seed images
,
NBRP-Barley
The database contains the EST sequences from nine cDNA libraries of three strains and wild type barleys in developmental stages and in tissues. The EST data duplicate with HarvEST. It contains a list of germplasms that can be distributed from Okayama University. A part of them are attached with photos of the seeds and the sprouts. The database also contains a list of representative strains (in consideration of genetic diversity) called Core Collection.
Genome of Mycoplasma penetrans (bacterial pathogen)/annotation
,
Mycoplasma penetrans genome
Mycoplasma (Mycoplasma penetrans) genome sequences and the gene predictions are registered. Supplement data for 12466555.
Differential gene expression profiles of mouse strains
,
Mouse DNA Microarray
Gene expressions compared between mouse C57BL/6J and 129X1SvJ strains in newborn brains, in adult spleens, and in adult livers. Agilent microarrays are used for the analysis. Supplement data for 15029957.
Microarray analysis of Ciona intestinalis (ascidian), gene expression
,
Microarray analysis of embryonic retinoic acid target genes
Gene expression profiles of 9,287 candidates of embryonic retinoic acid target genes analyzed with microarrays using cDNA libraries for the Ciona intestinalis EST analyses. Supplement data for an article 12828686. In addition, in situ hybridization images for 91 genes are shown.
ESTs of tomato
,
MiBASE
EST sequences of cDNA libraries from the fruits and the leaves of Micro-Tom tomato are registered. The annotations include homologous genes and clusters in UNIGENE database containing ESTs from other projects and GO terms. Gene expression using microarrays made of the cDNA libraries as described above seems to have been analyzed for tissues, developmental stages, and breeds, but no raw data can be obtained. Supplement data for 15975739, Plant Biotechnol. 22: 161-165(2005)
Medaka EST database
,
Medaka gene expression
A database that contains Medaka ESTs, the library information, an explanation of mutation mapping system using ESTs, and the organization of a microarray (Medaka Microarray 8K).
Database for maize genome/annotation
,
MaizeGDB
Maize genomes and resources database. It contains genome sequences, conserved strains and phenotypes, mutant strains, and genetic maps.
Database for Arabidopsis genome/ annotation
,
MAtDB
The database contains the whole sequences that were sequenced and annotated in Arabidopsis Genome Initiative. Mitochondria genomes and chloroplast genomes are also annotated and are contained in the database.
EST of Halocynthia roretzi (ascidian)
,
MAGEST
A database of Halocynthia roretzi EST and the clustering.
MAEDA (Micro Array Expression DAta search)
,
Microarray analysis of Arabidopsis gene expression
Arabidopsis gene expression analyses using microarrays that were manufactured using 7,000 full-length cDNA clones. URL on GEO: GSE4203, GPL3181.
Genome of Lotus japonicus (model legume) / annotation
,
Lotus japonicus Genome Sequence Project
Lotus japonicus genome clones (TAC: transformation-competent artificial chromosomes), the annotations, chromosomal gene maps, and chloroplast gene sequences are registered. Supplement data for 12056416, 11853318, 11214967, 11853317.
ESTs of Lentinus edodes (shiitake mushroom)
,
LeEST
Lentinus edodes cDNA libraries have been constructed, and the 5' ESTs are sequenced. They are registered with BLASTN search results.
Full-length cDNA clones form rice
,
Knowledge-Oriented Molecular Biological Encyclopedia (KOME)
Rice full-length cDNA sequences and the annotations (homologous genes, clustering analyses, InterPro motif searches, GO assignments) are registered. Supplement data for 12869764.
Distribution pattern of antigens in embryonic worm
,
The Sugimoto Lab C. elegans Monoclonal Antibody Collection
Immunostaining images of stage specificity and localization of proteins in worm embryos are registered. The antibodies are distributed.
EST of silkworm
,
KAIKO cDNA
Silkworm ESTs are organized by cDNA libraries (strain, developmental stages, tissues, and sexes). BLASTX searches by each EST are registered.
Database of Japanese SNPs
,
JSNP
A database of genetic polymorphisms in Japanese populations which are generally observed on or adjacent to the genes.
Database for human metabolic disorder-related SNPs
,
JMDBase/Japan Metabolic disease DataBase
SNPs related to human metabolic diseases (especially hypertension and diabetes) are identified using original algorithms. 15716494 is an article about the effectiveness of the method, and verification on five genes of three races (Japanese, American Africans and Caucasians) is provided.
Integrated Rice Genome Explorer (INE)
,
Rice genome/annotation
A database of rice genome on which gene maps, physical maps, PCR markers, ESTs, BAC/PAC contigs are mapped.
EST database-viewing software? of crops
,
HarvEST
EST sequences and the assemblies for barleys, Brachypodium, citrus, coffee, cowpea, soybeans, rice, and wheat. The database has been constructed by Univ. California, Riverside, but the sequence data are accepted from cooperating institutes in the project (ex. Univ. Okayama provides barley ESTs). Genome sequences provided by Affymetrix, which were material data for the genome chip, for barley, wheat, rice, and soybean.
HUGE - Human Unidentified Gene-Encoded large proteins
,
Unidentified long human cDNA/, annotation
A database of cDNA sequences and the analyses of KIAA/FLJ clones collected in Kazusa human cDNA project. The initial purpose was to analyze unidentified genes corresponding to large (>50kDa) proteins. The database registers cDNA sequences, restriction maps, signal sequences, amino acid sequences, motif searches with Pfam and SOSUI.
HIV Infectious Disease Integrated Database
The sequences for virus sequences account for only a small part of INSD sequences, but clinical information for about 600 virus hosts seems to be supplemented. The usage is unknown. Registration is required to view the clinical information.
HGS (Human Genome Sequencing)
GiiB-JST mtSNP
,
Human mitochondrial genome polymorphism database
Distribution of polymorphism in mitochondria genomes from patients of seven diseases (96 patients for each disease) has been analyzed. This is a database of the functional differences between individuals related to the paired polymorphisms. 672 INSD entries of mitochondria genomes cite the same article.
GenoBase
,
Integrated database of E.coli K-12 (W3110)
The database has been constructed in NAIST. It is a database of various information related to E.coli. It contains genome sequences, ORF, amino acid sequences, 2D-PAGE proteomics data, microarray gene expression data, literature information, bioinformatics analyses (ORF clustering, codon usage skewness, and clustering of gene expression profiles).
Drosophila genome/annotation
,
GadFly
A site that summarizes various data produced in Drosophila genome project. (1) Genome sequences and annotation, (2) gene expression patterns analyzed with in-situ hybridization, They are validated with microarrays. Annotation has been manually conducted using controlled vocabulary. (3) EST and full-length cDNA sequences, (4) transposon sequences, (5) gene disruption strains using a single P transposable element. (6) comparative genomic analysis of Drosophila, (7) SNP map.
Drosophila Gal4 enhancer trap insertion lines
,
GETDB - Gal4 Enhancer Trap Insertion Database -
Analyses of insertion positions, gene expression patterns, and the phenotypes of Drosophila strains with inserted Gal4 enhancer traps. Resources can be distributed.
GALAXY
,
Structures of N-gylcans in glycoproteins
Sugar chains structures linked to asparagine residues (N type sugar chains) were analyzed the original 2D/3D sugar chain mapping, and the obtained structures (combinations of sugar residue units) are registered.
Full-Toxoplasma
,
Full-length cDNA database of Toxoplasma
A database of the analyses of full-length cDNA clones of the toxoplasma. ESTs of each clone are mapped on draft genome sequences (contigs).
Full-Malaria
,
Plasmodium falciparum (malaria), full-length cDNA Database
A database of full-length cDNA clones and the analyses of two Plasmodia causing human malaria and two Plasmodia causing mouse malaria. ESTs for each clone are mapped to the genome as well as to ESTs in the public domain. The database also contains confirmed homology between Plasmodia with TBLASTX (Other Plasmodia contigs were mapped to the template P. falciparum genome).
Database for human full-length cDNAs
,
FLJ Human cDNA Database
Outcomes of the sequencing and sequence analyses of METI Protein function analysis project/development of splicing variant acquisition technology. It consists of human splicing variant cDNA sequence database and an entire FLJ oligo-cap method based human full-length cDNA sequence database (about 50,000 full-length analyzed sequences and about 1,500,000 5'end analyzed sequences)
ExtremoBase
ENU-based gene-driven mutagenesis in progress (location list
,
Name and chromosomal location of mutant gene in mutant mouse
The same as above. A list sorted by chromosomal location.
ENU-based gene-driven mutagenesis in progress (gene-name lis
,
Name and chromosomal location of mutant gene in mutant mouse
Mouse mutant lineages induced by ENU are selected and listed based on the mutations on the genes. The lineages are based on sperms of G1 mice with no G2 lineage due to lethality or infertility at the time of the construction of Mutant library with phenotypic screening. Resources are distributed. A list sorted by the mutated genes.
ENA
,
European Nucleotide Archive
The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. A typical workflow includes the isolation and preparation of material for sequencing, a run of a sequencing machine in which sequencing data are produced and a subsequent bioinformatic analysis pipeline. ENA records this information in a data model that covers input information, output machine data and interpreted information
EGTC
,
Mouse genes, trapped with gene trap method
Trap clones were obtained from Mouse ES cells using a new gene trap system (exchangeable gene trap system). Based on them, gene transfer or disruption was conducted, and lineage resources (oocytes and sperms) are distributed. The database registers tag sequences for the trapped genes, homology search results, and mutated lineages.
Database of pathogenetic E.coli O157 genome
,
E.coli O157:H7 Sakai genome project
A database of E.coli O157:H7 Sakai genome sequences, pOSAK1 plasmid sequences, and the annotations. Download is possible.
Database of genomes and transcriptional regulations for fila
,
ESTs of Aspergillus oryzae
Aspergillus oryzae cDNA libraries were constructed under several conditions and the 5'ESTs were sequenced. The sequences are available only via FASTA homology search. Promoter analyses are seems to be planned, because methods for construction of genome clones for the purpose are described. In addition, a cosmid clone of Aspergillus nidulans is registered.
DBTSS
,
Database of transcription start sites
A database of transcription start sites that have been decided by mapping ESTs of full-length cDNA clones to genomes. Presently, the target species are human, mouse, zebrafish, Plasmodia, and Protoflorideophycidae.
D-HaploDB
,
SNPs of human definitive haplotypes determined by complete hydatidiform moles,
Haplotype analysis of 280,000 SNPs from the samples of Japanese complete hydatidiform mole that have been typed with microarrays.
CyanoBase
,
Genome database for Cyanobacteria
A database of the genomes for Cyanobacteria, photosynthesis bacteriu (Chlorobium tepidum TLS), purple bacterium (Rhodopseudomonas palustris). Genome Sequences, ORF information, genes and the categories, mutations, proteomic analyses (Cyano2Dbase) are registered. Supplement data for 8590279, 8905231, 9435137, 11759840, 11858227, 12240834, 14621292.
CropNet
Genome mapping in crop plants
Cricket EST DB and Expression DB
,
ESTs of cricket
Cricket EST sequences, clustering analyses, and homology search results are registered.
CluSTr - Clusters of Swiss-Prot and TrEMBL proteins
Automatic classification of SWISS-PROT+TrEMBL proteins
Ciona intestinalis EST project database
Chick Eye EST DB and Expression DB
Cerebellar Development Transcriptome Database (CDT-DB)
,
Gene expression of mouse cerebellum in postnatal development?
Gene expression in mouse cerebellum after birth analyzed with various methods. Methods used are fluorescence differential display, cDNA microarray, and GeneChip. Genes expressed in cerebellum were studied their expression patterns with RT-PCR and in situ hybridization. The results of in situ hybridization are displayed as images, and no verbal description is found.
Cancer Gene Expression Database (CGED)
CGED (Cancer Gene Expression Database) is a database of geneexpression profile and accompanying clinical information. The data of CGED were obtained through collaborative efforts of Nara Institute of Science andTechnology and Osaka University School of Medicine to identify genes ofclinical importance.
CYORF (Cyanobacteria Gene Annotation Database)
,
Workbench for Cyanobacteria gene annotation
A workbench for the Cyanobacteria research community to annotate the genes. General users can make searches on, refer to, and download the database.
CREAT portal
,
Gene/protein expression profiles and protein-protein interaction of mouse mKIAA genes expression, protein expression,
Gene expressions that have been analyzed using microarrays based on mKIAA clones that have been obtained in Kazusa mouse cDNA project. Ectopic expressions seem to have been analyzed using hybridization (with images). The database (InGap) contains protein expression analyzed with western blot, immunohistochemical analysis, and immunoprecipitation using antibodies based on mKIAA. The database (InCeP) contains protein-protein interactions between mKIAA expressed proteins analyzed with immunoprecipitation and MS/MS. The interactions can be searched/displayed/downloaded, but the display required a dedicated software.
CGH Database
,
Chromosome aberration in human tumor cells
A database of chromosomal abnormalities in human tumor cell lines analyzed with Comparative Genomic Hybridization. Loss, gain, and amplification are detected.
CASP8
,
Critical Assessment of Techniques for Protein Structure Prediction
Results of the recent Critical Assessment of Techniques for Protein Structure Prediction, CASP8, present several valuable sources of information. First, CASP targets comprise a realistic sample of currently solved protein structures and exemplify the corresponding challenges for predictors. Second, the plethora of predictions by all possible methods provides an unusually rich material for evolutionary analysis of target proteins. Third, CASP results show the current state of the field and highlight specific problems in both predicting and assessing. Finally, these data can serve as grounds to develop and analyze methods for assessing prediction quality. Here we present results of our analysis in these areas. Our objective is not to duplicate CASP assessment, but to use our unique experience as former CASP5 assessors and CASP8 predictors to (i) offer more insights into CASP targets and predictions based on expert analysis, including invaluable analysis prior to target structure release; and (ii) develop an assessment methodology tailored towards current challenges in the field. Specifically, we discuss preparing target structures for assessment, parsing protein domains, balancing evaluations based on domains and on whole chains, dividing targets into categories and developing new evaluation scores. We also present evolutionary analysis of the most interesting and challenging targets.
CAGE
,
CAGE/ transcription start site
A database of CAGE tag mapping to the genome. Human and mouse libraries by the tissues and developmental stages were constructed, and the sequences are mapped to UCSC (golden path) genome. The mapping results are referred to by FANTOM.
C. elegans RNAi Phenome Database
,
Database of RNAi gene disrupted worms
Phenotypes for worm lineages that have been undergone RNAi gene disruption are comprehensively registered. Clustering of the genes based on the phenotypes is registered, and they are expressed in the form of lineage trees.
Brain Gene Expression Database (BGED)
,
Database of mouse brain gene expression
A database of gene expression in the mouse brain analyzed in various physiological and pathological processes. ATAC-PCR was used for gene expression analysis.
3D brain image database of humans, Japanese monkeys and Rhesus monkeys
,
Brain Atlas Database of Japanese Monkey for WWW
Three dimensional images of human, the Japanese macaque, and the rhesus macaque are re-constructed from MRI.
BombMap
,
Bombyx genome map
BodyMap
,
Human and mouse gene expression
A database of gene expression in human and mouse tissues and cells. The gene expression is analyzed based on 3'ESTs.
BloodSAGE
A database of SAGE analyses of gene expression in blood cells.
BSORF
,
Genome database of Bacillus subtilis (soil bacterium)/ annotation
ORFs and the annotations of the Bacillus subtilis genome, mutations for the corresponding genes, microarray gene expression patterns are registered.
BED (Brain EST Database)
Brain EST Database (BED) is based on collection of 3' end ESTs generatedin the Taisho Laboratory of Functional Genomics
Atlas (ISH Data Base)
,
Gene expression of Dictyostelium (social amoeba?)
Gene expression information of Dictyostelium discoideum, a cellular slime mold, is registered. Registered data are cDNA clones from each stage, EST, their assemblies, gene expression images with in-situ hybridization.
Aspergillus oryzae RIB 40 genome DB
Aspergillus oryzae EST DataBase
Archaeal Gene Network (Arch GeNet)
,
Protein and gene expression of Thermoplasma volcanium GSS1(thermophilic archaebacterium)
Protein expression in a thermophilic archaebacterium under aerobic and anaerobic conditions was analyzed with 2-dimensional gel electrophoresis. In addition, gene expression under three types of environments was analyzed using microarrays.
Arabidopsis EST analysis database
,
Arabidopsis thaliana EST Index
Arabidopsis ESTs and the annotation (based on BLASTX). Supplement data for 10907847.
ASSETs (Alternative Splicing Sequence Enriched Tags)
,
Mouse ESTs that are, rich in alternative splicing
A database of tag sequences from cDNA libraries that were established from mouse cell lines and that are rich in alternative splicing. The sequences seem to have been pattern classified based on mapping on Ensemble genomes.
ARTADEdb
,
Arabidopsis exon detection, tool and validation result
A database of Arabidopsis exons detected with tiling arrays conducted in RIKEN. Programs that were used in the data analysis are also made public.
ARCHAebacterial Information Collection (ARCHAIC)
,
Genome of archaebacteria/annotation
A database of genome sequences, gene structures and the sequences (nucleic acids and amino acids), pseudogenes, operons, and lineages for several archaebacteria. Supplement data for PubMed:9679194, PubMed:11121031.
ABA
,
Images of Ciona intestinalis (ascidian chordate)morphology in different developmental stages
Ciona intestinalis morphology database that registers images for developmental stages from the fertilized egg to the tadpole larva. It contains 3D reconstruction images in the mid-tailbud stage and cell lineage figures.
5'SAGE
,
5'end serial analysis of gene expression database
A database of 5'SAGE analysis of human gene transcription start sites and the number of expressed tags.
Databank
World-2DPAGE Repository
2DPAGEのデータを論文からキュレートしてデータベース化したもの。ここのスポット強度情報まで詳細にアクセスすることができ、embl様のフォーマットでテキストを取得可能
SRA
,
Sequence Read Archive
The Sequence Read Archive (SRA) stores sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome Analyzer®, Applied Biosystems SOLiD® System, Helicos Heliscope®, and others.
Probe
,
sequence-specific regions
The NCBI Probe Database is a public registry of nucleic acid reagents designed for use in a wide variety of biomedical research applications, together with information on reagent distributors, probe effectiveness, and computed sequence similarities.
Nucleotide
,
The Entrez Nucleotide database
The Entrez Nucleotide database is a collection of sequences from several sources, including GenBank, RefSeq, and PDB. The number of bases in these databases continues to grow at an exponential rate.
Genome
,
whole genome sequences
The Genome database provides views for a variety of genomes, complete chromosomes, sequence maps with contigs, and integrated genetic and physical maps. The database is organized in six major organism groups: Archaea, Bacteria, Eukaryotae, Viruses, Viroids, and Plasmids and includes complete chromosomes, organelles and plasmids as well as draft genome assemblies.
GSS
,
The Genome Survey Sequences Database
The Genome Survey Sequences Database (GSS) division of GenBank is similar to the EST division, with the exception that most of the sequences are genomic in origin, rather than cDNA (mRNA). It should be noted that two classes (exon trapped products and gene trapped products) may be derived via a cDNA intermediate. Care should be taken when analyzing sequences from either of these classes, as a splicing event could have occurred and the sequence represented in the record may be interrupted when compared to genomic sequence. The GSS division contains (but is not limited to) the following types of data:
GENSAT
,
gene expression atlus of mouse central nurvous system
The GENSAT project aims to map the expression of genes in the central nervous system of the mouse, using both in situ hybridization and transgenic mouse techniques.
Database for Aquatic-vertebrate Science
This database consists of living and sample photographs of the fish. Registered photographs increased from 40,000 to 54,583 (Jul. 2006).
Life Science Database Archive Service
Data bank of single nucleotide polymorphism in the United States.
,
dbSNP
A database that collects SNPs and short insertion/deletion polymorphisms.
Data bank of ESTs in the United States
,
dbEST
A dataset collected as GenBank EST division.
Database for mouse genome/annotation
,
The Mouse Genome Informatics (MGI)
An integrated database of mouse genome researches in Jackson Laboratory. MGD (sequences, gene definition, mapping, phenotypes, mutants, strains, comparative studies with other mammals), GXD (gene expression: collection from literature or submission by researchers), and MTB (information of a model mouse that generates tumors) are integrated.
Database for Arabidopsis genome/annotation
,
The Arabidopsis Information Resource (TAIR)
A database of Arabidopsis genome, genes, and molecular biological data. It contains annotated genome, gene products, metabolism, gene expression, markers, strain resource information, and literature information.
Database for baker’s or budding yeast genome/annotation
,
SGD - Saccharomyces Genome Database
A gene-based database of the molecular biological and genetic data of budding yeasts. Most of the annotations rely on manual information extraction from literature. Genomic locations of the genes, GO annotations, sequences for nucleic acids and amino acids, phenotypes, and expression data are registered.
Database for rat genome/annotation
,
Rat Genome Database (RGD)
A rat genome and gene information database. Maps (gene and RH), genes, QTL, SSLP, EST/cDNA, strains, and sequences are registered. A separate user interface makes comparisons between rat, mouse and human from a diseases point of view.
PubChem Substance
,
deposited chemical substance records
The PubChem Substances Database contains descriptions of chemical samples, from a variety of sources, and links to PubMed citations, protein 3D structures, and biological screening results that are available in PubChem BioAssay. If the contents of a chemical sample are known, the description includes links to PubChem Compound.
Bioactivity screens of chemical substances
,
PubChem Bio Assay
The PubChem BioAssay Database contains bioactivity screens of chemical substances described in PubChem Substance. It provides searchable descriptions of each bioassay, including descriptions of the conditions and readouts specific to that screening procedure.
MS/MS proteomic experiments
,
Peptidome
Peptidome is a public repository that archives and freely distributes tandem mass spectrometry peptide and protein identification data generated by the scientific community. Several layers of data are captured to promote understanding of the experiment and analysis of the underlying data
PDDB
,
the Prion Disease Database
Prion diseases reflect conformational conversion of benign isoforms of prion protein (PrPC) to malignant PrPSc isoforms. Networks perturbed by PrPSc accumulation and their ties to pathological events are poorly understood. Time-course transcriptomic and phenotypic data in animal models are critical for understanding prion-perturbed networks in systems biology studies. Here, we present the Prion Disease Database (PDDB), the most comprehensive data resource on mouse prion diseases to date. The PDDB contains: (i) time-course mRNA measurements spanning the interval from prion inoculation through appearance of clinical signs in eight mouse strain-prion strain combinations and (ii) histoblots showing temporal PrPSc accumulation patterns in brains from each mouse–prion combination. To facilitate prion research, the PDDB also provides a suite of analytical tools for reconstructing dynamic networks via integration of temporal mRNA and interaction data and for analyzing these networks to generate hypotheses.
Data bank of protein structures in Japan
,
PDBj (Protein Data Bank Japan)
A Japan node of protein three dimensional structures database. It manages wwPDB in cooperation with U.S. RCSB and European MSD-EBI.
Data bank for protein 3D structures
,
PDB
A databank to collect protein 3D structures that have been identified using mainly X-ray crystal structure analyses or NMR.
Data bank for 3D structural information about nucleic acids ?
,
NDB
Nucleic acids 3D structures information from researches is accepted and is made public as a database.
Microsatellite markers of mouse strains
,
Mouse Microsatellite Data Base of Japan (MMDBJ)
A database of mouse microsatellite markers, as well as a repository that accepts new data from researches. Analyses showing SSLP differences between mouse strains are registered with PCR conditions. Emphasis is on Japanese mice (MSM and JF1).
KEGG - Kyoto Encyclopedia of Genes and Genomes
Metabolic and regulatory pathways
Japanese GenomeNet
Data bank of nucleic acid sequences in the United States
,
GenBank(R)
It is a databank that accepts deposition of nucleic acid sequences from researches in various countries. It routinely exchanges data with EBI (EMBL) and DDBJ.
GEO - Gene Expression Omnibus
,
United States, Data bank of gene expressions in the United States
A database run by NCBI that accepts and make public depositions of gene expression data. The database is compatible to microarrays (DNAchip) and SAGE.
ExPASy-SWISS 2D PAGE database
The method of data collection is unclear. The data contains the materials, spot ID, spot protein names, gel images. By protein names, 2D gels on which they are identified can be retrieved. By material names, 2DPAGE images and spot lists for the materials can be retrieved. This database is like a collection of article figures targeted to 2DGEL figures. It is useful to generate ideas. It is slow, so Korean mirror site may be a bit faster.
EST
,
Expressed Sequence Tags database
Expressed Sequence Tags database (dbEST) (Nature Genetics 4:332-3;1993) is a division of GenBank that contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms.
ENTREZ
ENTREZ is a cross-database web based search engine
Data bank of nucleic acid sequences in Europe
,
EMBL Nucleotide Sequence Database: developments in 2005.
It is a databank that accepts deposition of nucleic acid sequence data from researchers in European countries. It routinely exchanges data with U.S. NCBI (GenBank) and Japan DDBJ.
DDBJ Trace Archive
DDBJ - DNA Data Bank of Japan
,
Nucleic acids, sequence data bank of Japan
A databank that accept nucleic acids sequence data mainly from Japanese researchers. Data from all over the world can be collected by routinely exchanging data with U.S. NCBI (GenBank) and European EBI (EMBL).
Cell Line Catalog
A cell line catalogue of cancer cell lines that the cancer cell bank distributes.
CIBEX
,
Gene expression databank of Japan
Deposition of gene expression data from microarrays by researchers is accepted and published.
BioImage
,
Data bank, of biological images
A database of biologically informative images (especially microscopic photos) that have been deposited by researchers. Literature information is required as metadata at the time of image registration. Compatible with multi-slice images.
BMRB - BioMagResBank
,
Biological Magnetic Resonance Data Bank
The database accepts deposition of 3D structural data of biological macromolecules decided with NMR from researchers. The database makes cooperation with PDB and NDB when it is constructed, and data are deposited in these databanks via BMRB.
ArrayExpress
,
European data bank of microarray gene expression
A database operated by EBI that accepts and publishes microarray gene expression data. Arrays (design), gene expression, and protocols can be registered separately. Accepting data types are CHiP-chip, CGH, gene expression, protein arrays, and RNAi.
Androgen Receptor Gene Mutations Database
Mutations in the androgen receptor gene
Program
UniSTS
,
markars and mapping data
UniSTS is a comprehensive database of sequence tagged sites (STSs) derived from STS-based maps and other experiments. STSs are defined by PCR primer pairs and are associated with additional information such as genomic position, genes, and sequences.
Gene-oriented clusters of transcript sequences
,
UniGene
UniGene is an experimental system for automatically partitioning GenBank sequences into a non-redundant set of gene-oriented clusters. Each UniGene cluster contains sequences that represent a unique gene, as well as related information such as the tissue types in which the gene has been expressed and map location.
HomoloGene
,
eukaryotic homology groups
HomoloGene is a system for automated detection of homologs among the annotated genes of several completely sequenced eukaryotic genomes.
Protein-Nucleic Acid complex database
Protein-nucleic acid complex entries have been collected from PDB.
PDB-REPRDB Representative protein chains from PDB
,
Representative protein chains
A database that grouped PDB proteins based on the similarity in sequences and in the structures. An interface is provided with which representative proteins for the groups can be selected by applying to specified rules.
Cross reference of public databases
,
LinkDB Search
Cross references between multiple molecular biology databases. It can search with an entry in a database for relevant data in other databases.
DIGITized Genes
A list of candidate genes explored through application of DIGIT program to human genome.
GENIUS Ⅱ
A database of predicted coding regions in the genome sequences and the protein 3D structures. Basically, it is the searches of homologous proteins using PSI-BLAST.
Base-Amino Acid Interation Database
The database has extracted entries containing nucleic acids from PDB, and organized them to enable searches by the presence of nucleic acid-amino acid interactions.
Database of yeast gene expression
,
yMGV - Yeast microarray global viewer
A database of microarray analyses of gene expression in budding and fission yeasts.
viral probe databse
All the entries of INSD Virus division are organized, and PCR primers have been designed for the identification. There are many databases for the same purpose, and making up a category.
snoOPY
The database contains snoRNA (small nucleolus RNA) for 10 organisms.
MutationView
The mi-R ontology database
,
miRò
miRò is a web-based knowledge base that provides users with miRNA–phenotype associations in humans. It integrates data from various online sources, such as databases of miRNAs, ontologies, diseases and targets, into a unified database equipped with an intuitive and flexible query interface and data mining facilities. The main goal of miRò is the establishment of a knowledge base which allows non-trivial analysis through sophisticated mining techniques and the introduction of a new layer of associations between genes and phenotypes inferred based on miRNAs annotations. Furthermore, a specificity function applied to validated data highlights the most significant associations.
Database for molecular surfaces of ?protein’s functional sites
,
eF-site
A database of calculated surface characteristics of protein functional domains. PDB structures were used as the material, and surface electrostatic potentials, hydrophobic characteristics, and Connolly surfaces were calculated and are contained in the database.
Changes in the nucleic acid sequences that cause protein polymorphism
,
dbProP (a Protein Polymorphism database)
A database containing gene sequence alterations that affect amino acid sequences of the proteins, namely, SNPs in coding regions and alternative splicings. The target is human.
Annotation of tmRNA sequences
,
The tmRNA website
A database of tmRNA information. It contains sequences, secondary structures (the bases in the sequences are colored by the structural elements), corresponding proteolysis tag peptides, as well as multiple alignments of all the tmRNA sequence sets and all the tag peptide sets. The sequences consist of those identified by direct sequencing and those obtained from public databases.
Database for tmRNA sequences
,
The tmRDB and SRPDB resources.
A database of tmRNA (stable low molecular weight RNA having functions of both tRNA and mRNA .They are widely observed in bacteria), SRP RNA (signal recognition particle RNA), proteolysis tag peptides, and related proteins. Other than the sequences, multiple alignments, topological structures (only tmRNA and SRP RNA), and 3D structures are contained.
Automatic annotation of genomes
,
The UCSC Genome Browser Database: update 2006.
A database of automated annotation to vertebrates and major model organisms that have public genome sequences. Annotations are managed for each the type (there are various types including markers, BAC endmap positions, RefSeq entries, GeneScan results, mapped EST locations) as separate tracks. Proteomes (Proteome Browser) and in situ images (VisiGene) are managed separately.
Automatic annotation of Bacillus subtilis (soil bacterium) genome
,
SubtiList
A database of annotation to Bacillus subtilis genome (gene location, functional assignment, and links to references).
Automatically clustered human ESTs
,
STACK
A database that categorized human EST and mRNA by the expressed tissues, developmental stages, and related diseases, and made clustering of the sequences in each category in an original method. Description of splicing variants is included. The system to build this database is made available.
Amino acid sequence comparisons
,
SSDB (Sequence Similarity Database)
A database of comparative analyses of amino acid sequences of protein coding genes on known genomes.
Rice PIPELINE
Automatically annotated rice genome
,
Rice Genome Automated Annotation System (Rice GAAS)
A database of algorithmic annotations on rice genomes. They contain Gene prediction (GENESCAN, RiceHMM, FGENESH, and MZEF), splicing site prediction (SplicePredictor), homology searches (BLAST, HMMer, ProfileScan, and MOTIF), repetitive sequence searches (RepeatMasker, Printrepeats), signal sequence search (SignalScan), protein localization signal prediction (PSORT), and transmembrane protein secondary structure prediction (SOSUI).
Database for rRNA sequences
,
Ribosomal Database Project (RDP-II)
A database of known rRNA sequences from GenBank. Alignments that are also provided have been constructed with original algorithms the take into account sequences and secondary structures at the same time. The database has a browser that organized the entries based taxonomical hierarchies.
RPSD
,
Rice Protein Structure Databace
A database that collected rice protein structures. The data consists of 3D-structures derived from PDB and predicted structures from GTOP.
Protein Clusters
,
a collection of related protein sequences
This collection of related protein sequences (clusters) consists of Reference Sequence proteins encoded by complete genomes. This database contains both curated and non-curated clusters.
PrognoScan
ProMode
Fluctuations under normal vibration mode of various proteins have been calculated. The results are shown in animations.
Database for protein domain families
,
ProDom
A database of extracted domains from algorithmically classified known amino acid sequences. Family classification was conducted based on SWISS-PROT and treEMBL and using PSI-BLAST. Domains were explored through multiple alignments in each family. The database contains ProDom based on all the amino acid sequences and ProDom-CG based only on amino acid sequences that derived from completed genomes.
Database for human, non-synonymous SNPs
,
PicSNP/A Catalog of Non-Synonymous SNP
Non-synonymous SNPs are collected automatically. The materials are NCBI human draft genome sequences, and the gene sequences and SNPs in the Feature were compared with SwissProt to select non-synonymous SNPs. GO classification of the identified genes are also shown.
PRIME (PRotein Interaction and Molecular information databas
A revised version of Kinase Pathways Database that additionally contains interaction types and the validity in the evidences.
PIR-PSD
,
Resource of protein sequence information
The database functioned as a databank to collect amino acid sequence until 2004. Presently, it provides protein annotation resources and tools as a part of UniProt. PIRSF is a database that categorizes full-length protein sequences into classes from an evolutionally point of view. iProClass is a database that assigned to UniProtKB cross reference indices of PDB, COG, Pfam, GenBank, GEO, OMIM, PubMed, GO, DIP, Swiss-2DPAGE, and KEGG.
PDBSTR
A reconstruction of PDB for KEGG.
Automatic annotation of Bacillus subtilis (soil bacterium) genome
,
NRSub
A database of Bacillus subtilis genome on which entries of SWISS-PROT, ENZYME, and HOBACGEN are mapped and cross-referenced.
Microsatellite markers of mouse strains
,
Mouse Microsatellite Data Base of Japan (MMDBJ)
A database of mouse microsatellite markers, as well as a repository that accepts new data from researches. Analyses showing SSLP differences between mouse strains are registered with PCR conditions. Emphasis is on Japanese mice (MSM and JF1).
MitoFish
,
Mitochondrial genome database of fish
Fish mitochondrial genome sequences were collected from GenBank and RefSeq. Taxonomical information was obtained from FishBase, NCBI Taxonomy Browser, The Catalog of Fishes, and Fish Database of Japan. Literature information was obtained from PubMed. Related biological sequences were obtained from DDBJ.
Microbial Genome Workbench
Genome sequences in the public domain of bacteria and archaebacteria were collected, and made searchable with homology, keywords, protein molecular weights, and pI.
MBGD
,
Ortholog/ homolog of microbial genomes
A database of microorganism full-length genomes and the orthologous/homologous relations between the genes.
LigandBox
The database contains 3D images of all ligands in KEGG DRUG. All images are created by molecular simulation system, myPresto.
Chemical compounds relevant to life?
,
LIGAND
A database of chemical compounds and the reactions which are relevant to biological processes. It consists of COMPOUND, DRUGS, GLYCAN, REACTION, RPAIR, ENZYME (from Enzyme Nomenclature).
Database of kinase, pathways
,
Kinase Pathway Database
A database of protein kinases of major eukaryotes with completely sequenced genomes. Protein classes and functions, orthologous relations between species, protein interactions, protein domains, and protein structures are registered as well. Protein interactions were collected with natural language processing of literature.
Kaiko Genome Automated Annotation System (KAIKO GAAS)
A database of predicted genes and functional regions on silkworm genomes (BAC and WGS assemblies). The annotations were conducted algorithmically, and GeneScan, FGENESH, MZEF, SplicePredictor, BLAST, HMMer, ProfileScan, MOTIF, tRNAscan-SE, PSORT, SOSUI were used.
KATANA
Collection of Arabidopsis gene annotations from various databases and the summarization to searchable and referable formats.
KAIKO BLAST
Annotations of protein characteristics and function
,
InterPro
A database that contains information on the protein family domains, and functional parts that were collected from several sources and were integrated by specialists. Amino acid sequences are collected separately as InterProtKB.
Annotations of proteins related to the immune system
,
IMGT/LIGM-DB, the IMGT comprehensive database of immunoglobu
It is a database that collected immunoglobulin (IG), T cell receptors (TR), major histocompatibility complexes (MHC), and other proteins related to immunity (RPI) of human and other vertebrates. The database consists of five datasets of IMGT/LIGM-DB (IG and TR sequences), IMGT/PRIMER-DB (primers to IG and TR), IMGT/GENE-DB (human and mouse immunity-related genes that are organized based on the genomes), and IMGT/3Dstructure-DB (immunity related genes with known 3D structures).
Het-PDB Navi
3D structures of low molecular weight molecules that appear in PDB entries are collected and classified with the molecular types.
Analysis results of hepatitis virus gene, lineage
,
Hepatitis Virus Database
Evolutional lineage analyses of Hepatitis viruses (types B, C, E) genes are registered. An algorithmic analysis of hepatitis virus sequences from DDBJ INSD entries. It is updated after major updates.
Automatically annotated whole human genome
,
HOWDY
Human genome data are collected from databases in public domain from all over the world, and each entry is located on chromosomes, to relate the entries with each other.
HAL
A database of human genes that have been discovered using original algorithms. The genes seem to have been defined from integration of predicted genes in various databases and various prediction algorithms. There are datasets for human, chimpanzee, mouse, rat, dog, and chicken.
Glycan
,
Glycome related pathways
A database of sugar chains and the relevant pathways that have been collected from KEGG, CarbBank, and literature. Tools to generate possible structures of sugar chains are included. A part of KEGG LIGAND.
, Automatically annotated microbial genomes
,
Genome Information Broker
It is a database of full-length genomes mainly of microorganisms. Structures, gene names, functions, and characteristics are allocated to each genome's ORFs and are contained in the database.
GTOP
,
Prediction of protein structures
A database of all the predicted or confirmed ORFs on known genomes, as well as prediction of 3D structures and the functions of the encoded proteins. The database contains 3D structure prediction (by Reverse PSI-BLAST), analyses of protein functions (by BLAST), motif analyses (by PROSITE), gene family classification (by Pfam), prediction of transmembrane regions (by SOSUI), prediction of coiled-coil regions (by Multicoil), and analyses of repetitive sequences (by RepAlign).
Database of G-protein coupled receptor ligands
,
GPCRDB
A database that obtained information on the G-protein coupled receptors from public databases, added to analyses, and organized them. It contains sequences, mutation positions, 3D structures (PDB-derived data and predictive models), ligand binding constants, multiple alignments for each family and the lineage trees.
Database for G-protein coupled receptor (GPCR) and ligand interaction
,
GLIDA
Interactions between G-protein coupled receptors and the ligands are collected from public databases and linked.
GENES
,
Gene sequences of known genomes
Genes on the known genomes are collected from public data, and the ID numbers are allocated to use them in an integrated way with other KEGG systems.
Database of functional repeats in mouse cDNA
,
FREP
A database of repetitive sequences in CDS regions of mouse cDNA sequences that are predicted to be functional. Locations on cDNA/genome, polyA signal locations, motifs in translated proteins, and related MeSH terms are registered.
FACTS
,
Link between mouse cDNA, and literature
A database that mapped literature information on RIKEN mouse full-length cDNA clones. It consists of links between predicted functions based on sequence homology searches and information of protein functions obtained from keyword searches in literature.
EpoDB - Erythropoiesis Database
,
Genes related to red blood cell hematopoiesis
A database that has collected and organized genes that are expressed in hematopoiesis of vertebrate red blood cells, as well as their sequences and expression information, from public databases
Automatically annotated genomes
,
Ensembl
A database of algorithmic annotation on eukaryotes (especially vertebrates) with genome sequences in public domain. Annotation platform is also provided, and many other groups use it as their own genome database construction.
EXPRESSION
,
Microarray gene expression data
This database was constructed in order to integrate microarray gene expression data to KEGG genome and KEGG pathway. Gene expression data for Cyanophyceae, Bacillus subtilis, E.coli, human, and budding yeasts are registered.
, Gene expression of imprinted genes in mouse
,
EICO DB
A database of imprinted gene candidates of the mouse and the gene expression confirmed with microarrays. SNPs on the genes and relevant human genome regions are also contained. It is aimed for exploration of new imprinted genes.
DBGET
,
Simple search system of KEGG
It is a simple interface to make keyword searches in KEGG and generic molecular biology databases.
DB-SPIRE
DART
Cytokine Signaling Pathway Database
,
Signaling pathway of cytokine
Information related to cytokine signaling pathways are collected. Ligand-receptor relations of the chemokines, 3D structures and domain structures of receptors, interspecies lineages of receptors, a list of kinases and links to other databases are registered.
Cytokine Family cDNA Database (dbCFC)
,
Genes, and proteins of cytokines
The database and a portal site has collected information on cytokine genes, cDNA, proteins from public databases, and arranged them by the families and by the genes. All the detailed information is the links to original databases.
ConfC
CUTG - Codon Usage Tabulated from GenBank
,
Database of codon usage frequency
A database of codon usage probability in various species based on GenBank CDS.
CSDBase - Cold Shock Domain database
Cold shock domain-containing proteins
CSA - Catalytic Site Atlas
Enzyme active sites and catalytic residues in enzymes of known 3Dstructure
COG - Clusters of Orthologous Groups of proteins
,
Orthologous groups of proteins
A database that classified the proteins of prokaryotes/eukaryotes with sequenced genomes into orthologous groups. The groups are categorized by functional categories (GO annotations). A database that classified predicted orthologs on the eukaryote genomes in the same way is separately opened up as KOG.
Conserved Domains and Protein Classification
Curated alignments of protein domains from Pfam, SMART and COG databases
BodyMap-Xs
,
Gene expression organized and compared across species
The database enables inter-species comparison of gene expression patterns based on orthologous relations in the organs and the genes between species. Breakdown of the cDNA libraries into source organs is conducted using original taggers that automatically analyze material description from the latest dbEST (an EST division of DDBJ), while the genes are classified based on UniGene. The genes between species are linked using InParanoid ortholog relations.
Blocks
,
Highly conserved regions of proteins
A database that aligned highly conserved regions without gaps for each known protein family.
Alternative Splicing and Transcription Archives (ASTRA)
Alternative exon patterns by alternative splicing are predicted by comparison of the genome and cDNA. Target species are human, mouse, Drosophila, worm, Arabidopsis, and rice.
AMINOACYL-tRNA SYNTHETASES DATABASE
,
Database for aminoacyl-tRNA synthetases
A database of known aminoacyl-tRNA synthetase (an enzyme that is believed to have appeared in the early stages of evolution). They are classified by the species and corresponding amino acids.
Predictions of G-protein coupled receptor genes
,
SEVENS database
A database of predicted genes encoding G-protein coupled receptors (proteins with seven transmembrane helices) from known genome sequences.
3DinSight
Several protein databases are linked to enable inter-database tracing of the links between the database entries.
3DMET : database collecting three-dimensional structures of natural metabolites.
BioResource
Pathogenic microbes
NBRP Oryzabase
,
Rice research portal site
A portal site to data and resources related to rice researches. Information related to species (lineages, wild species, mutants), gene expression (relationships with tissues and developmental stages),
KOMUGI
,
Wheat Genetic Resources Database
National BioResource Project (NBRP)::C.elegans
The Japanese Morning Glory (=Asagao)
National BioResource Project (NBRP Information Site)
This is a metadata site connecting resource centers in 28 institutes. It has the most organized list of resource sites in the other institutes in Japan. The URL is http://shigen.lab.nig.ac.jp/wgr/jgr/jgrUrlList.jsp. National BioResource Project (NBRP) is a national project that arranges systematic collection, conservation, and distribution of bioresources (animals and plants for experiments, cells, and genetic materials such as DNA. Here, it is treated as the synonym of biological gene resources) that are in a wide use of life science researches as experiment materials and that the Japanese government especially acknowledges its importance.
Distribution of Full-length Human cDNA Clones
NBRC Culture Catalogue
JMSR(Japan Mouse Strain Resources Database)
JCRB Cellbank
,
Japanese Collection of Research Bioresources (JCRB)
JCRB consists of gene bank (http://genebank.nibio.go.jp/) and cell bank (http://cellbank.nibio.go.jo/cellbank.html), and experiment animals research resource bank (http://animal.nibio.go.jp/). The gene bank develops, conserves, and distributes research resources that are the basis of disease and drug discovery researches; the cell bank accepts deposition of, conserves and distributes cell lines; experiment animal research resource bank provide services for the distribution and nutrition of mice, including production of frozen embryo and sperms, as well as the fertilization and the development. These resources are distributed with a fee in Japan and abroad from the master bank of JCRB via human science research resource bank (HSRRB) (http://www.jhsf.or.jp/index_b.html)
Clone Registry
,
a database for genomic clones and libralies
a database that integrates information about genomic clones and libraries, including sequence data, genomic position, and distributor information.
CARD R-BASE
BRC JCM
RIKEN BRC DNA BANK
RIKEN Bioresource Center CELL BANK
AddGene
A Website of Addgene, an NPO, which distributes plasmids to make iPS cells.
Knowledge Model
Taxonomy
,
The NCBI Entrez Taxonomy
The NCBI taxonomy database contains the names of all organisms that are represented in the genetic databases with at least one nucleotide or protein sequence. Click on the tree if you want to browse the taxonomic structure or retrieve sequence data for a particular group of organisms.
Dictionary of carbohydrate antigens and antibodies
,
Sugar chain database (GlycoEpitope)
A dictionary of carbohydrate antigens and antibodies that recognize them. Information on the antigen include sugar chains, antibodies that recognize them, glycoproteins having the antigens, glycolipids having them as a building block, and the enzymes participating in the biosynthesis and degradation. Information on the antibodies include antibodies, sugar chain sequences that they recognize, cases of immunoprecipitation, immunobloting, and histochemistry experiments using them, and places to obtain them.
Database for Protein-Ligand Interactions
,
Dictionary of protein-ligand interactions
A database of protein-ligand interactions from literature. The database contains information on ligands (name, molecular structures, molecular weights, etc), proteins (name, organisms, ID numbers in PIR/SWISS-PROT/PDB, etc) and experiments (binding/inhibitory activity, etc), and the citations.
A dictionary of technical terms in the life sciences
,
LIFE SCIENCE DICTIONARY PROJECT
Dictionaries of technical terms that are used in life sciences. English-Japanese, Japanese-English, and co-occurrence searches are available, and the usages of the terms in article abstracts can be confirmed.
Multidimensional genome annotation viewer
,
OmicBrowse
A genome annotation database browser that have been constructed in the form of OmicSpace. OmicSpace is a genome annotation database described with a dedicated XML called OSML (OmicSpace Markup Language). To describe interactions, the data are located on a two-dimensional plane the coordinates of which are genomes. The planes are piled up in the order of genomic, transcriptomic, proteomic, metabolomic, phenomic planes, and this enables traces between planes. In addition, dataset coordinate and ontology coordinate are defined. Data are collected from literature, and interaction data are collected from KEGG and MetaCyc.
Transcription Product Database (TraP)
Knowledge model of signal transduction pathways
,
The Signaling PAthway Database (SPAD)
A database that is a collection and visualization of signal transduction pathways. The pathways are classified as growth factors, cytokines, hormones, and stresses, in correspondence to extracellular signal molecules.
Knowledge model ?of gene ontology
,
The Gene Ontology (GO) project in 2006.
A database that provide structured and controlled vocabularies to describe the genes, gene products, and the sequences.
Database on Translational Signals
Knowledge model ?of transmembrane proteins
,
TMPDB
Experimentally verified transmembrane proteins have been collected from literature and are topologically classified.
SIDER Side Effect Resource
Rice Research
Dictionary of restriction enzymes
,
REBASE
A database of restriction enzymes, DNA methyltransferases, and proteins related to modification of restriction sites. It contains Enzyme sequences, 3D structures, sources, recognition/cutting sequences, similar enzymes, methylation susceptibility, and commercial distribution sites. The database is accessible with WWW, and is downloadable.
Dictionary of thermodynamic parameters of proteins
,
ProTherm
A database of thermodynamics in wild type and mutated proteins mainly focusing on thermodynamics parameters, as well as secondary structures, contactability, experimental conditions, methods, and protein activity. Parameters included are Gibbs free energy, enthalpy changes, heat capacity changes, and phase transition temperature.
Knowledge model ?of thermodynamic interactions between proteins and nucleic acids
,
ProNIT
A database of thermodynamical interactions between proteins and nucleic acids that have been experimentally decided.
PhenoSITE :Phenotype Semantic Information with Terminology of Experiments
,
Standardized vocabulary for describing phenotypes of mouse mutants
Terms for mouse mutant phenotypes are hierarchically defined in a similar form to GO. The database consists of four categories of Simple category for GSC mouse, Extend representation build of GSCMPE, Mammalian Phenotype (used in MGI , a Jackson Lab, not restricted to the mouse), Mouse adult gross anatomy. Standardized methods (protocols) of phenotypic screening and the terms used are defined. The database targets ENU-induced mouse mutants established in RIKEN, and a table of the phenotypes and the chromosomal mapping is registered.
PLAnt Cis-acting Regulatory DNA Elements Database
,
PLAnt Cis-acting Regulatory DNA Elements Database (PLACE)
A database of plant cis-element motifs collected from literature. Downloadable.
Knowledge model of operons
,
ODB
A database of operons from various species that have been collected and reconstructed from literature and databases. A function to explore operon candidates based on prediction is provided.
Dictionary of standard NMR spectrum
,
PRIMe: SpinAssign
Reference NMR chemical shifts of metabolism products (chemical compounds) have been collected. Details in the contributing databases are unknown, but ExPASy Biochemical Pathways, BMRB (BioMagResBank, www.bmrb.wisc.edu), SDBS (organic chemical compounds spectrum database, www.aist.go.jp/RIODB/SDBS/cgi-bin/cre_index.cgi), NMRShiftDB (http://www.nmrshiftdb.org/) seem to be used. Articles of database usage cases targeted to plants include 17035691.
Mesh
,
detailed information about NLM's controlled vocabulary
MeSH is the U.S. National Library of Medicine's controlled vocabulary used for indexing articles for MEDLINE/PubMed. MeSH terminology provides a consistent way to retrieve information that may use different terminology for the same concepts
MDeR
,
The metadata element repository in life science
KEGG Pathway
,
Knowledge model ?of biomolecular interactions/reactions?
A database collecting pathways (molecular interactions). Metabolism maps, inter/intracellular information processing maps, human disease association maps are registered.
INOH pathway database
Genomic Object Net Pathway Database
GENA (Gene Name Dictionary)
Database of Genomic Variants
The objective of the Database of Genomic Variants is to provide a comprehensive summary of structural variation in the human genome. We define structural variation as genomic alterations that involve segments of DNA that are larger than >1kb. For the purpose of this database, we focus on variants that are not directly correlated with specific phenotypes. The Database of Genomic Variants provides a useful catalog of control data for studies aiming to correlate genomic variation with phenotypic data. The database is continuously updated with new data from peer reviewed research studies. We always welcome suggestions and comments regarding the database from the research community.
DBTGR
,
Knowledge model ?of Tunicate gene expression regulation
A database of Ascidian gene expression loci, control regions, promoter sequences, and transcription factors. It registers inter-species comparison of promoters (C.intestinalis vs. C.savignyi).
Bacillus subtilis (soil bacterium) promoter/Knowledge model? of transcription factors,
,
DBTBS
A database of promoters, transcription factors, and controlled genes of Bacillus subtillis. Only the experimentally verified targets from literatures are collected.
CleanEx
Expression reference database, linking heterogeneous expression data tofacilitate cross-dataset comparisons
Cell System Markup Language (CSML)
Biomarker Candidates
,
Biomarker search service
A service (and the supporting databases) to search biomarker candidates from given keywords. User-specified keywords are searched in medical documents (originated from Medline, OMIM, and PPI), and chemical compounds that appear specifically in the hit documents are listed as marked candidates. At the same time, genes/proteins to the keywords are searched in the same way, and the chemical compounds corresponding to the genes/proteins are searched, and the hit results are assumed to be the marker candidates. The locations of the hit compounds in the documents can be referred to.
BioTermNet
BRITE
,
Knowledge model ?of functional hierarchies and binary relationships of biological systems
A database of hierarchical expressions of the relationships in biological systems. Gene orthologs, protein families, protein interactions, chemical compounds and the reactions, drugs, and diseases are included.
BRENDA
,
Dictionary of enzymes
A database that manually extracted information on enzymes from literature. The information is organized based on EC numbers, and includes structures sequences, functions, reaction characteristics, isolation methods, stability, source organisms/tissues/localization, and relations to diseases.
AAindex
,
Dictionary of physiochemical and biological properties of amino acids
A database of physicochemical and biological index of 20 amino acids and the matrices of the similarity between them.
A Database of Enzyme Catalytic Mechanisms (EzCatDB)
,
Knowledge model of enzyme catalytic mechanisms
Classification of the enzymes registered in PDB and SWISS-PROT according to the domain structures, EC numbers, catalytic mechanisms, and ligand structures. The information sources are literature and PDB entry descriptions.
Dictionary
Genotype and Phenotype
,
dbGaP
The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits. The advent of high-throughput, cost-effective methods for genotyping and sequencing has provided powerful tools that allow for the generation of the massive amount of genotypic data required to make these analyses possible.
PubMed
,
biomedical literature citations and abstructs
PubMed lets you search millions of journal citations and abstracts in the fields of medicine, nursing, dentistry, veterinary medicine, the health care system, and preclinical sciences. It includes access to MEDLINE® and to citations for selected articles in life science journals not included in MEDLINE. PubMed also provides access to additional relevant Web sites and links to the other NCBI molecular biology resources.
OMIM
,
Online Mendelian Inheritance in Man
OMIM is a comprehensive, authoritative, and timely compendium of human genes and genetic phenotypes. The full-text, referenced overviews in OMIM contain information on all known mendelian disorders and over 12,000 genes. OMIM focuses on the relationship between phenotype and genotype. It is updated daily, and the entries contain copious links to other genetics resources.
OMIA
,
Online Mendelian Inheritance in Animals
Online Mendelian Inheritance in Animals (OMIA) is a database of genes, inherited disorders and traits in animal species (other than human and mouse) authored by Professor Frank Nicholas of the University of Sydney, Australia, with help from many people over the years. The database contains textual information and references, as well as links to relevant records from OMIM, PubMed, Gene, and soon to NCBI's Phenotype database.
Pathogenic Fungi Database (PFDB)
Japanese Ant Image Database
Japan Drosophila Database
Mammalian Crania Photographic Archive
Protist Information Server
This server is providing 61,094 images of protists and other microorganisms (714 genera, 3067 species and 11855 samples) and 1294 movie clips as research and educational resources. This database is supported by the Soken-Taxa project Construction of Biological Image Databases (1997-1999) at The Graduate University for Advanced Studies, and by the Bio-Resource project Fundamental research and development for making databases and networking culture collection information (1997-2001) at JST (Japan Science and Technology Corporation). The database has received grant-in-aid (07558052).
Database for human disease-associated gene mutations
,
KMDB/MutationView
Gene mutations related to human diseases have been collected from OMIM, GDB, and literature, and the loci were mapped to the chromosomes, and the diseased sites were mapped to human body/organs. Structural changes (in the amino acids and in splicing, etc) in the genes due to mutations are registered. Seven sub-databases for disease categories have been constructed(KMaiDB, KMboodDB, KMbrainDB, KMcancerDB, OMearDB, KMeyeDB, KMheartDB, KMmuscleDB, KMsyndromeDB)
DNABook
アクアDNAブック
Dictionary of protein, 3D structures
,
eProtS: Encyclopedia of Protein Structures
3Dstructures and the proteins are explained for biologically important proteins. The registered proteins are classified into groups.
Dictionary of Wnt proteins
,
Wnt Database
A portal site to information on Wnt proteins (highly-conserved proteins that control intercellular interactions in the embryogenesis). The database formats differs by the organisms and the topics. Signal pathway information (images) is included.
Rice Microarray Opening Site (RMOS)
Overview of protein mutant data (complied from the literature)
,
PMD
A literature database of protein mutations. Literature information includes authors, journals and the pages, abstracts, organism species, Nitrate end sequences, mutation locations and the patterns, links to Swiss-prot, PIR, and PDB.
Dictionary of protease and protease inhibitors
,
Prolysis
It is a database of proteases and protease inhibitors. It contains biochemical properties, and 3D structures of the proteases and the inhibitor structures and characteristics. It provides a tool to search for protease digesting positions from input amino acid sequences,
A dictionary of protein, sequences and related literature
,
PRF
PRF/LITDB (protein literature database), PRF/SEQDB (amino acid sequence database), PRF/SYNDB (synthetic chemical compounds database) are registered. All the data are obtained from literature.
Dictionary of E.coli gene
,
PEC
NBRP Oryzabase
,
Rice research portal site
A portal site to data and resources related to rice researches. Information related to species (lineages, wild species, mutants), gene expression (relationships with tissues and developmental stages),
Database for lipids
,
LipidBank
A database of structures, physical and chemical characteristics, spectrum data, metabolic pathways, fatty acids compositions, and citations, of bioactive lipids. The structures are provided in a ChemDraw format.
KMsyndromeDB
It is a sub-database of MutationView. It registers data related to Waardenburg syndrome and QT elongation syndrome.
KMmuscleDB
It is a sub-database of MutationView. It registers data related to Duchenne muscular dystrophy and Fukuyama muscular dystrophy.
KMheartDB
It is a sub-database of MutationView. It registers data related to cardiac myopathy.
KMeyeDB
It is a sub-database of MutationView. It registers data related to retinitis pigmentosa, glaucoma, and cataract.
KMearDB
It is a sub-database of MutationView. It registers data related to deafness.
KMcancerDB
It is a sub-database of MutationView. It registers data related to breast cancer, retinoblastoma, and neurofibromatosis.
KMbrainDB
It is a sub-database of MutationView. It registers data related to Parkinson disease and Alzheimer disease.
KMbloodDB
It is a sub-database of MutationView. It registers data related to chronic myelocytic leukemia.
KMaiDB
It is a sub-database of MutationView. It registers data related to APECED and APT1 (an autoimmune disease and the related gene?). The information source is literature. The database is based on remapping works by hand that summarized the coordinates of the mutations of Mendelian diseases to a common coordinate.
Dictionary of Drosophila
,
J*FLY
The resource contains a list of known Drosophila genes, experiment protocols with movies, morphology photos, stock centers, and laboratories and researchers in Japan and abroad.
HGMD® - Human Gene Mutation Database
Known (published) gene lesions underlying human inherited disease
GiiB-JST mtSNP
,
Human mitochondrial genome polymorphism database
Distribution of polymorphism in mitochondria genomes from patients of seven diseases (96 patients for each disease) has been analyzed. This is a database of the functional differences between individuals related to the paired polymorphisms. 672 INSD entries of mitochondria genomes cite the same article.
Gallery of Biomolecules
Dictionary of drug transporters
,
Drug transpoter DB
A database of information on drug transporters. It contains transporters, tissues expressing the genes, targeted chemical compounds, drug-drug interactions, knockout mice/rats for the genes, pathophysiology for the genes, genetic polymorphisms, and relevant inherited diseases.
Database for Genetic Engineering of Microalgae
COPE
Cytokines On-line Pathfinder Encyclopedia 全サイトカインに対する辞書的説明。 自動と手作業の混合。 ただで見えるが明示的に著作権の存在を宣言してる。 長く続けている老舗で 内容はよい評価を得ているようです。 サイトカインの専門家の意見がききたいところ。
Atlas of Genetics and Cytogenetics in Oncology and Haematolo
,
Overview of the cytogenetics of cancer cells
A database that has collected cytogenetics information and clinical information of tumors and cancer-related diseases. Genes (nucleic acids, proteins, mutations, related diseases), cytogenetics/clinical information (clinical manifestation, cytogenetics data, related genes, complex genes and fused proteins), cancer-related diseases (inheritance modes, clinical manifestation, risk of tumorigenesis, cytogenetics data, related genes, proteins, mutations) have been collected separately. Clinical case reports are collected.
Ancient Genome Encyclopedia : AGE
Analysis Service
PRIMe: Correlated Gene Search
,
Tool that searches for correlated Arabidopsis genes
It provides a service that makes searches in pre-computed correlations in Arabidopsis GeneChip gene expression data. Genes that have correlation with genes as the search key are output with the correlation coefficient and the annotation. The base correlation analysis is public as ATTED-II (www.atted.bio.titech.ac.jp).
RIKEN Hub Database Project
GenomeNet
Cluster cutting tool for gene expression data
,
PRIMe: Cluster Cutting
A tool to extract gene clusters containing specified genes from clustered GeneChip gene expression data. The results are graphically displayed with Java, and further extraction of a part of the tree by referencing the tree is possible. Presently, Affymetrix GeneChip gene expression data of Arabidopsis conducted in RIKEN and MaxPlanck are prepared on the server.
Multidimensional genome annotation viewer
,
OmicBrowse
A genome annotation database browser that have been constructed in the form of OmicSpace. OmicSpace is a genome annotation database described with a dedicated XML called OSML (OmicSpace Markup Language). To describe interactions, the data are located on a two-dimensional plane the coordinates of which are genomes. The planes are piled up in the order of genomic, transcriptomic, proteomic, metabolomic, phenomic planes, and this enables traces between planes. In addition, dataset coordinate and ontology coordinate are defined. Data are collected from literature, and interaction data are collected from KEGG and MetaCyc.
siDirect
SayaMatcher
Rice PIPELINE
Protein Interaction Prediction Server (PIPS)
PosMed (Positional MEDLINE)
,
Tool that assists the search of literature on positional cloning
It provides a service in which from given keywords, corresponding genes, chromosomal regions, and biological interactions are searched, inferred, and output, with the original articles. First, specified keywords are full-text searched from literature databases such as MEDLINE. Then, genes and symbols that appear in the hit documents with significant specificity are extracted, statistically tested, and rank ordered. In this, in case chromosomal regions are specified, only the hits in corresponding regions are targeted. In addition, Genes related to the key genes are searched for by referring to biological interaction information in OmicSpace, and the corresponding information (including original articles) are output. Detailed information display uses OmicBrowse.
Full-text search tool for biological literature
,
OmicScan
A service to search for biological objects (e.g. terms defined in OmicSpace) indirectly related to given keywords. User-specified keywords are full-text searched in biological documents (Medline, OMIM, PPI, and probably, OmicSpace itself), and biological objects that have been inferred from specific terms in hit documents and that are directly/indirectly related to the keywords are listed. The details are referred to OmicBrowse.
MusBanks
,
Portal site of mutant mouse strain
A service to search and infer corresponding genes and the mutations from given keywords. PosMed is used to make inferences. First, specified keywords are MEDLINE-searched. Then, genes and symbols that significantly appear in the hit documents are extracted statistically tested, and rank ordered. In addition, mutants for the genes are output. The system seems to contain lists for the genes and mutants as databases.
MEDIE
MEDIE is an intelligent search engine to retrieve biomedical correlations from MEDLINE, based on indexing by Natural Language Processing and Text Mining techniques.
KEGG DAS
GSCope3
,
Ranking of pathways with SOM clustered microarray data
A tool to analyze the correlation between gene expression data clustered with SOM and KEGG pathway data to rank and extract pathways that are likely to be supported by gene expression data. As inputs, KEGG pathways, SOM cluster, an appendix file that connect the both IDs are necessary.
Biomarker Candidates
,
Biomarker search service
A service (and the supporting databases) to search biomarker candidates from given keywords. User-specified keywords are searched in medical documents (originated from Medline, OMIM, and PPI), and chemical compounds that appear specifically in the hit documents are listed as marked candidates. At the same time, genes/proteins to the keywords are searched in the same way, and the chemical compounds corresponding to the genes/proteins are searched, and the hit results are assumed to be the marker candidates. The locations of the hit compounds in the documents can be referred to.
BRITE - Biomolecular Relations in Information Transmission a
Molecular interactions and pathways database, part of the KEGG system
ALICE
Catalog
Gene Expression Omnibus (GEO) Overview
Fungus and Actinomycete Gallery
PRIMe: Platform for RIKEN Metabolomics
,
Portal site for Metabolome Research Group, Plant Science Center, RIKEN
A portal site to RIKEN Plant Science Research Center NMR reference spectrum, Metabolic products reference spectrum, Gene correlation search, Gene cluster extraction software. It also registers Secondary metabolic products database (NAIST KNApSAcK, kanaya.naist.jp/KNApSAcK/) and BL-SOM clustering.
The Union of Japan Societies for Systematic Biology, academic community homepage
,
the Union of Jaoanese Societies for Systematic Biology
The Union of Japan Societies for Systematic Biology was established as federation of academic societies in Japan that are committed to classification of organisms, for the purpose of promoting research and education of systematics in general, and contributing to the dissemination and expansion of systematics fields. This federation is organized and managed by academic societies and communities that agree with the spirit of the federation."
Biodiversity Websites in Japan
,
Index to biological diversity web sites
The site lists 448 Web sites of biological diversity, still claiming that the list is under construction.
biometadatabase
,
world, molecule database, catalogue
In the future of database interoperability, programs need to lookup databases to connect automatically. It is a machine-readable database catalogue. An entry for a database describes interface programs for searches and analyses. It is also readable to human.
WINGpro
The Wellcome Trust Sanger Institute
Sanger Centre (Hinxton / U.K.)
Pathguide
MEDALS
,
METI Life science integrated database portal site
A portal site for the MEXT “Integrated Database Project”
,
LSDB
It is a database of the outcomes from MEXT "Integrated Database Project, consisting of the development and the management of the integrated database, connection to literature information and annotation to data, medical sciences database integration in the field of chemical compounds, drugs, clinical data, and diseases in task-sharing institutes (Tokyo, TMD, Kyoto), acceleration of database integration by institutes solving supplement tasks (RIKEN, AIST, NIG, KyuTech), accepting useful databases that are hard to maintain, and human resource development for database development.
KEGG
,
Portal site of KEGG
A top page of the whole KEGG system. Form here, PATHWAY, BRITE, GENES, LIGAND, DRUG, CLYCAN, REACTION, and KAAS are linked.
J-GLOBAL
INSD overview and search
GBIF Portal
DBSB
,
Database for Systems Biology
Annotation
popset
,
population study data set
The PopSet database contains aligned sequences submitted as a set resulting from a population, phylogenetic, or mutation study. These alignments describe such events as evolution and population variation. The PopSet database contains both nucleotide and protein sequence data.
Structure
,
three-dimensional macromolecular structures
The resources developed by the Structure Group of the NCBI Computational Biology Branch (CBB) are freely available to the public and focus on four areas:
Protein
,
Protein sequence database
The protein entries in the Entrez search and retrieval system have been compiled from a variety of sources, including SwissProt, PIR, PRF, PDB, and translations from annotated coding regions in GenBank and RefSeq.
Genome Project
,
genome project information
The NCBI Entrez Genome Project database is intended to be a searchable collection of complete and incomplete (in-progress) large-scale sequencing, assembly, annotation, and mapping projects for cellular organisms. The database is organized into organism-specific overviews that function as portals from which all projects in the database pertaining to that organism can be browsed and retrieved
Gene
,
gene-centered information
Entrez Gene is a searchable database of genes, from RefSeq genomes, and defined by sequence and/or located in the NCBI Map Viewer
GEO profiles
,
expression and molecular abundant profiles
This database stores individual gene expression profiles from curated DataSets in the Gene Expression Omnibus (GEO) repository. Search for specific profiles of interest based on gene annotation or pre-computed profile characteristics. GEO Profiles facilitates powerful searching and linking to additional information sources.
CDD
,
conserved protein domain database
Conserved domains are functional units within a protein that have been used as building blocks in molecular evolution and recombined in various arrangements to make proteins with different functions.
Database for Genes Contributing to Sustainable World
It is a database of annotation targeted to more than 5,000,000 gene candidates that have no assignment of functions in public databases and that were originated from a large number of nucleic acid sequences collected in large-scale sequencings of environmental biological samples (metagenome analyses), which are advanced in all over the world to discover genes that might be useful to the cleanup and conservation of natural environment, leading to a sustainable society. It is a part of human resource development program of database integration project.
tRNA gene database curated manually by experts
,
tRNADB-CE
Students had initiatives in the prediction of tRNA candidate sequences in a large part of the prokaryotes' DNA sequences, including fragment genome sequences of uncultured microorganisms, using three programs (tRNAscan-SE, ARAGORN, tRNAfinder). In case of different prediction among the programs (about three percent of the predicted sequences), three tRNA experiment experts (Hachiro Iguchi, former professor of Kyoto University, Akira Muto, professor emeritus of Hirosaki University, and Yuko Yamada, former lecturer of Jichi Medical College) made close inspection and registered in the database as tRNA. Because fragment genome sequences were added to analyses target, more than 140,000 tRNA genes were registered, which is four times larger than databases in the past. It is an outcome of integrated database project.
Annotation of yeast, introns
,
Yeast Intron Database
A database of budding yeast introns. Known introns were confirmed with microarrays, and those actually exist are put together with experimental data into a public database.
Database of functional RNAs that utilizes the UCSC Genome Browser
,
UCSC GenomeBrowser for Functional RNA
It is a database that displays functional RNAs on UCSC Genome Browser. One of the outcomes of METI functional RNA project.
Annotation of tmRNA sequences
,
The tmRNA website
A database of tmRNA information. It contains sequences, secondary structures (the bases in the sequences are colored by the structural elements), corresponding proteolysis tag peptides, as well as multiple alignments of all the tmRNA sequence sets and all the tag peptide sets. The sequences consist of those identified by direct sequencing and those obtained from public databases.
The Homeodomain Resource
,
a comprehensive collection of sequence, structure, interaction, genomic and functional information on the homeodomain protein family
The Homeodomain Resource is a curated collection of sequence, structure, interaction, genomic and functional information on the homeodomain family. The current version builds upon previous versions by the addition of new, complete sets of homeodomain sequences from fully sequenced genomes, the expansion of existing curated homeodomain information and the improvement of data accessibility through better search tools and more complete data integration. This release contains 1534 full-length homeodomain-containing sequences, 93 experimentally derived homeodomain structures, 101 homeodomain protein–protein interactions, 107 homeodomain DNA-binding sites and 206 homeodomain proteins implicated in human genetic disorders.
SubtiWiki
,
the Bacillus subtilis centred wiki SubtiWiki
Bacillus subtilis is the model organism for Gram-positive bacteria, with a large amount of publications on all aspects of its biology. To facilitate genome annotation and the collection of comprehensive information on B. subtilis, we created SubtiWiki as a community-oriented annotation tool for information retrieval and continuous maintenance. The wiki is focused on the needs and requirements of scientists doing experimental work. This has implications for the design of the interface and for the layout of the individual pages. The pages can be accessed primarily by the gene designations. All pages have a similar flexible structure and provide links to related gene pages in SubtiWiki or to information in the World Wide Web. Each page gives comprehensive information on the gene, the encoded protein or RNA as well as information related to the current investigation of the gene/protein. The wiki has been seeded with information from key publications and from the most relevant general and B. subtilis-specific databases. We think that SubtiWiki might serve as an example for other scientific wikis that are devoted to the genes and proteins of one organism.
Annotation of protein domain models
,
SMART
A database of manually constructed protein domain models. Domain families for signal transduction, extracellular, and chromatin-related families are registered. The database contains phylogenetical distribution, functional classes, 3D structures, and functionally important residues of the domains. Normal SMART is built upon SWISS-PROT, trEMBL, and Ensembl proteome (stable version). GenomicSMART is built upon proteins derived from completely sequenced genomes.
Annotation and classification of protein 3D structure
,
SCOP - Structural Classification Of Proteins
A database that hierarchically classified proteins with known 3D structures from the viewpoint of evolutional and structural relationships between the domains. Specialists make the classification manually.
Database of rice, genome/annotation
,
Rice Annotation Database (RAD)
A database of rice genes (including predicted genes) and transcripts that are mapped on rice PAC/PAC contigs. The database contains summarized tables of spliced site patterns, amino acid compositions, codon usage compositions, gene length, GO annotations, and MIPS functional classification.
Database for rice genome/annotation
,
Rice Annotation Project DataBase (RAP-DB)
A database of rice genome on which genes (including predictions), transcripts (including several plant ESTs other than the rice), BAC, and mutant information are mapped.
Annotation of protein motifs
,
Pfam
A database of shared protein domains, multiple alignment constructed from protein families, and HMM models for them. Pfam-A is a curated data set and Pfam-B is automatically constructed from PRODOM families.
OLIGAMI - OLIGomer Architecture and Molecular Interface
,
Quarternary Structural Classification of Proteins
OLIGAMI is a database of the verified coordinates (curated entries) and new chain formulas for biological molecules that allows you to browse oligomers through the SCOP hierarchy and to interactively view three-dimensional structures of biological molecules for all PDB entries.
Annotation of O- and C- ?glycosylated proteins
,
O-GLYCBASE
A database that collected and organized information from literature and public databases on glycoproteins with at least one experimentally verified O-linked glycosylation site. Each entry describes sequences, glycan types, O-linked Ser/Thr positions, N-linked Asn positions, C-linked Trp positions, source organisms, and citation information.
Functional annotations of peptidases and peptidase inhibitors
,
MEROPS: the peptidase database.
It is a database of peptidases and the inhibitor proteins. Peptidases and the inhibitors are hierarchically classified based on the structures. If the proteins are classified into protein families, the classification is based on them; otherwise (recent data) it is based on similarity (the latter is called CLAN). Low molecular weight inhibitors are also contained.
Annotations of protein characteristics and function
,
InterPro
A database that contains information on the protein family domains, and functional parts that were collected from several sources and were integrated by specialists. Amino acid sequences are collected separately as InterProtKB.
Annotations of proteins related to the immune system
,
IMGT/LIGM-DB, the IMGT comprehensive database of immunoglobu
It is a database that collected immunoglobulin (IG), T cell receptors (TR), major histocompatibility complexes (MHC), and other proteins related to immunity (RPI) of human and other vertebrates. The database consists of five datasets of IMGT/LIGM-DB (IG and TR sequences), IMGT/PRIMER-DB (primers to IG and TR), IMGT/GENE-DB (human and mouse immunity-related genes that are organized based on the genomes), and IMGT/3Dstructure-DB (immunity related genes with known 3D structures).
Annotations of 3D biopolymer structures, ?
,
IMB Jena Image Library
It is a database that reorganized PDB and NDB data in terms of molecular types and characteristics partial structures (SITE), and hetero compound.
Annotations of human genome variation
,
HGVbase
A database of human genome polymorphisms (mostly SNPs, but indels and tandem repeats are contained) that contains physical and functional relationships between the adjacent genes. The data has been obtained from public databases and literatures. In the past, deposition from research groups have been accepted, but it is stopped now. The database routinely exchange data with dbSNP.
An integrated database of human genes and annotation
,
H-Invitational Database
A database resulting of a Jamboree that partially manually made judges on the calculations and classifications of the clustering and the overlap relation analysis of the genes with cDNA sequences in INSD. It issues original gene Ids. The function information is based on Entrez gene because NCBI OMIM and Entrez are used. The substantial part of the full-length cDNA is made public by a METI/NEDO cDNA project.
Glycoconjugate Data Bank:Structure
GEO Dataset
,
experimental sets of GEO data
This database stores curated gene expression DataSets, as well as original Series and Platform records in the Gene Expression Omnibus (GEO) repository. DataSet records contain additional resources including cluster tools and differential expression queries.
Annotation of full-length mouse cDNA
,
FANTOM
A database of annotations to mouse full-length cDNA clone sequences. CAGE tags and GSC ditags information are also used to identify transcription start sites.
Dr. Zompo
,
Zostera marina and Posidonia oceanica ESTs
As ecosystem engineers, seagrasses are angiosperms of paramount ecological importance in shallow shoreline habitats around the globe. Furthermore, the ancestors of independent seagrass lineages have secondarily returned into the sea in separate, independent evolutionary events. Thus, understanding the molecular adaptation of this clade not only makes significant contributions to the field of ecology, but also to principles of parallel evolution as well. With the use of Dr. Zompo, the first interactive seagrass sequence database presented here, new insights into the molecular adaptation of marine environments can be inferred. The database is based on a total of 14 597 ESTs obtained from two seagrass species, Zostera marina and Posidonia oceanica, which have been processed, assembled and comprehensively annotated. Dr. Zompo provides experimentalists with a broad foundation to build experiments and consider challenges associated with the investigation of this class of non-domesticated monocotyledon systems. Our database, based on the Ruby on Rails framework, is rich in features including the retrieval of experimentally determined heat-responsive transcripts, mining for molecular markers (SSRs and SNPs), and weighted key word searches that allow access to annotation gathered on several levels including Pfam domains, GeneOntology and KEGG pathways. Well established plant genome sites such as The Arabidopsis Information Resource (TAIR) and the Rice Genome Annotation Project are interfaced by Dr. Zompo. With this project, we have initialized a valuable resource for plant biologists in general and the seagrass community in particular. The database is expected to grow together with more data to come in the near future, particularly with the recent initiation of the Zostera genome sequencing project.
DeinoBase
Annotations of E.coli DNA-binding sites of proteins
,
DPInteract
A database of protein binding sites on the E.coli genome. A recognition matrix is constructed based on known sites, and predicted sites based on the matrix are also registered. Each site is annotated.
Annotation of protein-protein interactions
,
DIP - Database of Interacting Proteins
A database that has collected and organized experimental evidences of protein-protein interactions from literature. The database evaluates the reliability of each entry based on experiment methods, and prepares the most reliable subset as CORE.
DBTGR
,
Knowledge model ?of Tunicate gene expression regulation
A database of Ascidian gene expression loci, control regions, promoter sequences, and transcription factors. It registers inter-species comparison of promoters (C.intestinalis vs. C.savignyi).
DBH2H
,
head-to-head (h2h) gene pairs
DBH2H collects head-to-head (h2h) gene pairs identified from human, mouse, rat, chicken and fugu genomes, and distinguishes the ortholog mapping relationship among them. The gene pairs in DBH2H are annotated with sequential features including single nucleotide polymorphisms, CpG islands and transcription factor binding sites, as well as functional terms and genetic disorders. In addition, the expression correlation information based on 117 microarray datasets is included. By providing user-friendly access to these data, DBH2H represents a valuable resource for further analyses of this important gene arrangement in terms of transcriptional regulation mechanisms, evolutionary conservation, disease relevance, etc.
CleanEx
Expression reference database, linking heterogeneous expression data tofacilitate cross-dataset comparisons
ChEBI - Chemical Entities of Biological Interest
Small molecules, atoms, ions and radicals of biological interest
Conserved Domains and Protein Classification
Curated alignments of protein domains from Pfam, SMART and COG databases
CCDS
,
Consensus CDS project
Annotation of genes is provided by multiple public resources, using different methods, and resulting in information that is similar but not always identical. The human and mouse genome sequence is now sufficiently stable to start identifying those gene placements that are identical, and to make those data public and supported as a core set by the three major public genome browsers. The long term goal is to support convergence towards a standard set of gene annotations.
NAR category
Functional genomics database of miRNA related to ultraconserved sequences
,
UCbase & miRfunc
UCbase & miRfunc is a database of (i) human, mouse and rat microRNAs and (ii) Ultraconserved elements providing information about function, expression and correlation between these classes of non-coding RNAs and the disorders related to their aberrant expression
Transcriptome annotation database of translation regulatory sequences contained in transcripts
,
Transterm
A database providing access to mRNA sequences and associated regulatory elements
Programs library to develop database malls coupled to a regulatory sequences annotation toolkit
,
The PAZAR database
PAZAR is a software framework for the construction and maintenance of regulatory sequence data annotations; a framework which allows multiple boutique databases to function independently within a larger system (or information mall). Our goal is to be the public repository for regulatory data.
PTM-Switchboard
,
ene database of budding yeast transcription factors (TF), TF targets, enzymes involve in posttranslational modification of TF.
PTM-Switchboard is designed to catalog known cases of TF-PTMs affecting gene transcriptions.
Genomics Databases (non-vertebrate)
Unicellular eukaryotes genome databases
Full-Malaria
,
Plasmodium falciparum (malaria), full-length cDNA Database
A database of full-length cDNA clones and the analyses of two Plasmodia causing human malaria and two Plasmodia causing mouse malaria. ESTs for each clone are mapped to the genome as well as to ESTs in the public domain. The database also contains confirmed homology between Plasmodia with TBLASTX (Other Plasmodia contigs were mapped to the template P. falciparum genome).
Human Genes and Diseases
Gene-, system- or disease-specific databases
RAPID
,
Resource of Asian Primary Immunodeficiency Diseases
Resource of Asian Primary Immunodeficiency Diseases (RAPID) is a web-based compendium of molecular alterations in primary immunodeficiency diseases.
Human and other Verteblate Genomics
Human ORFs
HGPD
,
Human Gene and Protein Database
Human Gene and Protein Database presents SDS-PAGE patterns and other informations of human genes and proteins. (
Model organisms, comparative genomics
Database of functional repeats in mouse cDNA
,
FREP
A database of repetitive sequences in CDS regions of mouse cDNA sequences that are predicted to be functional. Locations on cDNA/genome, polyA signal locations, motifs in translated proteins, and related MeSH terms are registered.
Immunological Database
IMGT/HLA
Polymorphic sequences of human MHC and related genes
Nucleotide Sequence Databases
Coding and non-coding DNA
TAndem Splice Site DataBase
STRBase
,
Short Tandem Repeat DNA Internet DataBase
RECODE
,
The database of the translational recoding events
Pseudogene.org
PANDIT
,
Protein and Associated Nucleotide Domains with Inferred Trees
DNA Replication Origin Database
,
OriDB
DNA Methylation Database
,
MethDB
MICAS
,
Microsatellite Analysis Server
Annotation of Full-Length, Intact L1 Elements
,
L1Base
InSatDb
,
Insect Microsatellite Database
Bacterial Insertion Sequences Database
,
IS Finder
HumHot
,
Human Meiotic Recombination Hot Spots
GyDB
,
Gypsy Database
A Scientific Wiki Networking devoted to the Molecular Diversity and Evolutionary Relationships of Mobile Genetic Elements, Viruses and related host genes.
GISSD
,
Group I intron sequence and structure Database
Database of functional repeats in mouse cDNA
,
FREP
A database of repetitive sequences in CDS regions of mouse cDNA sequences that are predicted to be functional. Locations on cDNA/genome, polyA signal locations, motifs in translated proteins, and related MeSH terms are registered.
ECRbase
Database of Evolutionary Conserved Regions, Promoters, and Transcription Factor Binding Sites in Vertebrate Genomes
DiProDB
,
Thermodynamic properties database of dinucleotides
The Dinucleotide Property Database is designed to collect and analyse thermodynamic, structural and other dinucleotide properties. The table presenting all the dinucleotide properties can be pruned and rearranged by different criteria. The database contains different export and analysis functions.
Ciliate IES-MDS database
,
Macronuclear and micronuclear genes in spirotrichous ciliates
CUTG - Codon Usage Tabulated from GenBank
,
Database of codon usage frequency
A database of codon usage probability in various species based on GenBank CDS.
CORG
,
COmparative Regulatory Genomics
A CLAssification of Mobile genetic Elements
,
ACLAME
Gene structure, introns and exons, splice sites
Gene Modeling with Alternative Splicing
A Knowledgebase for fusion sequences
Alternative Splicing and Transcript Diversity
Alternative Splicing DB
ARTADEdb
,
Arabidopsis exon detection, tool and validation result
A database of Arabidopsis exons detected with tiling arrays conducted in RIKEN. Programs that were used in the data analysis are also made public.
International Nucleotide Sequence Database Collaboration
Data bank of nucleic acid sequences in the United States
,
GenBank(R)
It is a databank that accepts deposition of nucleic acid sequences from researches in various countries. It routinely exchanges data with EBI (EMBL) and DDBJ.
Data bank of nucleic acid sequences in Europe
,
EMBL Nucleotide Sequence Database: developments in 2005.
It is a databank that accepts deposition of nucleic acid sequence data from researchers in European countries. It routinely exchanges data with U.S. NCBI (GenBank) and Japan DDBJ.
DDBJ - DNA Data Bank of Japan
,
Nucleic acids, sequence data bank of Japan
A databank that accept nucleic acids sequence data mainly from Japanese researchers. Data from all over the world can be collected by routinely exchanging data with U.S. NCBI (GenBank) and European EBI (EMBL).
Transcriptional regulator sites and transcription factors
MachiBase
,
a Drosophila melanogaster 5'-end mRNA transcription database
Organelle
Plant database
Arabidopsis thaliana
ATTED-II
,
Gene expression database of Arabidopsis coexpression
ATTED-II simply presents graphs of the gene expression pattern for each gene.
Protein sequence databases
Protein sequence motifs and active sites
Functional Database of Membrane Proteins
,
TMFunction
RNA sequence database
5S Ribosomal RNA Database
5S rRNA sequences
Structure Databases
Protein structure
Database of Protein interaction SITEs
,
PiSITE
PiSITE is a web-based database of protein interaction sites. The PiSITE provides not only information of interaction sites of a protein from single PDB entry, but also information of interaction sites of a protein from multiple PDB entries including similar proteins
GTOP
,
Prediction of protein structures
A database of all the predicted or confirmed ORFs on known genomes, as well as prediction of 3D structures and the functions of the encoded proteins. The database contains 3D structure prediction (by Reverse PSI-BLAST), analyses of protein functions (by BLAST), motif analyses (by PROSITE), gene family classification (by Pfam), prediction of transmembrane regions (by SOSUI), prediction of coiled-coil regions (by Multicoil), and analyses of repetitive sequences (by RepAlign).
AS-ALPS
,
Alternative Splicing - ALteration of Protein Structure
AS-ALPS (Alternative Splicing-induced ALteration of Protein Structure), is aimed at providing useful information to analyze effect of AS on protein interaction and network through alteration of protein structure.
Organism species
Animalia
Chordata
Aves
Chicken
DBH2H
,
head-to-head (h2h) gene pairs
DBH2H collects head-to-head (h2h) gene pairs identified from human, mouse, rat, chicken and fugu genomes, and distinguishes the ortholog mapping relationship among them. The gene pairs in DBH2H are annotated with sequential features including single nucleotide polymorphisms, CpG islands and transcription factor binding sites, as well as functional terms and genetic disorders. In addition, the expression correlation information based on 117 microarray datasets is included. By providing user-friendly access to these data, DBH2H represents a valuable resource for further analyses of this important gene arrangement in terms of transcriptional regulation mechanisms, evolutionary conservation, disease relevance, etc.
Chick Eye EST DB and Expression DB
Mammalia
Mouse
Gene expression profile of mouse brain during postnatal development
The gene expression profiles were analyzed using Affymetrix GeneChip in the mouse cerebellum developmental stages after birth (2001) and in the mouse embryo brains developmental processes (2005). (2001) and (2005) are supplement data for 15018818 and 15893506, respectively.
READ: Riken Expression Array Database
ANIMAL SEARCH SYSTEM
マウスDNAブック
Immunostaining images of whole mouse sections by all matrix proteins
,
mouse basement membrane bodymap
Immunostaining images of whole mouse sections by all matrix proteins Good data for matrix researchers and tissue engineers Poly/monoclonal antibodies against each of 44 matrix-proteins Immnunohistochemical images of E16.5 whole body embryo section list of target proteins http://www.matrixome.com/bm/EnterBodymap/Protein/protein.asp
Database of disrupted mouse genes by gene trap methods
,
The NAISTrap database or NAISTrap データベース
Mutant mouse ES cell lines were produced using random gene disruption by a new gene trap method (UPATrap). Partial trapped gene sequences as well as the homology search results are registered.
Database for mouse genome/annotation
,
The Mouse Genome Informatics (MGI)
An integrated database of mouse genome researches in Jackson Laboratory. MGD (sequences, gene definition, mapping, phenotypes, mutants, strains, comparative studies with other mammals), GXD (gene expression: collection from literature or submission by researchers), and MTB (information of a model mouse that generates tumors) are integrated.
ROUGE
,
cDNA of mouse, unidentified gene-encoded large proteins?/annotation
A database of cDNA clones (mKIAA/mFLJ) and the analyses from Kazusa mouse cDNA project. A mouse version of HUGE database.
Project Specific Custom Tracks
PhenoSITE :Phenotype Semantic Information with Terminology of Experiments
,
Standardized vocabulary for describing phenotypes of mouse mutants
Terms for mouse mutant phenotypes are hierarchically defined in a similar form to GO. The database consists of four categories of Simple category for GSC mouse, Extend representation build of GSCMPE, Mammalian Phenotype (used in MGI , a Jackson Lab, not restricted to the mouse), Mouse adult gross anatomy. Standardized methods (protocols) of phenotypic screening and the terms used are defined. The database targets ENU-induced mouse mutants established in RIKEN, and a table of the phenotypes and the chromosomal mapping is registered.
PDDB
,
the Prion Disease Database
Prion diseases reflect conformational conversion of benign isoforms of prion protein (PrPC) to malignant PrPSc isoforms. Networks perturbed by PrPSc accumulation and their ties to pathological events are poorly understood. Time-course transcriptomic and phenotypic data in animal models are critical for understanding prion-perturbed networks in systems biology studies. Here, we present the Prion Disease Database (PDDB), the most comprehensive data resource on mouse prion diseases to date. The PDDB contains: (i) time-course mRNA measurements spanning the interval from prion inoculation through appearance of clinical signs in eight mouse strain-prion strain combinations and (ii) histoblots showing temporal PrPSc accumulation patterns in brains from each mouse–prion combination. To facilitate prion research, the PDDB also provides a suite of analytical tools for reconstructing dynamic networks via integration of temporal mRNA and interaction data and for analyzing these networks to generate hypotheses.
NIG Mouse Phenotype Database
NIG Mouse Genome Database
MusBanks
,
Portal site of mutant mouse strain
A service to search and infer corresponding genes and the mutations from given keywords. PosMed is used to make inferences. First, specified keywords are MEDLINE-searched. Then, genes and symbols that significantly appear in the hit documents are extracted statistically tested, and rank ordered. In addition, mutants for the genes are output. The system seems to contain lists for the genes and mutants as databases.
Microsatellite markers of mouse strains
,
Mouse Microsatellite Data Base of Japan (MMDBJ)
A database of mouse microsatellite markers, as well as a repository that accepts new data from researches. Analyses showing SSLP differences between mouse strains are registered with PCR conditions. Emphasis is on Japanese mice (MSM and JF1).
Mouse Genome Databases
Differential gene expression profiles of mouse strains
,
Mouse DNA Microarray
Gene expressions compared between mouse C57BL/6J and 129X1SvJ strains in newborn brains, in adult spleens, and in adult livers. Agilent microarrays are used for the analysis. Supplement data for 15029957.
MGC - Mammalian Genome Collection
Full-length open reading frame clones for human, mouse, and rat genes
Database of functional repeats in mouse cDNA
,
FREP
A database of repetitive sequences in CDS regions of mouse cDNA sequences that are predicted to be functional. Locations on cDNA/genome, polyA signal locations, motifs in translated proteins, and related MeSH terms are registered.
FANTOM4
In FANTOM4 the focus has changed to understanding how these components work together in the context of a biological network. Using deepCAGE (deep sequencing with CAGE) we monitored the dynamics of transcription start site (TSS) usage during a time course of monocytic differentiation in the acute myeloid leukemia cell line THP-1. This allowed us to identify active promoters, monitor their relative expression and define relevant regions for carrying out transcription factor binding site predictions. Computational methods were then used to build a network model of gene expression in this leukemia and the transcription factors key to its regulation. This work gives the first picture of the wiring between genes involved in acute myeloid leukemia and provides a strategy for identifying key factors that determine cell fates. In addition to the network, FANTOM4 data was used in two additional analyses. The first identified a novel class of short RNAs associated with transcription start sites and the second focused on the role of repetitive element expression in the transcriptome. (cited from http://fantom.gsc.riken.jp/4/ ) Developed by RIKEN Omics Science Center
FANTOM3 (Functional Annotation of Mouse 3)
Annotation of full-length mouse cDNA
,
FANTOM
A database of annotations to mouse full-length cDNA clone sequences. CAGE tags and GSC ditags information are also used to identify transcription start sites.
ENU-based gene-driven mutagenesis in progress (location list
,
Name and chromosomal location of mutant gene in mutant mouse
The same as above. A list sorted by chromosomal location.
ENU-based gene-driven mutagenesis in progress (gene-name lis
,
Name and chromosomal location of mutant gene in mutant mouse
Mouse mutant lineages induced by ENU are selected and listed based on the mutations on the genes. The lineages are based on sperms of G1 mice with no G2 lineage due to lethality or infertility at the time of the construction of Mutant library with phenotypic screening. Resources are distributed. A list sorted by the mutated genes.
, Gene expression of imprinted genes in mouse
,
EICO DB
A database of imprinted gene candidates of the mouse and the gene expression confirmed with microarrays. SNPs on the genes and relevant human genome regions are also contained. It is aimed for exploration of new imprinted genes.
EGTC
,
Mouse genes, trapped with gene trap method
Trap clones were obtained from Mouse ES cells using a new gene trap system (exchangeable gene trap system). Based on them, gene transfer or disruption was conducted, and lineage resources (oocytes and sperms) are distributed. The database registers tag sequences for the trapped genes, homology search results, and mutated lineages.
DoTS (Database Of Transcribed Sequences)
A database of human and mouse genomes, ESTs, and transcripts.
DBH2H
,
head-to-head (h2h) gene pairs
DBH2H collects head-to-head (h2h) gene pairs identified from human, mouse, rat, chicken and fugu genomes, and distinguishes the ortholog mapping relationship among them. The gene pairs in DBH2H are annotated with sequential features including single nucleotide polymorphisms, CpG islands and transcription factor binding sites, as well as functional terms and genetic disorders. In addition, the expression correlation information based on 117 microarray datasets is included. By providing user-friendly access to these data, DBH2H represents a valuable resource for further analyses of this important gene arrangement in terms of transcriptional regulation mechanisms, evolutionary conservation, disease relevance, etc.
Cerebellar Development Transcriptome Database (CDT-DB)
,
Gene expression of mouse cerebellum in postnatal development?
Gene expression in mouse cerebellum after birth analyzed with various methods. Methods used are fluorescence differential display, cDNA microarray, and GeneChip. Genes expressed in cerebellum were studied their expression patterns with RT-PCR and in situ hybridization. The results of in situ hybridization are displayed as images, and no verbal description is found.
CREAT portal
,
Gene/protein expression profiles and protein-protein interaction of mouse mKIAA genes expression, protein expression,
Gene expressions that have been analyzed using microarrays based on mKIAA clones that have been obtained in Kazusa mouse cDNA project. Ectopic expressions seem to have been analyzed using hybridization (with images). The database (InGap) contains protein expression analyzed with western blot, immunohistochemical analysis, and immunoprecipitation using antibodies based on mKIAA. The database (InCeP) contains protein-protein interactions between mKIAA expressed proteins analyzed with immunoprecipitation and MS/MS. The interactions can be searched/displayed/downloaded, but the display required a dedicated software.
CARD R-BASE
CAGE
,
CAGE/ transcription start site
A database of CAGE tag mapping to the genome. Human and mouse libraries by the tissues and developmental stages were constructed, and the sequences are mapped to UCSC (golden path) genome. The mapping results are referred to by FANTOM.
Brain Gene Expression Database (BGED)
,
Database of mouse brain gene expression
A database of gene expression in the mouse brain analyzed in various physiological and pathological processes. ATAC-PCR was used for gene expression analysis.
BodyMap
,
Human and mouse gene expression
A database of gene expression in human and mouse tissues and cells. The gene expression is analyzed based on 3'ESTs.
BED (Brain EST Database)
Brain EST Database (BED) is based on collection of 3' end ESTs generatedin the Taisho Laboratory of Functional Genomics
AllGenes
Human and mouse gene index integrating gene, transcript, andproteinannotation
ASSETs (Alternative Splicing Sequence Enriched Tags)
,
Mouse ESTs that are, rich in alternative splicing
A database of tag sequences from cDNA libraries that were established from mouse cell lines and that are rich in alternative splicing. The sequences seem to have been pattern classified based on mapping on Ensemble genomes.
Rat
Database for rat genome/annotation
,
Rat Genome Database (RGD)
A rat genome and gene information database. Maps (gene and RH), genes, QTL, SSLP, EST/cDNA, strains, and sequences are registered. A separate user interface makes comparisons between rat, mouse and human from a diseases point of view.
Cross-sectional images of rat brain
,
Rat Brain Sections: Super-fine images
The database contains images of transverse and sagittal sections of rat brains. To display the images, Viewpoint Media Player is required. Scrolling and expansion/reduction is possible using the mouse.
Project Specific Custom Tracks
MGC - Mammalian Genome Collection
Full-length open reading frame clones for human, mouse, and rat genes
DBH2H
,
head-to-head (h2h) gene pairs
DBH2H collects head-to-head (h2h) gene pairs identified from human, mouse, rat, chicken and fugu genomes, and distinguishes the ortholog mapping relationship among them. The gene pairs in DBH2H are annotated with sequential features including single nucleotide polymorphisms, CpG islands and transcription factor binding sites, as well as functional terms and genetic disorders. In addition, the expression correlation information based on 117 microarray datasets is included. By providing user-friendly access to these data, DBH2H represents a valuable resource for further analyses of this important gene arrangement in terms of transcriptional regulation mechanisms, evolutionary conservation, disease relevance, etc.
Pig
Swine Marker Viewer
Swine Linkage Map Viewer
Pig genomics infromation system
It is a portal page of China-Denmark joint genome project. Genome sequences and EST sequences are generated from the project, and they make annotation on the. They say that all the data is on INSD.
Full-length cDNA of pig
,
Pig EST Data Explorer (PEDE)
The database contains pig ESTs from full-length cDNA clones, the assembled contigs, the annotations, and the full-length sequences of selected clones. The database functions as a resource bank to distribute the clones. Pig cSNP Database, a database of SNPs identified in the assembly processes, has been made. It is not clear whether it has ESTs or cDNA sequences that are registered only here. Should they have been registered in INSD?
Additional data for SSRH
Primate
Database of Chimpanzee cDNAs (PRIGEN)
,
Full-length cDNAs of chimpanzee (Pan troglodytes verus)
Full-length cDNA libraries were constructed from chimpanzee brain, liver, testis, and epithelial tissues. Then, 5'EST were sequenced, and some of the clones were sequenced in the full-length. Supplementary data for 12727913 and 15677748.
Database for Macaca fascicularis (cynomolgus monkey) full-length cDNA
,
QFbase - Macaca fascicularis cDNA database
EST and full-length sequences and their homology to human sequences of Macaca fascicularis cDNA clones are registered. The full-length cDNAs are referred to in 17194215.
Genome database for anthropoids/ phylogenetic comparison
,
Silver Project (Ape Genome Sequencing)
Genome sequences for the chimpanzee and gorilla. Comparison of DNA sequences between human and these anthropoids, and the comparisons based on the DNA sequences that have been sequenced in the Apes Genome Project (Silver) are registered.
3D brain image database of humans, Japanese monkeys and Rhesus monkeys
,
Brain Atlas Database of Japanese Monkey for WWW
Three dimensional images of human, the Japanese macaque, and the rhesus macaque are re-constructed from MRI.
Human
Proteins found in human salivary intercalated duct cell line
,
Two-Dimensional Electrophoresis Database of HSG cells proteins
2-dimensional gel electrophoresis of the proteins in HSG (Human salivary intercalated duct cell line, a cell line established with radiation on human salivary glands) is registered. Peptides have been identified from each spot using MALDI-TOF or peptide sequencers.
SAGE of human immune system cells SAGE
SAGE analyses of gene expression in cells of human immune systems.
Gene Diversity DataBase System (GDBS)
Gene Diversity DataBase System (GDBS)
A Database of Japanese Single Nucleotide Polymorphism for Geriatric Research (JG-SNP)
,
Database of Japanese SNP for geriatric disease
The database contains sexes, ages, disease status, and polymorphisms of geriatric disease patients in Tokyo Metropolitan Geriatric Hospital. These information are registered in a separate database called GEAD (geriatric autopsy database) that contains the above mentioned data as well as smoking history, drinking history, pathological findings, and the extent of atherosclerosis.
Genome Medicine Database of Japan (GeMDBJ)
,
Polymorphism, gene/ protein expression of diseases
Analyses of genetic polymorphisms (SNP), gene expression (with GeneChip) and protein expression (2D-DIGE, LC-MS/MS) related to Alzheimer disease, gastric cancer, diabetes, hypertension, and asthma. Limited patient information (sexes, stratified ages, living prefectures, past history, smoking history) is included. User registration is required to view some of the information.
Full-length human cDNA Database
A METI full-length cDNA sequencing and analysis project” database of 30,000 human full-length cDNA sequences that have been collected using Oligo-cap method.
Database for human disease-associated gene mutations
,
KMDB/MutationView
Gene mutations related to human diseases have been collected from OMIM, GDB, and literature, and the loci were mapped to the chromosomes, and the diseased sites were mapped to human body/organs. Structural changes (in the amino acids and in splicing, etc) in the genes due to mutations are registered. Seven sub-databases for disease categories have been constructed(KMaiDB, KMboodDB, KMbrainDB, KMcancerDB, OMearDB, KMeyeDB, KMheartDB, KMmuscleDB, KMsyndromeDB)
DIGITized Genes
A list of candidate genes explored through application of DIGIT program to human genome.
MutationView
The mi-R ontology database
,
miRò
miRò is a web-based knowledge base that provides users with miRNA–phenotype associations in humans. It integrates data from various online sources, such as databases of miRNAs, ontologies, diseases and targets, into a unified database equipped with an intuitive and flexible query interface and data mining facilities. The main goal of miRò is the establishment of a knowledge base which allows non-trivial analysis through sophisticated mining techniques and the introduction of a new layer of associations between genes and phenotypes inferred based on miRNAs annotations. Furthermore, a specificity function applied to validated data highlights the most significant associations.
Integrated Clinical Omics Database
,
iCOD
Project Specific Custom Tracks
PrognoScan
Database for human, non-synonymous SNPs
,
PicSNP/A Catalog of Non-Synonymous SNP
Non-synonymous SNPs are collected automatically. The materials are NCBI human draft genome sequences, and the gene sequences and SNPs in the Feature were compared with SwissProt to select non-synonymous SNPs. GO classification of the identified genes are also shown.
KMsyndromeDB
It is a sub-database of MutationView. It registers data related to Waardenburg syndrome and QT elongation syndrome.
KMmuscleDB
It is a sub-database of MutationView. It registers data related to Duchenne muscular dystrophy and Fukuyama muscular dystrophy.
KMheartDB
It is a sub-database of MutationView. It registers data related to cardiac myopathy.
KMeyeDB
It is a sub-database of MutationView. It registers data related to retinitis pigmentosa, glaucoma, and cataract.
KMearDB
It is a sub-database of MutationView. It registers data related to deafness.
KMcancerDB
It is a sub-database of MutationView. It registers data related to breast cancer, retinoblastoma, and neurofibromatosis.
KMbrainDB
It is a sub-database of MutationView. It registers data related to Parkinson disease and Alzheimer disease.
KMbloodDB
It is a sub-database of MutationView. It registers data related to chronic myelocytic leukemia.
KMaiDB
It is a sub-database of MutationView. It registers data related to APECED and APT1 (an autoimmune disease and the related gene?). The information source is literature. The database is based on remapping works by hand that summarized the coordinates of the mutations of Mendelian diseases to a common coordinate.
Database of Japanese SNPs
,
JSNP
A database of genetic polymorphisms in Japanese populations which are generally observed on or adjacent to the genes.
Database for human metabolic disorder-related SNPs
,
JMDBase/Japan Metabolic disease DataBase
SNPs related to human metabolic diseases (especially hypertension and diabetes) are identified using original algorithms. 15716494 is an article about the effectiveness of the method, and verification on five genes of three races (Japanese, American Africans and Caucasians) is provided.
IBMD
,
Integrated Biomedical Database
HUGE - Human Unidentified Gene-Encoded large proteins
,
Unidentified long human cDNA/, annotation
A database of cDNA sequences and the analyses of KIAA/FLJ clones collected in Kazusa human cDNA project. The initial purpose was to analyze unidentified genes corresponding to large (>50kDa) proteins. The database registers cDNA sequences, restriction maps, signal sequences, amino acid sequences, motif searches with Pfam and SOSUI.
Automatically annotated whole human genome
,
HOWDY
Human genome data are collected from databases in public domain from all over the world, and each entry is located on chromosomes, to relate the entries with each other.
Annotations of human genome variation
,
HGVbase
A database of human genome polymorphisms (mostly SNPs, but indels and tandem repeats are contained) that contains physical and functional relationships between the adjacent genes. The data has been obtained from public databases and literatures. In the past, deposition from research groups have been accepted, but it is stopped now. The database routinely exchange data with dbSNP.
HGS (Human Genome Sequencing)
HGMD® - Human Gene Mutation Database
Known (published) gene lesions underlying human inherited disease
HAL
A database of human genes that have been discovered using original algorithms. The genes seem to have been defined from integration of predicted genes in various databases and various prediction algorithms. There are datasets for human, chimpanzee, mouse, rat, dog, and chicken.
An integrated database of human genes and annotation
,
H-Invitational Database
A database resulting of a Jamboree that partially manually made judges on the calculations and classifications of the clustering and the overlap relation analysis of the genes with cDNA sequences in INSD. It issues original gene Ids. The function information is based on Entrez gene because NCBI OMIM and Entrez are used. The substantial part of the full-length cDNA is made public by a METI/NEDO cDNA project.
H-InvDB
Full-length human cDNA clones
GiiB-JST mtSNP
,
Human mitochondrial genome polymorphism database
Distribution of polymorphism in mitochondria genomes from patients of seven diseases (96 patients for each disease) has been analyzed. This is a database of the functional differences between individuals related to the paired polymorphisms. 672 INSD entries of mitochondria genomes cite the same article.
Genew
Human gene nomenclature database: Approved symbols for all human genes
Database for human, full-length cDNAs
,
Full Length cDNA
A database of human full-length cDNA
Database for human full-length cDNAs
,
FLJ Human cDNA Database
Outcomes of the sequencing and sequence analyses of METI Protein function analysis project/development of splicing variant acquisition technology. It consists of human splicing variant cDNA sequence database and an entire FLJ oligo-cap method based human full-length cDNA sequence database (about 50,000 full-length analyzed sequences and about 1,500,000 5'end analyzed sequences)
FANTOM4
In FANTOM4 the focus has changed to understanding how these components work together in the context of a biological network. Using deepCAGE (deep sequencing with CAGE) we monitored the dynamics of transcription start site (TSS) usage during a time course of monocytic differentiation in the acute myeloid leukemia cell line THP-1. This allowed us to identify active promoters, monitor their relative expression and define relevant regions for carrying out transcription factor binding site predictions. Computational methods were then used to build a network model of gene expression in this leukemia and the transcription factors key to its regulation. This work gives the first picture of the wiring between genes involved in acute myeloid leukemia and provides a strategy for identifying key factors that determine cell fates. In addition to the network, FANTOM4 data was used in two additional analyses. The first identified a novel class of short RNAs associated with transcription start sites and the second focused on the role of repetitive element expression in the transcriptome. (cited from http://fantom.gsc.riken.jp/4/ ) Developed by RIKEN Omics Science Center
EpoDB - Erythropoiesis Database
,
Genes related to red blood cell hematopoiesis
A database that has collected and organized genes that are expressed in hematopoiesis of vertebrate red blood cells, as well as their sequences and expression information, from public databases
DoTS (Database Of Transcribed Sequences)
A database of human and mouse genomes, ESTs, and transcripts.
Database of Genomic Variants
The objective of the Database of Genomic Variants is to provide a comprehensive summary of structural variation in the human genome. We define structural variation as genomic alterations that involve segments of DNA that are larger than >1kb. For the purpose of this database, we focus on variants that are not directly correlated with specific phenotypes. The Database of Genomic Variants provides a useful catalog of control data for studies aiming to correlate genomic variation with phenotypic data. The database is continuously updated with new data from peer reviewed research studies. We always welcome suggestions and comments regarding the database from the research community.
DBH2H
,
head-to-head (h2h) gene pairs
DBH2H collects head-to-head (h2h) gene pairs identified from human, mouse, rat, chicken and fugu genomes, and distinguishes the ortholog mapping relationship among them. The gene pairs in DBH2H are annotated with sequential features including single nucleotide polymorphisms, CpG islands and transcription factor binding sites, as well as functional terms and genetic disorders. In addition, the expression correlation information based on 117 microarray datasets is included. By providing user-friendly access to these data, DBH2H represents a valuable resource for further analyses of this important gene arrangement in terms of transcriptional regulation mechanisms, evolutionary conservation, disease relevance, etc.
D-HaploDB
,
SNPs of human definitive haplotypes determined by complete hydatidiform moles,
Haplotype analysis of 280,000 SNPs from the samples of Japanese complete hydatidiform mole that have been typed with microarrays.
Collagen Mutation Database
Human type I and type III collagen gene mutations
CleanEx
Expression reference database, linking heterogeneous expression data tofacilitate cross-dataset comparisons
CellMontage
CellMontage is a system for searching gene expression databases for cells or tissues similar to the query gene expression profile.
Cancer Gene Expression Database (CGED)
CGED (Cancer Gene Expression Database) is a database of geneexpression profile and accompanying clinical information. The data of CGED were obtained through collaborative efforts of Nara Institute of Science andTechnology and Osaka University School of Medicine to identify genes ofclinical importance.
CGH Database
,
Chromosome aberration in human tumor cells
A database of chromosomal abnormalities in human tumor cell lines analyzed with Comparative Genomic Hybridization. Loss, gain, and amplification are detected.
CFTR mutation
Human cystic fibrosis mutation db (CFTR)
CELLPEDIA
,
Humam cell Database
CELLPEDIA is a database for human cells, exhaustively collecting various types of information on cells. Each data consists of various types of information such as systematic cell classification and morphological information using ontology, with related gene expression and journal information on cell (trans-)differentiation, providing all information that is required for cell study at once.
CAGE
,
CAGE/ transcription start site
A database of CAGE tag mapping to the genome. Human and mouse libraries by the tissues and developmental stages were constructed, and the sequences are mapped to UCSC (golden path) genome. The mapping results are referred to by FANTOM.
BodyMap
,
Human and mouse gene expression
A database of gene expression in human and mouse tissues and cells. The gene expression is analyzed based on 3'ESTs.
BloodSAGE
A database of SAGE analyses of gene expression in blood cells.
AllGenes
Human and mouse gene index integrating gene, transcript, andproteinannotation
5'SAGE
,
5'end serial analysis of gene expression database
A database of 5'SAGE analysis of human gene transcription start sites and the number of expressed tags.
Pisciformes
Database for Aquatic-vertebrate Science
This database consists of living and sample photographs of the fish. Registered photographs increased from 40,000 to 54,583 (Jul. 2006).
アクアDNAブック
UT Genome Browser (Medaka)
A project support environment to confirm clone information and assembly status.
MitoFish
,
Mitochondrial genome database of fish
Fish mitochondrial genome sequences were collected from GenBank and RefSeq. Taxonomical information was obtained from FishBase, NCBI Taxonomy Browser, The Catalog of Fishes, and Fish Database of Japan. Literature information was obtained from PubMed. Related biological sequences were obtained from DDBJ.
Medaka EST database
,
Medaka gene expression
A database that contains Medaka ESTs, the library information, an explanation of mutation mapping system using ESTs, and the organization of a microarray (Medaka Microarray 8K).
MCTDB
,
Medaka Craniofacial Trait DataBase
BISMaL
,
Biological Information System for Marine Life
Amphibia
NCBI Genome Project Result - Xenopus tropicalis (western clawed frog)
,
Xenopus tropicalis (western clawed frog)
XDB
,
Xenopus laevis (frog) gene expression
The database contains Xenopus laevis EST, their assemblies, and WISH images. The assembly sequences are annotated with BLAST searches targeted to NCBI-NR, TIGR-XGI, Xenopus protein database of NIH, and InterProScan. WISH images have been taken from each direction at developmental stages.
Ascidian
Gene expression database of Ciona intestinalis (sea squirt)
Ciona intestinalis gene expression analyses with EST and in situ hybridization are registered. 16 types of cDNA libraries for tissues and developmental stages have been constructed. EST clustering results and genome (obtained from JGI) mapping results are registered. In-situ hybridization images can be searchable with expression location/stages.
Microarray analysis of Ciona intestinalis (ascidian), gene expression
,
Microarray analysis of embryonic retinoic acid target genes
Gene expression profiles of 9,287 candidates of embryonic retinoic acid target genes analyzed with microarrays using cDNA libraries for the Ciona intestinalis EST analyses. Supplement data for an article 12828686. In addition, in situ hybridization images for 91 genes are shown.
EST of Halocynthia roretzi (ascidian)
,
MAGEST
A database of Halocynthia roretzi EST and the clustering.
DBTGR
,
Knowledge model ?of Tunicate gene expression regulation
A database of Ascidian gene expression loci, control regions, promoter sequences, and transcription factors. It registers inter-species comparison of promoters (C.intestinalis vs. C.savignyi).
Ciona intestinalis EST project database
CIPRO Ciona intestinalis Protein Database
ABA
,
Images of Ciona intestinalis (ascidian chordate)morphology in different developmental stages
Ciona intestinalis morphology database that registers images for developmental stages from the fertilized egg to the tadpole larva. It contains 3D reconstruction images in the mid-tailbud stage and cell lineage figures.
Echinodermata
Sea Urchin Genome Database
,
SpBase
SpBase is a system of databases focused on the genomic information from sea urchins and related echinoderms.
Platyhelminthes
Echinococcosis Full-length cDNA database
,
Full-Echinococcosis
Arthropoda
Check List of Japanese Insects MOKUROKU
Dictionary of Japanese Insect Names
Japanese Ant Image Database
The brown planthopper EST database
,
UNKA (BPH) EST
Full-Tsetse
,
Tsetse Full-length cDNA database
Full-Mite
,
Mite Full-length cDNA database
Anopheles Full-length cDNA database
,
Full-Anopheles
Cricket EST DB and Expression DB
,
ESTs of cricket
Cricket EST sequences, clustering analyses, and homology search results are registered.
Silkworm
Full-length cDNA of silkworm
,
Insect Genome Databases, IGB lab., Univ. of Tokyo
EST sequences of the silkworm full-length cDNA clones.
EST analysis of silkworm gene expression
,
SilkBase
Silkworm ESTs and the cDNA library information. Annotation with BLASTX is registered.
Kaiko Genome Automated Annotation System (KAIKO GAAS)
A database of predicted genes and functional regions on silkworm genomes (BAC and WGS assemblies). The annotations were conducted algorithmically, and GeneScan, FGENESH, MZEF, SplicePredictor, BLAST, HMMer, ProfileScan, MOTIF, tRNAscan-SE, PSORT, SOSUI were used.
KAIKObase
,
integrated silkworm genome database
KAIKObase is an integrated silkworm genome database and data mining tool with 4 map browsers, 1 gene viewer, and 2 independent databases. (from website)
EST of silkworm
,
KAIKO cDNA
Silkworm ESTs are organized by cDNA libraries (strain, developmental stages, tissues, and sexes). BLASTX searches by each EST are registered.
KAIKO BLAST
KAIKO 2DDB
Bombyx Trap DataBase
,
BombyxTrap DB
Bombyx Mutants Photographs
Silkworm Bombyx mori ESTs, mutants, photographs
Water flea
DaphniaBASE
The database of daphnia clone sequences and blast search results for those sequences.
Drosophila
Japan Drosophila Database
Project Specific Custom Tracks
Dictionary of Drosophila
,
J*FLY
The resource contains a list of known Drosophila genes, experiment protocols with movies, morphology photos, stock centers, and laboratories and researchers in Japan and abroad.
Drosophila genome/annotation
,
GadFly
A site that summarizes various data produced in Drosophila genome project. (1) Genome sequences and annotation, (2) gene expression patterns analyzed with in-situ hybridization, They are validated with microarrays. Annotation has been manually conducted using controlled vocabulary. (3) EST and full-length cDNA sequences, (4) transposon sequences, (5) gene disruption strains using a single P transposable element. (6) comparative genomic analysis of Drosophila, (7) SNP map.
Drosophila Gal4 enhancer trap insertion lines
,
GETDB - Gal4 Enhancer Trap Insertion Database -
Analyses of insertion positions, gene expression patterns, and the phenotypes of Drosophila strains with inserted Gal4 enhancer traps. Resources can be distributed.
FlyView
Drosophila development and genetics
FlyBase
Drosophila sequences and genomic information
DROSOPHILA GENE SEARCH PROJECT (DGSP)データベース
Nematoda
IINO lab. Germline index
,
Phenotypes of worm, gene function assessed by RNAi
Phenotypes from RNAi gene function inhibition targeted to genes that are specifically expressed in C.elegans germ lines.
Database for worm, genome/annotation
,
WormBase
A database of biological information of C.elegans and other worms. It contains genomes (structures, functions, genetic polymorphisms, comparative genomic studies), genes (structures, expressions, phenotypes, and RNAi), lineages (strains, genetics, and markers), and literature information.
WorfDB - Worm ORF Database
Predicted proteins from C. elegans
WorTS
,
Worm TS mutant Database
National BioResource Project (NBRP)::C.elegans
Distribution pattern of antigens in embryonic worm
,
The Sugimoto Lab C. elegans Monoclonal Antibody Collection
Immunostaining images of stage specificity and localization of proteins in worm embryos are registered. The antibodies are distributed.
C.elegans mutant DB
C.elegans WWW server
at University of Texas Southwestern
C. elegans RNAi Phenome Database
,
Database of RNAi gene disrupted worms
Phenotypes for worm lineages that have been undergone RNAi gene disruption are comprehensively registered. Clustering of the genes based on the phenotypes is registered, and they are expressed in the form of lineage trees.
C. elegans Project
Genome sequencing data at the Sanger Institute
Fungi
Fungus and Actinomycete Gallery
Pathogenic Fungi Database (PFDB)
ESTs of Lentinus edodes (shiitake mushroom)
,
LeEST
Lentinus edodes cDNA libraries have been constructed, and the 5' ESTs are sequenced. They are registered with BLASTN search results.
Database of genomes and transcriptional regulations for fila
,
ESTs of Aspergillus oryzae
Aspergillus oryzae cDNA libraries were constructed under several conditions and the 5'ESTs were sequenced. The sequences are available only via FASTA homology search. Promoter analyses are seems to be planned, because methods for construction of genome clones for the purpose are described. In addition, a cosmid clone of Aspergillus nidulans is registered.
Candida Genome Database
Candida albicans genome database
Candida
Contains genetics, physical map, sequence data and other resources onCandida Albicans
Aspergillus oryzae RIB 40 genome DB
Aspergillus oryzae EST DataBase
Aspergillus
Aspergillus oryzae RIB 40 genome DB
Aspergillus oryzae EST DataBase
Yeast
Database of yeast gene expression
,
yMGV - Yeast microarray global viewer
A database of microarray analyses of gene expression in budding and fission yeasts.
Database of small nucleolar RNAs from budding yeast
,
Yeast snoRNA Database
A database of budding yeast snoRNA structures and their interactions with other RNAs.
Annotation of yeast, introns
,
Yeast Intron Database
A database of budding yeast introns. Known introns were confirmed with microarrays, and those actually exist are put together with experimental data into a public database.
Yeast Interacting Proteins Database
UT Genome Browser (Yeast)
The Homeodomain Resource
,
a comprehensive collection of sequence, structure, interaction, genomic and functional information on the homeodomain protein family
The Homeodomain Resource is a curated collection of sequence, structure, interaction, genomic and functional information on the homeodomain family. The current version builds upon previous versions by the addition of new, complete sets of homeodomain sequences from fully sequenced genomes, the expansion of existing curated homeodomain information and the improvement of data accessibility through better search tools and more complete data integration. This release contains 1534 full-length homeodomain-containing sequences, 93 experimentally derived homeodomain structures, 101 homeodomain protein–protein interactions, 107 homeodomain DNA-binding sites and 206 homeodomain proteins implicated in human genetic disorders.
Database for baker’s or budding yeast genome/annotation
,
SGD - Saccharomyces Genome Database
A gene-based database of the molecular biological and genetic data of budding yeasts. Most of the annotations rely on manual information extraction from literature. Genomic locations of the genes, GO annotations, sequences for nucleic acids and amino acids, phenotypes, and expression data are registered.
Micrographs of the morphology of budding yeast mutants
,
SCMD - Saccharomyces cerevisiae Morphological Database
A database that classified the morphology of mutants (budding states) of budding yeasts. Feature extraction from the morphology photographs and the classification were conducted computationally.
Database for yeast genome/annotation
,
S.pombe genome project
A database containing data from a fission yeast genome project at Sanger Institute. Genome sequences and the annotation, GO annotation, clone libraries, mapping resources (tiling path, gene map, physical map) are contained.
MBGD
,
Ortholog/ homolog of microbial genomes
A database of microorganism full-length genomes and the orthologous/homologous relations between the genes.
Génolevures
A comparison of S. cerevisiae and 14 other yeast species
Plantae
Bryophyta
Moss plants index
Physcomitrella patens
EST analysis of Physcomitrella patens (moss) gene expression
,
Physcomitrella patens Full-Length cDNA Clone Database Search
A list of Physcomitrella patens subsp. patens. Full-length cDNA clones distributed by RIKEN. Full length sequences for the clones are registered.
ESTs of Physcomitrella patens (moss)
,
PHYSCObase
A database of Physcomitrella patens subsp. Patens mRNA, EST, contigs, and experiment protocols. Downloadable.
Chlorophyta
Chlamy Base
Chlamydomonas reinhardtii EST index
,
ESTs of chlamydomonas (single celled green alga)
chlamydomonas EST, the contig sequences, and BLASTX annotations to the contigs are registered, Supplement data for 11089912, Phycologia,43,722-726(2004)
Dr. Zompo
,
Zostera marina and Posidonia oceanica ESTs
As ecosystem engineers, seagrasses are angiosperms of paramount ecological importance in shallow shoreline habitats around the globe. Furthermore, the ancestors of independent seagrass lineages have secondarily returned into the sea in separate, independent evolutionary events. Thus, understanding the molecular adaptation of this clade not only makes significant contributions to the field of ecology, but also to principles of parallel evolution as well. With the use of Dr. Zompo, the first interactive seagrass sequence database presented here, new insights into the molecular adaptation of marine environments can be inferred. The database is based on a total of 14 597 ESTs obtained from two seagrass species, Zostera marina and Posidonia oceanica, which have been processed, assembled and comprehensively annotated. Dr. Zompo provides experimentalists with a broad foundation to build experiments and consider challenges associated with the investigation of this class of non-domesticated monocotyledon systems. Our database, based on the Ruby on Rails framework, is rich in features including the retrieval of experimentally determined heat-responsive transcripts, mining for molecular markers (SSRs and SNPs), and weighted key word searches that allow access to annotation gathered on several levels including Pfam domains, GeneOntology and KEGG pathways. Well established plant genome sites such as The Arabidopsis Information Resource (TAIR) and the Rice Genome Annotation Project are interfaced by Dr. Zompo. With this project, we have initialized a valuable resource for plant biologists in general and the seagrass community in particular. The database is expected to grow together with more data to come in the near future, particularly with the recent initiation of the Zostera genome sequencing project.
Magnoliophyta
Asteraceae
PRIDE (PSC-RIKEN Database of EST/Gene Expression)
,
Zinnia elegans EST /microarray gene expression
Gene expression analyses with EST and microarrays of Zinnia elegans are registered. ESTs are shown with BLASTX search results with each sequence. Microarray analyses are accessible only via GeNet system (now the service stopped?)
Brassicaceae
Arabidopsis
PRIMe: Correlated Gene Search
,
Tool that searches for correlated Arabidopsis genes
It provides a service that makes searches in pre-computed correlations in Arabidopsis GeneChip gene expression data. Genes that have correlation with genes as the search key are output with the correlation coefficient and the annotation. The base correlation analysis is public as ATTED-II (www.atted.bio.titech.ac.jp).
SASSC Quick search
RIKEN Arabidopsis Transcription Factor database (RARTF)
Arabidopsis Full-Length Clone Database Search
,
Catalogue of Arabidopsis, full-length cDNA
A list of Arabidopsis full-length cDNA clones distributed by RIKEN. Full length sequences for the clones are registered.
Activation T-DNA tags of mutant Arabidopsis lines
,
Activation Tagging Line Database
The database contains phenotypes and the photos of Arabidopsis activation tag lines that showed phenotypical changes in the T1 generation. The initial intent of the database was to distribute mutant strains.
Cluster cutting tool for gene expression data
,
PRIMe: Cluster Cutting
A tool to extract gene clusters containing specified genes from clustered GeneChip gene expression data. The results are graphically displayed with Java, and further extraction of a part of the tree by referencing the tree is possible. Presently, Affymetrix GeneChip gene expression data of Arabidopsis conducted in RIKEN and MaxPlanck are prepared on the server.
Database for Arabidopsis genome/annotation
,
The Arabidopsis Information Resource (TAIR)
A database of Arabidopsis genome, genes, and molecular biological data. It contains annotated genome, gene products, metabolism, gene expression, markers, strain resource information, and literature information.
Arabidopsis transposon mutant strains
,
RARGE [Ac/Ds Transposon Mutants]
A database of insertion positions and the adjacent genes of Arabidopsis transposon mutants. A part of RARGE.
Portal site for Arabidopsis research
,
RARGE- RIKEN Arabidopsis Genome Encyclopedia
A Web site to make searches in data and resources related to Arabidopsis researches in RIKEN. Full-length cDNA, microarray analyses, transposon mutants, genomic locations of the genes, and splicing patterns are registered.
Arabidopsis full-length cDNA
,
RAFL cDNAs
A database of Arabidopsis full-length cDNA sequences and the BLASTX annotation. A part of RARGE.
Phenome Analysis of Ds transposon-tagging line in Arabidopsi
,
Phenotypes of transoposon-insertional mutants? in Arabidopsis
A list of Arabidopsis Ds transposon inserted mutants with the mutated gene loci and the genomic locations (also categorized in UTR, coding regions, exon or intron). The shapes of the mutants (eight primary categories and 50 secondary categories), but the list of phenotypes cannot be obtained. Detailed information display uses MIPS sites.
Overview of Arabidopsis transposon mutant strains
,
Plant Functional Genomics Research Group
A page describing the outline of a transposon mutation database
Database for Arabidopsis genome/ annotation
,
MAtDB
The database contains the whole sequences that were sequenced and annotated in Arabidopsis Genome Initiative. Mitochondria genomes and chloroplast genomes are also annotated and are contained in the database.
MAEDA (Micro Array Expression DAta search)
,
Microarray analysis of Arabidopsis gene expression
Arabidopsis gene expression analyses using microarrays that were manufactured using 7,000 full-length cDNA clones. URL on GEO: GSE4203, GPL3181.
KATANA
Collection of Arabidopsis gene annotations from various databases and the summarization to searchable and referable formats.
DART
CropNet
Genome mapping in crop plants
Arabidopsis EST analysis database
,
Arabidopsis thaliana EST Index
Arabidopsis ESTs and the annotation (based on BLASTX). Supplement data for 10907847.
Arabidopsis Gene Expression profile data base
ATTED-II
,
Gene expression database of Arabidopsis coexpression
ATTED-II simply presents graphs of the gene expression pattern for each gene.
ARTADEdb
,
Arabidopsis exon detection, tool and validation result
A database of Arabidopsis exons detected with tiling arrays conducted in RIKEN. Programs that were used in the data analysis are also made public.
Cassava
Cassava Full-Length cDNA Database
,
キャッサバ完全長cDNAアノテーション
RIKENで集められたキャッサバのcDNAに関する外部データベースを横断的に検索可能にしている。検索にはキーワード検索、BLAST検索が対応している。
Cassava Clone Database Search
,
キャッサバ完全長クローン配列
Poaceae
Rice
イネDNAブック
WhoGA
,
Whole Genome Annotation (Rice)
Rice Tos17 Insertion Mutant Database
,
rice, transposon gene disruption mutant strain, strain list
Rice strains with disrupted genes are produced by transferring endogenous transposons (Tos17) and are distributed as resources. The database contains flanking DNA sequences to the transposons.
Rice Research
Database of rice proteome
,
Rice Proteome Database
A database that collected spots of 2D gel electrophoresis targeted to rice tissues and organelles. Protocols of proteomic analyses are registered.
Rice Mitochondrial Genome Information (RMG)
Automatically annotated rice genome
,
Rice Genome Automated Annotation System (Rice GAAS)
A database of algorithmic annotations on rice genomes. They contain Gene prediction (GENESCAN, RiceHMM, FGENESH, and MZEF), splicing site prediction (SplicePredictor), homology searches (BLAST, HMMer, ProfileScan, and MOTIF), repetitive sequence searches (RepeatMasker, Printrepeats), signal sequence search (SignalScan), protein localization signal prediction (PSORT), and transmembrane protein secondary structure prediction (SOSUI).
Microarray gene expression of rice
,
Rice Expression Database (RED)
Gene expression data analyzed with rice cDNA mircoarrays are registered. The database contains NIAS and STAFF-derived data, as well as analyses in other projects using the same microarrays. The article for the database is Trends in Plant Science (2002) Dec 7 (12):563-564.
Database of rice, genome/annotation
,
Rice Annotation Database (RAD)
A database of rice genes (including predicted genes) and transcripts that are mapped on rice PAC/PAC contigs. The database contains summarized tables of spliced site patterns, amino acid compositions, codon usage compositions, gene length, GO annotations, and MIPS functional classification.
RPSD
,
Rice Protein Structure Databace
A database that collected rice protein structures. The data consists of 3D-structures derived from PDB and predicted structures from GTOP.
Rice Microarray Opening Site (RMOS)
ESTs of rice
,
RGP Rice cDNA Sequence Database
The database contains ESTs sequenced in NIAS and the clustering analyses.
Database for rice genome/annotation
,
Rice Annotation Project DataBase (RAP-DB)
A database of rice genome on which genes (including predictions), transcripts (including several plant ESTs other than the rice), BAC, and mutant information are mapped.
NBRP Oryzabase
,
Rice research portal site
A portal site to data and resources related to rice researches. Information related to species (lineages, wild species, mutants), gene expression (relationships with tissues and developmental stages),
Full-length cDNA clones form rice
,
Knowledge-Oriented Molecular Biological Encyclopedia (KOME)
Rice full-length cDNA sequences and the annotations (homologous genes, clustering analyses, InterPro motif searches, GO assignments) are registered. Supplement data for 12869764.
Integrated Rice Genome Explorer (INE)
,
Rice genome/annotation
A database of rice genome on which gene maps, physical maps, PCR markers, ESTs, BAC/PAC contigs are mapped.
EST database-viewing software? of crops
,
HarvEST
EST sequences and the assemblies for barleys, Brachypodium, citrus, coffee, cowpea, soybeans, rice, and wheat. The database has been constructed by Univ. California, Riverside, but the sequence data are accepted from cooperating institutes in the project (ex. Univ. Okayama provides barley ESTs). Genome sequences provided by Affymetrix, which were material data for the genome chip, for barley, wheat, rice, and soybean.
CROP SCIENCE DATA BASE
Barley, Wheat
EST analysis of barley and seed images
,
NBRP-Barley
The database contains the EST sequences from nine cDNA libraries of three strains and wild type barleys in developmental stages and in tissues. The EST data duplicate with HarvEST. It contains a list of germplasms that can be distributed from Okayama University. A part of them are attached with photos of the seeds and the sprouts. The database also contains a list of representative strains (in consideration of genetic diversity) called Core Collection.
KOMUGI
,
Wheat Genetic Resources Database
EST database-viewing software? of crops
,
HarvEST
EST sequences and the assemblies for barleys, Brachypodium, citrus, coffee, cowpea, soybeans, rice, and wheat. The database has been constructed by Univ. California, Riverside, but the sequence data are accepted from cooperating institutes in the project (ex. Univ. Okayama provides barley ESTs). Genome sequences provided by Affymetrix, which were material data for the genome chip, for barley, wheat, rice, and soybean.
GrainGenes
Molecular and phenotypic information on wheat, barley, rye, triticale,and oats
CropNet
Genome mapping in crop plants
Maize
Database for maize genome/annotation
,
MaizeGDB
Maize genomes and resources database. It contains genome sequences, conserved strains and phenotypes, mutant strains, and genetic maps.
Solanaceae
Tobacco
Tobacco EST clones from BY-2 cells Database Search
ESTs of Nicotiana tabaccum cell line (BY-2),
,
Transcription Analysis of BY-2
Nicotiana tabaccum -derived cell line BY-2 cDNA libraries were constructed and the ESTs were sequenced. BLASTX annotations for each EST and the clustering (with BLASTN searches between ESTs) are registered.
Tomato
ESTs of tomato
,
MiBASE
EST sequences of cDNA libraries from the fruits and the leaves of Micro-Tom tomato are registered. The annotations include homologous genes and clusters in UNIGENE database containing ESTs from other projects and GO terms. Gene expression using microarrays made of the cDNA libraries as described above seems to have been analyzed for tissues, developmental stages, and breeds, but no raw data can be obtained. Supplement data for 15975739, Plant Biotechnol. 22: 161-165(2005)
Fabaceae
Soybean Full-Length cDNA Database
ESTs of Lotus japonicus (model legume)
,
Lotus japonicus EST Index
Lotus japonicus EST sequences, 3'EST consensus sequences, and the annotations are registered. Supplement data for 10819328.
Genome of Lotus japonicus (model legume) / annotation
,
Lotus japonicus Genome Sequence Project
Lotus japonicus genome clones (TAC: transformation-competent artificial chromosomes), the annotations, chromosomal gene maps, and chloroplast gene sequences are registered. Supplement data for 12056416, 11853318, 11214967, 11853317.
BeanGenes
Beans genome db
Rhodophyta
ESTs of Porphyra yezoensis (red alga)
,
Porphyra yezoensis EST Index
Porphyra yezoensis ESTs and BASLT annotations are registered. Supplement data for 10907854, Journal of Phycology 39,923-930(2003) .
Bacteria
Extremobiospheres Research Center
,
Genomes of Extremophiles/ annotation
Extremophils genome sequences and the annotations analyzed in JAMSTEC. The database contains a list of links to extremophils genome information analyzed in other institutes.
Database for Genes Contributing to Sustainable World
It is a database of annotation targeted to more than 5,000,000 gene candidates that have no assignment of functions in public databases and that were originated from a large number of nucleic acid sequences collected in large-scale sequencings of environmental biological samples (metagenome analyses), which are advanced in all over the world to discover genes that might be useful to the cleanup and conservation of natural environment, leading to a sustainable society. It is a part of human resource development program of database integration project.
tRNA gene database curated manually by experts
,
tRNADB-CE
Students had initiatives in the prediction of tRNA candidate sequences in a large part of the prokaryotes' DNA sequences, including fragment genome sequences of uncultured microorganisms, using three programs (tRNAscan-SE, ARAGORN, tRNAfinder). In case of different prediction among the programs (about three percent of the predicted sequences), three tRNA experiment experts (Hachiro Iguchi, former professor of Kyoto University, Akira Muto, professor emeritus of Hirosaki University, and Yuko Yamada, former lecturer of Jichi Medical College) made close inspection and registered in the database as tRNA. Because fragment genome sequences were added to analyses target, more than 140,000 tRNA genes were registered, which is four times larger than databases in the past. It is an outcome of integrated database project.
Bacteria genomes/ annotation
,
a genome database of microorganisms sequenced at NITE. (DOGA
Annotations to the genomic sequences and the proteomic analyses (if any), ORF comparison between other species (if any), and the general description of the microorganisms that NITE has sequenced their genomes. Supplement data for 9679194, 10382966, 11572479, 11418146, 12044378, 16237012, 12840036, 12692562, 16372010.
Xanthobase : Xanthomonas oryzae pv. oryzae genome database
Annotation of tmRNA sequences
,
The tmRNA website
A database of tmRNA information. It contains sequences, secondary structures (the bases in the sequences are colored by the structural elements), corresponding proteolysis tag peptides, as well as multiple alignments of all the tmRNA sequence sets and all the tag peptide sets. The sequences consist of those identified by direct sequencing and those obtained from public databases.
The international project of sequencing the Bacillus subtili
Streptomyces griseus Genome Database
Genome of Streptomyces avermitilis (industrial microorganism)/annotation
,
Streptomyces avermitilisゲノムデータベース
A database of Streptomyces avermitilis (a microorganism producing avermectin, an anthelmintic) genome sequences and annotations. It contains physical maps, KEGG pathway analyses, protein families, secondary metabolic products, conserved genes, lineage trees based on 16S rRNA, request methods of cosmid clones. Supplement data for articles (PubMed:11572948, 12692562).
Rhodococcus Genome Project
Genome database for Rhizobia with annotation
,
RhizoBase
Rhizobia and photosynthesis bacterium (Rhodopseudomonas palstris) genomes database. Genome sequences, ORF information, genes and the gene categories are registered.
PseudoCAP
Pseudomonas aeruginosa genome database and community annotation project
Nocardia farcinica genome
A database of Nocardia farcinia IFM 10152 (a gram positive aerobic actinomycete that causes nocardia infections) whole genome sequence containing two plasmids and the annotation. Supplement data for 15466710.
Genome of Mycoplasma penetrans (bacterial pathogen)/annotation
,
Mycoplasma penetrans genome
Mycoplasma (Mycoplasma penetrans) genome sequences and the gene predictions are registered. Supplement data for 12466555.
Microbial Genome Workbench
Genome sequences in the public domain of bacteria and archaebacteria were collected, and made searchable with homology, keywords, protein molecular weights, and pI.
MBGD
,
Ortholog/ homolog of microbial genomes
A database of microorganism full-length genomes and the orthologous/homologous relations between the genes.
ExtremoBase
DeinoBase
Whole genomes of Bifidobacterium adolescentis ATCC15703
Cyanobacteria
KazusaMart
Fluorome - The Cyanobacterial Chlorophyll Fluorescence Datab
Database for Genetic Engineering of Microalgae
CyanoClust database
,
Database of homologous proteins in cyanobacteria and plastids
CyanoBase
,
Genome database for Cyanobacteria
A database of the genomes for Cyanobacteria, photosynthesis bacteriu (Chlorobium tepidum TLS), purple bacterium (Rhodopseudomonas palustris). Genome Sequences, ORF information, genes and the categories, mutations, proteomic analyses (Cyano2Dbase) are registered. Supplement data for 8590279, 8905231, 9435137, 11759840, 11858227, 12240834, 14621292.
CYORF (Cyanobacteria Gene Annotation Database)
,
Workbench for Cyanobacteria gene annotation
A workbench for the Cyanobacteria research community to annotate the genes. General users can make searches on, refer to, and download the database.
E. coli
Database of Vibrio parahaemolyticus (gram-negative marine bacterium), genome
,
VPARA(Vibrio parahaemolyticus)
A database of the whole-genome sequences of Vibrio parahaemolyticus and the annotations. Downloadable.
Transcriptome analysis of 2 component system in Eschericia coli
Dictionary of E.coli gene
,
PEC
GenoBase
,
Integrated database of E.coli K-12 (W3110)
The database has been constructed in NAIST. It is a database of various information related to E.coli. It contains genome sequences, ORF, amino acid sequences, 2D-PAGE proteomics data, microarray gene expression data, literature information, bioinformatics analyses (ORF clustering, codon usage skewness, and clustering of gene expression profiles).
Database of pathogenetic E.coli O157 genome
,
E.coli O157:H7 Sakai genome project
A database of E.coli O157:H7 Sakai genome sequences, pOSAK1 plasmid sequences, and the annotations. Download is possible.
Bacillus subtilis
SubtiWiki
,
the Bacillus subtilis centred wiki SubtiWiki
Bacillus subtilis is the model organism for Gram-positive bacteria, with a large amount of publications on all aspects of its biology. To facilitate genome annotation and the collection of comprehensive information on B. subtilis, we created SubtiWiki as a community-oriented annotation tool for information retrieval and continuous maintenance. The wiki is focused on the needs and requirements of scientists doing experimental work. This has implications for the design of the interface and for the layout of the individual pages. The pages can be accessed primarily by the gene designations. All pages have a similar flexible structure and provide links to related gene pages in SubtiWiki or to information in the World Wide Web. Each page gives comprehensive information on the gene, the encoded protein or RNA as well as information related to the current investigation of the gene/protein. The wiki has been seeded with information from key publications and from the most relevant general and B. subtilis-specific databases. We think that SubtiWiki might serve as an example for other scientific wikis that are devoted to the genes and proteins of one organism.
Automatic annotation of Bacillus subtilis (soil bacterium) genome
,
SubtiList
A database of annotation to Bacillus subtilis genome (gene location, functional assignment, and links to references).
Automatic annotation of Bacillus subtilis (soil bacterium) genome
,
NRSub
A database of Bacillus subtilis genome on which entries of SWISS-PROT, ENZYME, and HOBACGEN are mapped and cross-referenced.
Bacillus subtilis (soil bacterium) promoter/Knowledge model? of transcription factors,
,
DBTBS
A database of promoters, transcription factors, and controlled genes of Bacillus subtillis. Only the experimentally verified targets from literatures are collected.
BSORF
,
Genome database of Bacillus subtilis (soil bacterium)/ annotation
ORFs and the annotations of the Bacillus subtilis genome, mutations for the corresponding genes, microarray gene expression patterns are registered.
Archaea
Transcription Product Database (TraP)
Microbial Genome Workbench
Genome sequences in the public domain of bacteria and archaebacteria were collected, and made searchable with homology, keywords, protein molecular weights, and pI.
MBGD
,
Ortholog/ homolog of microbial genomes
A database of microorganism full-length genomes and the orthologous/homologous relations between the genes.
Archaeal Gene Network (Arch GeNet)
,
Protein and gene expression of Thermoplasma volcanium GSS1(thermophilic archaebacterium)
Protein expression in a thermophilic archaebacterium under aerobic and anaerobic conditions was analyzed with 2-dimensional gel electrophoresis. In addition, gene expression under three types of environments was analyzed using microarrays.
ARCHAebacterial Information Collection (ARCHAIC)
,
Genome of archaebacteria/annotation
A database of genome sequences, gene structures and the sequences (nucleic acids and amino acids), pseudogenes, operons, and lineages for several archaebacteria. Supplement data for PubMed:9679194, PubMed:11121031.
Virus
Recombinant Virus Database
viral probe databse
All the entries of INSD Virus division are organized, and PCR primers have been designed for the identification. There are many databases for the same purpose, and making up a category.
Analysis results of hepatitis virus gene, lineage
,
Hepatitis Virus Database
Evolutional lineage analyses of Hepatitis viruses (types B, C, E) genes are registered. An algorithmic analysis of hepatitis virus sequences from DDBJ INSD entries. It is updated after major updates.
HIV Infectious Disease Integrated Database
The sequences for virus sequences account for only a small part of INSD sequences, but clinical information for about 600 virus hosts seems to be supplemented. The usage is unknown. Registration is required to view the clinical information.
HIV Sequence Database
HIV RNA sequences
HERVd - Human Endogenous Retrovirus database
Human endogenous retrovirus database
Protista
Protist Information Server
This server is providing 61,094 images of protists and other microorganisms (714 genera, 3067 species and 11855 samples) and 1294 movie clips as research and educational resources. This database is supported by the Soken-Taxa project Construction of Biological Image Databases (1997-1999) at The Graduate University for Advanced Studies, and by the Bio-Resource project Fundamental research and development for making databases and networking culture collection information (1997-2001) at JST (Japan Science and Technology Corporation). The database has received grant-in-aid (07558052).
Full-Toxoplasma
,
Full-length cDNA database of Toxoplasma
A database of the analyses of full-length cDNA clones of the toxoplasma. ESTs of each clone are mapped on draft genome sequences (contigs).
Full-Malaria
,
Plasmodium falciparum (malaria), full-length cDNA Database
A database of full-length cDNA clones and the analyses of two Plasmodia causing human malaria and two Plasmodia causing mouse malaria. ESTs for each clone are mapped to the genome as well as to ESTs in the public domain. The database also contains confirmed homology between Plasmodia with TBLASTX (Other Plasmodia contigs were mapped to the template P. falciparum genome).
Criptosporidium Full-Length cDNA Database
,
Full-Cryptosporidium
Babesia Full-length cDNA database
,
Full-Babesia
Slime mold
Atlas (ISH Data Base)
,
Gene expression of Dictyostelium (social amoeba?)
Gene expression information of Dictyostelium discoideum, a cellular slime mold, is registered. Registered data are cDNA clones from each stage, EST, their assemblies, gene expression images with in-situ hybridization.
Target
DNA
Genome
Bacteria genomes/ annotation
,
a genome database of microorganisms sequenced at NITE. (DOGA
Annotations to the genomic sequences and the proteomic analyses (if any), ORF comparison between other species (if any), and the general description of the microorganisms that NITE has sequenced their genomes. Supplement data for 9679194, 10382966, 11572479, 11418146, 12044378, 16237012, 12840036, 12692562, 16372010.
Sea Urchin Genome Database
,
SpBase
SpBase is a system of databases focused on the genomic information from sea urchins and related echinoderms.
Genome database for Rhizobia with annotation
,
RhizoBase
Rhizobia and photosynthesis bacterium (Rhodopseudomonas palstris) genomes database. Genome sequences, ORF information, genes and the gene categories are registered.
NIG Mouse Genome Database
Genome of Lotus japonicus (model legume) / annotation
,
Lotus japonicus Genome Sequence Project
Lotus japonicus genome clones (TAC: transformation-competent artificial chromosomes), the annotations, chromosomal gene maps, and chloroplast gene sequences are registered. Supplement data for 12056416, 11853318, 11214967, 11853317.
CyanoBase
,
Genome database for Cyanobacteria
A database of the genomes for Cyanobacteria, photosynthesis bacteriu (Chlorobium tepidum TLS), purple bacterium (Rhodopseudomonas palustris). Genome Sequences, ORF information, genes and the categories, mutations, proteomic analyses (Cyano2Dbase) are registered. Supplement data for 8590279, 8905231, 9435137, 11759840, 11858227, 12240834, 14621292.
Aspergillus oryzae RIB 40 genome DB
cDNA
HGPD
,
Human Gene and Protein Database
Human Gene and Protein Database presents SDS-PAGE patterns and other informations of human genes and proteins. (
Full-Malaria
,
Plasmodium falciparum (malaria), full-length cDNA Database
A database of full-length cDNA clones and the analyses of two Plasmodia causing human malaria and two Plasmodia causing mouse malaria. ESTs for each clone are mapped to the genome as well as to ESTs in the public domain. The database also contains confirmed homology between Plasmodia with TBLASTX (Other Plasmodia contigs were mapped to the template P. falciparum genome).
3D Structure
Data bank for 3D structural information about nucleic acids ?
,
NDB
Nucleic acids 3D structures information from researches is accepted and is made public as a database.
Motif
Database of functional repeats in mouse cDNA
,
FREP
A database of repetitive sequences in CDS regions of mouse cDNA sequences that are predicted to be functional. Locations on cDNA/genome, polyA signal locations, motifs in translated proteins, and related MeSH terms are registered.
Polymorphism
Gene
,
gene-centered information
Entrez Gene is a searchable database of genes, from RefSeq genomes, and defined by sequence and/or located in the NCBI Map Viewer
Gene Diversity DataBase System (GDBS)
Gene Diversity DataBase System (GDBS)
A Database of Japanese Single Nucleotide Polymorphism for Geriatric Research (JG-SNP)
,
Database of Japanese SNP for geriatric disease
The database contains sexes, ages, disease status, and polymorphisms of geriatric disease patients in Tokyo Metropolitan Geriatric Hospital. These information are registered in a separate database called GEAD (geriatric autopsy database) that contains the above mentioned data as well as smoking history, drinking history, pathological findings, and the extent of atherosclerosis.
Genome Medicine Database of Japan (GeMDBJ)
,
Polymorphism, gene/ protein expression of diseases
Analyses of genetic polymorphisms (SNP), gene expression (with GeneChip) and protein expression (2D-DIGE, LC-MS/MS) related to Alzheimer disease, gastric cancer, diabetes, hypertension, and asthma. Limited patient information (sexes, stratified ages, living prefectures, past history, smoking history) is included. User registration is required to view some of the information.
The SNP Consortium database
SNP Consortium data
Database for human disease-associated gene mutations
,
KMDB/MutationView
Gene mutations related to human diseases have been collected from OMIM, GDB, and literature, and the loci were mapped to the chromosomes, and the diseased sites were mapped to human body/organs. Structural changes (in the amino acids and in splicing, etc) in the genes due to mutations are registered. Seven sub-databases for disease categories have been constructed(KMaiDB, KMboodDB, KMbrainDB, KMcancerDB, OMearDB, KMeyeDB, KMheartDB, KMmuscleDB, KMsyndromeDB)
MutationView
Data bank of single nucleotide polymorphism in the United States.
,
dbSNP
A database that collects SNPs and short insertion/deletion polymorphisms.
SNPs in the transcriptional promoter regions in human
,
dbQSNP
A database of SNP sequences and allele frequency information (experimental data) in the human genome promoter regions (mainly 1.0kb upstream region and 0.2kb downstream region of TSS). SNP typing and quantification were by SSCP analysis.
Changes in the nucleic acid sequences that cause protein polymorphism
,
dbProP (a Protein Polymorphism database)
A database containing gene sequence alterations that affect amino acid sequences of the proteins, namely, SNPs in coding regions and alternative splicings. The target is human.
SNPedia
SNPedia shares information about the effects of variations in DNA, citing peer-reviewed scientific publications.
Human microsatellite polymorphism
,
Polymorphism of Microsatellite Marker Loci in the Japanese P
Heterozygosity of microsatellite markers in Japanese populations. Targeted markers are from deCODE Genetics 2002 (Kong A et al. Nat. Genet. 2002)
Database for human, non-synonymous SNPs
,
PicSNP/A Catalog of Non-Synonymous SNP
Non-synonymous SNPs are collected automatically. The materials are NCBI human draft genome sequences, and the gene sequences and SNPs in the Feature were compared with SwissProt to select non-synonymous SNPs. GO classification of the identified genes are also shown.
PharmGKB
Variation in drug response based on human variation
Microsatellite markers of mouse strains
,
Mouse Microsatellite Data Base of Japan (MMDBJ)
A database of mouse microsatellite markers, as well as a repository that accepts new data from researches. Analyses showing SSLP differences between mouse strains are registered with PCR conditions. Emphasis is on Japanese mice (MSM and JF1).
KMsyndromeDB
It is a sub-database of MutationView. It registers data related to Waardenburg syndrome and QT elongation syndrome.
KMmuscleDB
It is a sub-database of MutationView. It registers data related to Duchenne muscular dystrophy and Fukuyama muscular dystrophy.
KMheartDB
It is a sub-database of MutationView. It registers data related to cardiac myopathy.
KMeyeDB
It is a sub-database of MutationView. It registers data related to retinitis pigmentosa, glaucoma, and cataract.
KMearDB
It is a sub-database of MutationView. It registers data related to deafness.
KMcancerDB
It is a sub-database of MutationView. It registers data related to breast cancer, retinoblastoma, and neurofibromatosis.
KMbrainDB
It is a sub-database of MutationView. It registers data related to Parkinson disease and Alzheimer disease.
KMbloodDB
It is a sub-database of MutationView. It registers data related to chronic myelocytic leukemia.
KMaiDB
It is a sub-database of MutationView. It registers data related to APECED and APT1 (an autoimmune disease and the related gene?). The information source is literature. The database is based on remapping works by hand that summarized the coordinates of the mutations of Mendelian diseases to a common coordinate.
Database of Japanese SNPs
,
JSNP
A database of genetic polymorphisms in Japanese populations which are generally observed on or adjacent to the genes.
Database for human metabolic disorder-related SNPs
,
JMDBase/Japan Metabolic disease DataBase
SNPs related to human metabolic diseases (especially hypertension and diabetes) are identified using original algorithms. 15716494 is an article about the effectiveness of the method, and verification on five genes of three races (Japanese, American Africans and Caucasians) is provided.
Annotations of human genome variation
,
HGVbase
A database of human genome polymorphisms (mostly SNPs, but indels and tandem repeats are contained) that contains physical and functional relationships between the adjacent genes. The data has been obtained from public databases and literatures. In the past, deposition from research groups have been accepted, but it is stopped now. The database routinely exchange data with dbSNP.
GiiB-JST mtSNP
,
Human mitochondrial genome polymorphism database
Distribution of polymorphism in mitochondria genomes from patients of seven diseases (96 patients for each disease) has been analyzed. This is a database of the functional differences between individuals related to the paired polymorphisms. 672 INSD entries of mitochondria genomes cite the same article.
Database for genome wide association studies (GWAS) of human
,
GWAS
The GWAS database is a database of allele frequency and genotype frequency obtained from GWAS (genome wide association study: targeted to unrelated patients and healthy controls, linkage disequilibrium between candidate disease genes and polymorphic markers are detected in the genome-wide range).
D-HaploDB
,
SNPs of human definitive haplotypes determined by complete hydatidiform moles,
Haplotype analysis of 280,000 SNPs from the samples of Japanese complete hydatidiform mole that have been typed with microarrays.
Regulatory Region
UniSTS
,
markars and mapping data
UniSTS is a comprehensive database of sequence tagged sites (STSs) derived from STS-based maps and other experiments. STSs are defined by PCR primer pairs and are associated with additional information such as genomic position, genes, and sequences.
Gene
,
gene-centered information
Entrez Gene is a searchable database of genes, from RefSeq genomes, and defined by sequence and/or located in the NCBI Map Viewer
Database on Translational Signals
PlantPromoterDB
,
ppdb
Mammalian Promoter/Enhancer DataBase
,
PEDB
MachiBase
,
a Drosophila melanogaster 5'-end mRNA transcription database
EPD
,
The Eukaryotic Promoter Database
Eukaryotic POL II promoters with experimentally-determined transcriptionstart sites
DBTSS
,
Database of transcription start sites
A database of transcription start sites that have been decided by mapping ESTs of full-length cDNA clones to genomes. Presently, the target species are human, mouse, zebrafish, Plasmodia, and Protoflorideophycidae.
Bacillus subtilis (soil bacterium) promoter/Knowledge model? of transcription factors,
,
DBTBS
A database of promoters, transcription factors, and controlled genes of Bacillus subtillis. Only the experimentally verified targets from literatures are collected.
sequence
Gene-oriented clusters of transcript sequences
,
UniGene
UniGene is an experimental system for automatically partitioning GenBank sequences into a non-redundant set of gene-oriented clusters. Each UniGene cluster contains sequences that represent a unique gene, as well as related information such as the tissue types in which the gene has been expressed and map location.
SRA
,
Sequence Read Archive
The Sequence Read Archive (SRA) stores sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome Analyzer®, Applied Biosystems SOLiD® System, Helicos Heliscope®, and others.
Probe
,
sequence-specific regions
The NCBI Probe Database is a public registry of nucleic acid reagents designed for use in a wide variety of biomedical research applications, together with information on reagent distributors, probe effectiveness, and computed sequence similarities.
Nucleotide
,
The Entrez Nucleotide database
The Entrez Nucleotide database is a collection of sequences from several sources, including GenBank, RefSeq, and PDB. The number of bases in these databases continues to grow at an exponential rate.
HomoloGene
,
eukaryotic homology groups
HomoloGene is a system for automated detection of homologs among the annotated genes of several completely sequenced eukaryotic genomes.
Genome Project
,
genome project information
The NCBI Entrez Genome Project database is intended to be a searchable collection of complete and incomplete (in-progress) large-scale sequencing, assembly, annotation, and mapping projects for cellular organisms. The database is organized into organism-specific overviews that function as portals from which all projects in the database pertaining to that organism can be browsed and retrieved
Genome
,
whole genome sequences
The Genome database provides views for a variety of genomes, complete chromosomes, sequence maps with contigs, and integrated genetic and physical maps. The database is organized in six major organism groups: Archaea, Bacteria, Eukaryotae, Viruses, Viroids, and Plasmids and includes complete chromosomes, organelles and plasmids as well as draft genome assemblies.
Gene
,
gene-centered information
Entrez Gene is a searchable database of genes, from RefSeq genomes, and defined by sequence and/or located in the NCBI Map Viewer
GSS
,
The Genome Survey Sequences Database
The Genome Survey Sequences Database (GSS) division of GenBank is similar to the EST division, with the exception that most of the sequences are genomic in origin, rather than cDNA (mRNA). It should be noted that two classes (exon trapped products and gene trapped products) may be derived via a cDNA intermediate. Care should be taken when analyzing sequences from either of these classes, as a splicing event could have occurred and the sequence represented in the record may be interrupted when compared to genomic sequence. The GSS division contains (but is not limited to) the following types of data:
GEO profiles
,
expression and molecular abundant profiles
This database stores individual gene expression profiles from curated DataSets in the Gene Expression Omnibus (GEO) repository. Search for specific profiles of interest based on gene annotation or pre-computed profile characteristics. GEO Profiles facilitates powerful searching and linking to additional information sources.
Extremobiospheres Research Center
,
Genomes of Extremophiles/ annotation
Extremophils genome sequences and the annotations analyzed in JAMSTEC. The database contains a list of links to extremophils genome information analyzed in other institutes.
Database for Genes Contributing to Sustainable World
It is a database of annotation targeted to more than 5,000,000 gene candidates that have no assignment of functions in public databases and that were originated from a large number of nucleic acid sequences collected in large-scale sequencings of environmental biological samples (metagenome analyses), which are advanced in all over the world to discover genes that might be useful to the cleanup and conservation of natural environment, leading to a sustainable society. It is a part of human resource development program of database integration project.
Full-length human cDNA Database
A METI full-length cDNA sequencing and analysis project” database of 30,000 human full-length cDNA sequences that have been collected using Oligo-cap method.
DIGITized Genes
A list of candidate genes explored through application of DIGIT program to human genome.
Genome Network Platform
It is a database of human and mouse cDNA CAGE tag sequencing data and molecular interaction data between transcription factors based on yeast two hybrid method, which are outcomes of genome function information analyses in MEXT Genome Network Project.
viral probe databse
All the entries of INSD Virus division are organized, and PCR primers have been designed for the identification. There are many databases for the same purpose, and making up a category.
Bacteria genomes/ annotation
,
a genome database of microorganisms sequenced at NITE. (DOGA
Annotations to the genomic sequences and the proteomic analyses (if any), ORF comparison between other species (if any), and the general description of the microorganisms that NITE has sequenced their genomes. Supplement data for 9679194, 10382966, 11572479, 11418146, 12044378, 16237012, 12840036, 12692562, 16372010.
Annotation of yeast, introns
,
Yeast Intron Database
A database of budding yeast introns. Known introns were confirmed with microarrays, and those actually exist are put together with experimental data into a public database.
Database of Vibrio parahaemolyticus (gram-negative marine bacterium), genome
,
VPARA(Vibrio parahaemolyticus)
A database of the whole-genome sequences of Vibrio parahaemolyticus and the annotations. Downloadable.
SubtiWiki
,
the Bacillus subtilis centred wiki SubtiWiki
Bacillus subtilis is the model organism for Gram-positive bacteria, with a large amount of publications on all aspects of its biology. To facilitate genome annotation and the collection of comprehensive information on B. subtilis, we created SubtiWiki as a community-oriented annotation tool for information retrieval and continuous maintenance. The wiki is focused on the needs and requirements of scientists doing experimental work. This has implications for the design of the interface and for the layout of the individual pages. The pages can be accessed primarily by the gene designations. All pages have a similar flexible structure and provide links to related gene pages in SubtiWiki or to information in the World Wide Web. Each page gives comprehensive information on the gene, the encoded protein or RNA as well as information related to the current investigation of the gene/protein. The wiki has been seeded with information from key publications and from the most relevant general and B. subtilis-specific databases. We think that SubtiWiki might serve as an example for other scientific wikis that are devoted to the genes and proteins of one organism.
Streptomyces griseus Genome Database
Genome of Streptomyces avermitilis (industrial microorganism)/annotation
,
Streptomyces avermitilisゲノムデータベース
A database of Streptomyces avermitilis (a microorganism producing avermectin, an anthelmintic) genome sequences and annotations. It contains physical maps, KEGG pathway analyses, protein families, secondary metabolic products, conserved genes, lineage trees based on 16S rRNA, request methods of cosmid clones. Supplement data for articles (PubMed:11572948, 12692562).
Genome database for anthropoids/ phylogenetic comparison
,
Silver Project (Ape Genome Sequencing)
Genome sequences for the chimpanzee and gorilla. Comparison of DNA sequences between human and these anthropoids, and the comparisons based on the DNA sequences that have been sequenced in the Apes Genome Project (Silver) are registered.
Rice Mitochondrial Genome Information (RMG)
Genome database for Rhizobia with annotation
,
RhizoBase
Rhizobia and photosynthesis bacterium (Rhodopseudomonas palstris) genomes database. Genome sequences, ORF information, genes and the gene categories are registered.
RPG
,
Ribosomal protein gene database
A database of genes encoding ribosomal proteins. It contains nucleic acid sequences, amino acid sequences, gene structures, orthologs for the genes, as well as multiple alignments for each orthologous group. Human data were obtained in a project. Other species data were obtained from public databases.
ROUGE
,
cDNA of mouse, unidentified gene-encoded large proteins?/annotation
A database of cDNA clones (mKIAA/mFLJ) and the analyses from Kazusa mouse cDNA project. A mouse version of HUGE database.
RAPID
,
Resource of Asian Primary Immunodeficiency Diseases
Resource of Asian Primary Immunodeficiency Diseases (RAPID) is a web-based compendium of molecular alterations in primary immunodeficiency diseases.
PLAnt Cis-acting Regulatory DNA Elements Database
,
PLAnt Cis-acting Regulatory DNA Elements Database (PLACE)
A database of plant cis-element motifs collected from literature. Downloadable.
ESTs of Physcomitrella patens (moss)
,
PHYSCObase
A database of Physcomitrella patens subsp. Patens mRNA, EST, contigs, and experiment protocols. Downloadable.
Knowledge model of operons
,
ODB
A database of operons from various species that have been collected and reconstructed from literature and databases. A function to explore operon candidates based on prediction is provided.
Nocardia farcinica genome
A database of Nocardia farcinia IFM 10152 (a gram positive aerobic actinomycete that causes nocardia infections) whole genome sequence containing two plasmids and the annotation. Supplement data for 15466710.
Automatic annotation of Bacillus subtilis (soil bacterium) genome
,
NRSub
A database of Bacillus subtilis genome on which entries of SWISS-PROT, ENZYME, and HOBACGEN are mapped and cross-referenced.
Genome of Mycoplasma penetrans (bacterial pathogen)/annotation
,
Mycoplasma penetrans genome
Mycoplasma (Mycoplasma penetrans) genome sequences and the gene predictions are registered. Supplement data for 12466555.
MitoFish
,
Mitochondrial genome database of fish
Fish mitochondrial genome sequences were collected from GenBank and RefSeq. Taxonomical information was obtained from FishBase, NCBI Taxonomy Browser, The Catalog of Fishes, and Fish Database of Japan. Literature information was obtained from PubMed. Related biological sequences were obtained from DDBJ.
Microbial Genome Workbench
Genome sequences in the public domain of bacteria and archaebacteria were collected, and made searchable with homology, keywords, protein molecular weights, and pI.
Genome of Lotus japonicus (model legume) / annotation
,
Lotus japonicus Genome Sequence Project
Lotus japonicus genome clones (TAC: transformation-competent artificial chromosomes), the annotations, chromosomal gene maps, and chloroplast gene sequences are registered. Supplement data for 12056416, 11853318, 11214967, 11853317.
Integrated Rice Genome Explorer (INE)
,
Rice genome/annotation
A database of rice genome on which gene maps, physical maps, PCR markers, ESTs, BAC/PAC contigs are mapped.
Analysis results of hepatitis virus gene, lineage
,
Hepatitis Virus Database
Evolutional lineage analyses of Hepatitis viruses (types B, C, E) genes are registered. An algorithmic analysis of hepatitis virus sequences from DDBJ INSD entries. It is updated after major updates.
HUGE - Human Unidentified Gene-Encoded large proteins
,
Unidentified long human cDNA/, annotation
A database of cDNA sequences and the analyses of KIAA/FLJ clones collected in Kazusa human cDNA project. The initial purpose was to analyze unidentified genes corresponding to large (>50kDa) proteins. The database registers cDNA sequences, restriction maps, signal sequences, amino acid sequences, motif searches with Pfam and SOSUI.
Automatically annotated whole human genome
,
HOWDY
Human genome data are collected from databases in public domain from all over the world, and each entry is located on chromosomes, to relate the entries with each other.
HIV Infectious Disease Integrated Database
The sequences for virus sequences account for only a small part of INSD sequences, but clinical information for about 600 virus hosts seems to be supplemented. The usage is unknown. Registration is required to view the clinical information.
HGS (Human Genome Sequencing)
, Automatically annotated microbial genomes
,
Genome Information Broker
It is a database of full-length genomes mainly of microorganisms. Structures, gene names, functions, and characteristics are allocated to each genome's ORFs and are contained in the database.
Data bank of nucleic acid sequences in the United States
,
GenBank(R)
It is a databank that accepts deposition of nucleic acid sequences from researches in various countries. It routinely exchanges data with EBI (EMBL) and DDBJ.
Full-Tsetse
,
Tsetse Full-length cDNA database
Full-Toxoplasma
,
Full-length cDNA database of Toxoplasma
A database of the analyses of full-length cDNA clones of the toxoplasma. ESTs of each clone are mapped on draft genome sequences (contigs).
Full-Mite
,
Mite Full-length cDNA database
Echinococcosis Full-length cDNA database
,
Full-Echinococcosis
Criptosporidium Full-Length cDNA Database
,
Full-Cryptosporidium
Babesia Full-length cDNA database
,
Full-Babesia
Anopheles Full-length cDNA database
,
Full-Anopheles
Database for human full-length cDNAs
,
FLJ Human cDNA Database
Outcomes of the sequencing and sequence analyses of METI Protein function analysis project/development of splicing variant acquisition technology. It consists of human splicing variant cDNA sequence database and an entire FLJ oligo-cap method based human full-length cDNA sequence database (about 50,000 full-length analyzed sequences and about 1,500,000 5'end analyzed sequences)
ExtremoBase
EST
,
Expressed Sequence Tags database
Expressed Sequence Tags database (dbEST) (Nature Genetics 4:332-3;1993) is a division of GenBank that contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms.
Data bank of nucleic acid sequences in Europe
,
EMBL Nucleotide Sequence Database: developments in 2005.
It is a databank that accepts deposition of nucleic acid sequence data from researchers in European countries. It routinely exchanges data with U.S. NCBI (GenBank) and Japan DDBJ.
EGTC
,
Mouse genes, trapped with gene trap method
Trap clones were obtained from Mouse ES cells using a new gene trap system (exchangeable gene trap system). Based on them, gene transfer or disruption was conducted, and lineage resources (oocytes and sperms) are distributed. The database registers tag sequences for the trapped genes, homology search results, and mutated lineages.
Database of pathogenetic E.coli O157 genome
,
E.coli O157:H7 Sakai genome project
A database of E.coli O157:H7 Sakai genome sequences, pOSAK1 plasmid sequences, and the annotations. Download is possible.
Dr. Zompo
,
Zostera marina and Posidonia oceanica ESTs
As ecosystem engineers, seagrasses are angiosperms of paramount ecological importance in shallow shoreline habitats around the globe. Furthermore, the ancestors of independent seagrass lineages have secondarily returned into the sea in separate, independent evolutionary events. Thus, understanding the molecular adaptation of this clade not only makes significant contributions to the field of ecology, but also to principles of parallel evolution as well. With the use of Dr. Zompo, the first interactive seagrass sequence database presented here, new insights into the molecular adaptation of marine environments can be inferred. The database is based on a total of 14 597 ESTs obtained from two seagrass species, Zostera marina and Posidonia oceanica, which have been processed, assembled and comprehensively annotated. Dr. Zompo provides experimentalists with a broad foundation to build experiments and consider challenges associated with the investigation of this class of non-domesticated monocotyledon systems. Our database, based on the Ruby on Rails framework, is rich in features including the retrieval of experimentally determined heat-responsive transcripts, mining for molecular markers (SSRs and SNPs), and weighted key word searches that allow access to annotation gathered on several levels including Pfam domains, GeneOntology and KEGG pathways. Well established plant genome sites such as The Arabidopsis Information Resource (TAIR) and the Rice Genome Annotation Project are interfaced by Dr. Zompo. With this project, we have initialized a valuable resource for plant biologists in general and the seagrass community in particular. The database is expected to grow together with more data to come in the near future, particularly with the recent initiation of the Zostera genome sequencing project.
DeinoBase
Database of genomes and transcriptional regulations for fila
,
ESTs of Aspergillus oryzae
Aspergillus oryzae cDNA libraries were constructed under several conditions and the 5'ESTs were sequenced. The sequences are available only via FASTA homology search. Promoter analyses are seems to be planned, because methods for construction of genome clones for the purpose are described. In addition, a cosmid clone of Aspergillus nidulans is registered.
Database of Genomic Variants
The objective of the Database of Genomic Variants is to provide a comprehensive summary of structural variation in the human genome. We define structural variation as genomic alterations that involve segments of DNA that are larger than >1kb. For the purpose of this database, we focus on variants that are not directly correlated with specific phenotypes. The Database of Genomic Variants provides a useful catalog of control data for studies aiming to correlate genomic variation with phenotypic data. The database is continuously updated with new data from peer reviewed research studies. We always welcome suggestions and comments regarding the database from the research community.
DaphniaBASE
The database of daphnia clone sequences and blast search results for those sequences.
DDBJ Trace Archive
DDBJ - DNA Data Bank of Japan
,
Nucleic acids, sequence data bank of Japan
A databank that accept nucleic acids sequence data mainly from Japanese researchers. Data from all over the world can be collected by routinely exchanging data with U.S. NCBI (GenBank) and European EBI (EMBL).
DBH2H
,
head-to-head (h2h) gene pairs
DBH2H collects head-to-head (h2h) gene pairs identified from human, mouse, rat, chicken and fugu genomes, and distinguishes the ortholog mapping relationship among them. The gene pairs in DBH2H are annotated with sequential features including single nucleotide polymorphisms, CpG islands and transcription factor binding sites, as well as functional terms and genetic disorders. In addition, the expression correlation information based on 117 microarray datasets is included. By providing user-friendly access to these data, DBH2H represents a valuable resource for further analyses of this important gene arrangement in terms of transcriptional regulation mechanisms, evolutionary conservation, disease relevance, etc.
Cytokine Family cDNA Database (dbCFC)
,
Genes, and proteins of cytokines
The database and a portal site has collected information on cytokine genes, cDNA, proteins from public databases, and arranged them by the families and by the genes. All the detailed information is the links to original databases.
CyanoBase
,
Genome database for Cyanobacteria
A database of the genomes for Cyanobacteria, photosynthesis bacteriu (Chlorobium tepidum TLS), purple bacterium (Rhodopseudomonas palustris). Genome Sequences, ORF information, genes and the categories, mutations, proteomic analyses (Cyano2Dbase) are registered. Supplement data for 8590279, 8905231, 9435137, 11759840, 11858227, 12240834, 14621292.
CropNet
Genome mapping in crop plants
Comparasite
CUTG - Codon Usage Tabulated from GenBank
,
Database of codon usage frequency
A database of codon usage probability in various species based on GenBank CDS.
Aspergillus oryzae RIB 40 genome DB
Ancient Genome Encyclopedia : AGE
Alternative Splicing and Transcription Archives (ASTRA)
Alternative exon patterns by alternative splicing are predicted by comparison of the genome and cDNA. Target species are human, mouse, Drosophila, worm, Arabidopsis, and rice.
ARCHAebacterial Information Collection (ARCHAIC)
,
Genome of archaebacteria/annotation
A database of genome sequences, gene structures and the sequences (nucleic acids and amino acids), pseudogenes, operons, and lineages for several archaebacteria. Supplement data for PubMed:9679194, PubMed:11121031.
Medical
The Contents Library of Medical Information
Online learning contents mainly for practicing physicians
(cited from the database top page) This database is an electronic version of cases of organic solvent poisoning that originally were published from 1984 through 2000. The literature information, occupational disease application and decision status are at the time of case collection. All the diagnoses of the cases were made by case reporters, and they contain a wide range of diagnosis confidence, from cases of definite organic solvent poisoning to possible cases of poisoning.
Japan Adult Cardiovascular Surgery Database
Pre/post-operative medically-checked physical status and the operation processes and the outcome of patients who underwent cardiovascular surgery.
A database from the outcomes of MHLW study group EBM evaluation of existing treatments to atopic dermatitis and dissemination of effective treatments (chief researcher: Masutaka Furue) (2002-2004). It consists of data collection for healthcare professionals and Q&A for the general population.
Integrated Clinical Omics Database
,
iCOD
Tropical Medicine and Health DATABASE
A contents database of "Tropical Medicine and Health"
Kidney Development Database
Kidney development and gene expression
IBMD
,
Integrated Biomedical Database
H-ANGEL
Pathology
Database for hematological malignancy
Pathology Core Pictures
A database of the minimum requirement of pathological images to understand 6 years medical education on the basis of medical education model core curriculum"
Gastrointestinal Medical Image Database
Breast Tumor Image Database
PDDB
,
the Prion Disease Database
Prion diseases reflect conformational conversion of benign isoforms of prion protein (PrPC) to malignant PrPSc isoforms. Networks perturbed by PrPSc accumulation and their ties to pathological events are poorly understood. Time-course transcriptomic and phenotypic data in animal models are critical for understanding prion-perturbed networks in systems biology studies. Here, we present the Prion Disease Database (PDDB), the most comprehensive data resource on mouse prion diseases to date. The PDDB contains: (i) time-course mRNA measurements spanning the interval from prion inoculation through appearance of clinical signs in eight mouse strain-prion strain combinations and (ii) histoblots showing temporal PrPSc accumulation patterns in brains from each mouse–prion combination. To facilitate prion research, the PDDB also provides a suite of analytical tools for reconstructing dynamic networks via integration of temporal mRNA and interaction data and for analyzing these networks to generate hypotheses.
H-ANGEL
Disease,Cancer
Genome Medicine Database of Japan (GeMDBJ)
,
Polymorphism, gene/ protein expression of diseases
Analyses of genetic polymorphisms (SNP), gene expression (with GeneChip) and protein expression (2D-DIGE, LC-MS/MS) related to Alzheimer disease, gastric cancer, diabetes, hypertension, and asthma. Limited patient information (sexes, stratified ages, living prefectures, past history, smoking history) is included. User registration is required to view some of the information.
Breast Tumor Image Database
dbGaP
RAPID
,
Resource of Asian Primary Immunodeficiency Diseases
Resource of Asian Primary Immunodeficiency Diseases (RAPID) is a web-based compendium of molecular alterations in primary immunodeficiency diseases.
PrognoScan
Human p53, human hprt, rodent lacI and rodent lacZ databases
Mutations at the human p53 and hprt genes; rodent transgeniclacI andlacZ mutations
HGMD® - Human Gene Mutation Database
Known (published) gene lesions underlying human inherited disease
Cell Line Catalog
A cell line catalogue of cancer cell lines that the cancer cell bank distributes.
Atlas of Genetics and Cytogenetics in Oncology and Haematolo
,
Overview of the cytogenetics of cancer cells
A database that has collected cytogenetics information and clinical information of tumors and cancer-related diseases. Genes (nucleic acids, proteins, mutations, related diseases), cytogenetics/clinical information (clinical manifestation, cytogenetics data, related genes, complex genes and fused proteins), cancer-related diseases (inheritance modes, clinical manifestation, risk of tumorigenesis, cytogenetics data, related genes, proteins, mutations) have been collected separately. Clinical case reports are collected.
ARSA
Pathway
Dictionary of Wnt proteins
,
Wnt Database
A portal site to information on Wnt proteins (highly-conserved proteins that control intercellular interactions in the embryogenesis). The database formats differs by the organisms and the topics. Signal pathway information (images) is included.
Knowledge model of signal transduction pathways
,
The Signaling PAthway Database (SPAD)
A database that is a collection and visualization of signal transduction pathways. The pathways are classified as growth factors, cytokines, hormones, and stresses, in correspondence to extracellular signal molecules.
Pathguide
Macrophage Curated Database
Database of kinase, pathways
,
Kinase Pathway Database
A database of protein kinases of major eukaryotes with completely sequenced genomes. Protein classes and functions, orthologous relations between species, protein interactions, protein domains, and protein structures are registered as well. Protein interactions were collected with natural language processing of literature.
KEGG Pathway
,
Knowledge model ?of biomolecular interactions/reactions?
A database collecting pathways (molecular interactions). Metabolism maps, inter/intracellular information processing maps, human disease association maps are registered.
INOH pathway database
Glycan
,
Glycome related pathways
A database of sugar chains and the relevant pathways that have been collected from KEGG, CarbBank, and literature. Tools to generate possible structures of sugar chains are included. A part of KEGG LIGAND.
GSCope3
,
Ranking of pathways with SOM clustered microarray data
A tool to analyze the correlation between gene expression data clustered with SOM and KEGG pathway data to rank and extract pathways that are likely to be supported by gene expression data. As inputs, KEGG pathways, SOM cluster, an appendix file that connect the both IDs are necessary.
Cytokine Signaling Pathway Database
,
Signaling pathway of cytokine
Information related to cytokine signaling pathways are collected. Ligand-receptor relations of the chemokines, 3D structures and domain structures of receptors, interspecies lineages of receptors, a list of kinases and links to other databases are registered.
BRITE
,
Knowledge model ?of functional hierarchies and binary relationships of biological systems
A database of hierarchical expressions of the relationships in biological systems. Gene orthologs, protein families, protein interactions, chemical compounds and the reactions, drugs, and diseases are included.
ontology
ChEBI - Chemical Entities of Biological Interest
Small molecules, atoms, ions and radicals of biological interest
BioTermNet
phenotype
Genotype and Phenotype
,
dbGaP
The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits. The advent of high-throughput, cost-effective methods for genotyping and sequencing has provided powerful tools that allow for the generation of the massive amount of genotypic data required to make these analyses possible.
The mi-R ontology database
,
miRò
miRò is a web-based knowledge base that provides users with miRNA–phenotype associations in humans. It integrates data from various online sources, such as databases of miRNAs, ontologies, diseases and targets, into a unified database equipped with an intuitive and flexible query interface and data mining facilities. The main goal of miRò is the establishment of a knowledge base which allows non-trivial analysis through sophisticated mining techniques and the introduction of a new layer of associations between genes and phenotypes inferred based on miRNAs annotations. Furthermore, a specificity function applied to validated data highlights the most significant associations.
The Homeodomain Resource
,
a comprehensive collection of sequence, structure, interaction, genomic and functional information on the homeodomain protein family
The Homeodomain Resource is a curated collection of sequence, structure, interaction, genomic and functional information on the homeodomain family. The current version builds upon previous versions by the addition of new, complete sets of homeodomain sequences from fully sequenced genomes, the expansion of existing curated homeodomain information and the improvement of data accessibility through better search tools and more complete data integration. This release contains 1534 full-length homeodomain-containing sequences, 93 experimentally derived homeodomain structures, 101 homeodomain protein–protein interactions, 107 homeodomain DNA-binding sites and 206 homeodomain proteins implicated in human genetic disorders.
portal
Dictionary of Wnt proteins
,
Wnt Database
A portal site to information on Wnt proteins (highly-conserved proteins that control intercellular interactions in the embryogenesis). The database formats differs by the organisms and the topics. Signal pathway information (images) is included.
Pathguide
Database
,
The Journal of Databases and Curation
Database: The Journal of Biological Databases and Curation provides a platform for the presentation of novel ideas in database research surrounding biological information, and aims to help strengthen the bridge between database developers and users.
PAGE
World-2DPAGE Repository
2DPAGEのデータを論文からキュレートしてデータベース化したもの。ここのスポット強度情報まで詳細にアクセスすることができ、embl様のフォーマットでテキストを取得可能
HGPD
,
Human Gene and Protein Database
Human Gene and Protein Database presents SDS-PAGE patterns and other informations of human genes and proteins. (
crystal structure analysis
PDDB
,
the Prion Disease Database
Prion diseases reflect conformational conversion of benign isoforms of prion protein (PrPC) to malignant PrPSc isoforms. Networks perturbed by PrPSc accumulation and their ties to pathological events are poorly understood. Time-course transcriptomic and phenotypic data in animal models are critical for understanding prion-perturbed networks in systems biology studies. Here, we present the Prion Disease Database (PDDB), the most comprehensive data resource on mouse prion diseases to date. The PDDB contains: (i) time-course mRNA measurements spanning the interval from prion inoculation through appearance of clinical signs in eight mouse strain-prion strain combinations and (ii) histoblots showing temporal PrPSc accumulation patterns in brains from each mouse–prion combination. To facilitate prion research, the PDDB also provides a suite of analytical tools for reconstructing dynamic networks via integration of temporal mRNA and interaction data and for analyzing these networks to generate hypotheses.
ConfC
3D structure
Structure
,
three-dimensional macromolecular structures
The resources developed by the Structure Group of the NCBI Computational Biology Branch (CBB) are freely available to the public and focus on four areas:
CDD
,
conserved protein domain database
Conserved domains are functional units within a protein that have been used as building blocks in molecular evolution and recombined in various arrangements to make proteins with different functions.
Protein Structure of proteins that contain hydrogen and hydration water
Analysis of Protein 3D structures containing hydrogen and hydration water with neutron diffraction method. The database contains PDB-derived X-ray structural analysis data and neutron diffraction data.
Protein-Nucleic Acid complex database
Protein-nucleic acid complex entries have been collected from PDB.
PDB-REPRDB Representative protein chains from PDB
,
Representative protein chains
A database that grouped PDB proteins based on the similarity in sequences and in the structures. An interface is provided with which representative proteins for the groups can be selected by applying to specified rules.
RIKEN Systems and Structural Biology Center
GENIUS Ⅱ
A database of predicted coding regions in the genome sequences and the protein 3D structures. Basically, it is the searches of homologous proteins using PSI-BLAST.
タンパク3000成果データベース
A database of the present status and the outcomes of MEXT Protein 3000 Project"
Dictionary of protein, 3D structures
,
eProtS: Encyclopedia of Protein Structures
3Dstructures and the proteins are explained for biologically important proteins. The registered proteins are classified into groups.
Database for molecular surfaces of ?protein’s functional sites
,
eF-site
A database of calculated surface characteristics of protein functional domains. PDB structures were used as the material, and surface electrostatic potentials, hydrophobic characteristics, and Connolly surfaces were calculated and are contained in the database.
Knowledge model ?of transmembrane proteins
,
TMPDB
Experimentally verified transmembrane proteins have been collected from literature and are topologically classified.
Annotation and classification of protein 3D structure
,
SCOP - Structural Classification Of Proteins
A database that hierarchically classified proteins with known 3D structures from the viewpoint of evolutional and structural relationships between the domains. Specialists make the classification manually.
RPSD
,
Rice Protein Structure Databace
A database that collected rice protein structures. The data consists of 3D-structures derived from PDB and predicted structures from GTOP.
Dictionary of thermodynamic parameters of proteins
,
ProTherm
A database of thermodynamics in wild type and mutated proteins mainly focusing on thermodynamics parameters, as well as secondary structures, contactability, experimental conditions, methods, and protein activity. Parameters included are Gibbs free energy, enthalpy changes, heat capacity changes, and phase transition temperature.
ProMode
Fluctuations under normal vibration mode of various proteins have been calculated. The results are shown in animations.
Data bank of protein structures in Japan
,
PDBj (Protein Data Bank Japan)
A Japan node of protein three dimensional structures database. It manages wwPDB in cooperation with U.S. RCSB and European MSD-EBI.
PDBSTR
A reconstruction of PDB for KEGG.
Data bank for protein 3D structures
,
PDB
A databank to collect protein 3D structures that have been identified using mainly X-ray crystal structure analyses or NMR.
OLIGAMI - OLIGomer Architecture and Molecular Interface
,
Quarternary Structural Classification of Proteins
OLIGAMI is a database of the verified coordinates (curated entries) and new chain formulas for biological molecules that allows you to browse oligomers through the SCOP hierarchy and to interactively view three-dimensional structures of biological molecules for all PDB entries.
Annotations of 3D biopolymer structures, ?
,
IMB Jena Image Library
It is a database that reorganized PDB and NDB data in terms of molecular types and characteristics partial structures (SITE), and hetero compound.
Het-PDB Navi
3D structures of low molecular weight molecules that appear in PDB entries are collected and classified with the molecular types.
Gallery of Biomolecules
GTOP
,
Prediction of protein structures
A database of all the predicted or confirmed ORFs on known genomes, as well as prediction of 3D structures and the functions of the encoded proteins. The database contains 3D structure prediction (by Reverse PSI-BLAST), analyses of protein functions (by BLAST), motif analyses (by PROSITE), gene family classification (by Pfam), prediction of transmembrane regions (by SOSUI), prediction of coiled-coil regions (by Multicoil), and analyses of repetitive sequences (by RepAlign).
ConfC
CSDBase - Cold Shock Domain database
Cold shock domain-containing proteins
Conserved Domains and Protein Classification
Curated alignments of protein domains from Pfam, SMART and COG databases
CASP8
,
Critical Assessment of Techniques for Protein Structure Prediction
Results of the recent Critical Assessment of Techniques for Protein Structure Prediction, CASP8, present several valuable sources of information. First, CASP targets comprise a realistic sample of currently solved protein structures and exemplify the corresponding challenges for predictors. Second, the plethora of predictions by all possible methods provides an unusually rich material for evolutionary analysis of target proteins. Third, CASP results show the current state of the field and highlight specific problems in both predicting and assessing. Finally, these data can serve as grounds to develop and analyze methods for assessing prediction quality. Here we present results of our analysis in these areas. Our objective is not to duplicate CASP assessment, but to use our unique experience as former CASP5 assessors and CASP8 predictors to (i) offer more insights into CASP targets and predictions based on expert analysis, including invaluable analysis prior to target structure release; and (ii) develop an assessment methodology tailored towards current challenges in the field. Specifically, we discuss preparing target structures for assessment, parsing protein domains, balancing evaluations based on domains and on whole chains, dividing targets into categories and developing new evaluation scores. We also present evolutionary analysis of the most interesting and challenging targets.
BMRB - BioMagResBank
,
Biological Magnetic Resonance Data Bank
The database accepts deposition of 3D structural data of biological macromolecules decided with NMR from researchers. The database makes cooperation with PDB and NDB when it is constructed, and data are deposited in these databanks via BMRB.
Predictions of G-protein coupled receptor genes
,
SEVENS database
A database of predicted genes encoding G-protein coupled receptors (proteins with seven transmembrane helices) from known genome sequences.
3DinSight
Several protein databases are linked to enable inter-database tracing of the links between the database entries.
Localization
Distribution pattern of antigens in embryonic worm
,
The Sugimoto Lab C. elegans Monoclonal Antibody Collection
Immunostaining images of stage specificity and localization of proteins in worm embryos are registered. The antibodies are distributed.
Amino Acid Characteristic
DB-SPIRE
CSA - Catalytic Site Atlas
Enzyme active sites and catalytic residues in enzymes of known 3Dstructure
Conserved Domains and Protein Classification
Curated alignments of protein domains from Pfam, SMART and COG databases
AAindex
,
Dictionary of physiochemical and biological properties of amino acids
A database of physicochemical and biological index of 20 amino acids and the matrices of the similarity between them.
Antibody
Dictionary of carbohydrate antigens and antibodies
,
Sugar chain database (GlycoEpitope)
A dictionary of carbohydrate antigens and antibodies that recognize them. Information on the antigen include sugar chains, antibodies that recognize them, glycoproteins having the antigens, glycolipids having them as a building block, and the enzymes participating in the biosynthesis and degradation. Information on the antibodies include antibodies, sugar chain sequences that they recognize, cases of immunoprecipitation, immunobloting, and histochemistry experiments using them, and places to obtain them.
Annotations of proteins related to the immune system
,
IMGT/LIGM-DB, the IMGT comprehensive database of immunoglobu
It is a database that collected immunoglobulin (IG), T cell receptors (TR), major histocompatibility complexes (MHC), and other proteins related to immunity (RPI) of human and other vertebrates. The database consists of five datasets of IMGT/LIGM-DB (IG and TR sequences), IMGT/PRIMER-DB (primers to IG and TR), IMGT/GENE-DB (human and mouse immunity-related genes that are organized based on the genomes), and IMGT/3Dstructure-DB (immunity related genes with known 3D structures).
Function
Structure
,
three-dimensional macromolecular structures
The resources developed by the Structure Group of the NCBI Computational Biology Branch (CBB) are freely available to the public and focus on four areas:
CDD
,
conserved protein domain database
Conserved domains are functional units within a protein that have been used as building blocks in molecular evolution and recombined in various arrangements to make proteins with different functions.
Immunostaining images of whole mouse sections by all matrix proteins
,
mouse basement membrane bodymap
Immunostaining images of whole mouse sections by all matrix proteins Good data for matrix researchers and tissue engineers Poly/monoclonal antibodies against each of 44 matrix-proteins Immnunohistochemical images of E16.5 whole body embryo section list of target proteins http://www.matrixome.com/bm/EnterBodymap/Protein/protein.asp
Dictionary of Wnt proteins
,
Wnt Database
A portal site to information on Wnt proteins (highly-conserved proteins that control intercellular interactions in the embryogenesis). The database formats differs by the organisms and the topics. Signal pathway information (images) is included.
Dictionary of restriction enzymes
,
REBASE
A database of restriction enzymes, DNA methyltransferases, and proteins related to modification of restriction sites. It contains Enzyme sequences, 3D structures, sources, recognition/cutting sequences, similar enzymes, methylation susceptibility, and commercial distribution sites. The database is accessible with WWW, and is downloadable.
Protein Clusters
,
a collection of related protein sequences
This collection of related protein sequences (clusters) consists of Reference Sequence proteins encoded by complete genomes. This database contains both curated and non-curated clusters.
Dictionary of protease and protease inhibitors
,
Prolysis
It is a database of proteases and protease inhibitors. It contains biochemical properties, and 3D structures of the proteases and the inhibitor structures and characteristics. It provides a tool to search for protease digesting positions from input amino acid sequences,
Functional annotations of peptidases and peptidase inhibitors
,
MEROPS: the peptidase database.
It is a database of peptidases and the inhibitor proteins. Peptidases and the inhibitors are hierarchically classified based on the structures. If the proteins are classified into protein families, the classification is based on them; otherwise (recent data) it is based on similarity (the latter is called CLAN). Low molecular weight inhibitors are also contained.
Database of G-protein coupled receptor ligands
,
GPCRDB
A database that obtained information on the G-protein coupled receptors from public databases, added to analyses, and organized them. It contains sequences, mutation positions, 3D structures (PDB-derived data and predictive models), ligand binding constants, multiple alignments for each family and the lineage trees.
DB-SPIRE
Conserved Domains and Protein Classification
Curated alignments of protein domains from Pfam, SMART and COG databases
BRENDA
,
Dictionary of enzymes
A database that manually extracted information on enzymes from literature. The information is organized based on EC numbers, and includes structures sequences, functions, reaction characteristics, isolation methods, stability, source organisms/tissues/localization, and relations to diseases.
AMINOACYL-tRNA SYNTHETASES DATABASE
,
Database for aminoacyl-tRNA synthetases
A database of known aminoacyl-tRNA synthetase (an enzyme that is believed to have appeared in the early stages of evolution). They are classified by the species and corresponding amino acids.
A Database of Enzyme Catalytic Mechanisms (EzCatDB)
,
Knowledge model of enzyme catalytic mechanisms
Classification of the enzymes registered in PDB and SWISS-PROT according to the domain structures, EC numbers, catalytic mechanisms, and ligand structures. The information sources are literature and PDB entry descriptions.
Motif, domain
CDD
,
conserved protein domain database
Conserved domains are functional units within a protein that have been used as building blocks in molecular evolution and recombined in various arrangements to make proteins with different functions.
Annotation of protein domain models
,
SMART
A database of manually constructed protein domain models. Domain families for signal transduction, extracellular, and chromatin-related families are registered. The database contains phylogenetical distribution, functional classes, 3D structures, and functionally important residues of the domains. Normal SMART is built upon SWISS-PROT, trEMBL, and Ensembl proteome (stable version). GenomicSMART is built upon proteins derived from completely sequenced genomes.
Annotation of protein motifs
,
Pfam
A database of shared protein domains, multiple alignment constructed from protein families, and HMM models for them. Pfam-A is a curated data set and Pfam-B is automatically constructed from PRODOM families.
Annotation of O- and C- ?glycosylated proteins
,
O-GLYCBASE
A database that collected and organized information from literature and public databases on glycoproteins with at least one experimentally verified O-linked glycosylation site. Each entry describes sequences, glycan types, O-linked Ser/Thr positions, N-linked Asn positions, C-linked Trp positions, source organisms, and citation information.
Annotations of protein characteristics and function
,
InterPro
A database that contains information on the protein family domains, and functional parts that were collected from several sources and were integrated by specialists. Amino acid sequences are collected separately as InterProtKB.
DB-SPIRE
CASP8
,
Critical Assessment of Techniques for Protein Structure Prediction
Results of the recent Critical Assessment of Techniques for Protein Structure Prediction, CASP8, present several valuable sources of information. First, CASP targets comprise a realistic sample of currently solved protein structures and exemplify the corresponding challenges for predictors. Second, the plethora of predictions by all possible methods provides an unusually rich material for evolutionary analysis of target proteins. Third, CASP results show the current state of the field and highlight specific problems in both predicting and assessing. Finally, these data can serve as grounds to develop and analyze methods for assessing prediction quality. Here we present results of our analysis in these areas. Our objective is not to duplicate CASP assessment, but to use our unique experience as former CASP5 assessors and CASP8 predictors to (i) offer more insights into CASP targets and predictions based on expert analysis, including invaluable analysis prior to target structure release; and (ii) develop an assessment methodology tailored towards current challenges in the field. Specifically, we discuss preparing target structures for assessment, parsing protein domains, balancing evaluations based on domains and on whole chains, dividing targets into categories and developing new evaluation scores. We also present evolutionary analysis of the most interesting and challenging targets.
Blocks
,
Highly conserved regions of proteins
A database that aligned highly conserved regions without gaps for each known protein family.
Mutation
Overview of protein mutant data (complied from the literature)
,
PMD
A literature database of protein mutations. Literature information includes authors, journals and the pages, abstracts, organism species, Nitrate end sequences, mutation locations and the patterns, links to Swiss-prot, PIR, and PDB.
PDDB
,
the Prion Disease Database
Prion diseases reflect conformational conversion of benign isoforms of prion protein (PrPC) to malignant PrPSc isoforms. Networks perturbed by PrPSc accumulation and their ties to pathological events are poorly understood. Time-course transcriptomic and phenotypic data in animal models are critical for understanding prion-perturbed networks in systems biology studies. Here, we present the Prion Disease Database (PDDB), the most comprehensive data resource on mouse prion diseases to date. The PDDB contains: (i) time-course mRNA measurements spanning the interval from prion inoculation through appearance of clinical signs in eight mouse strain-prion strain combinations and (ii) histoblots showing temporal PrPSc accumulation patterns in brains from each mouse–prion combination. To facilitate prion research, the PDDB also provides a suite of analytical tools for reconstructing dynamic networks via integration of temporal mRNA and interaction data and for analyzing these networks to generate hypotheses.
Androgen Receptor Gene Mutations Database
Mutations in the androgen receptor gene
Proteome
Proteins found in human salivary intercalated duct cell line
,
Two-Dimensional Electrophoresis Database of HSG cells proteins
2-dimensional gel electrophoresis of the proteins in HSG (Human salivary intercalated duct cell line, a cell line established with radiation on human salivary glands) is registered. Peptides have been identified from each spot using MALDI-TOF or peptide sequencers.
Characteristics of endogenous? peptides
,
PEPTIDOME
Peptides in living organisms were comprehensively separated and identified, and have been arranged with their electric charges, hydrophobicity, and molecular weights.
Solubility database of all E.coli proteins
,
eSOL
eSOL is a database on the solubility of entire ensemble E.coli proteins individually synthesized by PURE system that is chaperon free.
Database of rice proteome
,
Rice Proteome Database
A database that collected spots of 2D gel electrophoresis targeted to rice tissues and organelles. Protocols of proteomic analyses are registered.
MS/MS proteomic experiments
,
Peptidome
Peptidome is a public repository that archives and freely distributes tandem mass spectrometry peptide and protein identification data generated by the scientific community. Several layers of data are captured to promote understanding of the experiment and analysis of the underlying data
ExPASy-SWISS 2D PAGE database
The method of data collection is unclear. The data contains the materials, spot ID, spot protein names, gel images. By protein names, 2D gels on which they are identified can be retrieved. By material names, 2DPAGE images and spot lists for the materials can be retrieved. This database is like a collection of article figures targeted to 2DGEL figures. It is useful to generate ideas. It is slow, so Korean mirror site may be a bit faster.
Sequence
Protein
,
Protein sequence database
The protein entries in the Entrez search and retrieval system have been compiled from a variety of sources, including SwissProt, PIR, PRF, PDB, and translations from annotated coding regions in GenBank and RefSeq.
SALAD Database
Protein Clusters
,
a collection of related protein sequences
This collection of related protein sequences (clusters) consists of Reference Sequence proteins encoded by complete genomes. This database contains both curated and non-curated clusters.
Database for protein domain families
,
ProDom
A database of extracted domains from algorithmically classified known amino acid sequences. Family classification was conducted based on SWISS-PROT and treEMBL and using PSI-BLAST. Domains were explored through multiple alignments in each family. The database contains ProDom based on all the amino acid sequences and ProDom-CG based only on amino acid sequences that derived from completed genomes.
A dictionary of protein, sequences and related literature
,
PRF
PRF/LITDB (protein literature database), PRF/SEQDB (amino acid sequence database), PRF/SYNDB (synthetic chemical compounds database) are registered. All the data are obtained from literature.
PIR-PSD
,
Resource of protein sequence information
The database functioned as a databank to collect amino acid sequence until 2004. Presently, it provides protein annotation resources and tools as a part of UniProt. PIRSF is a database that categorizes full-length protein sequences into classes from an evolutionally point of view. iProClass is a database that assigned to UniProtKB cross reference indices of PDB, COG, Pfam, GenBank, GEO, OMIM, PubMed, GO, DIP, Swiss-2DPAGE, and KEGG.
Automatic annotation of Bacillus subtilis (soil bacterium) genome
,
NRSub
A database of Bacillus subtilis genome on which entries of SWISS-PROT, ENZYME, and HOBACGEN are mapped and cross-referenced.
Microbial Genome Workbench
Genome sequences in the public domain of bacteria and archaebacteria were collected, and made searchable with homology, keywords, protein molecular weights, and pI.
MIPS
Databases at Munich Information Center for Protein Sequences
Analysis results of hepatitis virus gene, lineage
,
Hepatitis Virus Database
Evolutional lineage analyses of Hepatitis viruses (types B, C, E) genes are registered. An algorithmic analysis of hepatitis virus sequences from DDBJ INSD entries. It is updated after major updates.
HIV Infectious Disease Integrated Database
The sequences for virus sequences account for only a small part of INSD sequences, but clinical information for about 600 virus hosts seems to be supplemented. The usage is unknown. Registration is required to view the clinical information.
GENES
,
Gene sequences of known genomes
Genes on the known genomes are collected from public data, and the ID numbers are allocated to use them in an integrated way with other KEGG systems.
Cytokine Family cDNA Database (dbCFC)
,
Genes, and proteins of cytokines
The database and a portal site has collected information on cytokine genes, cDNA, proteins from public databases, and arranged them by the families and by the genes. All the detailed information is the links to original databases.
CluSTr - Clusters of Swiss-Prot and TrEMBL proteins
Automatic classification of SWISS-PROT+TrEMBL proteins
COG - Clusters of Orthologous Groups of proteins
,
Orthologous groups of proteins
A database that classified the proteins of prokaryotes/eukaryotes with sequenced genomes into orthologous groups. The groups are categorized by functional categories (GO annotations). A database that classified predicted orthologs on the eukaryote genomes in the same way is separately opened up as KOG.
Conserved Domains and Protein Classification
Curated alignments of protein domains from Pfam, SMART and COG databases
Annotation
Genotype and Phenotype
,
dbGaP
The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits. The advent of high-throughput, cost-effective methods for genotyping and sequencing has provided powerful tools that allow for the generation of the massive amount of genotypic data required to make these analyses possible.
Cassava Full-Length cDNA Database
,
キャッサバ完全長cDNAアノテーション
RIKENで集められたキャッサバのcDNAに関する外部データベースを横断的に検索可能にしている。検索にはキーワード検索、BLAST検索が対応している。
RIKEN Hub Database Project
fRNA Database
Database for worm, genome/annotation
,
WormBase
A database of biological information of C.elegans and other worms. It contains genomes (structures, functions, genetic polymorphisms, comparative genomic studies), genes (structures, expressions, phenotypes, and RNAi), lineages (strains, genetics, and markers), and literature information.
Automatic annotation of genomes
,
The UCSC Genome Browser Database: update 2006.
A database of automated annotation to vertebrates and major model organisms that have public genome sequences. Annotations are managed for each the type (there are various types including markers, BAC endmap positions, RefSeq entries, GeneScan results, mapped EST locations) as separate tracks. Proteomes (Proteome Browser) and in situ images (VisiGene) are managed separately.
Database for mouse genome/annotation
,
The Mouse Genome Informatics (MGI)
An integrated database of mouse genome researches in Jackson Laboratory. MGD (sequences, gene definition, mapping, phenotypes, mutants, strains, comparative studies with other mammals), GXD (gene expression: collection from literature or submission by researchers), and MTB (information of a model mouse that generates tumors) are integrated.
The Homeodomain Resource
,
a comprehensive collection of sequence, structure, interaction, genomic and functional information on the homeodomain protein family
The Homeodomain Resource is a curated collection of sequence, structure, interaction, genomic and functional information on the homeodomain family. The current version builds upon previous versions by the addition of new, complete sets of homeodomain sequences from fully sequenced genomes, the expansion of existing curated homeodomain information and the improvement of data accessibility through better search tools and more complete data integration. This release contains 1534 full-length homeodomain-containing sequences, 93 experimentally derived homeodomain structures, 101 homeodomain protein–protein interactions, 107 homeodomain DNA-binding sites and 206 homeodomain proteins implicated in human genetic disorders.
Database for Arabidopsis genome/annotation
,
The Arabidopsis Information Resource (TAIR)
A database of Arabidopsis genome, genes, and molecular biological data. It contains annotated genome, gene products, metabolism, gene expression, markers, strain resource information, and literature information.
SubtiWiki
,
the Bacillus subtilis centred wiki SubtiWiki
Bacillus subtilis is the model organism for Gram-positive bacteria, with a large amount of publications on all aspects of its biology. To facilitate genome annotation and the collection of comprehensive information on B. subtilis, we created SubtiWiki as a community-oriented annotation tool for information retrieval and continuous maintenance. The wiki is focused on the needs and requirements of scientists doing experimental work. This has implications for the design of the interface and for the layout of the individual pages. The pages can be accessed primarily by the gene designations. All pages have a similar flexible structure and provide links to related gene pages in SubtiWiki or to information in the World Wide Web. Each page gives comprehensive information on the gene, the encoded protein or RNA as well as information related to the current investigation of the gene/protein. The wiki has been seeded with information from key publications and from the most relevant general and B. subtilis-specific databases. We think that SubtiWiki might serve as an example for other scientific wikis that are devoted to the genes and proteins of one organism.
Automatic annotation of Bacillus subtilis (soil bacterium) genome
,
SubtiList
A database of annotation to Bacillus subtilis genome (gene location, functional assignment, and links to references).
Genome of Streptomyces avermitilis (industrial microorganism)/annotation
,
Streptomyces avermitilisゲノムデータベース
A database of Streptomyces avermitilis (a microorganism producing avermectin, an anthelmintic) genome sequences and annotations. It contains physical maps, KEGG pathway analyses, protein families, secondary metabolic products, conserved genes, lineage trees based on 16S rRNA, request methods of cosmid clones. Supplement data for articles (PubMed:11572948, 12692562).
Database for baker’s or budding yeast genome/annotation
,
SGD - Saccharomyces Genome Database
A gene-based database of the molecular biological and genetic data of budding yeasts. Most of the annotations rely on manual information extraction from literature. Genomic locations of the genes, GO annotations, sequences for nucleic acids and amino acids, phenotypes, and expression data are registered.
Database for yeast genome/annotation
,
S.pombe genome project
A database containing data from a fission yeast genome project at Sanger Institute. Genome sequences and the annotation, GO annotation, clone libraries, mapping resources (tiling path, gene map, physical map) are contained.
Automatically annotated rice genome
,
Rice Genome Automated Annotation System (Rice GAAS)
A database of algorithmic annotations on rice genomes. They contain Gene prediction (GENESCAN, RiceHMM, FGENESH, and MZEF), splicing site prediction (SplicePredictor), homology searches (BLAST, HMMer, ProfileScan, and MOTIF), repetitive sequence searches (RepeatMasker, Printrepeats), signal sequence search (SignalScan), protein localization signal prediction (PSORT), and transmembrane protein secondary structure prediction (SOSUI).
Database of rice, genome/annotation
,
Rice Annotation Database (RAD)
A database of rice genes (including predicted genes) and transcripts that are mapped on rice PAC/PAC contigs. The database contains summarized tables of spliced site patterns, amino acid compositions, codon usage compositions, gene length, GO annotations, and MIPS functional classification.
Database for rat genome/annotation
,
Rat Genome Database (RGD)
A rat genome and gene information database. Maps (gene and RH), genes, QTL, SSLP, EST/cDNA, strains, and sequences are registered. A separate user interface makes comparisons between rat, mouse and human from a diseases point of view.
Database for rice genome/annotation
,
Rice Annotation Project DataBase (RAP-DB)
A database of rice genome on which genes (including predictions), transcripts (including several plant ESTs other than the rice), BAC, and mutant information are mapped.
Dictionary of E.coli gene
,
PEC
Database for maize genome/annotation
,
MaizeGDB
Maize genomes and resources database. It contains genome sequences, conserved strains and phenotypes, mutant strains, and genetic maps.
A portal site for the MEXT “Integrated Database Project”
,
LSDB
It is a database of the outcomes from MEXT "Integrated Database Project, consisting of the development and the management of the integrated database, connection to literature information and annotation to data, medical sciences database integration in the field of chemical compounds, drugs, clinical data, and diseases in task-sharing institutes (Tokyo, TMD, Kyoto), acceleration of database integration by institutes solving supplement tasks (RIKEN, AIST, NIG, KyuTech), accepting useful databases that are hard to maintain, and human resource development for database development.
KazusaMart
KazusaAnnotation
Kaiko Genome Automated Annotation System (KAIKO GAAS)
A database of predicted genes and functional regions on silkworm genomes (BAC and WGS assemblies). The annotations were conducted algorithmically, and GeneScan, FGENESH, MZEF, SplicePredictor, BLAST, HMMer, ProfileScan, MOTIF, tRNAscan-SE, PSORT, SOSUI were used.
KATANA
Collection of Arabidopsis gene annotations from various databases and the summarization to searchable and referable formats.
HAL
A database of human genes that have been discovered using original algorithms. The genes seem to have been defined from integration of predicted genes in various databases and various prediction algorithms. There are datasets for human, chimpanzee, mouse, rat, dog, and chicken.
An integrated database of human genes and annotation
,
H-Invitational Database
A database resulting of a Jamboree that partially manually made judges on the calculations and classifications of the clustering and the overlap relation analysis of the genes with cDNA sequences in INSD. It issues original gene Ids. The function information is based on Entrez gene because NCBI OMIM and Entrez are used. The substantial part of the full-length cDNA is made public by a METI/NEDO cDNA project.
Drosophila genome/annotation
,
GadFly
A site that summarizes various data produced in Drosophila genome project. (1) Genome sequences and annotation, (2) gene expression patterns analyzed with in-situ hybridization, They are validated with microarrays. Annotation has been manually conducted using controlled vocabulary. (3) EST and full-length cDNA sequences, (4) transposon sequences, (5) gene disruption strains using a single P transposable element. (6) comparative genomic analysis of Drosophila, (7) SNP map.
Annotation of full-length mouse cDNA
,
FANTOM
A database of annotations to mouse full-length cDNA clone sequences. CAGE tags and GSC ditags information are also used to identify transcription start sites.
Automatically annotated genomes
,
Ensembl
A database of algorithmic annotation on eukaryotes (especially vertebrates) with genome sequences in public domain. Annotation platform is also provided, and many other groups use it as their own genome database construction.
, Gene expression of imprinted genes in mouse
,
EICO DB
A database of imprinted gene candidates of the mouse and the gene expression confirmed with microarrays. SNPs on the genes and relevant human genome regions are also contained. It is aimed for exploration of new imprinted genes.
Dr. Zompo
,
Zostera marina and Posidonia oceanica ESTs
As ecosystem engineers, seagrasses are angiosperms of paramount ecological importance in shallow shoreline habitats around the globe. Furthermore, the ancestors of independent seagrass lineages have secondarily returned into the sea in separate, independent evolutionary events. Thus, understanding the molecular adaptation of this clade not only makes significant contributions to the field of ecology, but also to principles of parallel evolution as well. With the use of Dr. Zompo, the first interactive seagrass sequence database presented here, new insights into the molecular adaptation of marine environments can be inferred. The database is based on a total of 14 597 ESTs obtained from two seagrass species, Zostera marina and Posidonia oceanica, which have been processed, assembled and comprehensively annotated. Dr. Zompo provides experimentalists with a broad foundation to build experiments and consider challenges associated with the investigation of this class of non-domesticated monocotyledon systems. Our database, based on the Ruby on Rails framework, is rich in features including the retrieval of experimentally determined heat-responsive transcripts, mining for molecular markers (SSRs and SNPs), and weighted key word searches that allow access to annotation gathered on several levels including Pfam domains, GeneOntology and KEGG pathways. Well established plant genome sites such as The Arabidopsis Information Resource (TAIR) and the Rice Genome Annotation Project are interfaced by Dr. Zompo. With this project, we have initialized a valuable resource for plant biologists in general and the seagrass community in particular. The database is expected to grow together with more data to come in the near future, particularly with the recent initiation of the Zostera genome sequencing project.
DeinoBase
DBH2H
,
head-to-head (h2h) gene pairs
DBH2H collects head-to-head (h2h) gene pairs identified from human, mouse, rat, chicken and fugu genomes, and distinguishes the ortholog mapping relationship among them. The gene pairs in DBH2H are annotated with sequential features including single nucleotide polymorphisms, CpG islands and transcription factor binding sites, as well as functional terms and genetic disorders. In addition, the expression correlation information based on 117 microarray datasets is included. By providing user-friendly access to these data, DBH2H represents a valuable resource for further analyses of this important gene arrangement in terms of transcriptional regulation mechanisms, evolutionary conservation, disease relevance, etc.
CYORF (Cyanobacteria Gene Annotation Database)
,
Workbench for Cyanobacteria gene annotation
A workbench for the Cyanobacteria research community to annotate the genes. General users can make searches on, refer to, and download the database.
ArkDB
Genome databases for farm and other animals
Cell
JCRB Cellbank
,
Japanese Collection of Research Bioresources (JCRB)
JCRB consists of gene bank (http://genebank.nibio.go.jp/) and cell bank (http://cellbank.nibio.go.jo/cellbank.html), and experiment animals research resource bank (http://animal.nibio.go.jp/). The gene bank develops, conserves, and distributes research resources that are the basis of disease and drug discovery researches; the cell bank accepts deposition of, conserves and distributes cell lines; experiment animal research resource bank provide services for the distribution and nutrition of mice, including production of frozen embryo and sperms, as well as the fertilization and the development. These resources are distributed with a fee in Japan and abroad from the master bank of JCRB via human science research resource bank (HSRRB) (http://www.jhsf.or.jp/index_b.html)
CellMontage
CellMontage is a system for searching gene expression databases for cells or tissues similar to the query gene expression profile.
Cell Line Catalog
A cell line catalogue of cancer cell lines that the cancer cell bank distributes.
CELLPEDIA
,
Humam cell Database
CELLPEDIA is a database for human cells, exhaustively collecting various types of information on cells. Each data consists of various types of information such as systematic cell classification and morphological information using ontology, with related gene expression and journal information on cell (trans-)differentiation, providing all information that is required for cell study at once.
Organelle
Organellome Database
Organelles Movie Database
Functional Analysis Database
macrophage
Macrophage Curated Database
Comparative Genomics
popset
,
population study data set
The PopSet database contains aligned sequences submitted as a set resulting from a population, phylogenetic, or mutation study. These alignments describe such events as evolution and population variation. The PopSet database contains both nucleotide and protein sequence data.
HomoloGene
,
eukaryotic homology groups
HomoloGene is a system for automated detection of homologs among the annotated genes of several completely sequenced eukaryotic genomes.
Amino acid sequence comparisons
,
SSDB (Sequence Similarity Database)
A database of comparative analyses of amino acid sequences of protein coding genes on known genomes.
MBGD
,
Ortholog/ homolog of microbial genomes
A database of microorganism full-length genomes and the orthologous/homologous relations between the genes.
DeinoBase
Database of Genomic Variants
The objective of the Database of Genomic Variants is to provide a comprehensive summary of structural variation in the human genome. We define structural variation as genomic alterations that involve segments of DNA that are larger than >1kb. For the purpose of this database, we focus on variants that are not directly correlated with specific phenotypes. The Database of Genomic Variants provides a useful catalog of control data for studies aiming to correlate genomic variation with phenotypic data. The database is continuously updated with new data from peer reviewed research studies. We always welcome suggestions and comments regarding the database from the research community.
DBH2H
,
head-to-head (h2h) gene pairs
DBH2H collects head-to-head (h2h) gene pairs identified from human, mouse, rat, chicken and fugu genomes, and distinguishes the ortholog mapping relationship among them. The gene pairs in DBH2H are annotated with sequential features including single nucleotide polymorphisms, CpG islands and transcription factor binding sites, as well as functional terms and genetic disorders. In addition, the expression correlation information based on 117 microarray datasets is included. By providing user-friendly access to these data, DBH2H represents a valuable resource for further analyses of this important gene arrangement in terms of transcriptional regulation mechanisms, evolutionary conservation, disease relevance, etc.
CropNet
Genome mapping in crop plants
BodyMap-Xs
,
Gene expression organized and compared across species
The database enables inter-species comparison of gene expression patterns based on orthologous relations in the organs and the genes between species. Breakdown of the cDNA libraries into source organs is conducted using original taggers that automatically analyze material description from the latest dbEST (an EST division of DDBJ), while the genes are classified based on UniGene. The genes between species are linked using InParanoid ortholog relations.
Compound
Standard spectrum of metabolites
,
Standard Spectrum Search
NMR and MS reference spectrum for metabolic products (chemical compounds) have been analyzed. It is available only via search interface. The relation to NMR reference spectrum is not described.
Bioactivity screens of chemical substances
,
PubChem Bio Assay
The PubChem BioAssay Database contains bioactivity screens of chemical substances described in PubChem Substance. It provides searchable descriptions of each bioassay, including descriptions of the conditions and readouts specific to that screening procedure.
PharmGKB
Variation in drug response based on human variation
Dictionary of standard NMR spectrum
,
PRIMe: SpinAssign
Reference NMR chemical shifts of metabolism products (chemical compounds) have been collected. Details in the contributing databases are unknown, but ExPASy Biochemical Pathways, BMRB (BioMagResBank, www.bmrb.wisc.edu), SDBS (organic chemical compounds spectrum database, www.aist.go.jp/RIODB/SDBS/cgi-bin/cre_index.cgi), NMRShiftDB (http://www.nmrshiftdb.org/) seem to be used. Articles of database usage cases targeted to plants include 17035691.
LigandBox
The database contains 3D images of all ligands in KEGG DRUG. All images are created by molecular simulation system, myPresto.
KNApSAcK
Dictionary of drug transporters
,
Drug transpoter DB
A database of information on drug transporters. It contains transporters, tissues expressing the genes, targeted chemical compounds, drug-drug interactions, knockout mice/rats for the genes, pathophysiology for the genes, genetic polymorphisms, and relevant inherited diseases.
ChEBI - Chemical Entities of Biological Interest
Small molecules, atoms, ions and radicals of biological interest
3DMET : database collecting three-dimensional structures of natural metabolites.
Medicine
SIDER Side Effect Resource
PubChem Substance
,
deposited chemical substance records
The PubChem Substances Database contains descriptions of chemical samples, from a variety of sources, and links to PubMed citations, protein 3D structures, and biological screening results that are available in PubChem BioAssay. If the contents of a chemical sample are known, the description includes links to PubChem Compound.
PharmGKB
Variation in drug response based on human variation
Dictionary of drug transporters
,
Drug transpoter DB
A database of information on drug transporters. It contains transporters, tissues expressing the genes, targeted chemical compounds, drug-drug interactions, knockout mice/rats for the genes, pathophysiology for the genes, genetic polymorphisms, and relevant inherited diseases.
Metabolic Product
Standard spectrum of metabolites
,
Standard Spectrum Search
NMR and MS reference spectrum for metabolic products (chemical compounds) have been analyzed. It is available only via search interface. The relation to NMR reference spectrum is not described.
PubChem Substance
,
deposited chemical substance records
The PubChem Substances Database contains descriptions of chemical samples, from a variety of sources, and links to PubMed citations, protein 3D structures, and biological screening results that are available in PubChem BioAssay. If the contents of a chemical sample are known, the description includes links to PubChem Compound.
Dictionary of standard NMR spectrum
,
PRIMe: SpinAssign
Reference NMR chemical shifts of metabolism products (chemical compounds) have been collected. Details in the contributing databases are unknown, but ExPASy Biochemical Pathways, BMRB (BioMagResBank, www.bmrb.wisc.edu), SDBS (organic chemical compounds spectrum database, www.aist.go.jp/RIODB/SDBS/cgi-bin/cre_index.cgi), NMRShiftDB (http://www.nmrshiftdb.org/) seem to be used. Articles of database usage cases targeted to plants include 17035691.
KNApSAcK
Form
WorTS
,
Worm TS mutant Database
Micrographs of the morphology of budding yeast mutants
,
SCMD - Saccharomyces cerevisiae Morphological Database
A database that classified the morphology of mutants (budding states) of budding yeasts. Feature extraction from the morphology photographs and the classification were conducted computationally.
Cross-sectional images of rat brain
,
Rat Brain Sections: Super-fine images
The database contains images of transverse and sagittal sections of rat brains. To display the images, Viewpoint Media Player is required. Scrolling and expansion/reduction is possible using the mouse.
Phenome Analysis of Ds transposon-tagging line in Arabidopsi
,
Phenotypes of transoposon-insertional mutants? in Arabidopsis
A list of Arabidopsis Ds transposon inserted mutants with the mutated gene loci and the genomic locations (also categorized in UTR, coding regions, exon or intron). The shapes of the mutants (eight primary categories and 50 secondary categories), but the list of phenotypes cannot be obtained. Detailed information display uses MIPS sites.
MCTDB
,
Medaka Craniofacial Trait DataBase
C. elegans RNAi Phenome Database
,
Database of RNAi gene disrupted worms
Phenotypes for worm lineages that have been undergone RNAi gene disruption are comprehensively registered. Clustering of the genes based on the phenotypes is registered, and they are expressed in the form of lineage trees.
3D brain image database of humans, Japanese monkeys and Rhesus monkeys
,
Brain Atlas Database of Japanese Monkey for WWW
Three dimensional images of human, the Japanese macaque, and the rhesus macaque are re-constructed from MRI.
ABA
,
Images of Ciona intestinalis (ascidian chordate)morphology in different developmental stages
Ciona intestinalis morphology database that registers images for developmental stages from the fertilized egg to the tadpole larva. It contains 3D reconstruction images in the mid-tailbud stage and cell lineage figures.
Gene Expression
Gene-oriented clusters of transcript sequences
,
UniGene
UniGene is an experimental system for automatically partitioning GenBank sequences into a non-redundant set of gene-oriented clusters. Each UniGene cluster contains sequences that represent a unique gene, as well as related information such as the tissue types in which the gene has been expressed and map location.
GEO profiles
,
expression and molecular abundant profiles
This database stores individual gene expression profiles from curated DataSets in the Gene Expression Omnibus (GEO) repository. Search for specific profiles of interest based on gene annotation or pre-computed profile characteristics. GEO Profiles facilitates powerful searching and linking to additional information sources.
GENSAT
,
gene expression atlus of mouse central nurvous system
The GENSAT project aims to map the expression of genes in the central nervous system of the mouse, using both in situ hybridization and transgenic mouse techniques.
PRIMe: Correlated Gene Search
,
Tool that searches for correlated Arabidopsis genes
It provides a service that makes searches in pre-computed correlations in Arabidopsis GeneChip gene expression data. Genes that have correlation with genes as the search key are output with the correlation coefficient and the annotation. The base correlation analysis is public as ATTED-II (www.atted.bio.titech.ac.jp).
SAGE of human immune system cells SAGE
SAGE analyses of gene expression in cells of human immune systems.
Gene expression profile of mouse brain during postnatal development
The gene expression profiles were analyzed using Affymetrix GeneChip in the mouse cerebellum developmental stages after birth (2001) and in the mouse embryo brains developmental processes (2005). (2001) and (2005) are supplement data for 15018818 and 15893506, respectively.
IINO lab. Germline index
,
Phenotypes of worm, gene function assessed by RNAi
Phenotypes from RNAi gene function inhibition targeted to genes that are specifically expressed in C.elegans germ lines.
Genome Medicine Database of Japan (GeMDBJ)
,
Polymorphism, gene/ protein expression of diseases
Analyses of genetic polymorphisms (SNP), gene expression (with GeneChip) and protein expression (2D-DIGE, LC-MS/MS) related to Alzheimer disease, gastric cancer, diabetes, hypertension, and asthma. Limited patient information (sexes, stratified ages, living prefectures, past history, smoking history) is included. User registration is required to view some of the information.
ESTs of Lotus japonicus (model legume)
,
Lotus japonicus EST Index
Lotus japonicus EST sequences, 3'EST consensus sequences, and the annotations are registered. Supplement data for 10819328.
EST analysis of Physcomitrella patens (moss) gene expression
,
Physcomitrella patens Full-Length cDNA Clone Database Search
A list of Physcomitrella patens subsp. patens. Full-length cDNA clones distributed by RIKEN. Full length sequences for the clones are registered.
Database of Chimpanzee cDNAs (PRIGEN)
,
Full-length cDNAs of chimpanzee (Pan troglodytes verus)
Full-length cDNA libraries were constructed from chimpanzee brain, liver, testis, and epithelial tissues. Then, 5'EST were sequenced, and some of the clones were sequenced in the full-length. Supplementary data for 12727913 and 15677748.
ESTs of Porphyra yezoensis (red alga)
,
Porphyra yezoensis EST Index
Porphyra yezoensis ESTs and BASLT annotations are registered. Supplement data for 10907854, Journal of Phycology 39,923-930(2003) .
Arabidopsis Full-Length Clone Database Search
,
Catalogue of Arabidopsis, full-length cDNA
A list of Arabidopsis full-length cDNA clones distributed by RIKEN. Full length sequences for the clones are registered.
Chlamydomonas reinhardtii EST index
,
ESTs of chlamydomonas (single celled green alga)
chlamydomonas EST, the contig sequences, and BLASTX annotations to the contigs are registered, Supplement data for 11089912, Phycologia,43,722-726(2004)
Cluster cutting tool for gene expression data
,
PRIMe: Cluster Cutting
A tool to extract gene clusters containing specified genes from clustered GeneChip gene expression data. The results are graphically displayed with Java, and further extraction of a part of the tree by referencing the tree is possible. Presently, Affymetrix GeneChip gene expression data of Arabidopsis conducted in RIKEN and MaxPlanck are prepared on the server.
Database for Macaca fascicularis (cynomolgus monkey) full-length cDNA
,
QFbase - Macaca fascicularis cDNA database
EST and full-length sequences and their homology to human sequences of Macaca fascicularis cDNA clones are registered. The full-length cDNAs are referred to in 17194215.
Gene expression database of Ciona intestinalis (sea squirt)
Ciona intestinalis gene expression analyses with EST and in situ hybridization are registered. 16 types of cDNA libraries for tissues and developmental stages have been constructed. EST clustering results and genome (obtained from JGI) mapping results are registered. In-situ hybridization images can be searchable with expression location/stages.
Full-length cDNA of silkworm
,
Insect Genome Databases, IGB lab., Univ. of Tokyo
EST sequences of the silkworm full-length cDNA clones.
Database of yeast gene expression
,
yMGV - Yeast microarray global viewer
A database of microarray analyses of gene expression in budding and fission yeasts.
The mi-R ontology database
,
miRò
miRò is a web-based knowledge base that provides users with miRNA–phenotype associations in humans. It integrates data from various online sources, such as databases of miRNAs, ontologies, diseases and targets, into a unified database equipped with an intuitive and flexible query interface and data mining facilities. The main goal of miRò is the establishment of a knowledge base which allows non-trivial analysis through sophisticated mining techniques and the introduction of a new layer of associations between genes and phenotypes inferred based on miRNAs annotations. Furthermore, a specificity function applied to validated data highlights the most significant associations.
Data bank of ESTs in the United States
,
dbEST
A dataset collected as GenBank EST division.
XDB
,
Xenopus laevis (frog) gene expression
The database contains Xenopus laevis EST, their assemblies, and WISH images. The assembly sequences are annotated with BLAST searches targeted to NCBI-NR, TIGR-XGI, Xenopus protein database of NIH, and InterProScan. WISH images have been taken from each direction at developmental stages.
Transcriptome analysis of 2 component system in Eschericia coli
ESTs of Nicotiana tabaccum cell line (BY-2),
,
Transcription Analysis of BY-2
Nicotiana tabaccum -derived cell line BY-2 cDNA libraries were constructed and the ESTs were sequenced. BLASTX annotations for each EST and the clustering (with BLASTN searches between ESTs) are registered.
Database of disrupted mouse genes by gene trap methods
,
The NAISTrap database or NAISTrap データベース
Mutant mouse ES cell lines were produced using random gene disruption by a new gene trap method (UPATrap). Partial trapped gene sequences as well as the homology search results are registered.
EST analysis of silkworm gene expression
,
SilkBase
Silkworm ESTs and the cDNA library information. Annotation with BLASTX is registered.
Automatically clustered human ESTs
,
STACK
A database that categorized human EST and mRNA by the expressed tissues, developmental stages, and related diseases, and made clustering of the sequences in each category in an original method. Description of splicing variants is included. The system to build this database is made available.
Micrographs of the morphology of budding yeast mutants
,
SCMD - Saccharomyces cerevisiae Morphological Database
A database that classified the morphology of mutants (budding states) of budding yeasts. Feature extraction from the morphology photographs and the classification were conducted computationally.
Database of gene expression profiles in human tissues and organs
,
SBM Database(Systems Biology and Medicine Database)
RefExA registers gene expression of human normal tissues, normal cells, and various cancer cell lines. LSBM GeNet registers gene expression analyses of cells in various pathological states and in drug administration. HUVEC DB registers gene expression changes of HUVEC after stimuli such as TNF-alpha. All the gene expression has been analyzed with GeneChip.
Microarray gene expression of rice
,
Rice Expression Database (RED)
Gene expression data analyzed with rice cDNA mircoarrays are registered. The database contains NIAS and STAFF-derived data, as well as analyses in other projects using the same microarrays. The article for the database is Trends in Plant Science (2002) Dec 7 (12):563-564.
Rice Microarray Opening Site (RMOS)
ESTs of rice
,
RGP Rice cDNA Sequence Database
The database contains ESTs sequenced in NIAS and the clustering analyses.
Arabidopsis transposon mutant strains
,
RARGE [Ac/Ds Transposon Mutants]
A database of insertion positions and the adjacent genes of Arabidopsis transposon mutants. A part of RARGE.
Arabidopsis full-length cDNA
,
RAFL cDNAs
A database of Arabidopsis full-length cDNA sequences and the BLASTX annotation. A part of RARGE.
PrognoScan
PRIDE (PSC-RIKEN Database of EST/Gene Expression)
,
Zinnia elegans EST /microarray gene expression
Gene expression analyses with EST and microarrays of Zinnia elegans are registered. ESTs are shown with BLASTX search results with each sequence. Microarray analyses are accessible only via GeNet system (now the service stopped?)
Overview of Arabidopsis transposon mutant strains
,
Plant Functional Genomics Research Group
A page describing the outline of a transposon mutation database
Full-length cDNA of pig
,
Pig EST Data Explorer (PEDE)
The database contains pig ESTs from full-length cDNA clones, the assembled contigs, the annotations, and the full-length sequences of selected clones. The database functions as a resource bank to distribute the clones. Pig cSNP Database, a database of SNPs identified in the assembly processes, has been made. It is not clear whether it has ESTs or cDNA sequences that are registered only here. Should they have been registered in INSD?
Database for prostate gene expression
,
PEDB
A database of gene expression in human and mouse prostate that have been analyzed with EST, microarrays, protein masspectrometry.
EST analysis of barley and seed images
,
NBRP-Barley
The database contains the EST sequences from nine cDNA libraries of three strains and wild type barleys in developmental stages and in tissues. The EST data duplicate with HarvEST. It contains a list of germplasms that can be distributed from Okayama University. A part of them are attached with photos of the seeds and the sprouts. The database also contains a list of representative strains (in consideration of genetic diversity) called Core Collection.
Differential gene expression profiles of mouse strains
,
Mouse DNA Microarray
Gene expressions compared between mouse C57BL/6J and 129X1SvJ strains in newborn brains, in adult spleens, and in adult livers. Agilent microarrays are used for the analysis. Supplement data for 15029957.
Microarray analysis of Ciona intestinalis (ascidian), gene expression
,
Microarray analysis of embryonic retinoic acid target genes
Gene expression profiles of 9,287 candidates of embryonic retinoic acid target genes analyzed with microarrays using cDNA libraries for the Ciona intestinalis EST analyses. Supplement data for an article 12828686. In addition, in situ hybridization images for 91 genes are shown.
ESTs of tomato
,
MiBASE
EST sequences of cDNA libraries from the fruits and the leaves of Micro-Tom tomato are registered. The annotations include homologous genes and clusters in UNIGENE database containing ESTs from other projects and GO terms. Gene expression using microarrays made of the cDNA libraries as described above seems to have been analyzed for tissues, developmental stages, and breeds, but no raw data can be obtained. Supplement data for 15975739, Plant Biotechnol. 22: 161-165(2005)
Medaka EST database
,
Medaka gene expression
A database that contains Medaka ESTs, the library information, an explanation of mutation mapping system using ESTs, and the organization of a microarray (Medaka Microarray 8K).
EST of Halocynthia roretzi (ascidian)
,
MAGEST
A database of Halocynthia roretzi EST and the clustering.
MAEDA (Micro Array Expression DAta search)
,
Microarray analysis of Arabidopsis gene expression
Arabidopsis gene expression analyses using microarrays that were manufactured using 7,000 full-length cDNA clones. URL on GEO: GSE4203, GPL3181.
ESTs of Lentinus edodes (shiitake mushroom)
,
LeEST
Lentinus edodes cDNA libraries have been constructed, and the 5' ESTs are sequenced. They are registered with BLASTN search results.
Full-length cDNA clones form rice
,
Knowledge-Oriented Molecular Biological Encyclopedia (KOME)
Rice full-length cDNA sequences and the annotations (homologous genes, clustering analyses, InterPro motif searches, GO assignments) are registered. Supplement data for 12869764.
EST of silkworm
,
KAIKO cDNA
Silkworm ESTs are organized by cDNA libraries (strain, developmental stages, tissues, and sexes). BLASTX searches by each EST are registered.
Integrated Rice Genome Explorer (INE)
,
Rice genome/annotation
A database of rice genome on which gene maps, physical maps, PCR markers, ESTs, BAC/PAC contigs are mapped.
EST database-viewing software? of crops
,
HarvEST
EST sequences and the assemblies for barleys, Brachypodium, citrus, coffee, cowpea, soybeans, rice, and wheat. The database has been constructed by Univ. California, Riverside, but the sequence data are accepted from cooperating institutes in the project (ex. Univ. Okayama provides barley ESTs). Genome sequences provided by Affymetrix, which were material data for the genome chip, for barley, wheat, rice, and soybean.
Drosophila Gal4 enhancer trap insertion lines
,
GETDB - Gal4 Enhancer Trap Insertion Database -
Analyses of insertion positions, gene expression patterns, and the phenotypes of Drosophila strains with inserted Gal4 enhancer traps. Resources can be distributed.
GEO Dataset
,
experimental sets of GEO data
This database stores curated gene expression DataSets, as well as original Series and Platform records in the Gene Expression Omnibus (GEO) repository. DataSet records contain additional resources including cluster tools and differential expression queries.
GEO - Gene Expression Omnibus
,
United States, Data bank of gene expressions in the United States
A database run by NCBI that accepts and make public depositions of gene expression data. The database is compatible to microarrays (DNAchip) and SAGE.
FANTOM4
In FANTOM4 the focus has changed to understanding how these components work together in the context of a biological network. Using deepCAGE (deep sequencing with CAGE) we monitored the dynamics of transcription start site (TSS) usage during a time course of monocytic differentiation in the acute myeloid leukemia cell line THP-1. This allowed us to identify active promoters, monitor their relative expression and define relevant regions for carrying out transcription factor binding site predictions. Computational methods were then used to build a network model of gene expression in this leukemia and the transcription factors key to its regulation. This work gives the first picture of the wiring between genes involved in acute myeloid leukemia and provides a strategy for identifying key factors that determine cell fates. In addition to the network, FANTOM4 data was used in two additional analyses. The first identified a novel class of short RNAs associated with transcription start sites and the second focused on the role of repetitive element expression in the transcriptome. (cited from http://fantom.gsc.riken.jp/4/ ) Developed by RIKEN Omics Science Center
EpoDB - Erythropoiesis Database
,
Genes related to red blood cell hematopoiesis
A database that has collected and organized genes that are expressed in hematopoiesis of vertebrate red blood cells, as well as their sequences and expression information, from public databases
EXPRESSION
,
Microarray gene expression data
This database was constructed in order to integrate microarray gene expression data to KEGG genome and KEGG pathway. Gene expression data for Cyanophyceae, Bacillus subtilis, E.coli, human, and budding yeasts are registered.
, Gene expression of imprinted genes in mouse
,
EICO DB
A database of imprinted gene candidates of the mouse and the gene expression confirmed with microarrays. SNPs on the genes and relevant human genome regions are also contained. It is aimed for exploration of new imprinted genes.
Dr. Zompo
,
Zostera marina and Posidonia oceanica ESTs
As ecosystem engineers, seagrasses are angiosperms of paramount ecological importance in shallow shoreline habitats around the globe. Furthermore, the ancestors of independent seagrass lineages have secondarily returned into the sea in separate, independent evolutionary events. Thus, understanding the molecular adaptation of this clade not only makes significant contributions to the field of ecology, but also to principles of parallel evolution as well. With the use of Dr. Zompo, the first interactive seagrass sequence database presented here, new insights into the molecular adaptation of marine environments can be inferred. The database is based on a total of 14 597 ESTs obtained from two seagrass species, Zostera marina and Posidonia oceanica, which have been processed, assembled and comprehensively annotated. Dr. Zompo provides experimentalists with a broad foundation to build experiments and consider challenges associated with the investigation of this class of non-domesticated monocotyledon systems. Our database, based on the Ruby on Rails framework, is rich in features including the retrieval of experimentally determined heat-responsive transcripts, mining for molecular markers (SSRs and SNPs), and weighted key word searches that allow access to annotation gathered on several levels including Pfam domains, GeneOntology and KEGG pathways. Well established plant genome sites such as The Arabidopsis Information Resource (TAIR) and the Rice Genome Annotation Project are interfaced by Dr. Zompo. With this project, we have initialized a valuable resource for plant biologists in general and the seagrass community in particular. The database is expected to grow together with more data to come in the near future, particularly with the recent initiation of the Zostera genome sequencing project.
Database of Genomic Variants
The objective of the Database of Genomic Variants is to provide a comprehensive summary of structural variation in the human genome. We define structural variation as genomic alterations that involve segments of DNA that are larger than >1kb. For the purpose of this database, we focus on variants that are not directly correlated with specific phenotypes. The Database of Genomic Variants provides a useful catalog of control data for studies aiming to correlate genomic variation with phenotypic data. The database is continuously updated with new data from peer reviewed research studies. We always welcome suggestions and comments regarding the database from the research community.
DBTGR
,
Knowledge model ?of Tunicate gene expression regulation
A database of Ascidian gene expression loci, control regions, promoter sequences, and transcription factors. It registers inter-species comparison of promoters (C.intestinalis vs. C.savignyi).
DART
Cricket EST DB and Expression DB
,
ESTs of cricket
Cricket EST sequences, clustering analyses, and homology search results are registered.
CleanEx
Expression reference database, linking heterogeneous expression data tofacilitate cross-dataset comparisons
Ciona intestinalis EST project database
Chick Eye EST DB and Expression DB
Cancer Gene Expression Database (CGED)
CGED (Cancer Gene Expression Database) is a database of geneexpression profile and accompanying clinical information. The data of CGED were obtained through collaborative efforts of Nara Institute of Science andTechnology and Osaka University School of Medicine to identify genes ofclinical importance.
CREAT portal
,
Gene/protein expression profiles and protein-protein interaction of mouse mKIAA genes expression, protein expression,
Gene expressions that have been analyzed using microarrays based on mKIAA clones that have been obtained in Kazusa mouse cDNA project. Ectopic expressions seem to have been analyzed using hybridization (with images). The database (InGap) contains protein expression analyzed with western blot, immunohistochemical analysis, and immunoprecipitation using antibodies based on mKIAA. The database (InCeP) contains protein-protein interactions between mKIAA expressed proteins analyzed with immunoprecipitation and MS/MS. The interactions can be searched/displayed/downloaded, but the display required a dedicated software.
CIBEX
,
Gene expression databank of Japan
Deposition of gene expression data from microarrays by researchers is accepted and published.
CAGE
,
CAGE/ transcription start site
A database of CAGE tag mapping to the genome. Human and mouse libraries by the tissues and developmental stages were constructed, and the sequences are mapped to UCSC (golden path) genome. The mapping results are referred to by FANTOM.
Brain Gene Expression Database (BGED)
,
Database of mouse brain gene expression
A database of gene expression in the mouse brain analyzed in various physiological and pathological processes. ATAC-PCR was used for gene expression analysis.
BodyMap-Xs
,
Gene expression organized and compared across species
The database enables inter-species comparison of gene expression patterns based on orthologous relations in the organs and the genes between species. Breakdown of the cDNA libraries into source organs is conducted using original taggers that automatically analyze material description from the latest dbEST (an EST division of DDBJ), while the genes are classified based on UniGene. The genes between species are linked using InParanoid ortholog relations.
BodyMap
,
Human and mouse gene expression
A database of gene expression in human and mouse tissues and cells. The gene expression is analyzed based on 3'ESTs.
BloodSAGE
A database of SAGE analyses of gene expression in blood cells.
BED (Brain EST Database)
Brain EST Database (BED) is based on collection of 3' end ESTs generatedin the Taisho Laboratory of Functional Genomics
Atlas (ISH Data Base)
,
Gene expression of Dictyostelium (social amoeba?)
Gene expression information of Dictyostelium discoideum, a cellular slime mold, is registered. Registered data are cDNA clones from each stage, EST, their assemblies, gene expression images with in-situ hybridization.
ArrayExpress
,
European data bank of microarray gene expression
A database operated by EBI that accepts and publishes microarray gene expression data. Arrays (design), gene expression, and protocols can be registered separately. Accepting data types are CHiP-chip, CGH, gene expression, protein arrays, and RNAi.
Archaeal Gene Network (Arch GeNet)
,
Protein and gene expression of Thermoplasma volcanium GSS1(thermophilic archaebacterium)
Protein expression in a thermophilic archaebacterium under aerobic and anaerobic conditions was analyzed with 2-dimensional gel electrophoresis. In addition, gene expression under three types of environments was analyzed using microarrays.
Arabidopsis EST analysis database
,
Arabidopsis thaliana EST Index
Arabidopsis ESTs and the annotation (based on BLASTX). Supplement data for 10907847.
ASSETs (Alternative Splicing Sequence Enriched Tags)
,
Mouse ESTs that are, rich in alternative splicing
A database of tag sequences from cDNA libraries that were established from mouse cell lines and that are rich in alternative splicing. The sequences seem to have been pattern classified based on mapping on Ensemble genomes.
ARTADEdb
,
Arabidopsis exon detection, tool and validation result
A database of Arabidopsis exons detected with tiling arrays conducted in RIKEN. Programs that were used in the data analysis are also made public.
5'SAGE
,
5'end serial analysis of gene expression database
A database of 5'SAGE analysis of human gene transcription start sites and the number of expressed tags.
EST
Gene-oriented clusters of transcript sequences
,
UniGene
UniGene is an experimental system for automatically partitioning GenBank sequences into a non-redundant set of gene-oriented clusters. Each UniGene cluster contains sequences that represent a unique gene, as well as related information such as the tissue types in which the gene has been expressed and map location.
Gene expression profile of mouse brain during postnatal development
The gene expression profiles were analyzed using Affymetrix GeneChip in the mouse cerebellum developmental stages after birth (2001) and in the mouse embryo brains developmental processes (2005). (2001) and (2005) are supplement data for 15018818 and 15893506, respectively.
Chlamydomonas reinhardtii EST index
,
ESTs of chlamydomonas (single celled green alga)
chlamydomonas EST, the contig sequences, and BLASTX annotations to the contigs are registered, Supplement data for 11089912, Phycologia,43,722-726(2004)
Database for Macaca fascicularis (cynomolgus monkey) full-length cDNA
,
QFbase - Macaca fascicularis cDNA database
EST and full-length sequences and their homology to human sequences of Macaca fascicularis cDNA clones are registered. The full-length cDNAs are referred to in 17194215.
Gene expression database of Ciona intestinalis (sea squirt)
Ciona intestinalis gene expression analyses with EST and in situ hybridization are registered. 16 types of cDNA libraries for tissues and developmental stages have been constructed. EST clustering results and genome (obtained from JGI) mapping results are registered. In-situ hybridization images can be searchable with expression location/stages.
Full-length cDNA of silkworm
,
Insect Genome Databases, IGB lab., Univ. of Tokyo
EST sequences of the silkworm full-length cDNA clones.
Data bank of ESTs in the United States
,
dbEST
A dataset collected as GenBank EST division.
XDB
,
Xenopus laevis (frog) gene expression
The database contains Xenopus laevis EST, their assemblies, and WISH images. The assembly sequences are annotated with BLAST searches targeted to NCBI-NR, TIGR-XGI, Xenopus protein database of NIH, and InterProScan. WISH images have been taken from each direction at developmental stages.
ESTs of Nicotiana tabaccum cell line (BY-2),
,
Transcription Analysis of BY-2
Nicotiana tabaccum -derived cell line BY-2 cDNA libraries were constructed and the ESTs were sequenced. BLASTX annotations for each EST and the clustering (with BLASTN searches between ESTs) are registered.
EST analysis of silkworm gene expression
,
SilkBase
Silkworm ESTs and the cDNA library information. Annotation with BLASTX is registered.
Automatically clustered human ESTs
,
STACK
A database that categorized human EST and mRNA by the expressed tissues, developmental stages, and related diseases, and made clustering of the sequences in each category in an original method. Description of splicing variants is included. The system to build this database is made available.
ESTs of rice
,
RGP Rice cDNA Sequence Database
The database contains ESTs sequenced in NIAS and the clustering analyses.
Arabidopsis transposon mutant strains
,
RARGE [Ac/Ds Transposon Mutants]
A database of insertion positions and the adjacent genes of Arabidopsis transposon mutants. A part of RARGE.
Database for prostate gene expression
,
PEDB
A database of gene expression in human and mouse prostate that have been analyzed with EST, microarrays, protein masspectrometry.
EST analysis of barley and seed images
,
NBRP-Barley
The database contains the EST sequences from nine cDNA libraries of three strains and wild type barleys in developmental stages and in tissues. The EST data duplicate with HarvEST. It contains a list of germplasms that can be distributed from Okayama University. A part of them are attached with photos of the seeds and the sprouts. The database also contains a list of representative strains (in consideration of genetic diversity) called Core Collection.
ESTs of tomato
,
MiBASE
EST sequences of cDNA libraries from the fruits and the leaves of Micro-Tom tomato are registered. The annotations include homologous genes and clusters in UNIGENE database containing ESTs from other projects and GO terms. Gene expression using microarrays made of the cDNA libraries as described above seems to have been analyzed for tissues, developmental stages, and breeds, but no raw data can be obtained. Supplement data for 15975739, Plant Biotechnol. 22: 161-165(2005)
Medaka EST database
,
Medaka gene expression
A database that contains Medaka ESTs, the library information, an explanation of mutation mapping system using ESTs, and the organization of a microarray (Medaka Microarray 8K).
EST of Halocynthia roretzi (ascidian)
,
MAGEST
A database of Halocynthia roretzi EST and the clustering.
ESTs of Lentinus edodes (shiitake mushroom)
,
LeEST
Lentinus edodes cDNA libraries have been constructed, and the 5' ESTs are sequenced. They are registered with BLASTN search results.
EST of silkworm
,
KAIKO cDNA
Silkworm ESTs are organized by cDNA libraries (strain, developmental stages, tissues, and sexes). BLASTX searches by each EST are registered.
Integrated Rice Genome Explorer (INE)
,
Rice genome/annotation
A database of rice genome on which gene maps, physical maps, PCR markers, ESTs, BAC/PAC contigs are mapped.
EST database-viewing software? of crops
,
HarvEST
EST sequences and the assemblies for barleys, Brachypodium, citrus, coffee, cowpea, soybeans, rice, and wheat. The database has been constructed by Univ. California, Riverside, but the sequence data are accepted from cooperating institutes in the project (ex. Univ. Okayama provides barley ESTs). Genome sequences provided by Affymetrix, which were material data for the genome chip, for barley, wheat, rice, and soybean.
Dr. Zompo
,
Zostera marina and Posidonia oceanica ESTs
As ecosystem engineers, seagrasses are angiosperms of paramount ecological importance in shallow shoreline habitats around the globe. Furthermore, the ancestors of independent seagrass lineages have secondarily returned into the sea in separate, independent evolutionary events. Thus, understanding the molecular adaptation of this clade not only makes significant contributions to the field of ecology, but also to principles of parallel evolution as well. With the use of Dr. Zompo, the first interactive seagrass sequence database presented here, new insights into the molecular adaptation of marine environments can be inferred. The database is based on a total of 14 597 ESTs obtained from two seagrass species, Zostera marina and Posidonia oceanica, which have been processed, assembled and comprehensively annotated. Dr. Zompo provides experimentalists with a broad foundation to build experiments and consider challenges associated with the investigation of this class of non-domesticated monocotyledon systems. Our database, based on the Ruby on Rails framework, is rich in features including the retrieval of experimentally determined heat-responsive transcripts, mining for molecular markers (SSRs and SNPs), and weighted key word searches that allow access to annotation gathered on several levels including Pfam domains, GeneOntology and KEGG pathways. Well established plant genome sites such as The Arabidopsis Information Resource (TAIR) and the Rice Genome Annotation Project are interfaced by Dr. Zompo. With this project, we have initialized a valuable resource for plant biologists in general and the seagrass community in particular. The database is expected to grow together with more data to come in the near future, particularly with the recent initiation of the Zostera genome sequencing project.
Cricket EST DB and Expression DB
,
ESTs of cricket
Cricket EST sequences, clustering analyses, and homology search results are registered.
Ciona intestinalis EST project database
Chick Eye EST DB and Expression DB
BodyMap-Xs
,
Gene expression organized and compared across species
The database enables inter-species comparison of gene expression patterns based on orthologous relations in the organs and the genes between species. Breakdown of the cDNA libraries into source organs is conducted using original taggers that automatically analyze material description from the latest dbEST (an EST division of DDBJ), while the genes are classified based on UniGene. The genes between species are linked using InParanoid ortholog relations.
BodyMap
,
Human and mouse gene expression
A database of gene expression in human and mouse tissues and cells. The gene expression is analyzed based on 3'ESTs.
Atlas (ISH Data Base)
,
Gene expression of Dictyostelium (social amoeba?)
Gene expression information of Dictyostelium discoideum, a cellular slime mold, is registered. Registered data are cDNA clones from each stage, EST, their assemblies, gene expression images with in-situ hybridization.
Arabidopsis EST analysis database
,
Arabidopsis thaliana EST Index
Arabidopsis ESTs and the annotation (based on BLASTX). Supplement data for 10907847.
ASSETs (Alternative Splicing Sequence Enriched Tags)
,
Mouse ESTs that are, rich in alternative splicing
A database of tag sequences from cDNA libraries that were established from mouse cell lines and that are rich in alternative splicing. The sequences seem to have been pattern classified based on mapping on Ensemble genomes.
PCR
Cancer Gene Expression Database (CGED)
CGED (Cancer Gene Expression Database) is a database of geneexpression profile and accompanying clinical information. The data of CGED were obtained through collaborative efforts of Nara Institute of Science andTechnology and Osaka University School of Medicine to identify genes ofclinical importance.
Brain Gene Expression Database (BGED)
,
Database of mouse brain gene expression
A database of gene expression in the mouse brain analyzed in various physiological and pathological processes. ATAC-PCR was used for gene expression analysis.
SAGE, CAGE
SAGE of human immune system cells SAGE
SAGE analyses of gene expression in cells of human immune systems.
GEO - Gene Expression Omnibus
,
United States, Data bank of gene expressions in the United States
A database run by NCBI that accepts and make public depositions of gene expression data. The database is compatible to microarrays (DNAchip) and SAGE.
FANTOM4
In FANTOM4 the focus has changed to understanding how these components work together in the context of a biological network. Using deepCAGE (deep sequencing with CAGE) we monitored the dynamics of transcription start site (TSS) usage during a time course of monocytic differentiation in the acute myeloid leukemia cell line THP-1. This allowed us to identify active promoters, monitor their relative expression and define relevant regions for carrying out transcription factor binding site predictions. Computational methods were then used to build a network model of gene expression in this leukemia and the transcription factors key to its regulation. This work gives the first picture of the wiring between genes involved in acute myeloid leukemia and provides a strategy for identifying key factors that determine cell fates. In addition to the network, FANTOM4 data was used in two additional analyses. The first identified a novel class of short RNAs associated with transcription start sites and the second focused on the role of repetitive element expression in the transcriptome. (cited from http://fantom.gsc.riken.jp/4/ ) Developed by RIKEN Omics Science Center
CAGE
,
CAGE/ transcription start site
A database of CAGE tag mapping to the genome. Human and mouse libraries by the tissues and developmental stages were constructed, and the sequences are mapped to UCSC (golden path) genome. The mapping results are referred to by FANTOM.
BloodSAGE
A database of SAGE analyses of gene expression in blood cells.
5'SAGE
,
5'end serial analysis of gene expression database
A database of 5'SAGE analysis of human gene transcription start sites and the number of expressed tags.
micro array
PRIMe: Correlated Gene Search
,
Tool that searches for correlated Arabidopsis genes
It provides a service that makes searches in pre-computed correlations in Arabidopsis GeneChip gene expression data. Genes that have correlation with genes as the search key are output with the correlation coefficient and the annotation. The base correlation analysis is public as ATTED-II (www.atted.bio.titech.ac.jp).
Gene expression profile of mouse brain during postnatal development
The gene expression profiles were analyzed using Affymetrix GeneChip in the mouse cerebellum developmental stages after birth (2001) and in the mouse embryo brains developmental processes (2005). (2001) and (2005) are supplement data for 15018818 and 15893506, respectively.
Genome Medicine Database of Japan (GeMDBJ)
,
Polymorphism, gene/ protein expression of diseases
Analyses of genetic polymorphisms (SNP), gene expression (with GeneChip) and protein expression (2D-DIGE, LC-MS/MS) related to Alzheimer disease, gastric cancer, diabetes, hypertension, and asthma. Limited patient information (sexes, stratified ages, living prefectures, past history, smoking history) is included. User registration is required to view some of the information.
Cluster cutting tool for gene expression data
,
PRIMe: Cluster Cutting
A tool to extract gene clusters containing specified genes from clustered GeneChip gene expression data. The results are graphically displayed with Java, and further extraction of a part of the tree by referencing the tree is possible. Presently, Affymetrix GeneChip gene expression data of Arabidopsis conducted in RIKEN and MaxPlanck are prepared on the server.
Database of yeast gene expression
,
yMGV - Yeast microarray global viewer
A database of microarray analyses of gene expression in budding and fission yeasts.
Transcriptome analysis of 2 component system in Eschericia coli
Database of disrupted mouse genes by gene trap methods
,
The NAISTrap database or NAISTrap データベース
Mutant mouse ES cell lines were produced using random gene disruption by a new gene trap method (UPATrap). Partial trapped gene sequences as well as the homology search results are registered.
Database of gene expression profiles in human tissues and organs
,
SBM Database(Systems Biology and Medicine Database)
RefExA registers gene expression of human normal tissues, normal cells, and various cancer cell lines. LSBM GeNet registers gene expression analyses of cells in various pathological states and in drug administration. HUVEC DB registers gene expression changes of HUVEC after stimuli such as TNF-alpha. All the gene expression has been analyzed with GeneChip.
Microarray gene expression of rice
,
Rice Expression Database (RED)
Gene expression data analyzed with rice cDNA mircoarrays are registered. The database contains NIAS and STAFF-derived data, as well as analyses in other projects using the same microarrays. The article for the database is Trends in Plant Science (2002) Dec 7 (12):563-564.
Rice Microarray Opening Site (RMOS)
Arabidopsis full-length cDNA
,
RAFL cDNAs
A database of Arabidopsis full-length cDNA sequences and the BLASTX annotation. A part of RARGE.
Differential gene expression profiles of mouse strains
,
Mouse DNA Microarray
Gene expressions compared between mouse C57BL/6J and 129X1SvJ strains in newborn brains, in adult spleens, and in adult livers. Agilent microarrays are used for the analysis. Supplement data for 15029957.
Microarray analysis of Ciona intestinalis (ascidian), gene expression
,
Microarray analysis of embryonic retinoic acid target genes
Gene expression profiles of 9,287 candidates of embryonic retinoic acid target genes analyzed with microarrays using cDNA libraries for the Ciona intestinalis EST analyses. Supplement data for an article 12828686. In addition, in situ hybridization images for 91 genes are shown.
Medaka EST database
,
Medaka gene expression
A database that contains Medaka ESTs, the library information, an explanation of mutation mapping system using ESTs, and the organization of a microarray (Medaka Microarray 8K).
MAEDA (Micro Array Expression DAta search)
,
Microarray analysis of Arabidopsis gene expression
Arabidopsis gene expression analyses using microarrays that were manufactured using 7,000 full-length cDNA clones. URL on GEO: GSE4203, GPL3181.
GEO - Gene Expression Omnibus
,
United States, Data bank of gene expressions in the United States
A database run by NCBI that accepts and make public depositions of gene expression data. The database is compatible to microarrays (DNAchip) and SAGE.
EXPRESSION
,
Microarray gene expression data
This database was constructed in order to integrate microarray gene expression data to KEGG genome and KEGG pathway. Gene expression data for Cyanophyceae, Bacillus subtilis, E.coli, human, and budding yeasts are registered.
, Gene expression of imprinted genes in mouse
,
EICO DB
A database of imprinted gene candidates of the mouse and the gene expression confirmed with microarrays. SNPs on the genes and relevant human genome regions are also contained. It is aimed for exploration of new imprinted genes.
Database of Genomic Variants
The objective of the Database of Genomic Variants is to provide a comprehensive summary of structural variation in the human genome. We define structural variation as genomic alterations that involve segments of DNA that are larger than >1kb. For the purpose of this database, we focus on variants that are not directly correlated with specific phenotypes. The Database of Genomic Variants provides a useful catalog of control data for studies aiming to correlate genomic variation with phenotypic data. The database is continuously updated with new data from peer reviewed research studies. We always welcome suggestions and comments regarding the database from the research community.
DART
CREAT portal
,
Gene/protein expression profiles and protein-protein interaction of mouse mKIAA genes expression, protein expression,
Gene expressions that have been analyzed using microarrays based on mKIAA clones that have been obtained in Kazusa mouse cDNA project. Ectopic expressions seem to have been analyzed using hybridization (with images). The database (InGap) contains protein expression analyzed with western blot, immunohistochemical analysis, and immunoprecipitation using antibodies based on mKIAA. The database (InCeP) contains protein-protein interactions between mKIAA expressed proteins analyzed with immunoprecipitation and MS/MS. The interactions can be searched/displayed/downloaded, but the display required a dedicated software.
CIBEX
,
Gene expression databank of Japan
Deposition of gene expression data from microarrays by researchers is accepted and published.
ArrayExpress
,
European data bank of microarray gene expression
A database operated by EBI that accepts and publishes microarray gene expression data. Arrays (design), gene expression, and protocols can be registered separately. Accepting data types are CHiP-chip, CGH, gene expression, protein arrays, and RNAi.
Archaeal Gene Network (Arch GeNet)
,
Protein and gene expression of Thermoplasma volcanium GSS1(thermophilic archaebacterium)
Protein expression in a thermophilic archaebacterium under aerobic and anaerobic conditions was analyzed with 2-dimensional gel electrophoresis. In addition, gene expression under three types of environments was analyzed using microarrays.
ARTADEdb
,
Arabidopsis exon detection, tool and validation result
A database of Arabidopsis exons detected with tiling arrays conducted in RIKEN. Programs that were used in the data analysis are also made public.
Image
Database for Aquatic-vertebrate Science
This database consists of living and sample photographs of the fish. Registered photographs increased from 40,000 to 54,583 (Jul. 2006).
Fungus and Actinomycete Gallery
Makino Herbarium Type Specimen Image Database
Japanese Ant Image Database
Mammalian Crania Photographic Archive
Breast Tumor Image Database
JCB DataViewer
3D brain image database of humans, Japanese monkeys and Rhesus monkeys
,
Brain Atlas Database of Japanese Monkey for WWW
Three dimensional images of human, the Japanese macaque, and the rhesus macaque are re-constructed from MRI.
BioImage
,
Data bank, of biological images
A database of biologically informative images (especially microscopic photos) that have been deposited by researchers. Literature information is required as metadata at the time of image registration. Compatible with multi-slice images.
Interaction
Database for Protein-Ligand Interactions
,
Dictionary of protein-ligand interactions
A database of protein-ligand interactions from literature. The database contains information on ligands (name, molecular structures, molecular weights, etc), proteins (name, organisms, ID numbers in PIR/SWISS-PROT/PDB, etc) and experiments (binding/inhibitory activity, etc), and the citations.
List of 145 protein-protein interactions
Genome Network Platform
It is a database of human and mouse cDNA CAGE tag sequencing data and molecular interaction data between transcription factors based on yeast two hybrid method, which are outcomes of genome function information analyses in MEXT Genome Network Project.
Base-Amino Acid Interation Database
The database has extracted entries containing nucleic acids from PDB, and organized them to enable searches by the presence of nucleic acid-amino acid interactions.
Transcription Product Database (TraP)
Knowledge model of signal transduction pathways
,
The Signaling PAthway Database (SPAD)
A database that is a collection and visualization of signal transduction pathways. The pathways are classified as growth factors, cytokines, hormones, and stresses, in correspondence to extracellular signal molecules.
The Homeodomain Resource
,
a comprehensive collection of sequence, structure, interaction, genomic and functional information on the homeodomain protein family
The Homeodomain Resource is a curated collection of sequence, structure, interaction, genomic and functional information on the homeodomain family. The current version builds upon previous versions by the addition of new, complete sets of homeodomain sequences from fully sequenced genomes, the expansion of existing curated homeodomain information and the improvement of data accessibility through better search tools and more complete data integration. This release contains 1534 full-length homeodomain-containing sequences, 93 experimentally derived homeodomain structures, 101 homeodomain protein–protein interactions, 107 homeodomain DNA-binding sites and 206 homeodomain proteins implicated in human genetic disorders.
Knowledge model ?of thermodynamic interactions between proteins and nucleic acids
,
ProNIT
A database of thermodynamical interactions between proteins and nucleic acids that have been experimentally decided.
PRIME (PRotein Interaction and Molecular information databas
A revised version of Kinase Pathways Database that additionally contains interaction types and the validity in the evidences.
PLAnt Cis-acting Regulatory DNA Elements Database
,
PLAnt Cis-acting Regulatory DNA Elements Database (PLACE)
A database of plant cis-element motifs collected from literature. Downloadable.
Chemical compounds relevant to life?
,
LIGAND
A database of chemical compounds and the reactions which are relevant to biological processes. It consists of COMPOUND, DRUGS, GLYCAN, REACTION, RPAIR, ENZYME (from Enzyme Nomenclature).
Database of kinase, pathways
,
Kinase Pathway Database
A database of protein kinases of major eukaryotes with completely sequenced genomes. Protein classes and functions, orthologous relations between species, protein interactions, protein domains, and protein structures are registered as well. Protein interactions were collected with natural language processing of literature.
KEGG Pathway
,
Knowledge model ?of biomolecular interactions/reactions?
A database collecting pathways (molecular interactions). Metabolism maps, inter/intracellular information processing maps, human disease association maps are registered.
KEGG
,
Portal site of KEGG
A top page of the whole KEGG system. Form here, PATHWAY, BRITE, GENES, LIGAND, DRUG, CLYCAN, REACTION, and KAAS are linked.
INOH pathway database
Genomic Object Net Pathway Database
Database for G-protein coupled receptor (GPCR) and ligand interaction
,
GLIDA
Interactions between G-protein coupled receptors and the ligands are collected from public databases and linked.
Database of genomes and transcriptional regulations for fila
,
ESTs of Aspergillus oryzae
Aspergillus oryzae cDNA libraries were constructed under several conditions and the 5'ESTs were sequenced. The sequences are available only via FASTA homology search. Promoter analyses are seems to be planned, because methods for construction of genome clones for the purpose are described. In addition, a cosmid clone of Aspergillus nidulans is registered.
Database of Base-Amino Acid Interactions
(From top page) Database of Base-Amino Acid Interactions enables users to find pairs of bases and amino acids that are within a threshold distance in a given protein-nucleic acid complex structure.
Annotations of E.coli DNA-binding sites of proteins
,
DPInteract
A database of protein binding sites on the E.coli genome. A recognition matrix is constructed based on known sites, and predicted sites based on the matrix are also registered. Each site is annotated.
Annotation of protein-protein interactions
,
DIP - Database of Interacting Proteins
A database that has collected and organized experimental evidences of protein-protein interactions from literature. The database evaluates the reliability of each entry based on experiment methods, and prepares the most reliable subset as CORE.
DBTGR
,
Knowledge model ?of Tunicate gene expression regulation
A database of Ascidian gene expression loci, control regions, promoter sequences, and transcription factors. It registers inter-species comparison of promoters (C.intestinalis vs. C.savignyi).
Cytokine Signaling Pathway Database
,
Signaling pathway of cytokine
Information related to cytokine signaling pathways are collected. Ligand-receptor relations of the chemokines, 3D structures and domain structures of receptors, interspecies lineages of receptors, a list of kinases and links to other databases are registered.
CREAT portal
,
Gene/protein expression profiles and protein-protein interaction of mouse mKIAA genes expression, protein expression,
Gene expressions that have been analyzed using microarrays based on mKIAA clones that have been obtained in Kazusa mouse cDNA project. Ectopic expressions seem to have been analyzed using hybridization (with images). The database (InGap) contains protein expression analyzed with western blot, immunohistochemical analysis, and immunoprecipitation using antibodies based on mKIAA. The database (InCeP) contains protein-protein interactions between mKIAA expressed proteins analyzed with immunoprecipitation and MS/MS. The interactions can be searched/displayed/downloaded, but the display required a dedicated software.
BRITE
,
Knowledge model ?of functional hierarchies and binary relationships of biological systems
A database of hierarchical expressions of the relationships in biological systems. Gene orthologs, protein families, protein interactions, chemical compounds and the reactions, drugs, and diseases are included.
Androgen Receptor Gene Mutations Database
Mutations in the androgen receptor gene
nucleic acid - protein
Base-Amino Acid Interation Database
The database has extracted entries containing nucleic acids from PDB, and organized them to enable searches by the presence of nucleic acid-amino acid interactions.
Knowledge model ?of thermodynamic interactions between proteins and nucleic acids
,
ProNIT
A database of thermodynamical interactions between proteins and nucleic acids that have been experimentally decided.
PRIME (PRotein Interaction and Molecular information databas
A revised version of Kinase Pathways Database that additionally contains interaction types and the validity in the evidences.
Database of Base-Amino Acid Interactions
(From top page) Database of Base-Amino Acid Interactions enables users to find pairs of bases and amino acids that are within a threshold distance in a given protein-nucleic acid complex structure.
Annotations of E.coli DNA-binding sites of proteins
,
DPInteract
A database of protein binding sites on the E.coli genome. A recognition matrix is constructed based on known sites, and predicted sites based on the matrix are also registered. Each site is annotated.
protein-protein
List of 145 protein-protein interactions
Database of Protein interaction SITEs
,
PiSITE
PiSITE is a web-based database of protein interaction sites. The PiSITE provides not only information of interaction sites of a protein from single PDB entry, but also information of interaction sites of a protein from multiple PDB entries including similar proteins
PRIME (PRotein Interaction and Molecular information databas
A revised version of Kinase Pathways Database that additionally contains interaction types and the validity in the evidences.
Annotation of protein-protein interactions
,
DIP - Database of Interacting Proteins
A database that has collected and organized experimental evidences of protein-protein interactions from literature. The database evaluates the reliability of each entry based on experiment methods, and prepares the most reliable subset as CORE.
CREAT portal
,
Gene/protein expression profiles and protein-protein interaction of mouse mKIAA genes expression, protein expression,
Gene expressions that have been analyzed using microarrays based on mKIAA clones that have been obtained in Kazusa mouse cDNA project. Ectopic expressions seem to have been analyzed using hybridization (with images). The database (InGap) contains protein expression analyzed with western blot, immunohistochemical analysis, and immunoprecipitation using antibodies based on mKIAA. The database (InCeP) contains protein-protein interactions between mKIAA expressed proteins analyzed with immunoprecipitation and MS/MS. The interactions can be searched/displayed/downloaded, but the display required a dedicated software.
Database for Protein-Ligand Interactions
,
Dictionary of protein-ligand interactions
A database of protein-ligand interactions from literature. The database contains information on ligands (name, molecular structures, molecular weights, etc), proteins (name, organisms, ID numbers in PIR/SWISS-PROT/PDB, etc) and experiments (binding/inhibitory activity, etc), and the citations.
PRIME (PRotein Interaction and Molecular information databas
A revised version of Kinase Pathways Database that additionally contains interaction types and the validity in the evidences.
Chemical compounds relevant to life?
,
LIGAND
A database of chemical compounds and the reactions which are relevant to biological processes. It consists of COMPOUND, DRUGS, GLYCAN, REACTION, RPAIR, ENZYME (from Enzyme Nomenclature).
Database for G-protein coupled receptor (GPCR) and ligand interaction
,
GLIDA
Interactions between G-protein coupled receptors and the ligands are collected from public databases and linked.
Lipid
Database for lipids
,
LipidBank
A database of structures, physical and chemical characteristics, spectrum data, metabolic pathways, fatty acids compositions, and citations, of bioactive lipids. The structures are provided in a ChemDraw format.
Marker Molecule
Biomarker Candidates
,
Biomarker search service
A service (and the supporting databases) to search biomarker candidates from given keywords. User-specified keywords are searched in medical documents (originated from Medline, OMIM, and PPI), and chemical compounds that appear specifically in the hit documents are listed as marked candidates. At the same time, genes/proteins to the keywords are searched in the same way, and the chemical compounds corresponding to the genes/proteins are searched, and the hit results are assumed to be the marker candidates. The locations of the hit compounds in the documents can be referred to.
RNA
tRNA gene database curated manually by experts
,
tRNADB-CE
Students had initiatives in the prediction of tRNA candidate sequences in a large part of the prokaryotes' DNA sequences, including fragment genome sequences of uncultured microorganisms, using three programs (tRNAscan-SE, ARAGORN, tRNAfinder). In case of different prediction among the programs (about three percent of the predicted sequences), three tRNA experiment experts (Hachiro Iguchi, former professor of Kyoto University, Akira Muto, professor emeritus of Hirosaki University, and Yuko Yamada, former lecturer of Jichi Medical College) made close inspection and registered in the database as tRNA. Because fragment genome sequences were added to analyses target, more than 140,000 tRNA genes were registered, which is four times larger than databases in the past. It is an outcome of integrated database project.
snoOPY
The database contains snoRNA (small nucleolus RNA) for 10 organisms.
ncRNAs database
Noncoding RNAs with regulatory functions
miRBase
Database of microRNAs (small noncoding RNAs)
Database of functional RNA sequences and literature information
,
fRNAdb
It is one of the outcome databases of METI functional RNA project. It is a database of known and novel functional RNA sequences and the literature.
fRNA Database
Database of small nucleolar RNAs from budding yeast
,
Yeast snoRNA Database
A database of budding yeast snoRNA structures and their interactions with other RNAs.
Database of functional RNAs that utilizes the UCSC Genome Browser
,
UCSC GenomeBrowser for Functional RNA
It is a database that displays functional RNAs on UCSC Genome Browser. One of the outcomes of METI functional RNA project.
Annotation of tmRNA sequences
,
The tmRNA website
A database of tmRNA information. It contains sequences, secondary structures (the bases in the sequences are colored by the structural elements), corresponding proteolysis tag peptides, as well as multiple alignments of all the tmRNA sequence sets and all the tag peptide sets. The sequences consist of those identified by direct sequencing and those obtained from public databases.
Database for tmRNA sequences
,
The tmRDB and SRPDB resources.
A database of tmRNA (stable low molecular weight RNA having functions of both tRNA and mRNA .They are widely observed in bacteria), SRP RNA (signal recognition particle RNA), proteolysis tag peptides, and related proteins. Other than the sequences, multiple alignments, topological structures (only tmRNA and SRP RNA), and 3D structures are contained.
Database for rRNA sequences
,
Ribosomal Database Project (RDP-II)
A database of known rRNA sequences from GenBank. Alignments that are also provided have been constructed with original algorithms the take into account sequences and secondary structures at the same time. The database has a browser that organized the entries based taxonomical hierarchies.
RNAmod db
RNA modification db
RNABase
RNA-containing structures from PDB and NDB
RNA Modification Database
Naturally modified nucleosides in RNA
RNA Bibliography
Project Specific Custom Tracks
Plant snoRNA DB
snoRNA genes in plant species
5S Ribosomal RNA Database
5S rRNA sequences
miRNA
The mi-R ontology database
,
miRò
miRò is a web-based knowledge base that provides users with miRNA–phenotype associations in humans. It integrates data from various online sources, such as databases of miRNAs, ontologies, diseases and targets, into a unified database equipped with an intuitive and flexible query interface and data mining facilities. The main goal of miRò is the establishment of a knowledge base which allows non-trivial analysis through sophisticated mining techniques and the introduction of a new layer of associations between genes and phenotypes inferred based on miRNAs annotations. Furthermore, a specificity function applied to validated data highlights the most significant associations.
Project Specific Custom Tracks
rRNA
5S Ribosomal RNA Database
5S rRNA sequences
tRNA
tRNA gene database curated manually by experts
,
tRNADB-CE
Students had initiatives in the prediction of tRNA candidate sequences in a large part of the prokaryotes' DNA sequences, including fragment genome sequences of uncultured microorganisms, using three programs (tRNAscan-SE, ARAGORN, tRNAfinder). In case of different prediction among the programs (about three percent of the predicted sequences), three tRNA experiment experts (Hachiro Iguchi, former professor of Kyoto University, Akira Muto, professor emeritus of Hirosaki University, and Yuko Yamada, former lecturer of Jichi Medical College) made close inspection and registered in the database as tRNA. Because fragment genome sequences were added to analyses target, more than 140,000 tRNA genes were registered, which is four times larger than databases in the past. It is an outcome of integrated database project.
Reference
Genotype and Phenotype
,
dbGaP
The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits. The advent of high-throughput, cost-effective methods for genotyping and sequencing has provided powerful tools that allow for the generation of the massive amount of genotypic data required to make these analyses possible.
PubMed
,
biomedical literature citations and abstructs
PubMed lets you search millions of journal citations and abstracts in the fields of medicine, nursing, dentistry, veterinary medicine, the health care system, and preclinical sciences. It includes access to MEDLINE® and to citations for selected articles in life science journals not included in MEDLINE. PubMed also provides access to additional relevant Web sites and links to the other NCBI molecular biology resources.
OMIM
,
Online Mendelian Inheritance in Man
OMIM is a comprehensive, authoritative, and timely compendium of human genes and genetic phenotypes. The full-text, referenced overviews in OMIM contain information on all known mendelian disorders and over 12,000 genes. OMIM focuses on the relationship between phenotype and genotype. It is updated daily, and the entries contain copious links to other genetics resources.
OMIA
,
Online Mendelian Inheritance in Animals
Online Mendelian Inheritance in Animals (OMIA) is a database of genes, inherited disorders and traits in animal species (other than human and mouse) authored by Professor Frank Nicholas of the University of Sydney, Australia, with help from many people over the years. The database contains textual information and references, as well as links to relevant records from OMIM, PubMed, Gene, and soon to NCBI's Phenotype database.
A dictionary of technical terms in the life sciences
,
LIFE SCIENCE DICTIONARY PROJECT
Dictionaries of technical terms that are used in life sciences. English-Japanese, Japanese-English, and co-occurrence searches are available, and the usages of the terms in article abstracts can be confirmed.
Knowledge model ?of gene ontology
,
The Gene Ontology (GO) project in 2006.
A database that provide structured and controlled vocabularies to describe the genes, gene products, and the sequences.
RNA Bibliography
Parsed MEDLINE® data download service
Mesh
,
detailed information about NLM's controlled vocabulary
MeSH is the U.S. National Library of Medicine's controlled vocabulary used for indexing articles for MEDLINE/PubMed. MeSH terminology provides a consistent way to retrieve information that may use different terminology for the same concepts
Info-Pubmed
GENA (Gene Name Dictionary)
FACTS
,
Link between mouse cDNA, and literature
A database that mapped literature information on RIKEN mouse full-length cDNA clones. It consists of links between predicted functions based on sequence homology searches and information of protein functions obtained from keyword searches in literature.
Sugar chain
Dictionary of carbohydrate antigens and antibodies
,
Sugar chain database (GlycoEpitope)
A dictionary of carbohydrate antigens and antibodies that recognize them. Information on the antigen include sugar chains, antibodies that recognize them, glycoproteins having the antigens, glycolipids having them as a building block, and the enzymes participating in the biosynthesis and degradation. Information on the antibodies include antibodies, sugar chain sequences that they recognize, cases of immunoprecipitation, immunobloting, and histochemistry experiments using them, and places to obtain them.
JCGGDB
,
Japan Consortium for Glycobiology and Glycotechnology DataBase
First of all, we started sharing and integrating 6 databases developed by Glycogene Function Team, Molecular Medicine Team and Lectin Application and Analysis Team at AIST. I.Mass spectral database of Glycans (GMDB : GlycoMass Database) II.Lectin database (LfDB : Lectin frontier Database) III.Lectin-glycan interaction profiling database (LfDB : Lectin frontier Database) IV.Glycogene database (GGDB:GlycoGene Database) V.Database on substrate specificity of glycosyltransferases (KEM-C) VI.N-glycosylated protein database (C.elegans, mouse, etc.) (GlycoProtDB : GlycoProtein Database) In 2008, 3 Universities (Ritsumeikan University, Nagoya City University and Nagoya University) and an organization (Executive Committee of Lipid Database) newly joined the project. We will integrate the database which each institute has and the one of AIST. We are calling for the participation of research institutes, universities, companies and organizations which have glycan-related resources (data or model organisms, etc.) (cited from http://jcggdb.jp/seturitu_en.html ) Developed by Research Center for Medical Glycoscience, National Institute of Advanced Industrial Science and Technology
Glycoconjugate Data Bank:Structure
Glycan
,
Glycome related pathways
A database of sugar chains and the relevant pathways that have been collected from KEGG, CarbBank, and literature. Tools to generate possible structures of sugar chains are included. A part of KEGG LIGAND.
GALAXY
,
Structures of N-gylcans in glycoproteins
Sugar chains structures linked to asparagine residues (N type sugar chains) were analyzed the original 2D/3D sugar chain mapping, and the obtained structures (combinations of sugar residue units) are registered.
[Created by
Life Science Databases(LSDB)
]