Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
http://cgi-www.daimi.au.dk/cgi-chili/datfap/frontdoor.py
A database of transcription factors from 13 plant species, and PCR primers for around 90% of them.
Proper citation: DATFAP (RRID:SCR_005413) Copy
A publicly available database of Transposed elements (TEs) which are located within protein-coding genes of 7 organisms: human, mouse, chicken, zebrafish, fruilt fly, nematode and sea squirt. Using TranspoGene the user can learn about the many aspects of the effect these TEs have on their hosting genes, such as: exonization events (including alternative splicing-related data), insertion of TEs into introns, exons, and promoters, specific location of the TE over the gene, evolutionary divergence of the TE from its consensus sequence and involvement in diseases. TranspoGene database is quickly searchable through its website, enables many kinds of searches and is available for download. TranspoGene contains information regarding specific type and family of the TEs, genomic and mRNA location, sequence, supporting transcript accession and alignment to the TE consensus sequence. The database also contains host gene specific data: gene name, genomic location, Swiss-Prot and RefSeq accessions, diseases associated with the gene and splicing pattern. The TranspoGene and microTranspoGene databases can be used by researchers interested in the effect of TE insertion on the eukaryotic transcriptome.
Proper citation: TranspoGene (RRID:SCR_005634) Copy
http://www.gene-regulation.com/pub/databases.html#transfac
Manually curated database of eukaryotic transcription factors, their genomic binding sites and DNA binding profiles. Used to predict potential transcription factor binding sites.
Proper citation: TRANSFAC (RRID:SCR_005620) Copy
The Kabat Database determines the combining site of antibodies based on the available amino acid sequences. The precise delineation of complementarity determining regions (CDR) of both light and heavy chains provides the first example of how properly aligned sequences can be used to derive structural and functional information of biological macromolecules. The Kabat database now includes nucleotide sequences, sequences of T cell receptors for antigens (TCR), major histocompatibility complex (MHC) class I and II molecules, and other proteins of immunological interest. The Kabat Database searching and analysis tools package is an ASP.NET web-based portal containing lookup tools, sequence matching tools, alignment tools, length distribution tools, positional correlation tools and much more. The searching and analysis tools are custom made for the aligned data sets contained in both the SQL Server and ASCII text flat file formats. The searching and analysis tools may be run on a single PC workstation or in a distributed environment. The analysis tools are written in ASP.NET and C# and are available in Visual Studio .NET 2003/2005/2008 formats. The Kabat Database was initially started in 1970 to determine the combining site of antibodies based on the available amino acid sequences at that time. Bence Jones proteins, mostly from human, were aligned, using the now-known Kabat numbering system, and a quantitative measure, variability, was calculated for every position. Three peaks, at positions 24-34, 50-56 and 89-97, were identified and proposed to form the complementarity determining regions (CDR) of light chains. Subsequently, antibody heavy chain amino acid sequences were also aligned using a different numbering system, since the locations of their CDRs (31-35B, 50-65 and 95-102) are different from those of the light chains. CDRL1 starts right after the first invariant Cys 23 of light chains, while CDRH1 is eight amino acid residues away from the first invariant Cys 22 of heavy chains. During the past 30 years, the Kabat database has grown to include nucleotide sequences, sequences of T cell receptors for antigens (TCR), major histocompatibility complex (MHC) class I and II molecules and other proteins of immunological interest. It has been used extensively by immunologists to derive useful structural and functional information from the primary sequences of these proteins.
Proper citation: Kabat Database of Sequences of Proteins of Immunological Interest (RRID:SCR_006465) Copy
http://edwardslab.bmcb.georgetown.edu/downloads/
The Peptide Sequence Database contains putative peptide sequences from human, mouse, rat, and zebrafish. Compressed to eliminate redundancy, these are about 40 fold smaller than a brute force enumeration. Current and old releases are available for download. Each species'' peptide sequence database comprises peptide sequence data from releveant species specific UniGene and IPI clusters, plus all sequences from their consituent EST, mRNA and protein sequence databases, namely RefSeq proteins and mRNAs, UniProt''s SwissProt and TrEMBL, GenBank mRNA, ESTs, and high-throughput cDNAs, HInv-DB, VEGA, EMBL, IPI protein sequences, plus the enumeration of all combinations of UniProt sequence variants, Met loss PTM, and signal peptide cleavages. The README file contains some information about the non amino-acid symbols O (digest site corresponding to a protein N- or C-terminus) and J (no digest sequence join) used in these peptide sequence databases and information about how to configure various search engines to use them. Some search engines handle (very) long sequences badly and in some cases must be patched to use these peptide sequence databases. All search engines supported by the PepArML meta-search engine can (or can be patched to) successfully search these peptide sequence databases.
Proper citation: Peptide Sequence Database (RRID:SCR_005764) Copy
http://indel.bioinfo.sdu.edu.cn/gridsphere/gridsphere
THIS RESOURCE IS NO LONGER IN SERVCE, documented September 2, 2016. Indel Flanking Region Database is an online resource for indels and the flanking regions of proteins in SCOP superfamilies, including amino acid sequences, lengths, locations, secondary structure constitutions, hydrophilicity / hydrophobicity, domain information, 3D structures and so on. It aims at providing a comprehensive dataset for analyzing the qualities of amino acid insertion/deletions(indels), substitutions and the relationship between them. The indels were obtained through the pairwise alignment of homologous structures in SCOP superfamilies. The IndelFR database contains 2,925,017 indels with flanking regions extracted from 373,402 structural alignment pairs of 12,573 non-redundant domains from 1053 superfamilies. IndelFR has already been used for molecular evolution studies and may help to promote future functional studies of indels and their flanking regions.
Proper citation: IndelFR - Indel Flanking Region Database (RRID:SCR_006050) Copy
Collection of transmembrane protein datasets containing experimentally derived topology information from the literature and from public databases. Web interface of TOPDB includes tools for searching, relational querying and data browsing, visualisation tools for topology data.
Proper citation: Topology Data Bank of Transmembrane Proteins (RRID:SCR_007964) Copy
It provides a database based on a pre-computed similarity matrix covering the similarity space formed by >4 million amino acid sequences from public databases and completely sequenced genomes. The database is capable of handling very large datasets and is updated incrementally. For sequence similarity searches and pairwise alignments, we implemented a grid-enabled software system, which is based on FASTA heuristics and the Smith Waterman algorithm. SimpleSIMAP and AdvancedSIMAP retrieve homologs for given protein sequences that need to be contained in the SIMAP database. While SimpleSIMAP provides only selected parameters and preconfigured search spaces, the AdvancedSIMAP allows the user to specify search space, filtering and sorting parameters in a flexible manner. Both types of queries result in lists of homologs that are linked in turn to their homologs. So the web interfaces allow users to explore quickly and interactively the protein world by homology. Sponsors: SIMAP is supported by the Department of Genome Oriented Bioinformatics of the Technische Universitt Mnchen and the Institute for Bioinformatics of the GSF-National Research Center for Environment and Health.
Proper citation: SIMAP (RRID:SCR_007927) Copy
http://www.grt.kyushu-u.ac.jp/spad/
It is divided to four categories based on extracellular signal molecules (Growth factor, Cytokine, and Hormone) and stress, that initiate the intracellular signaling pathway. SPAD is compiled in order to describe information on interaction between protein and protein, protein and DNA as well as information on sequences of DNA and proteins. There are multiple signal transduction pathways: cascade of information from plasma membrane to nucleus in response to an extracellular stimulus in living organisms. Extracellular signal molecule binds specific intracellular receptor, and initiates the signaling pathway. Now, there is a large amount of information about the signaling pathway which controls the gene expression and cellular proliferation. We have developed an integrated database SPAD to understand the overview of signaling transduction.
Proper citation: Signaling Pathway Database (RRID:SCR_008243) Copy
ITFP is an integrated transcription factor (TF) platform, which included abundant TFs and targets message of mammalian. Support vector machine (SVM) algorithm combined with error-correcting output coding (ECOC) algorithm was utilized to identify and classify transcription factor from protein sequence of Human, Mouse and Rat. For transcription factor targets, a reverse engineering method named ARACNE was used to derive potential interaction pairs between transcription factor and downstream regulated gene from Human, Mouse and Rat gene expression profile data. Detailed information of gene expression profile data can be found in help page. Moreover, all data provided by the platform is free for non-commercial users and can be downloaded through links on help page.
Proper citation: Intergrated Transcription Factor Platform (RRID:SCR_008119) Copy
http://pbil.univ-lyon1.fr/databases/homolens.php
Database of homologous genes from Ensembl organisms, structured under ACNUC sequence database management system. It allows to select sets of homologous genes among species, and to visualize multiple alignments and phylogenetic trees. It is possible to search for orthologous genes in a wide range of taxons. HOMOLENS is particularly useful for comparative sequence analysis, phylogeny and molecular evolution studies. More generally, HOMOLENS gives an overall view of what is known about a peculiar gene family. Note that HOMOLENS is split into two databases on this server: HOMOLENS contains the protein sequences while HOMOLENSDNA contains the nucleotide sequences. Protein sequences of HOMOLENS have been generated by translating the CDS of HOMOLENSDNA and using associated cross-references to generate the annotations.
Proper citation: Homologous Sequences in Ensembl Animal Genomes (RRID:SCR_008356) Copy
http://www.primervfx.com/#welcome
PrimerParadise is an online PCR primer database for genomics studies. The database contains predesigned PCR primers for amplification of exons, genes and SNPs of almost all sequenced genomes. Primers can be used for genome-wide projects (resequencing, mutation analysis, SNP detection etc). The primers for eukaryotic genomes have been tested with e-PCR to make sure that no alternative products will be generated. Also, all eukaryotic primers have been filtered to exclude primers that bind excessively throughout the genome. Genes are amplified as amplicons. Amplicons are defined as only one genes exons containing maximaly 3000 bp long dna segments. If gene is longer than 3000 bp then it is split into the segments at length 3000 bp. So for example gene at length 5000 bp is split into two segment and for both segments there were designed a separate primerpair. If genes exons length is over 3000 bp then it is split into amplicons as well. Every SNP has one primerpair. In addition of considering repetitive sequences and mono-dinucleotide repeats, we avoid designing primers to genome regions which contain other SNPs. -There are two ways to search for primers: you can use features IDs ( for SNP primers Reference ID, for gene/exon primers different IDs (Ensembl gene IDs, HUGO IDs for human genes, LocusLink IDs, RefSeq IDs, MIM IDs, NCBI gene names, SWISSPROT IDs for bacterial genes, VEGA gene IDs for human and mouse, Sanger S.pombe systematic gene names and common gene names, S.cerevisiae GeneBanks Locus, AccNo, GI IDs and common gene names) -you can use genome regions (chromosome coordinates, chromosome bands if exists) -Currently we provide 3 primers collections: proPCR for prokaryotic organisms genes primers -euPCR for eukaryotic organisms genes/exons primers -snpPCR for eukaryotic organisms SNP primers Sponsors: PrimerStudio is funded by the University of Tartu.
Proper citation: PrimerStudio (RRID:SCR_008232) Copy
http://www.thearkdb.org/arkdb/
This website contains the mapping sequence of poultry. The ArkDB database system aims to provide a comprehensive public repository for genome mapping data from farmed and other animal species. In doing so, it aims to provide a route in to genomic and other sequence from the initial viewpoint of linkage mapping, RH mapping, physical mapping or - possibly more importantly - QTL mapping data. It's supported, in part, by the USDA-CSREES National Animal Genome Research Program in order to serve the poultry genome mapping community. This system represents a complete rewrite of the original version with the code migrated to java and the underlying database targeted at postgres (although any standards-compliant database engine should suffice). The initial release records details of maps and the markers that they contain. There are alternative entry points that target either a chromosome or a specific mapping analysis as the starting point. Limited relationships between markers are recorded and displayed. As with the previous version, all maps are drawn using data extracted from the database on the fly.
Proper citation: ChickBase (RRID:SCR_008147) Copy
http://locustdb.genomics.org.cn/
The migratory locust (Locusta migratoria) is an orthopteran pest and a representative member of hemimetabolous insects. Its transcriptomic data provide invaluable information for molecular entomology study of the insect and pave a way for comparative studies of other medically, agronomically, and ecologically relevant insects. This first transcriptomic database of the locust (LocustDB) has been developed, building necessary infrastructures to integrate, organize, and retrieve data that are either currently available or to be acquired in the future. It currently hosts 45,474 high quality EST sequences from the locust, which were assembled into 12,161 unigenes. This database contains original sequence data, including homologous/orthologous sequences, functional annotations, pathway analysis, and codon usage, based on conserved orthologous groups (COG), gene ontology (GO), protein domain (InterPro), and functional pathways (KEGG). It also provides information from comparative analysis based on data from the migratory locust and five other invertebrate species, such as the silkworm, the honeybee, the fruitfly, the mosquito and the nematode. LocustDB also provides information from comparative analysis based on data from the migratory locust and five other invertebrate species, such as the silkworm, the honeybee, the fruitfly, the mosquito and the nematode. It starts with the first transcriptome information for an orthopteran and hemimetabolous insect and will be extended to provide a framework for incorporation of in-coming genomic data of relevant insect groups and a workbench for cross-species comparative studies.
Proper citation: Migratory Locust EST Database (RRID:SCR_008201) Copy
http://www.sanger.ac.uk/Projects/C_elegans/index.shtml
The Sanger Institute and the Genome Sequencing Center at the Washington University School of Medicine, St. Louis have collaborated to sequence the genomes of both C. elegans and C. briggsae. The completed C. elegans genome sequence is represented by over 3,000 individual clone sequences which can be accessed through this site (or through WormBase). These sequences are submitted to EMBL whenever the sequence or annotation changes (e.g. modification to gene structures) and these submissions are then mirrored to GenBank and DDBJ. These sequences (along with ESTs and proteins) can be searched on our C. elegans BLAST server. WormBase is the repository of mapping, sequencing and phenotypic information for C. elegans. The worm informatics group at the Sanger Institute play a key role in assembling the whole database. They also curate and develop some of the constituent databases that comprise WormBase.
Proper citation: Caenorhabditis Genome Sequencing Projects (RRID:SCR_008155) Copy
http://www.ebi.ac.uk/asd/aedb/index.html
THIS RESOURCE IS NO LONGER IN SERVICE, documented on March 27, 2013. A manual generated database for alternative exons and their properties from numerous species - the data is gathered from literature where these exons have been experimentally verified. Most alternative exons are cassette exons and are expressed in more than two tissues. Of all exons whose expression was reported to be specific for a certain tissue, the majority were expressed in the brain. At the moment, AEdb products that are available are sequence (a database of alternative exons), function (a database of functions attributed to constitutive and alternative exon), regulatory sequence (a database of transcript regulatory motifs), minigenes (a table of minigenes and their associations to splicing events), and diseases (a table of diseases associated with splicing and their associations to AltSplice). Alternative splicing is an important regulatory mechanism of mammalian gene expression. The alternative splicing database (ASD) consortium is systematically collecting and annotating data on alternative splicing. The continuation and upgrade of the ASD consists of computationally and manually generated data. Its largest parts are AltSplice, a value-added database of computationally delineated alternative splicing events. Its data include alternatively spliced introns/exons, events, isoform splicing patterns and isoform peptide sequences. AltSplice data are generated by examining gene-transcript alignments. The data are annotated for various biological features including splicing signals, expression states, (SNP)-mediated splicing and cross-species conservation. AEdb forms the manually curated component of ASD. It is a literature-based data set containing sequence and properties of alternatively spliced exons, functional enumeration of observed splicing events, characterization of observed splicing regulatory elements, and a collection of experimentally clarified minigene constructs.
Proper citation: Alternative Exon Database (RRID:SCR_008157) Copy
http://mips.gsf.de/services/genomes/uwe25/
THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 15, 2013. This is the official database of the environmental chlamydia genome project. This resource provides access to finished sequence for Parachlamydia-related symbiont UWE25 and to a wide range of manual annotations, automatical analyses and derived datasets. Functional classification and description has been manually annotated according to the Annotation guidelines. Chlamydiae are the major cause of preventable blindness and sexually transmitted disease. Genome analysis of a chlamydia-related symbiont of free-living amoebae revealed that it is twice as large as any of the pathogenic chlamydiae and had few signs of recent lateral gene acquisition. We showed that about 700 million years ago the last common ancestor of pathogenic and symbiotic chlamydiae was already adapted to intracellular survival in early eukaryotes and contained many virulence factors found in modern pathogenic chlamydiae, including a type III secretion system. Ancient chlamydiae appear to be the originators of mechanisms for the exploitation of eukaryotic cells. Environmental chlamydiae have recently been recognized as obligate endosymbionts of free-living amoebae and have been implicated as potential human pathogens. Environmental chlamydiae form a deep branching evolutionary lineage within the medically important order Chlamydiales. Despite their high diversity and ubiquitous distribution in clinical and environmental samples only limited information about genetics and ecology of these microorganisms is available. The Parachlamydia-related Acanthamoeba symbiont UWE25 was therefore selected as representative environmental chlamydia strain for whole genome sequencing. Comparative genome analysis was performed using PEDANT and simap. Sponsors: The environmental chlamydia genome project was funded by the bmb+f (German Federal Ministry of Education and Research) and is part of the Competence Network PathoGenoMiK.
Proper citation: Protochlamydia amoebophila UWE25 (RRID:SCR_008222) Copy
http://www.bioinf.mdc-berlin.de/splice/db/
THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 15, 2013. An online available compendium of alternative splice forms for several organisms (Arabidopsis thaliana, Bos taurus, Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Homo sapiens, Mus musculus, Rattus norvegicus, Xenopus laevis). Alternative splice forms are defined by comparing high-scoring ESTs to mRNA sequences (both from GenBank) with known exon-intron information (from ENSEMBL database) using BLAST. Repetitive sequences of all mRNAs have beforehand been masked by MaskerAid. Filtering programs with defined parameters compare the ends of each aligned sequence pair for deletions or insertions in the EST sequence, which suggest the existence of alternative splice forms. The database is accessible by typing in accession numbers (ACC) or keywords like description, gene names, organism or other keywords. (If more than one hit was found a list of all results is given.) And the result page is divided into 4 major parts. The first part (General Information About The Entry) summarizes the most important information as database ids, organism, and description. The so called alternative splice profile (ASP) of each human sequence is shown in the second part (Alternative Splice Frequency). The ASP indicates the number of alternatively spliced ESTs (NAE), the number of constitutively spliced ESTs (NCE) as well as the number of alternative splice sites (NSS) per mRNA. NAE and NCE corresponds to the EST coverage and can be used as a quality value for the predicted alternative splice variants. The NSS value specifies the splice propensity of a gene. Moreover the number of ESTs from cancerous tissues is shown. The histological source and the developmental stages are illustrated with several colors to enables the user to get an overview of the origins of the matching ESTs. Also, the Splice Site View shows graphically all alternative splice sites for the whole transcript.
Proper citation: Extended Alternatively Spliced EST Database (RRID:SCR_008186) Copy
http://mpr.nci.nih.gov/MPR/BrowseProteins.aspx
THIS RESOURCE IS NO LONGER IN SERVICE, documented on 6/24/13. A repository of information on commercially available phospho-specific antibodies to human phosphorylation sites. It provides a BLAST search for phosphorylation sites using as query the amino acid sequence surrounding the site. It also provides direct links to the relevant antibodies from many companies including BD Pharmingen, Biosource International, Cell Signaling Technology (CST), Santa Cruz Biotechnologies, Upstate Biotechnology.
Proper citation: Mammalian Phosphorylation Resource (RRID:SCR_008210) Copy
http://www.schematikon.org/Nh3D.html
THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 17, 2013. It is freely available as a reference dataset for the statistical analysis of sequence and structure features of proteins in the PDB. It is a dataset of structurally dissimilar proteins. This dataset has been compiled by selecting well resolved representatives from the Topology level of the CATH database which hierarchically classifies all protein structures. These have been been pruned to remove: i) domains that may contain homologous elements (by pairwise sequence comparison and structural superposition of aligned residues) ii) internal duplications (by repeat detection) iii) regions with high B-Factor The statistical analysis of protein structures requires datasets in which structural features can be considered independently distributed, i.e. not related through common ancestry, and that fulfill minimal requirements regarding the experimental quality of the structures it contains. However, non-redundant datasets based on sequence similarity invariably contain distantly related homologues. Here a reference dataset of non-homologous protein domains is provided, assuming that structural dissimilarity at the topology level is incompatible with recognizable common ancestry. It contains the best refined representatives of each Topology level, validates structural dissimilarity and removes internally duplicated fragments. The compilation of Nh3D is fully scripted. The current Nh3D list contains 570 domains with a total of 90780 residues. It covers more than 70% of folds at the Topology level of the CATH database and represents more than 90% of the structures in the PDB that have been classified by CATH. Even though all protein pairs are structurally dissimilar, some pairwise sequence identities after global alignment are greater than 30%. Nh3D is freely available as a reference dataset for the statistical analysis of sequence and structure features of proteins in the PDB.
Proper citation: Nh3D: A Reference Dataset of Structures of Non-homologous Proteins (RRID:SCR_008212) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the RRID Resources search. From here you can search through a compilation of resources used by RRID and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that RRID has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on RRID then you can log in from here to get additional features in RRID such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into RRID you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within RRID that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.