Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

  • Register
X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X

Leaving Community

Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.

No
Yes
X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature.

PloS one | 2016

The NIF Registry developed and maintained by the Neuroscience Information Framework is a cooperative project aimed at cataloging research resources, e.g., software tools, databases and tissue banks, funded largely by governments and available as tools to research scientists. Although originally conceived for neuroscience, the NIF Registry has over the years broadened in the scope to include research resources of general relevance to biomedical research. The current number of research resources listed by the Registry numbers over 13K. The broadening in scope to biomedical science led us to re-christen the NIF Registry platform as SciCrunch. The NIF/SciCrunch Registry has been cataloging the resource landscape since 2006; as such, it serves as a valuable dataset for tracking the breadth, fate and utilization of these resources. Our experience shows research resources like databases are dynamic objects, that can change location and scope over time. Although each record is entered manually and human-curated, the current size of the registry requires tools that can aid in curation efforts to keep content up to date, including when and where such resources are used. To address this challenge, we have developed an open source tool suite, collectively termed RDW: Resource Disambiguator for the (Web). RDW is designed to help in the upkeep and curation of the registry as well as in enhancing the content of the registry by automated extraction of resource candidates from the literature. The RDW toolkit includes a URL extractor from papers, resource candidate screen, resource URL change tracker, resource content change tracker. Curators access these tools via a web based user interface. Several strategies are used to optimize these tools, including supervised and unsupervised learning algorithms as well as statistical text analysis. The complete tool suite is used to enhance and maintain the resource registry as well as track the usage of individual resources through an innovative literature citation index honed for research resources. Here we present an overview of the Registry and show how the RDW tools are used in curation and usage tracking.

Pubmed ID: 26730820 RIS Download

Research resources used in this publication

Antibodies used in this publication

None found

Associated grants

  • Agency: NIDA NIH HHS, United States
    Id: HHSN271200577531C
  • Agency: NIDA NIH HHS, United States
    Id: U24 DA039832
  • Agency: NIDDK NIH HHS, United States
    Id: U24 DK097771
  • Agency: PHS HHS, United States
    Id: HHSN271200577531C

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


SciCrunch (tool)

RRID:SCR_003115

Community portal for researchers and content management system for data and databases. Intended to provide common source of data to research community and data about Research Resource Identifiers (RRIDs), which can be used in scientific publications. Central service where RRIDs can be searched and created. Designed to help communities of researchers create their own portals to provide access to resources, databases and tools of relevance to their research areas. Adds value to existing scientific resources by increasing their discoverability, accessibility, visibility, utility and interoperability, regardless of their current design or capabilities and without need for extensive redesign of their components or information models. Resources can be searched and discovered at multiple levels of integration, from superficial discovery based on limited description of resource at SciCrunch Registry, to deep content query at SciCrunch Data Federation.

View all literature mentions

ModelDB (tool)

RRID:SCR_007271

Curated database of published models so that they can be openly accessed, downloaded, and tested to support computational neuroscience. Provides accessible location for storing and efficiently retrieving computational neuroscience models.Coupled with NeuronDB. Models can be coded in any language for any environment. Model code can be viewed before downloading and browsers can be set to auto-launch the models. The model source code has to be available from publicly accessible online repository or WWW site. Original source code is used to generate simulation results from which authors derived their published insights and conclusions.

View all literature mentions

Gemma (tool)

RRID:SCR_008007

Resource for reuse, sharing and meta-analysis of expression profiling data. Database and set of tools for meta analysis, reuse and sharing of genomics data. Targeted at analysis of gene expression profiles. Users can search, access and visualize coexpression and differential expression results.

View all literature mentions

CSSP (tool)

RRID:SCR_012932

Software for power computation for ChIP-Seq data based on Bayesian estimation for local poisson counting process.

View all literature mentions

Genetic Analysis Software (tool)

RRID:SCR_013155

THIS RESOURCE IS NO LONGER IN SERVICE. Documented on May 4th,2023. Listing of computer software for the gene mapping community on the following topics: genetic linkage analysis for human pedigree data, QTL analysis for animal/plant breeding data, genetic marker ordering, genetic association analysis, haplotype construction, pedigree drawing, and population genetics. The inclusion of a program should not be interpreted as an endorsement to that program from us. In the last few years, new technology produces new types of genetic data, and the scope of genetic analyses change dramatically. It is no longer obvious whether a program should be included or excluded from this list. Topics such as next-generation-sequencing (NGS), gene expression, genomics annotation, etc. can all be relevant to a genetic study, yet be specialized topics by themselves. Though programs on variance calling from NSG can be in, those can sequence alignment might be out; programs on eQTL can be in, those on differential expression might be out. This page was created by Dr. Wentian Li, when he was at Columbia University (1995-1996). It was later moved to Rockefeller University (1996-2002), and now takes its new home at North Shore LIJ Research Institute (2002-now). The present copy is maintained by Jurg Ott as a single file. More than 240 programs have been listed by December 2004, more than 350 programs by August 2005, close to 400 programs by December 2006, and close to 480 programs by November 2008, and over 600 programs by October 2012. A version of the searchable database was developed by Zhiliang Hu of Iowa State University, and a recent round of updating was assisted by Wei JIANG of Harbin Medical School. Some earlier software can be downloaded from EBI: ftp://ftp.ebi.ac.uk/pub/software/linkage_and_mapping/ (Linkage and Mapping Software Repository), and http://genamics.com/software/index.htm may contain archived copy of some programs.

View all literature mentions

NIF Registry Automated Crawl Data (data or information resource)

RRID:SCR_012862

An automatic pipeline based on an algorithm that identifies new resources in publications every month to assist the efficiency of NIF curators. The pipeline is also able to find the last time the resource's webpage was updated and whether the URL is still valid. This can assist the curator in knowing which resources need attention. Additionally, the pipeline identifies publications that reference existing NIF Registry resources as this is also of interest. These mentions are available through the Data Federation version of the NIF Registry, http://neuinfo.org/nif/nifgwt.html?query=nlx_144509 The RDF is based on an algorithm on how related it is to neuroscience. (hits of neuroscience related terms). Each potential resource gets assigned a score (based on how related it is to neuroscience) and the resources are then ranked and a list is generated.

View all literature mentions

PubMed Central (data or information resource)

RRID:SCR_004166

Collection of full text archive of biomedical and life sciences journal literature at U.S. National Institutes of Health National Library of Medicine (NIH/NLM). With PubMed Central, NCBI is taking lead in preserving and maintaining open access to electronic literature. Value of PubMed Central, in addition to its role as an archive, lies in what can be done when data from diverse sources is stored in common format in single repository. All articles in PMC are free (sometimes on a delayed basis). Some journals go beyond free, to Open Access.

View all literature mentions

PubMed (data or information resource)

RRID:SCR_004846

Public bibliographic database that provides access to citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full-text content from PubMed Central and publisher web sites. PubMed citations and abstracts include fields of biomedicine and health, covering portions of life sciences, behavioral sciences, chemical sciences, and bioengineering. Provides access to additional relevant web sites and links to other NCBI molecular biology resources. Publishers of journals can submit their citations to NCBI and then provide access to full-text of articles at journal web sites using LinkOut.

View all literature mentions

eagle-i research resource ontology (software resource)

RRID:SCR_008784

Ontology that models research resources such as instruments, protocols, reagents, animal models and biospecimens. It has been developed in the context of the eagle-i project (http://eagle-i.net/) and consists of over 3451 classes of which over 1200 were created within the ERO namespace, while the rest come from existent ontologies such as the Ontology for Biomedical Investigation (OBI), the uber-anatomy ontology (Uberon), VIVO, the Ontology for Clinical Research (OCRe), the Sequence Ontology (SO), the Software Ontology (SWO) and we include terms from the NCBI Taxonomy as well. The main ontology can be browsed in OntoBee. All purls resolve to OntoBee.

View all literature mentions

Biomedical Resource Ontology (data or information resource)

RRID:SCR_004443

THIS RESOURCE IS NO LONGER IN SERVICE. Documented on April 27,2023. A controlled terminology of resources, which is used to improve the sensitivity and specificity of web searches. It includes ''resource_type'', ''area of research'', and ''activity''. It is under development by a number of NIH-funded researchers who have a combined interest in classification of biomedical resources. The biositemaps site is no longer available but the biomedical resource ontology is still available via bioportal Biomedical Resource Ontology (BRO).

View all literature mentions

SciCrunch Registry (data or information resource)

RRID:SCR_005400

Interactive portal for finding and submitting biomedical resources. Resources within SciCrunch have assigned RRIDs which are used to cite resources in scientific manuscripts. SciCrunch Registry, formerly NIF Registry, provides resources catalog. Allows to add new resources. Allows edit existing resources after registration. Curators are tasked with identifying and registering resources, examining data, writing configuration files to index and display data and keeping contents current.

View all literature mentions

DISCO (service resource)

RRID:SCR_004586

DISCO is an information integration approach designed to facilitate interoperation among Internet resources. It consists of a set of tools and services that allows resource providers who maintain information to share it with automated systems such as NIF. NIF is then able to harvest the information and keep those sets of information up-to-date. How is this accomplished? By using a series of files and/or scripts which are then placed in the root directory of the resource developer''s resource. (NIF can also host the files on its servers and crawl for changes there.) Once the files of the resource providers are in place, and DISCO is notified, the DISCO server can then recognize and consume the information shared, providing machine understandable information to NIF Integrator Servers (also known as Aggregators) about your resource. What can DISCO do for my resource? * Inform search engines about your resource and keep your NIF Registry resource description up-to-date. * Expose your data (semi-structured datasets or fields within your structured database) through NIF''s Data Federation you choose what data will be shared. * Create links from an NCBI database (e.g., PubMed, Protein, Nucleotide, etc.) to your data records in NIF using Entrez LinkOut. * Advertise your terminology or ontological information. * Share your resource''s news with the NIF community.

View all literature mentions

Textpresso (software resource)

RRID:SCR_008737

An information extracting and processing package for biological literature that can be used online or installed locally via a downloadable software package, http://www.textpresso.org/downloads.html Textpresso's two major elements are (1) access to full text, so that entire articles can be searched, and (2) introduction of categories of biological concepts and classes that relate two objects (e.g., association, regulation, etc.) or describe one (e.g., methods, etc). A search engine enables the user to search for one or a combination of these categories and/or keywords within an entire literature. The Textpresso project serves the biological and biomedical research community by providing: * Full text literature searches of model organism research and subject-specific articles at individual sites. Major elements of these search engines are (1) access to full text, so that the entire content of articles can be searched, and (2) search capabilities using categories of biological concepts and classes that relate two objects (e.g., association, regulation, etc.) or identify one (e.g., cell, gene, allele, etc). The search engines are flexible, enabling users to query the entire literature using keywords, one or more categories or a combination of keywords and categories. * Text classification and mining of biomedical literature for database curation. They help database curators to identify and extract biological entities and facts from the full text of research articles. Examples of entity identification and extraction include new allele and gene names and human disease gene orthologs; examples of fact identification and extraction include sentence retrieval for curating gene-gene regulation, Gene Ontology (GO) cellular components and GO molecular function annotations. In addition they classify papers according to curation needs. They employ a variety of methods such as hidden Markov models, support vector machines, conditional random fields and pattern matches. Our collaborators include WormBase, FlyBase, SGD, TAIR, dictyBase and the Neuroscience Information Framework. They are looking forward to collaborating with more model organism databases and projects. * Linking biological entities in PDF and online journal articles to online databases. They have established a journal article mark-up pipeline that links select content of Genetics journal articles to model organism databases such as WormBase and SGD. The entity markup pipeline links over nine classes of objects including genes, proteins, alleles, phenotypes, and anatomical terms to the appropriate page at each database. The first article published with online and PDF-embedded hyperlinks to WormBase appeared in the September 2009 issue of Genetics. As of January 2011, we have processed around 70 articles, to be continued indefinitely. Extension of this pipeline to other journals and model organism databases is planned. Textpresso is useful as a search engine for researchers as well as a curation tool. It was developed as a part of WormBase and is used extensively by C. elegans curators. Textpresso has currently been implemented for 24 different literatures, among them Neuroscience, and can readily be extended to other corpora of text.

View all literature mentions

PubMed Central (data or information resource)

RRID:SCR_004166

Collection of full text archive of biomedical and life sciences journal literature at U.S. National Institutes of Health National Library of Medicine (NIH/NLM). With PubMed Central, NCBI is taking lead in preserving and maintaining open access to electronic literature. Value of PubMed Central, in addition to its role as an archive, lies in what can be done when data from diverse sources is stored in common format in single repository. All articles in PMC are free (sometimes on a delayed basis). Some journals go beyond free, to Open Access.

View all literature mentions

PubMed (data or information resource)

RRID:SCR_004846

Public bibliographic database that provides access to citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full-text content from PubMed Central and publisher web sites. PubMed citations and abstracts include fields of biomedicine and health, covering portions of life sciences, behavioral sciences, chemical sciences, and bioengineering. Provides access to additional relevant web sites and links to other NCBI molecular biology resources. Publishers of journals can submit their citations to NCBI and then provide access to full-text of articles at journal web sites using LinkOut.

View all literature mentions

NIF Registry Automated Crawl Data (data or information resource)

RRID:SCR_012862

An automatic pipeline based on an algorithm that identifies new resources in publications every month to assist the efficiency of NIF curators. The pipeline is also able to find the last time the resource's webpage was updated and whether the URL is still valid. This can assist the curator in knowing which resources need attention. Additionally, the pipeline identifies publications that reference existing NIF Registry resources as this is also of interest. These mentions are available through the Data Federation version of the NIF Registry, http://neuinfo.org/nif/nifgwt.html?query=nlx_144509 The RDF is based on an algorithm on how related it is to neuroscience. (hits of neuroscience related terms). Each potential resource gets assigned a score (based on how related it is to neuroscience) and the resources are then ranked and a list is generated.

View all literature mentions

eagle-i research resource ontology (software resource)

RRID:SCR_008784

Ontology that models research resources such as instruments, protocols, reagents, animal models and biospecimens. It has been developed in the context of the eagle-i project (http://eagle-i.net/) and consists of over 3451 classes of which over 1200 were created within the ERO namespace, while the rest come from existent ontologies such as the Ontology for Biomedical Investigation (OBI), the uber-anatomy ontology (Uberon), VIVO, the Ontology for Clinical Research (OCRe), the Sequence Ontology (SO), the Software Ontology (SWO) and we include terms from the NCBI Taxonomy as well. The main ontology can be browsed in OntoBee. All purls resolve to OntoBee.

View all literature mentions

Biomedical Resource Ontology (data or information resource)

RRID:SCR_004443

THIS RESOURCE IS NO LONGER IN SERVICE. Documented on April 27,2023. A controlled terminology of resources, which is used to improve the sensitivity and specificity of web searches. It includes ''resource_type'', ''area of research'', and ''activity''. It is under development by a number of NIH-funded researchers who have a combined interest in classification of biomedical resources. The biositemaps site is no longer available but the biomedical resource ontology is still available via bioportal Biomedical Resource Ontology (BRO).

View all literature mentions

SciCrunch Registry (data or information resource)

RRID:SCR_005400

Interactive portal for finding and submitting biomedical resources. Resources within SciCrunch have assigned RRIDs which are used to cite resources in scientific manuscripts. SciCrunch Registry, formerly NIF Registry, provides resources catalog. Allows to add new resources. Allows edit existing resources after registration. Curators are tasked with identifying and registering resources, examining data, writing configuration files to index and display data and keeping contents current.

View all literature mentions

PubMed (data or information resource)

RRID:SCR_004846

Public bibliographic database that provides access to citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full-text content from PubMed Central and publisher web sites. PubMed citations and abstracts include fields of biomedicine and health, covering portions of life sciences, behavioral sciences, chemical sciences, and bioengineering. Provides access to additional relevant web sites and links to other NCBI molecular biology resources. Publishers of journals can submit their citations to NCBI and then provide access to full-text of articles at journal web sites using LinkOut.

View all literature mentions

PubMed Central (data or information resource)

RRID:SCR_004166

Collection of full text archive of biomedical and life sciences journal literature at U.S. National Institutes of Health National Library of Medicine (NIH/NLM). With PubMed Central, NCBI is taking lead in preserving and maintaining open access to electronic literature. Value of PubMed Central, in addition to its role as an archive, lies in what can be done when data from diverse sources is stored in common format in single repository. All articles in PMC are free (sometimes on a delayed basis). Some journals go beyond free, to Open Access.

View all literature mentions

NIF Registry Automated Crawl Data (data or information resource)

RRID:SCR_012862

An automatic pipeline based on an algorithm that identifies new resources in publications every month to assist the efficiency of NIF curators. The pipeline is also able to find the last time the resource's webpage was updated and whether the URL is still valid. This can assist the curator in knowing which resources need attention. Additionally, the pipeline identifies publications that reference existing NIF Registry resources as this is also of interest. These mentions are available through the Data Federation version of the NIF Registry, http://neuinfo.org/nif/nifgwt.html?query=nlx_144509 The RDF is based on an algorithm on how related it is to neuroscience. (hits of neuroscience related terms). Each potential resource gets assigned a score (based on how related it is to neuroscience) and the resources are then ranked and a list is generated.

View all literature mentions

eagle-i research resource ontology (software resource)

RRID:SCR_008784

Ontology that models research resources such as instruments, protocols, reagents, animal models and biospecimens. It has been developed in the context of the eagle-i project (http://eagle-i.net/) and consists of over 3451 classes of which over 1200 were created within the ERO namespace, while the rest come from existent ontologies such as the Ontology for Biomedical Investigation (OBI), the uber-anatomy ontology (Uberon), VIVO, the Ontology for Clinical Research (OCRe), the Sequence Ontology (SO), the Software Ontology (SWO) and we include terms from the NCBI Taxonomy as well. The main ontology can be browsed in OntoBee. All purls resolve to OntoBee.

View all literature mentions

Biomedical Resource Ontology (data or information resource)

RRID:SCR_004443

THIS RESOURCE IS NO LONGER IN SERVICE. Documented on April 27,2023. A controlled terminology of resources, which is used to improve the sensitivity and specificity of web searches. It includes ''resource_type'', ''area of research'', and ''activity''. It is under development by a number of NIH-funded researchers who have a combined interest in classification of biomedical resources. The biositemaps site is no longer available but the biomedical resource ontology is still available via bioportal Biomedical Resource Ontology (BRO).

View all literature mentions

SciCrunch Registry (data or information resource)

RRID:SCR_005400

Interactive portal for finding and submitting biomedical resources. Resources within SciCrunch have assigned RRIDs which are used to cite resources in scientific manuscripts. SciCrunch Registry, formerly NIF Registry, provides resources catalog. Allows to add new resources. Allows edit existing resources after registration. Curators are tasked with identifying and registering resources, examining data, writing configuration files to index and display data and keeping contents current.

View all literature mentions

Biomedical Resource Ontology (data or information resource)

RRID:SCR_004443

THIS RESOURCE IS NO LONGER IN SERVICE. Documented on April 27,2023. A controlled terminology of resources, which is used to improve the sensitivity and specificity of web searches. It includes ''resource_type'', ''area of research'', and ''activity''. It is under development by a number of NIH-funded researchers who have a combined interest in classification of biomedical resources. The biositemaps site is no longer available but the biomedical resource ontology is still available via bioportal Biomedical Resource Ontology (BRO).

View all literature mentions

NIF Registry Automated Crawl Data (data or information resource)

RRID:SCR_012862

An automatic pipeline based on an algorithm that identifies new resources in publications every month to assist the efficiency of NIF curators. The pipeline is also able to find the last time the resource's webpage was updated and whether the URL is still valid. This can assist the curator in knowing which resources need attention. Additionally, the pipeline identifies publications that reference existing NIF Registry resources as this is also of interest. These mentions are available through the Data Federation version of the NIF Registry, http://neuinfo.org/nif/nifgwt.html?query=nlx_144509 The RDF is based on an algorithm on how related it is to neuroscience. (hits of neuroscience related terms). Each potential resource gets assigned a score (based on how related it is to neuroscience) and the resources are then ranked and a list is generated.

View all literature mentions

eagle-i research resource ontology (software resource)

RRID:SCR_008784

Ontology that models research resources such as instruments, protocols, reagents, animal models and biospecimens. It has been developed in the context of the eagle-i project (http://eagle-i.net/) and consists of over 3451 classes of which over 1200 were created within the ERO namespace, while the rest come from existent ontologies such as the Ontology for Biomedical Investigation (OBI), the uber-anatomy ontology (Uberon), VIVO, the Ontology for Clinical Research (OCRe), the Sequence Ontology (SO), the Software Ontology (SWO) and we include terms from the NCBI Taxonomy as well. The main ontology can be browsed in OntoBee. All purls resolve to OntoBee.

View all literature mentions

SciCrunch Registry (data or information resource)

RRID:SCR_005400

Interactive portal for finding and submitting biomedical resources. Resources within SciCrunch have assigned RRIDs which are used to cite resources in scientific manuscripts. SciCrunch Registry, formerly NIF Registry, provides resources catalog. Allows to add new resources. Allows edit existing resources after registration. Curators are tasked with identifying and registering resources, examining data, writing configuration files to index and display data and keeping contents current.

View all literature mentions