A portal funded by the National Institutes of Health has added data and more powerful tools to help researchers make scientific breakthroughs in better understanding and treating type 2 diabetes, the most common form of the disease.

Developed in part by a team of scientists and software engineers at the Broad Institute of MIT and Harvard, and the University of Michigan, under NIH’s Accelerating Medicines Partnership, the portal is designed as a central repository for large datasets of human genetic information linked to type 2 diabetes and related traits.

The ultimate goal of the portal, which opened last year, is to serve as a “scientific discovery engine” that can be leveraged by the research community and help select valid human targets for new diabetic therapies or treatments.

“By putting the data in one place, researchers can actually access it and query it,” says Philip Smith, deputy director of the Division of Diabetes, Endocrinology, and Metabolic Diseases at the National Institute of Diabetes and Digestive and Kidney Diseases. “The more data that’s in the portal, the greater its power.”

The result is a database of DNA sequence, functional and epigenomic information, as well as clinical data from studies on type 2 diabetes and its macro- and micro-vascular complications. The ability to identify and validate changes in DNA that influence onset of type 2 diabetes, disease severity or disease progression is critical.

Smith estimates that the portal currently holds aggregated data from almost 300,000 DNA samples from research supported by NIH and other institutions, including data from Asian and European collaborators, as it seeks to expand international content.

“We continually accrue data, so by next December we plan to have doubled the number of exomes,” he says, revealing that the portal has 26,000 exomes from individuals with diabetes across five major ethnic groups. “We can begin to understand differences between ethnic and racial subgroups, which might provide specific targets.”

Before now, the data was only accessible to approved researchers, while others could view aggregate results. However, a Google account is all that is needed to use the portal, which is also available in Spanish.

In addition, the portal contains expanded data and search capabilities to accelerate the pace of scientific discovery. Researchers and the public can search for information by gene, genetic variant and region, as well as access summaries of genetic variants and run customized genetic analyses.

“What a user does is simply submit queries and get back answers,” Smith adds. “People who are not necessarily informaticists or geneticists can actually use these tools.”

When it comes to privacy and security protections, he asserts that individual data will remain confidential and behind a firewall. Going forward, Smith says the portal is looking to leverage electronic health records, which he calls a “goldmine” of longitudinal and drug treatment data and an essential element.

“Just getting the data into the system has been a real challenge for us, because there are so many arrangements to make with all of the institutions,” he concludes. “For a typical cohort study, like the one we just did where there were 16 or 17 institutions involved, we have to develop agreements with every one of those institutions, and we have to go through all of the consents for all of the subjects to determine what is shareable and what is not.”

(This article appears courtesy of our sister publication, Health Data Management)

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access