Genomic data linked with phenotypic data from NAFLD patients show variants that increase risk of comorbidities such as Familial Hypercholesterolemia and Type 1 Diabetes
Although we’ve made significant progress in genome sequencing, we are only at the beginning of characterizing the diverse cohorts of genomes needed by researchers to understand the nuances of complex diseases like NASH/NAFLD. There is high unmet need in NAFLD patients due to the complexity of the disease and difficulty in identification of accurate biomarkers.
A pharma company expressed the need for a cohort with genomic sequencing in order to identify genetic risk factors and/or disease biomarkers for NAFLD as part of their research program. Ovation decided to use Datavant’s token to link deep genomic data with claims derived clinical data, enabling researchers to access whole genome sequencing with corresponding relevant clinical characteristics in a cohort of 10 patients.
Data Characterization and Approach:
The sample dataset is a packaged, real-world observational cohort including:
- High-quality WGS data with 50x depth of coverage
- Known common comorbidities, including type 2 diabetes, obesity, hypercholesterolemia, or hypertension
- Biomarkers for each comorbidity and the number of biomarkers found in each sequence (showing propensity for each comorbidity)
- Relevant claims data from a real-world data partner* are linked to the WGS data. Linked Real World Data includes comorbidity diagnoses, therapeutic exposure, medical procedures, and surgical history
- Other phenotypic data such as liver function tests, Fibrosis-4 (FIB-4) index score, blood counts, lipid profile, and age of onset can also be provided
This dataset comes pre-packaged with WGS data linked to phenotypic data sourced by Ovation to provide an example turnkey solution. Although this dataset is linked using Datavant tokenization, it can be linked to almost any other phenotypic data that uses one of the commercially available tokenization technologies. Ovation has made 10 genomes from patients with NAFLD freely available for exploration on Amazon Data Exchange (AWS). AWS used variant annotations from ClinVar and VCFs from this dataset to demonstrate how we can perform queries to gain insights from this cohort of 10 patients.
Analysis shows that all patients have a variant in the APOA2 gene which increases the risk of Familial Hypercholesterolemia
Figure 1 shows the genes that have the Pathogenic or Likely Pathogenic variants that cause conditions by patient. It is interesting to note that in this cohort of patients, which have been diagnosed with NAFLD, all the patients have a variant in the APOA2 gene that increases their risk of Familial Hypercholesterolemia. There is some evidence that there is a link between NAFLD and high cholesterol levels and obesity. Several of the patients also have variants that pre-dispose them to Type-1 Diabetes in this cohort. This example shows how insights can be gained by combining genomic variants with clinical annotations and phenotypes, and how Amazon Omics streamlines this data delivery process. Pharma companies can use this data for biomarker development and validation, target identification, or patient stratification.
*Thanks to our partners at AWS Data Exchange for helping with this analysis.