Background: Cytosine-guanine(CpGs) sites in molecules identified as methylated or unmethylated; the combination of them in the genetic sequence of an individual includes a methylation haplotype (methyl-haplotype) for a specific locus. The insulin gene promoter(IGP) is highly regulated by methylation mechanisms, which lead to alteration of gene expression.
Aim: To identify IGPmethyl-haplotypes among children/adolescents with type 1 diabetes(T1D) and to deploy a predictive model for the classification of cases and controls, using Next-Generation Sequencing(NGS)- methyl-haplotypes as biomarkers.
Patients-Methods: DNA from peripheral whole blood of 40 participants (20 T1D/20 healthy age-gender-matched) was extracted and IGP-region was sequenced by NGS; the sequence readings analysis was performed using FASTQ files. A python-based pipeline for targeted deep bisulfite sequenced amplicons(ampliMethProfiler) was applied to estimate the methylation status. Methylation profile at 10 CpG sites proximal to transcription start site of the IGP was recorded (site 1/-357, site-2/-345, site 3/-234, site 4/-206, site 5/-180, site 6/-135, site 7/-102, site 8/-69, site 9/-19, site 10/+60).Methylation of each site was coded as 0(zero) for unmethylation or 1(one) for methylation. A single read with the 10 CpG sites could result in "1111111111"methyl-haplotype(all methylated), in "000000000"methyl-haplotype(all unmethylated) or any other combination. The generated methyl-haplotypes were tested as predictive biomarkers in five different classifiers (Random forest, Support Vector Machine Radial and Linear, Generalized Linear Regression, Linear Discriminant Analysis). Predictive models were evaluated with the Receiver Operating Characteristics for 10-fold cross validation; their performance was assessed by computing the metrics accuracy, sensitivity and specificity as a mean of 100 repetitions of random separation of the dataset in train and test set.
Results: 469 different methyl-haplotypes were recorded. After normalization of the features according to the number of readings, three distinct methyl-haplotypes:"1110101110", "1110111110" and "1111111100" were more closely related to T1D compared to the controls (Wilcoxon test P-values: 0.00018,0.00032, 0.00095, respectively); they were then used as predictors for the training of the five classifiers. The Support Vector Machine Radial presented the best accuracy (0.82±0.09) and a balanced performance between the two categories having sensitivity 0.86±0.12 and specificity 0.77±0.15.
Conclusions: Since methylation quantification approaches are unable to reflect the complexity of the methylation substrate, methyl-haplotypes describe in a more holistic manner the epigenetic profile of an individual. Methylation based biomarkers, such as IGP methyl-haplotypes 1110101110, 1110111110 and 1111111100 could serve as a strategy to identify individuals at high risk for β-cell failure.
19 - 21 Sep 2019
European Society for Paediatric Endocrinology