If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
NGS accelerates detection of RE insertions as part of a comprehensive testing strategy.
•
37 unique RE insertions were identified in 10 cancer predisposition genes, including 17 in BRCA2.
•
Overall, 211 individuals who had hereditary cancer testing were found to carry an RE insertion.
Cancer risks have been previously reported for some retrotransposon element (RE) insertions; however, detection of these insertions is technically challenging and very few oncogenic RE insertions have been reported. Here we evaluate RE insertions identified during hereditary cancer genetic testing using a comprehensive testing strategy. Individuals who had single-syndrome or pan-cancer hereditary cancer genetic testing from February 2004 to March 2017 were included. RE insertions were identified using Sanger sequencing, Next Generation Sequencing, or multiplex quantitative PCR, and further characterized using targeted PCR and sequencing analysis. Personal cancer history, ancestry, and haplotype were evaluated. A total of 37 unique RE insertions were identified in 10 genes, affecting 211 individuals. BRCA2 accounted for 45.9% (17/37) of all unique RE insertions. Several RE insertions were detected with high frequency in populations of conserved ancestry wherein up to 100% of carriers shared a high degree of haplotype conservation, suggesting founder effects. Our comprehensive testing strategy resulted in a substantial increase in the number of reported oncogenic RE insertions, several of which may have possible founder effects. Collectively, these data show that the detection of RE insertions is an important component of hereditary cancer genetic testing and may be more prevalent than previously reported.
Retroelements (REs), also known as retrotransposons, make up more than a third of the human genome, with Long INterspersed Element-1 (LINE-1 or L1) and Alu elements being the most common. There are approximately 500,000 L1 elements and 1.1 million Alu elements, comprising 17% and 11% of human genomic sequence, respectively. SINE-VNTR-ALU (SVA) elements are a rare type of RE that comprises only about 0.2% of human genomic sequence. Collectively, these REs are active areas of research as they can “jump” by retrotransposition, which may cause disease by inserting a copy of the element into critical regions of genes to disrupt transcription, splicing, or translation (
To date, approximately 100 RE insertions associated with human disease have been reported, several of which have been detected in cancer risk genes such as APC, BRCA1, BRCA2, MSH2, and NF1 (
Screening for a BRCA2 rearrangement in high-risk breast/ovarian cancer families: evidence for a founder effect and analysis of the associated phenotypes.
Ovarian carcinoma-associated TaqI restriction fragment length polymorphism in intron G of the progesterone receptor gene is due to an Alu sequence insertion.
). For example, an Alu insertion in exon 3 of BRCA2 (c.156_157insAlu) has been identified as a founder mutation of Portuguese origin that introduces a high risk for hereditary breast and ovarian cancer syndrome (HBOC) (
). Pathogenic variants, such as these, identified during clinical genetic testing may impact medical management based on professional society guidelines (
Many genetic testing methods utilized in hereditary cancer testing cannot reliably detect RE insertions due to technical limitations. For example, array comparative genomic hybridization (aCGH) probes bind only to known target sequence and cannot detect inserted foreign sequence. Multiplex ligation-dependent probe amplification (MLPA) may detect insertions, but only if the point of insertion is located at or near the MLPA probe ligation site. Historically, RE insertions had been detected by our laboratory using multiplex quantitative PCR (qPCR), Sanger sequencing, or Southern blot analysis. Sanger sequencing is capable of detecting RE insertions, but is limited in its ability to detect larger insertions (i.e. L1 elements). This is due to the preferential amplification of the smaller wild-type allele and the comparatively less favorable amplification of the larger mutant allele (
). In contrast, RE detection by qPCR is less impacted by amplification bias because the presence of a large insertion can appear as a decrease in dosage that triggers additional scrutiny. That is, the qPCR assay indicates the presence of a possible large rearrangement, but additional targeted PCR and sequence analyses are needed to confirm and characterize RE insertions. In light of the technical challenges associated with these various platforms, RE insertions are likely underreported.
In recent years, Next-Generation Sequencing (NGS) has been incorporated into clinical genetic testing for hereditary cancer risk. NGS allows high-throughput analysis of multiple genes simultaneously and the ability to detect a variety of mutation types and variant allele frequencies. Due to improved coverage, NGS has been successfully utilized to detect both sequence variants and large rearrangements. As such, NGS assays can more consistently reveal the presence of RE insertions relative to traditional genetic testing methods. However, thorough clinical testing still requires optimization of NGS data analysis, confirmation, and proper characterization of detected RE insertions with complementary methods to appropriately assess the functional impact.
The aim of this study was to assess the incidence of RE insertions among individuals for whom hereditary cancer genetic testing was performed. This was done by applying a comprehensive testing strategy to detect, confirm, and assess pathogenicity of RE insertions identified as part of hereditary cancer testing. Analyses of personal cancer history, ancestry, and haplotype were also performed for individuals with pathogenic RE insertions.
Methods and materials
Patient population
This analysis included individuals from the United States of America who underwent clinical hereditary cancer genetic testing (Myriad Genetic Laboratories, Salt Lake City, UT) between February 2004 and March 2017. Patients were consented for clinical genetic testing and all patient information was obtained from provider-completed test request forms (TRF). No information was obtained from the patient or provider for research purposes.
The majority of individuals were ascertained for a clinical suspicion of hereditary cancer risk. Clinical genetic testing included both single-syndrome testing and pan-cancer panel testing. Single-syndrome testing was available over the full time period. NGS panel testing was available starting in September 2013. Individuals who underwent single-site testing for a known family variant or single-gene testing in any of the genes included in single-syndrome or panel testing were also included in the analysis.
Personal cancer history was obtained from the TRF and had specific options for no history of cancer, as well as the diagnosis of breast cancer, endometrial cancer, ovarian cancer, colon/rectal cancer, and colon/rectal adenomas. Write-in fields were available for other cancers. For this analysis, individuals with polyps were considered affected with colorectal cancer. Self-identified ancestry was also indicated on the TRF and included specific options for Western/Northern Europe, Central/Eastern Europe, Africa, Near East/Middle East, Ashkenazi, Latin America/Caribbean, Asia, and Native American with an open text field for other ancestries. Individuals who checked multiple ancestry fields were recorded as “multiple ancestries”. For this analysis, individuals who indicated Western/Northern Europe, Central/Eastern Europe, or both were considered as European.
Genetic testing
Single-syndrome testing was available for HBOC (BRCA1, BRCA2), Lynch syndrome (MLH1, MSH2, MSH6, PMS2, EPCAM), and FAP/MYH-associated polyposis (APC, MUTYH) over the full time period (February 2004–March 2017). The pan-cancer panel included all of the genes included in single-syndrome testing with the addition of ATM, BARD1, BMPR1A, BRIP1, CDH1, CDK4, CDKN2A (p14ARF and p16INK4a), CHEK2, GREM1, NBN, PALB2, POLD1, POLE, PTEN, RAD51C, RAD51D, SMAD4, STK11, and TP53. The panel was available starting in September 2013 and included all listed genes except for POLD1, POLE, and GREM1, which were included starting in July 2016. Testing included sequencing and large rearrangement (LR) analysis of all genes, except the POLD1 and POLE genes, for which only sequencing of the exonuclease domains was performed, and the EPCAM and GREM1 genes, for which only LR analysis was performed. Pathogenic variants (PVs) are those that received a laboratory classification of Deleterious Mutation or Suspected Deleterious Mutation (
Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.
). REs that insert into critical gene regions (i.e. coding regions or splice sites) are thought to cause a pathogenic effect and are classified accordingly.
Sequencing analysis for panel testing utilizing NGS has been described previously (
Development and analytical validation of a 25-gene next generation sequencing panel that includes the BRCA1 and BRCA2 genes to assess hereditary cancer risk.
). In brief, germline DNA was extracted and purified from blood or saliva. Target DNA was enriched by controlled mixing of microdroplets containing DNA and PCR reaction mix with microdroplets containing PCR primer sets (RainDance Technologies, Bilerica, MA). A custom primer library was utilized to cover gene regions of interest with an average of >5 amplicons per exon. Gene regions of interest were then amplified and sequenced using Illumina HiSeq 2500 (Illumina Inc., San Diego, CA). Single-syndrome and single-site sequencing analyses were performed using Sanger sequencing.
LR analysis was performed using NGS dosage analysis in pan-cancer gene panel testing (
Development and analytical validation of a 25-gene next generation sequencing panel that includes the BRCA1 and BRCA2 genes to assess hereditary cancer risk.
). The qPCR assay was incorporated as part of single-syndrome testing in 2006 for HBOC testing and in 2008 for Lynch syndrome testing.
Apparent single exon deletions detected by NGS or qPCR may be attributed to the presence of a true deletion, the presence of an insertion, or a technical artifact of testing. The presence of a single nucleotide polymorphism (SNP) under a qPCR or NGS primer can inhibit primer hybridization and subsequent amplification of the allele carrying the SNP, causing the artifactual appearance of a single exon deletion. The NGS and qPCR assays used here were amplification-based assays and, as such, a large insertion on one allele will look like an apparent deletion due to (1) preferential amplification of the wild-type allele since it produces a smaller product and (2) the inability (or significantly reduced ability) to produce the larger mutant product under standard amplification conditions. Once the affected region was determined to be void of any interfering SNPs, and confirmatory analyses did not demonstrate the presence of a deletion, a long range PCR was run to confirm the presence of an RE insertion. The resulting products were subsequently sequenced to determine the exact point of insertion and to evaluate pathogenicity. Single-site testing for RE insertions was performed using a targeted PCR assay developed specifically for the known familial insertion identified. Additional characterization of the identified RE insertions is denoted in Table 1 and includes determination of the identity of the RE insertion subfamily via annotation with RepeatMasker (
Sequencing results for this patient are limited. BRCA2 c.7075_7076insAlu is presumed to have an insertion of an Alu element based on the size of the insertion product and the presence of a poly A sequence identified in the available data.
b Sequencing results for this patient are limited. BRCA2 c.7075_7076insAlu is presumed to have an insertion of an Alu element based on the size of the insertion product and the presence of a poly A sequence identified in the available data.
c PMS2 c.804-60_804-59insSVA has been previously reported
Haplotype analysis was performed for the most common RE insertions, which were in BRCA2 and ATM. Gene-level haplotypes were estimated computationally using BEAGLE version 4.1 (
) to account for genetic data obtained using a variety of technologies (qPCR, NGS dosage analysis, targeted PCR). These computations were limited to individuals separated by third degree or more as determined from the available information. When pedigree information was provided, related individuals were excluded from haplotype analysis. Individuals were also excluded from haplotype analysis if they underwent single-site testing, which is only performed for individuals with a known familial mutation.
To enable haplotype estimation for two or more samples and to improve accuracy we used a two-stage haplotyping design. The first stage used phased genotype data from 1000 Genomes Phase 3 data (
) to establish haplotypes for common SNPs. 1000 Genomes Phase 3 data includes individuals from 26 populations that represent 5 super-populations: African, American, East Asian, European and South Asian (
). The second stage used a combination of phased (common SNPs) and unphased (Alu inserts) variants from patient samples to assign each Alu present in two or more individuals to a BRCA2 or ATM haplotype.
Haplotype analysis for individuals of all ancestries was combined. Although this may confound the analysis (
), this approach was appropriate here given the limitations of the available ancestry data. These limitations include the fact that 25% of tested individuals did not report any ancestry and an additional 7% tested individuals reported multiple ancestries. In addition, the gene regions were all less than 150 kb in length. In short regions, linkage disequilibrium between markers is strong while the number of common markers is small. This restricts the total number of possible haplotypes at that locus. This combined ancestry analysis allows the sample size of phased reference haplotypes and clinical samples to be maximized. Phased 1000 Genomes data for 2504 individuals representing 5 super-populations were included as a reference panel for the clinical samples.
Results
Detection of RE insertions
Here, we present RE insertions detected by single-gene, single-syndrome, or pan-cancer clinical genetic testing. Overall, 37 unique, pathogenic RE insertions were identified during clinical genetic testing. This includes 34 Alu insertions, 2 L1 insertions, and 1 SVA insertion. Additional details for the identified RE insertions are described in Table 1, with sequence information in Supplementary Table S1. Two RE insertions detected here have been previously reported in the literature: the Portuguese founder mutation in BRCA2 (
). Detected Alu insertions ranged from about 300 to 500 bp in size while the L1 and SVA insertions were between 1.5 kb and 6 kb.
Testing of our patient cohort spans many years and the LR analysis platforms have changed over time; however, the LR analysis platforms incorporated into single-gene or single-syndrome testing (qPCR, Southern Blot analysis) and pan-cancer panel testing (NGS dosage analysis) are all able to detect insertions. In some of our earliest cases, the RE insertion was detected directly by Sanger sequencing analysis (n = 6) or Southern Blot analysis (n = 1). However, we have observed an improvement in the detection of RE insertions over time due to the increased sensitivity afforded by qPCR and NGS dosage analysis. This increase in sensitivity is due, in small part, to an increase in coverage; however, the primary source of improvement is from the ability of these platforms to detect exon-specific decreases in dosage to trigger additional analysis. As such, the majority of the RE insertions were detected by NGS dosage analysis as part of pan-cancer panel testing (n = 16), qPCR as part of single-syndrome testing (n = 6), or both (n = 6; Table 1).
Representative evidence of one Alu insertion identified during panel testing, BRCA2 c.5007_5008insAlu, is shown in Figure 1. NGS dosage analysis results show decreased dosage in two amplicons covering exon 11, making the insertion appear as a partial exon deletion (Figure 1A). Confirmatory targeted PCR was designed to characterize the affected region in more depth and shows an insertion of approximately 300 bp within exon 11 (Figure 1B). Sequencing of the resulting products permitted identification of the inserted Alu element, a portion of which is provided in Supplementary Table S1. The pathogenicity of this variant was evaluated by determining the exact point of insertion using sequencing analysis (Figure 1C). Targeted PCR was used to confirm that this insertion occurs within the exon, leading to a frame-shift mutation that may cause exon skipping and nonsense-mediated decay of mRNA (
Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.
). As such, the variant was classified as pathogenic (Table 1).
Figure 1Representative Evidence of BRCA2 c.5007_5008insAlu. A) Alu insertion detected by NGS. Red squares represent the average dosage of several overlapping amplicons for each exon. Exons at normal dosage align to 2 on the Y-axis, 1 for deleted exons, and 3 for duplicated exons. The enlarged view shows decreased dosage in amplicons covering exon 11, making the insertion appear as a deletion. B) Scheme (left) and results (right) of targeted PCR analysis designed to characterize the large insertion in the suspected region. Lane 4 contains the patient sample, which demonstrates the presence of a larger fragment (i.e. the allele carrying the Alu insertion) in exon 11. C) Sequencing data showing the point of Alu insertion in exon 11 (highlighted in blue). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Overall, the comprehensive testing strategy employed here has identified RE insertions in 211 tested individuals (0.012%) from this cohort. This includes 72 (34%) individuals who had pan-cancer panel testing, where large rearrangement analysis was performed using NGS dosage analysis (Table 2). Seventy-two individuals with RE insertions had single-syndrome testing (Table 2), including 70 (97.2%) who were tested for HBOC and 2 (2.8%) who were tested for Lynch syndrome. An additional 67 patients with RE insertions had single-site testing for a known familial mutation using targeted PCR and sequencing analysis, or single-gene testing using Southern Blot and Sanger sequencing. Despite the fact that pan-cancer panel testing was only available for a portion of the time period assessed here, both panel and single-syndrome testing identified the same number of individuals with RE insertions. In addition, the prevalence of RE insertions among individuals who had panel testing (0.017%) is more than double the prevalence for single-syndrome testing (0.006%). This is likely due to the additional coverage across all genes on NGS as well as an increase in the overall number of genes included on the panel.
Table 2Summary of individuals identified as carrying a pathogenic RE insertion
9 individuals with an RE insertion did not specify whether they had a personal history of cancer (1 panel testing, 4 single-syndrome testing, 4 single-site testing).
Panel testing was available starting in September 2013.
415,584
13,431
402,152
72 (0.017%)
3
69
42
29
49.98 (10.69)
Single-syndrome
1,251,788
46,684
1,196,869
72 (0.006%)
1
70
54
14
45.36 (10.13)
Single-site
146,982
29,820
116,110
67 (0.046%)
12
55
16
47
43.14 (9.57)
Total
1,814,354
89,935
1,715,131
211 (0.012%)
16
194
112
90
46.87 (10.51)
a 9288 individuals did not indicate a gender (1 panel testing, 8235 single-syndrome testing, 1052 single-site testing).
b 1 individual with an RE insertion did not indicate a gender (single-syndrome testing).
c 9 individuals with an RE insertion did not specify whether they had a personal history of cancer (1 panel testing, 4 single-syndrome testing, 4 single-site testing).
d Age at first cancer diagnosis was used for individuals with multiple cancer diagnoses.
e Panel testing was available starting in September 2013.
The majority of patients carrying RE insertions (194) were female, which is consistent with the overall testing population (Table 2). Of the 16 men identified as carrying an RE insertion, 12 underwent single-site testing for the specific familial insertion mutations.
RE insertions were identified in 10 of the 26 genes tested for large rearrangements (Table 3). RE insertions were most common in BRCA2 and ATM, with 17 (45.9%) and 6 (16.2%) unique insertions, respectively. The remaining RE insertions occurred in APC, BARD1, BRCA1, CHEK2, MLH1, MSH2, PALB2, and PMS2. Specifically, panel testing resulted in the identification of RE insertions in ATM (6), BARD1 (1), BRCA1 (1), BRCA2 (9), CHEK2 (1), MLH1 (1), PALB2 (2), and PMS2 (1). Nearly half of the unique RE insertions identified with panel testing occurred in genes not included in single-syndrome testing.
Table 3RE insertions identified during hereditary cancer testing
The total number of pathogenic variants includes sequencing mutations and large rearrangement mutations classified as pathogenic during the time period of this analysis and includes RE insertions.
This includes pathogenic variants detected in all 28 genes included in testing.
37
211
1361
10,631
~1:325
a The total number of pathogenic variants includes sequencing mutations and large rearrangement mutations classified as pathogenic during the time period of this analysis and includes RE insertions.
b Includes BRCA2 c.156_157insAlu, which has been previously reported
Overall, RE insertions accounted for approximately 1 in every 325 unique PVs detected in this clinical testing cohort. However, the gene specific prevalence of RE insertions varied greatly. The 17 insertions in BRCA2 accounted for 149/211 (70.6%) individuals identified as carrying an RE insertion. We determined that about 1 in 185 unique PVs in BRCA2 identified during genetic testing were RE insertions (Table 3). By comparison, only about 1 in 640 PVs in BRCA1 were RE insertions, despite the fact that both genes were available for testing over the full study period. As such, the high rate of RE insertions detected in BRCA2 cannot be attributed solely to the high proportion of individuals who had single-syndrome HBOC testing during the entirety of the study period. Several additional genes (ATM, BARD1, and PALB2) also showed a high ratio of RE insertion PVs; however, this may be an artifact of the relatively small numbers of total PVs that have been observed in these genes.
Personal cancer history
Overall, 112/211 (53.1%) individuals found to carry RE insertions had been diagnosed with at least one cancer at the time of testing (Table 2). Table 4 shows the incidence of cancer by gene and mutation type (RE insertion vs other PV). The overall incidence of cancer represents aggregate clinical information from the three genetic test types: pan-cancer panel, single-syndrome, and single-site. The incidence of cancer will be influenced by the test type (i.e. low incidence of cancer among individuals undergoing single-site testing for a known familial mutation). However, the distribution of pathogenic variants identified by panel, single-syndrome, and single-site testing is similar for RE insertions and other types of PVs (data not shown). As such, a comparison of the aggregate clinical information is appropriate.
Table 4Personal cancer history among individuals with an RE insertion or other pathogenic variant
Totals for all 28 genes included in testing. For pathogenic variants other than RE insertions, this includes additional contributions from BMPR1A, BRIP1, CDH1, CDKN2A, EPCAM, MSH6, MUTYH, NBN, PTEN, RAD51C, SMAD4, STK11, and TP53.
88 (39.6)
6 (2.7)
38 (17.1)
90 (40.5)
61,484 (33.9)
9742 (5.4)
42,130 (23.3)
67,778 (37.4)
a Includes all individuals with a personal cancer history other than breast or colon cancer or a cancer that was not specified on the test request form.
b Totals for all 28 genes included in testing. For pathogenic variants other than RE insertions, this includes additional contributions from BMPR1A, BRIP1, CDH1, CDKN2A, EPCAM, MSH6, MUTYH, NBN, PTEN, RAD51C, SMAD4, STK11, and TP53.
The most common cancer diagnosis among individuals found to carry an RE insertion was breast cancer (39.6%), which is consistent with the overall testing population (33.9%; Table 4). In addition, only individuals with an RE insertion in a breast cancer risk gene reported a history of breast cancer. The incidence of colon cancer among individuals with an RE insertion (2.7%) was lower than that observed among individuals with other types of PVs (5.4%; Table 4). However, colon cancer was primarily reported among individuals with an RE insertion in a colon-cancer risk gene (MLH1, MSH2, PMS2). As such, the lower incidence of colon cancer among individuals with an RE insertion is likely related to the decreased incidence of RE insertions in colon cancer-risk genes rather than a decreased penetrance among RE mutation carriers. Among individuals with a personal diagnosis of cancer, the average age at diagnosis was 46.9 years (Table 2). This is consistent with the average age at diagnosis for the overall testing cohort (data not shown).
Ancestry
The distribution of RE insertions according to self-identified ancestry is shown in Table 5 and is compared to the total percentage of PVs for each ancestry. Overall, 32.7% of individuals found to carry an RE insertion were of European ancestry. This is much lower than the proportion of all PVs identified among individuals of European ancestry in the overall testing cohort (56.7%; Table 5). In contrast, 16.1% of RE insertions were found in individuals of African ancestry, which is substantially higher than the proportion of all PVs identified in this population (4.8%). A higher proportion of RE insertions was also identified in individuals of Latin American/Caribbean ancestry (14.7%) relative to all PVs (6.5%; Table 5). The proportion of RE insertions and all PVs identified in individuals of other ancestries were not substantially different (Table 5).
Table 5Ancestry distribution among all patients with an RE insertion
Ancestry
Individuals with an RE Insertion (N)
Percentage of Individuals with an RE Insertion (%)
Among the 37 unique RE insertions, several were detected with high frequency in populations of conserved ancestry. This includes five Alu insertions in BRCA2 that collectively account for 56.4% (119/211) of all RE insertions detected here. The well-known Portuguese founder mutation, BRCA2 c.156_157insAlu was detected in 30 individuals (Table 6). Consistent with previous reports, 66.7% of individuals with this PV reported Latin American/Caribbean ancestry. Half of the individuals with this PV reported a history of breast cancer, as would be expected for PVs in BRCA2 (Table 6).
Table 6Ancestry and personal/family cancer history for individuals with possible founder mutations in BRCA2 and ATM
Ancestry
N (%)
Female N (%)
Breast Cancer N (%)
Ovarian Cancer N (%)
Other Cancers N (%)
Unaffected N (%)
None Specified N (%)
BRCA2 c.156_157insAlu (N = 30)
Latin American/Caribbean Only
19 (63.3)
28 (93.3)
15 (50.0)
0
0
15 (50.0)
0
Latin American/Caribbean and ≥1 Other Ancestry
1 (3.3)
European Only
4 (13.3)
Multiple Ancestries
1 (3.3)
None specified
5 (16.7)
BRCA2 c.3407_3408insAlu (N = 50)
African Only
30 (60.0)
50 (100.0)
25 (50.0)
2 (4.0)
2 (4.0)
16 (32.0)
5 (10.0)
African and ≥1 Other Ancestry
5 (10.0)
Native American
2 (4.0)
Latin American/Caribbean
1 (2.0)
None specified
12 (24.0)
BRCA2 c.5007_5008insAlu (N = 5)
Latin American/Caribbean
4 (80.0)
5 (100.0)
2 (40.0)
0
0
3 (60.0)
0
Latin American/Caribbean and ≥1 Other Ancestry
1 (20.0)
BRCA2 c.9327_9328insAlu (N = 9)
European
3 (33.3)
8 (88.9)
2 (22.2)
2 (22.2)
0
5 (55.6)
0
European and ≥1 Other Ancestry
2 (22.2)
None specified
4 (44.4)
BRCA2 c.9451_9452insAlu (N = 25)
European
18 (72.0)
22 (88.0)
9 (36.0)
0
1 (4.0)
14 (56.0)
1 (4.0)
European and ≥1 Other Ancestry
1 (4.0)
None specified
6 (24.0)
ATM c.7374_7375insAlu (N = 27)
European
17 (63.0)
26 (96.3)
9 (33.3)
2 (7.4)
5 (18.5)
10 (37.0)
1 (3.7)
Native American
1 (3.7)
None specified
9 (33.3)
Note: Cancer diagnosis only considers the first onset of cancer and is not adjusted for multiple cancer types and reoccurrences.
An Alu insertion in exon 11 of BRCA2 (c.3407_3408insAlu) was identified in 50 patients. This includes 30 individuals (60.0%) who reported only African ancestry and an additional 5 (10.0%) individuals who reported multiple ancestries, at least one of which was African (Table 6). The remaining individuals with this RE insertion reported Native American ancestry (4.0%), Latin American/Caribbean ancestry (2.0%), or did not specify any ancestry (24.0%). Another Alu insertion in exon 11 (BRCA2 c.5007_5008insAlu) was identified in 5 individuals, all of whom indicated full or partial Latin American/Caribbean ancestry. Again, the personal cancer histories of individuals with these RE insertions in exon 11 are consistent with BRCA2-associated cancer risks (Table 6).
There were two common Alu insertions identified in exon 25 of BRCA2, which is part of the DNA binding domain. BRCA2 c.9327_9328insAlu was identified in nine individuals, five of whom reported full or partial European ancestry (Table 6). The remaining four individuals with this insertion did not report any ancestry. Similarly, BRCA2 c.9451_9452insAlu was observed in 25 individuals, 19 (76.0%) of whom reported European ancestry. The remaining individuals with this Alu insertion did not specify an ancestry (24.0%).
The most common RE insertion in ATM was c.7374_7375insAlu in exon 49. Twenty-seven individuals were found to carry this Alu insertion, including 17 individuals (63.0%) who reported European ancestry (Table 6). The remaining individuals with this RE insertion reported Native American ancestry (3.7%) or did not specify any ancestry (33.3%). Individuals with ATM c.7374_7375insAlu reported personal cancer histories that included breast and ovarian cancer (Table 6).
Haplotype analysis of the common Alu insertions
Given the strong conservation of ancestry for these insertions, possible founder effects were assessed by haplotype analysis of 95 of the 146 individuals carrying one of the six common Alu insertions in BRCA2 or ATM. This analysis showed that all 14 lineages with the Portuguese founder mutation c.156_157insAlu shared a single haplotype with two SNPs: rs144848, rs9534262 (Table 7, Supplementary Table S2). BRCA2 c.3407_3408insAlu was found mostly in individuals of African Ancestry and all 40 lineages with this RE insertion shared a single haplotype (Table 7, Supplementary Table S3). Similarly, a single haplotype was observed for BRCA2 c.5007_5008insAlu in five lineages, all of which reported Latin American/Caribbean ancestry (Table 7, Supplementary Table S4).
Table 7Summary of haplotypes for the BRCA2 founder mutations
The SNPs in bold represent the ones that differentiate the distinct haplotypes. All other SNPs are shared by all apparently unrelated lineages. The detail haplotype is shown in Supplementary Tables S2–S7.
a The SNPs in bold represent the ones that differentiate the distinct haplotypes. All other SNPs are shared by all apparently unrelated lineages. The detail haplotype is shown in Supplementary Tables S2–S7.
The two Alu insertions in BRCA2 exon 25 were identified primarily among individuals who reported European ancestry. Both lineages carrying BRCA2 c.9327_9328insAlu shared a single haplotype (Table 7, Supplementary Table S5). However, 3 different haplotypes were observed among the 11 lineages carrying BRCA2 c.9451_9452insAlu, the most common of which (rs1799943-rs1801406-rs9534262-c.9451_9452insAlu) was observed in five lineages (Table 7, Supplementary Table S6). These data suggest that the two Alu insertions in exon 25 might have arisen independently on different haplotypes.
ATM c.7374_7375insAlu was also identified primarily among individuals who reported a European ancestry and showed a more diverse haplotype. Overall, 4 different haplotypes were observed among 23 lineages with this Alu insertion (Table 7, Supplementary Table S7). The most common haplotype was shared by 16 lineages.
Discussion
Previous estimates predicted that RE insertions account for 0.1%–0.16% of human genetic diseases (
). However, the detection of RE insertions is technically challenging and the contribution of these mutations to disease may be underestimated. Here, we utilized a comprehensive testing strategy to detect and subsequently characterize RE insertions as part of genetic testing of 28 cancer predisposition genes. In total, 37 unique pathogenic RE insertions were identified as part of hereditary cancer testing. Although the RE elements themselves are not pathogenic, follow-up testing showed that these elements inserted into critical gene regions to cause a pathogenic effect.
Only 11 insertions have been previously reported in any of the 28 genes evaluated here (
). The large proportion of novel RE insertions reported in this study is likely due to the limitations of conventional assays. For example, MLPA is commonly used to identify large rearrangements during genetic testing; however, most MLPA assays place 1–2 probes per exon. As a result, this technique does not provide full coverage of the gene and the detection of REs that insert into coding regions is limited. In contrast, our internally designed, amplification-based qPCR and NGS assays cover all coding regions of the genes included here and can improve detection of REs that insert within exonic and limited flanking intronic regions. This is exemplified by the fact that three-quarters of the unique RE insertions reported here were identified by qPCR and/or NGS dosage analysis. Our comprehensive testing approach was also able to detect some large L1 elements and SVA elements of 1–6 kb. The relatively low prevalence of L1 and SVA elements identified here may in part be due to the difficulty in amplifying such large mutant products as well as the rarity of these REs in the human genome relative to Alu elements.
Previous estimates suggest that RE insertions account for approximately 1 in 600 (0.16%) pathogenic variants (
); however, we show that RE insertions may be much more common. RE insertions accounted for approximately 1 in 325 (0.3%) pathogenic variants in the cancer risk genes tested in this study. In addition, the incidence of RE insertions was much higher in some genes. For example, RE insertions accounted for 1 in 185 unique pathogenic variants in BRCA2 compared to only 1 in 640 for BRCA1 and 1 in 1165 for APC. The high ratio of RE insertions in BRCA2 may be explained by the larger cDNA size of BRCA2 relative to BRCA1, which would simply provide more real estate for these insertions to occur. In addition, it is possible that RE insertions in some genes are difficult to detect (i.e. large L1 or SVA insertions). However, the high frequency of insertions in BRCA2 also suggests that this gene may be a “hot spot” for oncogenic Alu insertions.
RE insertions were identified in 211 individuals who had hereditary cancer genetic testing by our laboratory. Although this includes some individuals who were tested for a known familial mutation, 144 individuals had single-syndrome or pan-cancer panel based on clinical suspicion of hereditary cancer risk. Notably, NGS pan-cancer panel testing accounted for a third of all detected RE insertions despite being available for only 3.5 years of the of the study period. Furthermore, the prevalence of RE insertions among individuals who had NGS panel testing (0.017%) was more than double that observed for single-syndrome testing (0.006%). This increased prevalence is partially accounted for by the addition of new genes as part of panel testing. In addition, this reflects improved gene coverage using NGS relative to other methods.
RE insertions were identified in ten cancer-risk genes, most commonly in the breast and ovarian cancer risk gene, BRCA2. The 17 different Alu insertions identified in BRCA2 accounted for 46% of all unique findings and 71% of all individuals with an RE insertion. While this in part reflects the large proportion of individuals who had single-syndrome HBOC testing, RE insertions were also identified in several genes associated with an increased risk of breast cancer that were not included in single-syndrome HBOC testing (ATM, BARD1, CHEK2, PALB2). Notably, RE insertions in ATM were most common after BRCA2, despite the fact that this gene was more recently included as part of panel testing. RE insertions were also identified in several genes associated with colorectal cancer, including the Lynch syndrome genes MLH1, MSH2, and PMS2 as well as APC. Individuals with pathogenic variants in these genes are recommended for increased screening and, in some cases, consideration of prophylactic surgery (
). Overall, the clinical presentation of individuals with pathogenic RE insertions in these genes was similar to those with other types of pathogenic variants in the same genes, suggesting similar clinical risk. This highlights the importance of not only identifying RE insertions in cancer risk genes, but accurately characterizing these mutations in order to appropriately assign pathogenicity.
The data presented here also provide evidence that pathogenic RE insertions in cancer predisposition genes may have a founder effect and be enriched in certain populations. Among RE insertion mutation carriers, there was a high proportion of individuals of African and Latin American/Caribbean ancestry (Table 5). In addition, several RE insertions were detected in individuals of conserved ancestry, including the Portuguese founder mutation. BRCA2 c.3407_3408insAlu was found almost exclusively in individuals with at least partial African ancestry and haplotype analysis showed 40 lineages with a single haplotype. There is also evidence that BRCA2 c.9451_9452insAlu and ATM c.7374_7375insAlu may be European founder mutations. BRCA2 c.5007_5008insAlu shows preliminary evidence that it may be a founder mutation in individuals of Latin American ancestry; however, the limited number of observations here warrants additional research on this Alu insertion. It should be noted that haplotype analysis was only possible for the most common Alu insertions by pooling individuals of all ancestries. This included analysis for only BRCA2 and ATM among the ten genes with RE insertions identified here.
In summary, our data show that RE insertions in cancer risk genes have likely been historically underreported due to the technical limitations of many genetic testing methods. Although pathogenic RE insertions in cancer-risk genes are relatively rare, the detection of these insertions may impact medical management decisions in appropriate patients. This is especially true for individuals who may have a higher risk of this type of mutation based on ancestry. As such, thorough analysis and characterization of regions exhibiting evidence of an insertion are an essential part of a comprehensive genetic testing strategy in order to ensure quality patient care.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Conflict of interest
All authors are employees of Myriad Genetic Laboratories, Inc. and receive compensation that can include salaries and stock options.
Supplementary data
The following is the supplementary data to this article:
Screening for a BRCA2 rearrangement in high-risk breast/ovarian cancer families: evidence for a founder effect and analysis of the associated phenotypes.
Ovarian carcinoma-associated TaqI restriction fragment length polymorphism in intron G of the progesterone receptor gene is due to an Alu sequence insertion.
Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.
Development and analytical validation of a 25-gene next generation sequencing panel that includes the BRCA1 and BRCA2 genes to assess hereditary cancer risk.