If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Chimeric transcripts observed in non-canonical FGFR2 fusions with partner genes' breakpoint located in intergenic region in intrahepatic cholangiocarcinoma
Four gene-intergenic FGF2 fusions were identified in 493 ICC samples.
•
The four non-canonical FGFR2 fusions can generate chimeric transcript.
•
The transcripts were characterized by exon17 of 5′ FGFR2 fusing with the exon2 of 3′ partner.
•
Combined with RNA-NGS, it may make up for omissions caused by DNA-NGS.
Abstract
Intrahepatic cholangiocarcinoma (ICC) is a fatal bile duct cancer with dismal prognosis and limited therapeutic options. FGFR family fusion have been identified in many diseases, and FGFR2 fusion is a validated oncogenic driver in ICC. At present, a variety of fusion forms have been reported, including gene-gene, gene-intergenic, and intergenic-intergenic fusion. Here, by performing RNA- and DNA-sequencing analysis, FGFR2 fusions were found in 10.1% of ICC, including 4 gene-intergenic fusions. We confirmed that the non-canonical rearrangements can generate chimeric transcripts, and used conventional splicing mechanism to explain the event. Our study provides possible target therapy for these 4 patients and possibility analysis scheme for similar situation.
]. With the development of sequencing technology, FGFR2 fusion has been frequently identified in ICC as oncogenic driver, making them attractive diagnostic biomarkers and therapeutic targets [
]. Fibroblast growth factor receptors (FGFRs) are important transmembrane tyrosine kinase receptors in many biological processes, consisting of three extracellular immunoglobulin domains, a transmembrane domain and a cytoplasmic tyrosine kinase domain [
]. FGFRs gene alterations result in aberrant activation of intracellular signaling pathways associated with cellular proliferation and survival and trigger oncogenic activity [
]. FGFR fusions contains two types: first, FGFRs is the 3’ partner, and the extracellular and the transmembrane domains are excluded from the fusion protein, only retaining the kinase domain linked to 5’ fusion partner. Activation mechanism of this fusion type is promoter switching. Second, more common FGFRs is the 5’ partner, with a breakpoint usually found in exons 17-19 leading to extracellular, transmembrane and kinase domain intact remained [
]. Oncogenic mechanism of this fusion type include introducing domains that help facilitate dimerization and escape of microRNA regulation leading by 3’-untranslated region (3’-UTR) loss [
]. A large number of FGFR2 fusions from ICC belongs to the second fusion type. Importantly, Clinical studies have shown that FGFR2 fusion-positive ICC patients can benefit some FGFR inhibitors [
Fusion genes are generated through different chromosomal rearrangements and abnormal transcription. Intrachromosomal rearrangements are classified into four types, including translocation, deletion, inversion, and tandem duplication [
]. DNA-based fusion breakpoint may occur in intragenic (exons and introns) and/or intergenic regions, which generate different fusion types including gene-gene, gene-intergenic, intergenic-intergenic [
]. In many cases, in-frame gene-gene fusions can generate chimeric mRNAs, and 5’-3’ gene fusion product is usually considered functional. However, many rearrangements identified by DNA sequencing show breakpoints located in intergenic regions, which lead to unknown transcriptional consequence and are generally ignored during clinical diagnostics. Recently, studies reveled that a gene-intergenic fusion may lead to a gene-gene fusion at the RNA level [
], which is likely to be a functional chimeric transcript. The acquisition of this chimeric transcript may attribute to splicing mechanism.
In addition to the conventional methods (FISH, RT-PCR and IHC), Next-generation sequencing (NGS) is most commonly used for fusion detection: DNA-based NGS identifying exons, introns and intergenic regions, and RNA-based NGS surveying only spliced exons [
]. However, the two NGS methods have partial difference. DNA sequencing can analyze various alteration types simultaneously and characterize known and unknown fusions, but large introns and repetitive sequences have effect on the detection. Furthermore, it is not possible to discriminate whether fusion gene is or is not expressed. RNA sequencing can quantify fusion transcripts, while it may be hampered by RNA quality and quantity [
In our study, we identified 4 gene-intergenic fusions from intrahepatic cholangiocarcinoma patients using a DNA-based NGS. Their breakpoint is one in intron of 5’ gene FGFR2, and the other in the intergenic upstream region of 3’ gene. RNA-based NGS assay confirmed that this non-canonical fusion can generate chimeric transcripts by splicing the exon1 of 3’ gene. The FGFR2 fusion results detected by DNA-based and RNA-based NGS are mostly consistent, but there are still some factors leading to the difference.
Results
Fusion detection and diversity of FGFR2 partners
By DNA-based NGS, we observed 59 high confident rearrangements with FGFR2 genes (RNA fusion positive samples) in 50 of 493 ICC samples (Table S1), for an overall frequency of 10.1%. We found 33 different fusion partners in total, and the most common partners are BICC1 (24, 40.7%), KIAA1217 (3, 5.1%), VCL (3, 5.1%), TNIP3 (2, 3.4%), CTNNA3 (2, 3.4%), WAC (2, 3.4%) and GOLGB1 (2, 3.4%) (Fig. 1). Furthermore, 42 of 59 FGFR2 fusions contain the full 5’ end of FGFR2, while 11 of them have reciprocal fusions for FGFR2 (Table S2).
Fig. 1Proportion of the most common partners fusing with FGFR2. The chart shows 7 partners, in which BICC1 gene has a higher frequency.
Detection difference of DNA and RNA sequencing of FGFR2 fusions
RNA sequencing was performed to verify the accuracy of these FGFR2 fusion identified by DNA sequencing. To analyze consistency of fusions detected by DNA-based and RNA-based NGS, we counted the positive and negative results of FGFR2 fusions from the 493 samples. If RNA sequencing is used as the detection reference, the sensitivity and specificity of DNA sequencing were 84.78% and 99.11% (calculation shown in materials and methods), respectively, which was sufficient to ensure the accuracy of detection. The Table 1 shows that most positive and negative results of the two methods were consistent.
Table 1Positive and negative samples identified by different methods.
R-seq positive
R-seq negative
D-seq positive
39
4
D-seq negative
7
443
Abbreviations: R-seq = RNA sequencing, D-seq = DNA sequencing.
Special explanation: Samples Cli-I-002, Cli-I-147, Cli-I-450N, Cli-I-469N (Supplementary Table S1) contain both DNA- and RNA-positive FGFR fusions as well as DNA-positive or RNA-positive FGFR fusions, but we count these samples as double positives.
Gene-intergenic fusion can generate chimeric transcript
Among all the 493 samples, we identified 4 FGFR2 non-canonical fusions with their partner genes located in the downstream of DNA breakpoints. For all 4 of the FGFR2 rearrangements, FGFR2 provides the exons 1-17 with DNA breakpoints located in the 17th intron, while 3 of the partner breakpoints observed in intergenic regions and 1 observed in the first intron of SFI1 which can only generate a 5’-5’ format fusion (Fig. 2). Generally speaking, fusions formed through gene-intergentic rearrangement may not be considered. However, RNA-based analysis demonstrated this four FGFR2-intergenic can generate chimeric transcripts, including FGFR2-VCL, FGFR2-BAIAP2L1, FGFR2-CBX5 and FGFR2- EIF4ENIF1. Furthermore, all 4 chimeric mRNAs are formed from exons 1-17 of 5’ gene FGFR and 3’ partner started from the second exon. FGFR2-VCL fusion is formed by inversion, and the FGFR2-BAIAP2L1, FGFR2-CBX5, FGFR2- EIF4ENIF1 by translocation.
Fig. 2The IGV snapshot and schematic diagram show the breakpoints on fusion genes of 4 samples detected by DNA-based and RNA-based NGS. The top of Figure A, B, C, D represents DNA fusion IGV, and the middle represents RNA fusion IGV, in which left half of IGV for FGFR2 gene while the right for partners. Gray bars are sequencing reads that match the reference genome, whereas color bars are mismatching reads that come from the corresponding partners. The bottom schematic diagram a description for DNA-fusion and RNA-fusion. Lightning icons indicate breakpoints and gray arrows indicate direction of transcription. (A) FGFR2-VCL fusion. (B) FGFR2-BAIAP2L1 fusion (C) FGFR2-CBX5 fusion (D) FGFR2-EIF4ENIF fusion.
Hypothesis of gene-intergenic fusion splicing model
As shown in Fig. 3, it is our assumption for the transcriptional products of gene-intergenic fusion. The first exon of the 3’ gene is skipped because it lacks the canonical AG sequence before the exon. The sequence between the 5’ gene provided splicing donor and the splicing acceptor of the exon2 can be regarded as one intron (named putative intron). Therefore, gene-intergenic fusion at DNA level would usually generate a chimeric transcript with 5’ gene and 3’ partner without exon 1.
Fig. 3Schema of hypothesis for products of gene-intergenic fusion. Double slash indicates the omitted DNA sequence, and black rectangle indicates exons, and black straight line indicates introns. The remaining shapes have been explained in diagram.
FGFR2 fusion is a biomarker for the diagnosis and therapy of intrahepatic cholangiocarcinoma, so its detection is crucial. In our study, FGFR2 fusion were found in 10.1% of ICC, which is comparable to previous studies [
Molecular detection and clinicopathological characteristics of advanced/recurrent biliary tract carcinomas harboring the FGFR2 rearrangements: a prospective observational study (PRELUDE Study).
]. Targeted DNA based sequencing offers a comprehensive tool to detect all types of oncogenic alterations including some structural variants. However, due to the frequent complexity of DNA rearrangements and assay design limitations, it is plausible that some important gene fusions and rearrangements are not accurately detected by DNA-based sequencing techniques, or sequence results are difficult to annotate and interpret [
High yield of RNA Sequencing for targetable kinase fusions in lung adenocarcinomas with no mitogenic driver alteration detected by DNA sequencing and low tumor mutation burden.
]. DNA-based and RNA-based fusion positive and negative results are inconsistent in some samples. DNA sequencing is our current mainstream method, while RNA sequencing fusion assay should be an optional method for screening actionable fusions in common driver-negative cases [
Molecular detection and clinicopathological characteristics of advanced/recurrent biliary tract carcinomas harboring the FGFR2 rearrangements: a prospective observational study (PRELUDE Study).
High yield of RNA Sequencing for targetable kinase fusions in lung adenocarcinomas with no mitogenic driver alteration detected by DNA sequencing and low tumor mutation burden.
Besides omissions caused by detection methods, the gene-intergenic fusions found in our research are often ignored due to the unknown transcriptional consequence. The breakpoint of the genome cannot predict the breakpoint of the fusion transcript, and the breakpoint in the intergenic region may or may not produce chimeric mRNA, theoretically [
A novel intergenic region between CENPA and DPYSL5-ALK Exon 20 fusion variant responding to crizotinib treatment in a patient with lung adenocarcinoma.
]. Intergenic breakpoint identified by DNA sequencing is coming to be an unreliable predictor of breakpoint at the transcript level. Our study found that FGFR2-intergenic fusion can also generate chimeric transcripts, which is attributed to the suitable splicing sites where splicing donor (GT) provided by FGFR2 exon 17 and splicing acceptor (AG) provided by partner, belong to downstream-intergenic-breakpoint subtype. The splicing mechanism is very complicated, in addition to the two 5’ and 3’ splicing sites, the branch point sequence is also essential for introns (Fig. 3) because the splicing occurs by recognizing these three core sequences through the spliceosome. Furthermore, the cis-acting elements located both within introns and exons are referred to as intronic or exonic splicing enhancers or silencers, which have great effect on the splicing efficiency [
]. Above mechanism can be used to explain that chimeric mRNA is always the exon17 of FGFR gene binding to the exon2 of the partner but skip the exon1 of the fusion partner, and it may be that putative intron provided the sequence for 3’splicing recognition is located in the 1st intron of fusion partner.
ICC triggered by FGFR2 fusion can be explained through the two following aspects. Most of FGFR2 fusion proteins shared FGFR2 N-terminus retaining an intact kinase domain. ICC patients with 3’-UTR truncated FGFR2 transcripts exhibited higher RNA levels compared to wild-type FGFR2 transcripts. Meanwhile, C-terminal truncation of FGFR2 showed transforming ability in many cancer [
Aberrant receptor internalization and enhanced FRS2-dependent signaling contribute to the transforming activity of the fibroblast growth factor receptor 2 IIIb C3 isoform.
]. In our study, the 4 FGFR2 all lacked the last exon, which satisfy the pathogenic conditions mentioned above. What's more, diverse FGFR fusion partners contribute domains with known dimerization motifs, including coiled-coil, SAM, LisH, BAR, SPFH, and Capase [
], which may abnormally activate downstream pathways through constitutive receptor autophosphorylation. Oligomerization may serve as the common mechanism of activation of FGFR fusion proteins [
]. For four partner genes in our study, a large number of studies have shown that BAIAP2L1 gene with BAR motif has dimerization function, but the structure of the eIF4ENIF1 gene has hardly been analyzed. We used the SMART and Uniprot website to predict VCL and CBX5 domain, and the result showed that the amino acids located at 354-393 and 567-596 in VCL gene were coiled coil domain. However, the most common dimerization motifs were not observed in CBX5 gene, while the shadow chromo domain in CBX5 gene were reported to be a homodimer [
All assays, no matter how well designed, have inherent gaps due to technical and biological limitations. In some clinical cases, testing by multiple methodologies is needed to address these gaps and ensure the most accurate molecular diagnoses. Our study would bring new hope to these 4 patients, who can try FGFR2 inhibitors for treatment. Pemigatinib has received FDA approval for treatment of advanced or metastatic ICC with FGFR2 fusion. Most importantly, through our work, a gene-intergenic transcriptional processing mechanism hypothesis of FGFR2 is proposed. The form of gene-intergenic fusion should arouse our attention, which is the key factor to decide whether or not to take drugs. Our findings elucidate the potential oncogenic function of intergenic fusions and highlight the wide-ranging consequences of structural rearrangements in cancer genomes.
Materials and methods
Samples information
Four hundred and ninety-three FFPE tumor tissues of ICC patients were analyzed regardless of the patients' age and gender. All samples were histologically examined and were confirmed to contain at least 20% tumor cells.
Targeted NGS experiments and data analysis approach
Genomic DNA was extracted with QIAamp DNA FFPE Tissue Kit (Qiagen) and quantified by PicoGreen fluorescence assay (Invitrogen) for each clinical sample. 50-200 ng of DNAs were fragmented to around ∼200 bp by sonication (Covaris), and constructed into sequencing libraries with KAPA Hyper Prep Kit (Kapa Biosystems) according to the protocol. A 381 cancer genes panel was used to capture the targeted region of the DNA. The DNA libraries were then sequenced on an Illumina NextSeq 550 instrument with paired-end 100 bp.
Sequenced reads were mapped to the human reference genome (hg19/GRCh37) using BWA version 0.7.12. Duplicated reads were removed using Picard version 1.130.
Gene rearrangements were identified using an in-house script. Generally, span reads with insert size larger than 10Kb or mapped to different chromosomes were extracted from the de-duplicated BAM file and then clustered according to the mapped genomic position. Rearrangements were called within each cluster if it contains at least 3 unique reads pair with the same clipped position and reads orientation. Then the clipped fragments were cross-validated by each breakpoint to make sure the reads pair supports a real rearrangement. Rearrangements were annotated to genes as follows: first, breakpoints were mapped to the reference genomic annotation to determine their original arrangement type as 5’-5’, 5’-3’, 3’-3’, 5’-intergenic, 3’-intergenic and intergenic-intergenic, then for 5’-5’ and 5’-intergenic rearrangements, we tried to search a transcript starting site downstream the 5’ or intergenic breakpoint within 50Kb which can provide the 3’ gene region, then re-annotate this rearrangement as 5’-3’ and skipping the first exon of 3’ gene. Finally, only 5’-3’ rearrangements with non-frameshift chimeric transcripts were reported as candidate gene fusions.
For RNA-seq, total RNA was isolated from the FFPE sample using ReliaPrep™ FFPE Total RNA Miniprep System (Promega) according to the manufacturer's protocol. QIAseq Stranded Total RNA Lib Kit (Qiagen) was used to acquire cDNA and RNA-seq detection.
Targeted RNA sequencing were performed. Seq reads were aligned to hg19 using STAR version 2.6.1d [
] and gene fusions were identified using STAR-Fusion version 1.5.0 with default parameters. The filtering parameters are as follows: (1) three or more supporting reads which include both splitting reads and spanning reads (2) FusionInspector (https://github.com/FusionInspector/FusionInspector) was used to inspect the evidence of supporting the predicted fusions.
Sensitivity and specificity
The sensitivity and specificity of the DNA-based NGS assay were determined using 493 ICC samples. True positives (TPs) are FGFR2 fusion detected by DNA sequencing as well as by RNA sequencing. True negatives (TNs) are FGFR2 fusion neither detected by DNA sequencing nor by RNA sequencing. False positives are FGFR2 fusion detected by DNA sequencing but not RNA sequencing. False negatives are FGFR2 fusion detected by RNA sequencing but not DNA sequencing. Sensitivity was calculated as follows: TP/(TP + false negative), and specificity was calculated as follows: TN/(TN + false positive).
Declaration of Competing Interest
Disclosure of potential conflict of interest no potential conflicts of interest was disclosed.
Molecular detection and clinicopathological characteristics of advanced/recurrent biliary tract carcinomas harboring the FGFR2 rearrangements: a prospective observational study (PRELUDE Study).
High yield of RNA Sequencing for targetable kinase fusions in lung adenocarcinomas with no mitogenic driver alteration detected by DNA sequencing and low tumor mutation burden.
A novel intergenic region between CENPA and DPYSL5-ALK Exon 20 fusion variant responding to crizotinib treatment in a patient with lung adenocarcinoma.
Aberrant receptor internalization and enhanced FRS2-dependent signaling contribute to the transforming activity of the fibroblast growth factor receptor 2 IIIb C3 isoform.