- •Karyotype analysis has a great impact on the diagnosis, treatment and prognosis in hematologic neoplasms.
- •Identification and characterization of chromosomes is a challenging process and needs experienced personal.
- •The presented novel laboratory approach using a convolutional neural network (CNN) saves time and reduces mistakes in the karyotyping workflow.
Karyotype analysis has a great impact on the diagnosis, treatment and prognosis in hematologic neoplasms. The identification and characterization of chromosomes is a challenging process and needs experienced personal. Artificial intelligence provides novel support tools. However, their safe and reliable application in diagnostics needs to be evaluated. Here, we present a novel laboratory approach to identify chromosomes in cancer cells using a convolutional neural network (CNN). The CNN identified the correct chromosome class for 98.8% of chromosomes, which led to a time saving of 42% for the karyotyping workflow. These results demonstrate that the CNN has potential application value in chromosome classification of hematologic neoplasms. This study contributes to the development of an automatic karyotyping platform.
Chromosomal aberrations are found in ≥ 40% of hematologic neoplasms such as acute myeloid leukemia (AML) and myelodysplastic syndromes (MDS) [
]. Chromosomal aberrations closely influence treatment and prognosis of AML and MDS patients. Tumor cells derived from AML show mainly balanced rearrangements that result in fusion genes, whereas unbalanced chromosomal aberrations that result in loss or gain of genetic material are observed more frequently in tumor cells derived from MDS patients [
- Swerdlow S.
- Campo E.
- Harris N.L.
- et al.
WHO classification of tumours of haematopoietic and lymphoid tissues.
4th ed. IARC, Lyon2017
]. A normal karyotype (46,XX or 46,XY) is observed in nearly half of the patients. Visual karyotyping is very repetitive and time-consuming work. Experienced personal is key for the generation of accurate karyotypes especially for the identification of chromosomal aberrations. Chromosome banding generates distinct chromosomal patterns through different staining techniques such as G-, Q- or R-banding [
- Swerdlow S.
- Campo E.
- Harris N.L.
- et al.
WHO classification of tumours of haematopoietic and lymphoid tissues.
4th ed. IARC, Lyon2017
]. The resulting patterns allow the specific identification of each pair of chromosomes in a given species. These techniques of chromosome identification have led to the development of standardized banded karyotypes with a chromosome numbering system and banding nomenclature for several mammalian species including humans [
- Human H.T.C.
- Cytogenetics M.
An historical perspective.
Springer, New York1979
]. Spectral karyotyping or multicolor fluorescence in situ hybridization (mFISH) is widely used in diagnostic laboratories to identify chromosomes with multiple aberrations. Although this method allows an easy identification of different chromosomes, it is also very cost intensive and time consuming. Therefore, these routine diagnostic methods are sought to benefit from supporting techniques based on artificial intelligence (AI) [
- Harnden D.G.
- Klinger H.P.
ISCN. An international system for human cytogenetic nomenclature: report of the standing committee on human cytogenetic nomenclature.
- Topol E.J.
High-performance medicine: the convergence of human and artificial intelligence.
Nat Med. 2019; 25: 44-56https://doi.org/10.1038/s41591-018-0300-7
Medical applications that use AI-support for the interpretation of image data (i.e. computer vision) are quickly gaining importance [
]. These applications typically use convolutional neural networks (CNNs) to either classify entire images (image classification) or to identify regions-of-interest or objects-of interest within images (semantic segmentation or object detection) [
- Esteva A.
- Chou K.
- Yeung S.
- Naik N.
- Madani A.
- Mottaghi A.
- et al.
Deep learning-enabled medical computer vision.
npj Digit Med. 2021; 4: 5https://doi.org/10.1038/s41746-020-00376-2
]. These CNNs learn from examples (often generated by human annotations). The amount and quality of this training data primarily determine the final performance of the CNN. Classifying chromosomes to build a karyogram is an image classification task that is well suitable for CNN-based automation. Moreover, hundreds of thousands of manually generated karyograms are available as training data through routine diagnostics. Here, we developed a convolutional neural network that predicts the chromosome class and the correct orientation of chromosomes in order to automatically fill the karyogram. In a second step, we evaluated the benefits of using a supportive CNN in our routine diagnostic workflow to identify normal and abnormal chromosomes in patients with hematological neoplasms.
- Goodfellow I.
- Begio Y.
- Courville A.
MIT Press, London2016
Methods and material
Development and evaluation of a convolutional neural network for individual chromosomes
We collected a total of 330,131 normal karyograms and the images of the associated metaphases from routine diagnostic karyotyping from 2012 to 2019 without filtering for image quality. Fluorescence R-banding was applied for obtaining all karyograms since it is the standard staining in our lab to analyze metaphases of cancer cells. An estimated 78% of karyograms came from bone marrow sample, the remaining 22% came from blood samples. The whole dataset was split into three independent, non-overlapping subsets: 297,119 karyograms (90%) were used as the training set, 16,506 karyograms (5%) were used as the validation set and another 16,506 karyograms (5%) were used as the test set. For all karyograms, we extracted images of the individual chromosomes in their original orientation from the metaphase image. In addition, we extracted the associated chromosome classes and the rotation angles that were applied in the routine workflow to align the chromosomes in the karyogram. We then trained a self-developed convolutional neural net (CNN) to predict the chromosome class and the required rotation angle from the individual chromosome image. The architecture of the CNN is further described in US patent US10991098. In short, it makes use of multiple convolutional blocks with DenseNet-like skip connections (https://doi.org/10.1109/CVPR.2017.243) and 1 × 1 “network-in-network” convolutions (https://arxiv.org/abs/1312.4400). After every full pass of all chromosomes in the training set (i.e. one epoch), we evaluated the current performance of the CNN using the validation set. This process was repeated 75 times and the CNN with the best validation performance was used for a final evaluation of the test set. The complete training process took 45 days on a Nvidia RTX 2080 Ti graphics card. The CNN was then integrated into the karyotyping software Ikaros (MetaSystems Hard & Software GmbH, Altlußheim). Additional analytical metrics about the CNN performance with precision and recall for every chromosome class can be found in supplementary table S3.
Study design to test the support of the convolutional neural network in routine diagnostics
In this study, 200 metaphases with normal karyotype from twenty individuals were analyzed (ten metaphases per individual) either with or without CNN support. Acute myeloid leukemia was diagnosed in four patients. Fourteen patients were diagnosed with myelodysplastic syndrome and two patients with a chronic myeloproliferative disorder (CMPD). Cytogenetic analyses were performed in the Department of Human Genetics at Hannover Medical School, Germany. Written consent was obtained from all participants. Cytogenetic analyses were performed on bone marrow aspirates as well as on peripheral blood samples. Metaphase preparations were performed according to the standard procedure [
]. Ten metaphases from each individual were examined after fluorescence R-banding. For this, metaphases were automatically captured and chromosomes were separated by automatic thresholding and manual cluster separation, if necessary. Individual chromosomes were then automatically classified and rotated either with support of the routinely used classification algorithm, which has been a feature of the Ikaros karyotyping software for almost 30 years and classifies individual chromosomes based on their size and banding characteristics, or with support of the CNN.
- Schlegelberger B.
- Metzke S.
- Harder S.
- Zuhlke-Jenisch R.
- Zhang Y.
- Siebert R.
Classical and molecular cytogenetics of tumor cells.
Springer, Berlin, Heidelberg1999
Two different observers analyzed the metaphases. The first observer first analyzed 100 metaphases of patients #1-#10 with support of the routinely used classification algorithm. After at least one week, the observer analyzed the same 100 metaphases again with support of the CNN. We measured the required hands-on-time (separation of chromosomes, chromosome assignment), as well as the number of necessary correction steps (manual correction of automatic classification and rotation) for every karyogram. To exclude a possible training effect, the second observer first analyzed 100 metaphases of patients #11-#20 with support of the CNN and after some time the same metaphases only with support of the routinely used classification.
Next, we were interested how the CNN performes to identify aberrant chromosomes. Therefore, we analyzed 10 cases (10 metaphases each) with different chromosomal aberrations: isolated chromosomal aberrations involving different chromosomes, complex karyotypes, composite karyotypes, isolated whole arm translocations and isolated deletions (Table S2). The karyotype was described according to guidelines of the International System of Human Cytogenetic Nomenclature [
- McGowan-Jordan J.
- Hastings R.J.
- Moore S.
An international system for human cytogenomic nomenclature (ISCN 2020). reprint of: cytogenetic and genome research.
The CNN reliably identifies chromosomes and rotates them in an upright position
First, we evaluated the performance of the CNN on individual chromosomes using a set of 16,506 normal karyograms that the CNN had never seen during training (i.e. the test set). The CNN predicted the correct chromosome class for 98.8% of chromosomes (750,419 out of 759,275 chromosomes, Fig. 1). For autosomal chromosomes, rare misclassifications happened mostly between neighboring classes. The sex chromosomes X and Y were rarely confused with chromosomes 7 and 22, respectively.
For the prediction of the rotation angle, we observed a median absolute error of 2° and for 94.6% of chromosomes the absolute rotation error was below 15° (Fig. 2). While the rotation error was closely distributed around 0° for the majority of chromosomes, we initially expected a subpopulation of chromosomes with a rotation error of around 180° (i.e. the CNN rotated them upside-down; likely for metacentric chromosomes) and more rarely 90° (i.e. the CNN rotated them on the side). In fact, these subpopulations were hardly noticeable and far smaller than we expected. Only 0.7% of chromosomes were predicted in an upside-down position (rotation error of 180±5°) and only 0.2% of chromosomes were predicted in a side position (rotation error of 90±5° or −90±5°).
Clinical results: support by the CNN leads to less rotation and position correction steps resulting in time saving in routine diagnostics
In a second step, we tested the developed CNN in routine diagnostics against the routinely used automatic classification algorithm within the Ikaros karyotyping software. Fig. 3 displays three examples of automatically generated karyograms with normal karyotype generated with or without the support of the CNN before editing by trained personal. (Fig. 3A) The routinely used classification algorithm failed to identify three chromosomes of the metaphase obtained from a MDS patient (Fig. 3A). These chromosomes remained below the karyogram as so-called marker chromosomes, especially the smaller ones (chromosome 21 and 22). Furthermore, six chromosomes were assigned to the wrong class. The second metaphase was obtained from a patient with AML. Without support by the CNN eight chromosomes could not be identified by the routinely used Ikaros software and nine chromosomes were identified wrong. In Fig. 3C, a metaphase obtained from a patient with MDS showed four chromosomes that could not be identified by Ikaros software and eleven chromosomes that have been identified wrong. In all three examples, the CNN produced a correct karyogram.
When averaged over all 200 karyograms, the supportive use of the CNN largely decreased the number of steps that were necessary to generate a correct karyogram. While the routinely used algorithm required 19.1 correction steps per karyogram on average, only 0.15 correction steps per karyogram were required when the CNN was used. Fewer rotation steps to put the chromosomes into the correct orientation as well as fewer position changing steps led to a reduction in the turnaround time from 102.3 s on average to 58.6 s (Table S1, Fig. 4). This represents an overall time saving of 43%. For individual patient sample (10 metaphases), the time saving varied between 28–55%.
Although we performed the training of the CNN with normal karyotypes and concentrated in the clinical study on normal karyotypes, we were interested how robust the CNN classifies chromosomes with structural aberrations. Therefore, we performed a small study with 11 cases including metaphases with structural aberrations like deletions, whole arm translocations, complex karyotypes and composite karyotypes. All in all, the CNN supported the observer very well and reduced the overall handling time for each metaphase significantly (Table S2, Figs. S1 and S2).
In this study, a convolutional neural network was developed and evaluated to automatically identify individual chromosomes from metaphase cells of hematologic neoplasms such as AML, MDS and CMPDs with normal karyotype.
Recently, the application of convolutional neural networks (CNNs) has found its way into hematological cancer diagnostics. Although the CNN training algorithms vary, they share one basic function: all networks accept a set of inputs and, based on their underlying layer algorithm, generate corresponding outputs. Aghamaleki et al. [
] have described the development of an artificial neural network in order to identify a molecular biomarker for rapid diagnosis of chronic lymphocytic leukemia from blood samples. Furthermore, Sidhom et al. [
- Aghamaleki S.F.
- Mollashahi B.
- Nosrati M.
- Moradi A.
- Sheikhpour M.
- Movafagh A.
Application of an artificial neural network in the diagnosis of chronic lymphocytic leukemia.
Cureus. 2019; 11: e4004https://doi.org/10.7759/cureus.4004
] have developed a deep learning method that provides a rapid and accurate help for diagnosing t(15;17) positive acute promyelocytic leukemia.
- Sidhom J.W.
- Siddarthan I.J.
- Lai B.S.
- Luo A.
- Hambley B.C.
- Bynum J.
- Duffield A.S.
- Streiff M.B.
- Moliterno A.R.
- Imus P.
- Gocke C.B.
- Gondek L.P.
- DeZern A.E.
- Baras A.S.
- Kickler T.
- Levis M.J.
- Shenderov E.
Deep learning for diagnosis of acute promyelocytic leukemia via recognition of genomically imprinted morphologic features.
npj Precis Oncol. 2021; 5: 38https://doi.org/10.1038/s41698-021-00179-y
The first chromosomes that have been classified by artificial intelligence have been published by Cho et al. [
]. Within the last years, the classification accuracy has increased immensely, mostly due to the introduction of CNNs. Our experimental results demonstrate that our CNN was able to correctly detect the chromosomes in all 24 classes (1 to 22, X and Y) with an accuracy of 98.8% and rotate each chromosome into the defaulted orientation. Related work of Joshi et al. [
- Cho J.
- Ryu S.
- Woo S.
A study for the hierarchical artificial neural network model for Giemsa-stained human chromosome classification.
Conf Proc IEEE Eng Med Biol Soc. 2004; 6: 4588-4591https://doi.org/10.1109/IEMBS.2004.1404272
], who used artificial-intelligence-based methods to classify chromosomes based on the Denver group classification standard, have also shown a very high overall average of classification accuracy of 97%. Recently, Hu et al. [
- Joshi P.
- Munot M.
- Kulkarni P.
- Joshi M.
Efficient karyotyping of metaphase chromosomes using incremental learning.
IET Sci Meas Technol. 2013; 7: 287-295https://doi.org/10.1049/iet-smt.2012.0160
] have developed a classifier using a deep convolutional neural network to classify chromosomes into 24 classes with an accuracy of 93.79%.
- Hu X.
- Yi W.
- Jiang L.
- Wu S.
- Zhang Y.
- Du J.
- Ma T.
- Wang T.
- Wu X.
Classification of metaphase chromosomes using deep convolutional neural network.
J Comput Biol. 2019; 26: 473-484https://doi.org/10.1089/cmb.2018.021
Here, metaphases from different specimens were used for the training of the CNN such as peripheral blood and bone marrow. Although cytogeneticists consider bone marrow to be the most informative tissue for cytogenetic analysis’, the quality of metaphase chromosomes derived from bone marrow is often decreased due to lower level of chromosome banding. In line with observations of Wang et al. [
], we, therefore, recommend to train the CNN not only with high quality metaphases derived from peripheral blood but to include any quality of samples with which the CNN will be confronted in routine diagnostics. Unfortunately, changes in the documentation of tissue type over the 7-years collection phase of the dataset prevented us from determining the classification accuracy for bone marrow metaphases and blood metaphases separately. However, based on the known proportions of bone marrow metaphases and blood metaphases and the overall classification accuracy, we could determine the theoretical minimal classification accuracies for both tissue types. With 78% bone marrow metaphases and 22% blood metaphases in the whole dataset and an overall classification accuracy of 98.8%, the classification accuracy for bone marrow metaphases cannot be lower than 98.46% (if accuracy for blood was 100%) and the classification accuracy for blood metaphases cannot be lower than 94.54% (if accuracy for bone marrow was 100%).
- Wang Y.
- Zheng B.
- Li S.
- Mulvihill J.J.
- Chen X.
- Liu H.
Automated classification of metaphase chromosomes: optimization of an adaptive computerized scheme.
J Biomed Inform. 2009; 42: 22-31https://doi.org/10.1016/j.jbi.2008.05.004
In this study we focused on fluorescent R-band metaphases, since this is the standard staining in our lab to analyze metaphases of cancer cells. Technically, the CNN-based karyotyping can be adapted to other stainings (e.g. G-band, R-band, Q-band), given an appropriate set of training data. In addition, the amount of necessary training data can be reduced by using an existing karyotyping CNN as start point of the training (transfer learning). CNNs can be either used out-of-the-box or as a starting point for transfer learning.
While this study showed a relevant speed-up of the karyotyping workflow by CNN assistance and a support for analyzing metaphases of low quality (our average banding resolution is ∼300 bands, the CNN is able to analyze metaphases ≥ 100 bands) it also revealed chances for further development. Our CNN requires images of individual chromosomes as input. These chromosomes are already present in the overview image of the metaphase, but they are often bent, touching or overlapping and the current workflow requires the user to separate the chromosomes into individual objects. With the CNN-based speed-up of chromosome classification, this chromosome segmentation is another time-consuming step and should be the next target for AI support. This will additionally reduce the time of trained personal to generate a correct karyogram and the saving in karyotype handling time will translate into a saving in labor costs.
Although we established a large and diverse database in this study, the metaphase chromosomes were all collected from hematologic neoplasms with normal karyotypes. Since more than 50% of the hematological samples arriving in our Department of Human Genetics show chromosomal aberrations, a fully automated karyotyping workflow should at best include the detection of chromosomal aberrations. Unfortunately, chromosomal aberrations pose a great challenge to deep learning algorithms because they are both complex and rare compared to normal chromosomes. However, we demonstrated that the CNN reduced the time to generate a karyotype even in metaphases with diverse chromosomal aberrations such as complex karyotypes.
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
The work described has been carried out in accordance with Declaration of Helsinki. Informed consent was obtained, approved by the Ethics Committee of Hannover Medical School, Hannover, Germany (Nr. 8657_BO_K_2019).
CRediT authorship contribution statement
Beate Vajen: Conceptualization, Data curation, Formal analysis, Validation, Visualization, Writing – original draft. Siegfried Hänselmann: Conceptualization, Data curation, Formal analysis, Methodology, Software, Validation, Visualization, Writing – original draft. Friederike Lutterloh: Investigation. Simon Käfer: Methodology. Jennifer Espenkötter: Investigation. Anna Beening: Investigation. Jochen Bogin: Methodology, Software. Brigitte Schlegelberger: Project administration, Supervision, Writing – review & editing. Gudrun Göhring: Conceptualization, Project administration, Supervision, Writing – review & editing.
Declaration of Competing Interest
S. Hänselmann and J. Bogin are employees of MetaSystems Hard & Software GmbH, Altlussheim, Germany
We thank Claudia Davenport for the careful revision of the manuscript.
Appendix. Supplementary materials
- WHO classification of tumours of haematopoietic and lymphoid tissues.4th ed. IARC, Lyon2017
- An historical perspective.Springer, New York1979
- ISCN. An international system for human cytogenetic nomenclature: report of the standing committee on human cytogenetic nomenclature.Karger, Basel1985
- High-performance medicine: the convergence of human and artificial intelligence.Nat Med. 2019; 25: 44-56https://doi.org/10.1038/s41591-018-0300-7
- Deep learning-enabled medical computer vision.npj Digit Med. 2021; 4: 5https://doi.org/10.1038/s41746-020-00376-2
- Deep learning.MIT Press, London2016
- Classical and molecular cytogenetics of tumor cells.Springer, Berlin, Heidelberg1999
- An international system for human cytogenomic nomenclature (ISCN 2020). reprint of: cytogenetic and genome research.Karger, Basel2020
- Application of an artificial neural network in the diagnosis of chronic lymphocytic leukemia.Cureus. 2019; 11: e4004https://doi.org/10.7759/cureus.4004
- Deep learning for diagnosis of acute promyelocytic leukemia via recognition of genomically imprinted morphologic features.npj Precis Oncol. 2021; 5: 38https://doi.org/10.1038/s41698-021-00179-y
- A study for the hierarchical artificial neural network model for Giemsa-stained human chromosome classification.Conf Proc IEEE Eng Med Biol Soc. 2004; 6: 4588-4591https://doi.org/10.1109/IEMBS.2004.1404272
- Efficient karyotyping of metaphase chromosomes using incremental learning.IET Sci Meas Technol. 2013; 7: 287-295https://doi.org/10.1049/iet-smt.2012.0160
- Classification of metaphase chromosomes using deep convolutional neural network.J Comput Biol. 2019; 26: 473-484https://doi.org/10.1089/cmb.2018.021
- Automated classification of metaphase chromosomes: optimization of an adaptive computerized scheme.J Biomed Inform. 2009; 42: 22-31https://doi.org/10.1016/j.jbi.2008.05.004
Published online: November 20, 2021
Accepted: November 18, 2021
Received in revised form: November 15, 2021
Received: July 13, 2021
© 2021 The Authors. Published by Elsevier Inc.
User licenseCreative Commons Attribution – NonCommercial – NoDerivs (CC BY-NC-ND 4.0) |
How you can reuse
Elsevier's open access license policy
Creative Commons Attribution – NonCommercial – NoDerivs (CC BY-NC-ND 4.0)
For non-commercial purposes:
- Read, print & download
- Redistribute or republish the final article
- Text & data mine
- Translate the article (private use only, not for distribution)
- Reuse portions or extracts from the article in other works
- Sell or re-use for commercial purposes
- Distribute translations or adaptations of the article
Elsevier's open access license policy