IRIS Catalogo Istituzionale della Ricerca dell'Università degli Studi del Molise

Laryngeal cancer is one of the most common malignant tumors in otolaryngology, and histopathological image analysis is the gold standard for the diagnosis of laryngeal cancer. However, pathologists have high subjectivity in their diagnoses, which makes it easy to miss diagnoses and misdiagnose. In addition, according to a literature search, there is currently no computer-aided diagnosis (CAD) algorithm that has been applied to the classification of histopathological images of laryngeal cancer. Convolutional neural networks (CNNs) are widely used in various other cancer classification tasks. However, the potential global and channel relationships of images may be ignored, which will affect the feature representation ability. Simultaneously, due to the lack of interpretability, the results are often difficult to accept by pathologists. we propose a laryngeal cancer classification network (LPCANet) based on a CNN and attention mechanisms. First, the original histopathological images are sequentially cropped into patches. Then, the patches are input into the basic ResNet50 to extract the local features. Then, a position attention module and a channel attention module are added in parallel to capture the spatial dependency and the channel dependency, respectively. The two modules produce the fusion feature map to enhance the feature representation and improve network classification performance. Moreover, the fusion feature map is extracted and visually analyzed by the grad-weighted class activation map (Grad_CAM) to provide a certain interpretability for the final results. The three-class classification performance of LPCANet is better than those of the five state-of-the-art classifiers (VGG16, ResNet50, InceptionV3, Xception and DenseNet121) on the two original resolutions (534 * 400 and 1067 * 800). On the 534 * 400 data, LPCANet achieved 73.18% accuracy, 74.04% precision, 73.15% recall, 72.9% F1-score, and 0.8826 AUC. On the 1067 * 800 data, LPCANet achieved 83.15% accuracy, 83.5% precision, 83.1% recall, 83.1% F1-score, and 0.9487 AUC. The results show that LPCANet enhances the feature representation by capturing the global and channel relationships and achieves better classification performance. In addition, the visual analysis of Grad_CAM makes the results interpretable, which makes it easier for the results to be accepted by pathologists and allows the method to become a second tool for auxiliary diagnosis. Graphic Abstract: [Figure not available: see fulltext.]

LPCANet: Classification of Laryngeal Cancer Histopathological Images Using a CNN with Position Attention and Channel Attention Mechanisms

Zhou X.;Tang C.;Huang P.;Mercaldo F.;Santone A.;Shao Y.

2021-01-01

Abstract

Laryngeal cancer is one of the most common malignant tumors in otolaryngology, and histopathological image analysis is the gold standard for the diagnosis of laryngeal cancer. However, pathologists have high subjectivity in their diagnoses, which makes it easy to miss diagnoses and misdiagnose. In addition, according to a literature search, there is currently no computer-aided diagnosis (CAD) algorithm that has been applied to the classification of histopathological images of laryngeal cancer. Convolutional neural networks (CNNs) are widely used in various other cancer classification tasks. However, the potential global and channel relationships of images may be ignored, which will affect the feature representation ability. Simultaneously, due to the lack of interpretability, the results are often difficult to accept by pathologists. we propose a laryngeal cancer classification network (LPCANet) based on a CNN and attention mechanisms. First, the original histopathological images are sequentially cropped into patches. Then, the patches are input into the basic ResNet50 to extract the local features. Then, a position attention module and a channel attention module are added in parallel to capture the spatial dependency and the channel dependency, respectively. The two modules produce the fusion feature map to enhance the feature representation and improve network classification performance. Moreover, the fusion feature map is extracted and visually analyzed by the grad-weighted class activation map (Grad_CAM) to provide a certain interpretability for the final results. The three-class classification performance of LPCANet is better than those of the five state-of-the-art classifiers (VGG16, ResNet50, InceptionV3, Xception and DenseNet121) on the two original resolutions (534 * 400 and 1067 * 800). On the 534 * 400 data, LPCANet achieved 73.18% accuracy, 74.04% precision, 73.15% recall, 72.9% F1-score, and 0.8826 AUC. On the 1067 * 800 data, LPCANet achieved 83.15% accuracy, 83.5% precision, 83.1% recall, 83.1% F1-score, and 0.9487 AUC. The results show that LPCANet enhances the feature representation by capturing the global and channel relationships and achieves better classification performance. In addition, the visual analysis of Grad_CAM makes the results interpretable, which makes it easier for the results to be accepted by pathologists and allows the method to become a second tool for auxiliary diagnosis. Graphic Abstract: [Figure not available: see fulltext.]

Scheda breve

Scheda completa

Scheda completa (DC)

	Codice UT ISI
	
				WOS:000662905700002
			
	Codice DOI
	
				https://dx.doi.org/10.1007/s12539-021-00452-5
			
	Codice Scopus
	
				2-s2.0-85108114196
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11695/107209

Citazioni

ND

41

26

social impact