Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.

BACKGROUND: Melanoma has one of the fastest rising incidence rates of any cancer. It accounts for a small percentage of skin cancer cases but is responsible for the majority of skin cancer deaths. History-taking and visual inspection of a suspicious lesion by a clinician is usually the first in a series of 'tests' to diagnose skin cancer. Establishing the accuracy of visual inspection alone is critical to understating the potential contribution of additional tests to assist in the diagnosis of melanoma. OBJECTIVES: To determine the diagnostic accuracy of visual inspection for the detection of cutaneous invasive melanoma and atypical intraepidermal melanocytic variants in adults with limited prior testing and in those referred for further evaluation of a suspicious lesion. Studies were separated according to whether the diagnosis was recorded face-to-face (in-person) or based on remote (image-based) assessment. SEARCH METHODS: We undertook a comprehensive search of the following databases from inception up to August 2016: CENTRAL; CINAHL; CPCI; Zetoc; Science Citation Index; US National Institutes of Health Ongoing Trials Register; NIHR Clinical Research Network Portfolio Database; and the World Health Organization International Clinical Trials Registry Platform. We studied reference lists and published systematic review articles. SELECTION CRITERIA: Test accuracy studies of any design that evaluated visual inspection in adults with lesions suspicious for melanoma, compared with a reference standard of either histological confirmation or clinical follow-up. We excluded studies reporting data for 'clinical diagnosis' where dermoscopy may or may not have been used. DATA COLLECTION AND ANALYSIS: Two review authors independently extracted all data using a standardised data extraction and quality assessment form (based on QUADAS-2). We contacted authors of included studies where information related to the target condition or diagnostic threshold were missing. We estimated summary sensitivities and specificities per algorithm and threshold using the bivariate hierarchical model. We investigated the impact of: in-person test interpretation; use of a purposely developed algorithm to assist diagnosis; and observer expertise. MAIN RESULTS: We included 49 publications reporting on a total of 51 study cohorts with 34,351 lesions (including 2499 cases), providing 134 datasets for visual inspection. Across almost all study quality domains, the majority of study reports provided insufficient information to allow us to judge the risk of bias, while in three of four domains that we assessed we scored concerns regarding applicability of study findings as 'high'. Selective participant recruitment, lack of detail regarding the threshold for deciding on a positive test result, and lack of detail on observer expertise were particularly problematic.Attempts to analyse studies by degree of prior testing were hampered by a lack of relevant information and by the restricted inclusion of lesions selected for biopsy or excision. Accuracy was generally much higher for in-person diagnosis compared to image-based evaluations (relative diagnostic odds ratio of 8.54, 95% CI 2.89 to 25.3, P < 0.001). Meta-analysis of in-person evaluations that could be clearly placed on the clinical pathway showed a general trade-off between sensitivity and specificity, with the highest sensitivity (92.4%, 95% CI 26.2% to 99.8%) and lowest specificity (79.7%, 95% CI 73.7% to 84.7%) observed in participants with limited prior testing (n = 3 datasets). Summary sensitivities were lower for those referred for specialist assessment but with much higher specificities (e.g. sensitivity 76.7%, 95% CI 61.7% to 87.1%) and specificity 95.7%, 95% CI 89.7% to 98.3%) for lesions selected for excision, n = 8 datasets). These differences may be related to differences in the spectrum of included lesions, differences in the definition of a positive test result, or to variations in observer expertise. We did not find clear evidence that accuracy is improved by the use of any algorithm to assist diagnosis in all settings. Attempts to examine the effect of observer expertise in melanoma diagnosis were hindered due to poor reporting. AUTHORS' CONCLUSIONS: Visual inspection is a fundamental component of the assessment of a suspicious skin lesion; however, the evidence suggests that melanomas will be missed if visual inspection is used on its own. The evidence to support its accuracy in the range of settings in which it is used is flawed and very poorly reported. Although published algorithms do not appear to improve accuracy, there is insufficient evidence to suggest that the 'no algorithm' approach should be preferred in all settings. Despite the volume of research evaluating visual inspection, further prospective evaluation of the potential added value of using established algorithms according to the prior testing or diagnostic difficulty of lesions may be warranted.

Original publication

DOI

10.1002/14651858.CD013194

Type

Journal article

Journal

Cochrane Database Syst Rev

Publication Date

04/12/2018

Volume

12

Keywords

Adult, Aged, Algorithms, Diagnostic Errors, Humans, Melanoma, Middle Aged, Physical Examination, Sensitivity and Specificity, Skin Neoplasms