Drugs, Health Technologies, Health Systems

Health Technology Review

Asynchronous Teleultrasound and In-Person Ultrasound: Comparing Diagnostic Accuracy

Key Messages

What Is the Issue?

What Did We Do?

What Did We Find?

What Does This Mean?

Background

Ultrasonography is a portable and noninvasive imaging method that uses sound waves to visualize internal organs, structures, and systems within the body in real time. Ultrasound examinations are primarily conducted by sonographers and interpreted by physicians specializing in medical imaging.

Ultrasound imaging is a highly operator-dependent imaging modality that requires well-trained operators to perform the scan, adjust the protocol based on clinical judgment, take appropriate images, and provide accurate technical impressions for the interpreting physician. Therefore, the accuracy of the final report depends on the expertise of the interpreting physician and the skill of the sonographer.2,3

The quality of an ultrasound exam varies depending on the sonographer’s experience with operating the equipment, whereas the image quality of CT or MRI exams is less dependent on the operator’s performance.2 As well, ultrasound is much more affordable and portable than CT and MRI, and unlike CT, it does not expose patients to radiation.4 As a result, ultrasound is the preferred method for soft-tissue imaging in cases where the higher image quality of CT and MRI is not needed.4

Access to ultrasound services in rural or underserved regions is often limited by the number of qualified professionals, available equipment, and infrastructure or resources.3,5,6 In Canada, less than 28% of rural emergency departments have in-house access to ultrasound, requiring patient transfers from rural communities to facilities with ultrasound capacity.7

Ultrasound exams are conducted by imaging professionals, a shortage of which have been reported in Canada and in many other countries worldwide.8,910 Recruitment and retention challenges have also exacerbated existing staff shortages and contribute to longer wait times for diagnostics exams.8,9

Teleultrasound (TUS) is an imaging technique that uses advances in information technology and ultrasound to support ultrasound delivery and remote clinical decision-making.1,2 TUS involves performing an ultrasound exam at 1 location and then electronically transmitting the images to another location where they are interpreted by an imaging expert.2,6,10 TUS systems support decision-making across a wide range of clinical settings, and examinations may be conducted at the point-of-care or in emergency settings, community settings, or dedicated imaging facilities.

TUS is intended to enhance patient care by offering access to specialized expertise either to complement existing services or to provide care in resource-limited settings. By expanding access to these services, TUS has the potential to improve time to diagnosis, reduce costs for both patients and the health care system, and decrease patient transfers and travel time.3,11-13

How Is Teleultrasound Delivered?

TUS can be conducted using either real-time (synchronous) or asynchronous (“store-and-forward”) video or image transmission.5,10,14

With rapid advances in diagnostic imaging technology, asynchronous TUS has gained greater use as a tool to support the delivery of patient care, particularly in resource-limited settings.2,13,17,18 One such development in asynchronous TUS is the growing use of volume sweep imaging (VSI), which is a standardized technique that involves sweeping the ultrasound probe over a target area using simple, predefined movements to capture images. These images are then later transmitted to an expert for review.5 VSI is a method that enables individuals with little or no prior ultrasound experience to perform scans, which suggests it has potential value in settings with limited access to resources or trained imaging professionals.10

Purpose of This Review

As asynchronous TUS (i.e., unsupervised ultrasound with remote exam interpretation by an expert) continues to expand to different clinical areas, its comparability with traditional in-person ultrasound in terms of health care quality remains uncertain.1,19 This report aims to compare the diagnostic accuracy, patient care quality, and service quality provided by asynchronous TUS with that of traditional in-person ultrasound. This report also aims to summarize the recommendations from evidence-based guidelines regarding the use of TUS in clinical practice for supporting the diagnosis of various medical conditions.

Objectives

We prepared this rapid review to address the following questions:

  1. What is the diagnostic accuracy and agreement of asynchronous TUS compared to the traditional service model of ultrasound with an in-person imaging specialist?

  2. How do asynchronous TUS and the traditional in-person US service model compare in terms of patient care quality and service quality?

  3. What are the evidence-based guidelines regarding the use of TUS in clinical practice for supporting the diagnosis of various medical conditions?

Methods

Literature Search Methods

The literature search strategy used in this report is an updated version of a strategy developed for a previous report published by Canada's Drug Agency on the comparative effectiveness of real-time teleultrasound versus in-person ultrasound. For the current report, an information specialist conducted a literature search using key resources, including MEDLINE, the Cumulative Index to Nursing and Allied Health Literature (CINAHL), the Cochrane Database of Systematic Reviews, and the International HTA Database. The search also included a review of websites of health technology assessment agencies in Canada and major international health technology assessment agencies, as well as a focused internet search to capture grey literature.

The search approach was customized to retrieve a limited set of results, balancing comprehensiveness with relevance. The search strategy comprised both controlled vocabulary, such as the National Library of Medicine’s MeSH (Medical Subject Headings), and keywords. The main search concepts were ultrasound and telemedicine or remote supervision. The initial search was limited to English-language documents published between January 1, 2019, and August 27, 2024. For the current report, database searches were rerun on March 6, 2025, to capture any articles published or made available since the initial search date. The search of the grey literature was also updated to include documents published since August 2024.

Eligibility Criteria and Study Selection Methods

One reviewer screened records and selected studies based on the eligibility criteria presented in Table 1. In the first level of screening, the titles and abstracts were reviewed, and potentially relevant articles were retrieved for full-text review. In the second level of screening, 1 reviewer assessed potentially relevant full texts for inclusion. Articles published before 2019 were excluded from this report due to the focus on recent evidence and emerging TUS developments.

Table 1: Eligibility Criteria

Criteria

Description

Population

Patients of any age seeking ultrasound exams for any health condition

Index test

Asynchronous TUS (i.e., unsupervised ultrasound with remote exam interpretation by an expert, “store-and-forward” method)

Reference standard

Traditional ultrasound service model (standard in-person ultrasound delivered and interpreted by an onsite imaging specialist)

Outcomes

Q1: Diagnostic accuracy (sensitivity, specificity, diagnostic agreement)

Q2:

  • patient care quality (quality and safety of care, access to care, clinical utility, or operator acceptance)

  • service quality (efficiency of the index test, patient acceptance)

Q3: Recommendations regarding TUS use in clinical practice for diagnostic purposes

Study designs

Health technology assessments, systematic reviews, single-group designs, nonrandomized studies, evidence-based guidelines

Exclusion criteria

  • Index test: Real-time TUS or any index test without asynchronous remote exam interpretation

  • Reference standard: Standard in-person ultrasound delivered and/or interpreted by a nonspecialist (e.g., student, nonclinician, patient) or any non-ultrasound-based reference standard

  • Peer-reviewed articles published before 2019

  • Duplicate publications

  • Case reports

  • Non-English-language reports

TUS = teleultrasonography.

Data Extraction

Relevant articles underwent data extraction by 1 reviewer using a standardized form. Information extracted included study design, population characteristics, ultrasound characteristics, ultrasound operator and remote expert profile (i.e., role and experience), study inclusion and exclusion criteria, and relevant results.

Critical Appraisal of Individual Studies

One reviewer assessed the risk of bias of the included studies using Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2).20 The risk of bias of each included study was described narratively and summarized by the following 4 QUADAS-2 domains: patient selection, index test, reference standard, and flow and timing.

Summary of Evidence

Quantity of Research Available

A total of 692 records from the literature search were identified, including 61 potentially relevant records from the grey literature. Following screening of titles and abstracts, 660 records were excluded, and 32 potentially relevant reports were retrieved for full-text review.

Overall, 11 unique studies met the inclusion criteria.21-31 No evidence-based guidelines for TUS were identified. Refer to Appendix 1, Figure 1 for the PRISMA32 flow chart of study selection.

Study Characteristics

  • Eleven primary studies across 8 countries were included in this report, totalling 976 patients who underwent either asynchronous TUS, traditional in-person ultrasound, or both.

  • A total of 35 remote imaging experts located in 14 countries reviewed the TUS images. Detailed characteristics of the 11 included studies are presented in Appendix 2, Table 2.

Study Design

Eleven prospective studies examining diagnostic agreement and accuracy were published between 2019 and 2024. These included 11 cohort selection cross-diagnostic accuracy studies21-31

Country of Origin

Patient Population

A summary of the patient populations and clinical settings for each included study is presented in Appendix 2, Table 2. The 11 primary studies included 976 adult, pediatric, and pregnant individuals.

Index Test and Reference Standard

The index test used in all studies was delivered through various scanning methods:

In all cases, the reference standard was standard in-person delivered and interpreted ultrasound by a trained sonologist or medical professional with imaging expertise.

A summary of the index test, reference standard, and characteristics of the ultrasound operator and expert TUS interpreter are provided in Appendix 2, Table 2.

Outcomes

Summary of Findings

Main Take-Aways

  • Asynchronous TUS was found to be an alternative method to the standard in-person model of ultrasound for identifying certain targeted conditions when evaluating diagnostic accuracy. However, results varied across individual studies depending on the target condition.

  • Asynchronous TUS showed substantial diagnostic agreement with standard in-person ultrasound across most studies, although several studies showed variable findings.

  • In most studies, the image quality of ultrasound images transmitted to the offsite expert for interpretation was reported to be acceptable or excellent, although some studies showed mixed results.

  • Asynchronous TUS was reported to have high utility and acceptance by clinicians and patients, based on 2 studies that examined these outcomes.

Diagnostic Accuracy

The diagnostic accuracy of asynchronous TUS interpretation was evaluated across several key diagnostic test accuracy metrics (including sensitivity, specificity, and agreement). Notably, acceptable thresholds for these metrics may differ depending on the target condition being evaluated.

Sensitivity and Specificity

The reported sensitivity and specificity of asynchronous TUS interpretation varied within the individual studies and by target condition. In a total of 621-23,25,27,31 of the 11 included studies, the authors reported on the sensitivity and specificity of asynchronously interpreted TUS compared to standard-of-care interpreted ultrasound (Appendix 3, Table 3).

Diagnostic Agreement

Overall, asynchronous TUS interpretation showed substantial diagnostic agreement with standard in-person ultrasound across most of the 11 included studies,21-27,29,31 although some studies showed variable findings22,26,28-30 (Appendix 3, Table 3). For studies reporting diagnostic agreement using the kappa statistic, the following classification scale was used for interpretation: “poor” (0.0), “slight” (0.01 to 0.2), “fair” (0.21 to 0.4), “moderate” (0.41 to 0.6), “substantial” (0.61 to 0.80), and “almost perfect” (0.81 to 1).33 Of note, these classifications do not account for the clinical significance of agreement levels in real-world practice.

Exam Image Quality

Overall, the quality of ultrasound images transmitted to remote experts for asynchronous interpretation was generally reported to be acceptable to excellent, though findings varied across studies (Appendix 3, Table 3). Of the 11 included studies, the authors of 621,25-27,30,31 reported on this outcome:

Utility and Patient Acceptance

Overall, asynchronous TUS was reported to have high utility and acceptance. This outcome was evaluated in 27,30 of the 11 included studies (Appendix 3, Table 3):

Summary of Critical Appraisal

The risk of bias of the included studies was assessed using the QUADAS-2 tool. Appendix 4, Figure 2 presents a summary of the QUADAS-2 results by domain, while Appendix 4, Table 4 contains details about the strengths and limitations of the included studies.21-31

Overall, the included studies were judged to be at a low or unclear risk of bias in the domain of patient selection, with most studies using appropriate and representative recruitment methods. The index test domain was generally judged to have an unclear or high risk of bias. The reference standard domain showed a high risk of bias across studies, with several introducing potential bias through unclear blinding procedures. The flow and timing domain raised the most concern, with several studies having a high or unclear risk of bias due to variable intervals between tests and incomplete application of the reference standard (partial verification). These domain-level concerns may affect the overall internal validity and reliability of the evidence and introduce risk of bias. While the authors of most studies blinded interpretation, some included operators with limited training or lacked detail on test conduct.

Primary Studies

Patient Selection

The 11 studies21-31 comparing asynchronous TUS with standard in-person ultrasound mostly used appropriate methods for patient selection. However, 2 studies22,30 provided limited or no information about patient characteristics and sampling methods, which may have introduced an uncertain risk of bias and limits our understanding of the generalizability of the findings. The patients included in each study correspond to the population of interest in this report.

Index Test

In all 11 studies, the choice of index test (i.e., TUS with asynchronous review) aligned with those targeted by this review. In 10 studies,21-26,28-31 the index test results were interpreted without knowledge of the reference standard results, minimizing potential bias from prior knowledge. It was unclear if this was the case in the other study.27

Seven studies included ultrasound-naive operators who underwent a training protocol to administer the TUS exam.22,23,25,26,28,29,31 It is unclear whether their limited training and experience may have affected the applicability of the index test. Additionally, for 1 multicentre study,21 it was unclear whether all patients received the same index test due to variability in ultrasound equipment across participating sites. Overall, the included studies showed a high risk of bias for the index test.

Reference Standard

In 10 of the 11 studies, the reference standard (i.e., in-person delivered and interpreted ultrasound) matched those targeted by this review.21-23,25-31 In 1 of the studies, a consensus vote approach was used to establish the reference standard.24 In 8 of the 11 studies,21-23,25-27,30,31 the reference standard results were interpreted without knowledge of the results of the index test, reducing potential bias. In all studies, a reference standard likely to correctly classify the target outcomes was used, except for 1 study24 in which the authors reported operator and environmental (i.e., COVID-19) factors that may have impacted the classification accuracy of the reference standard. Overall, across all included studies, there was a high risk of bias related to the reference standard.

Flow and Timing

In 722,24-26,29-31 of the 11 studies, the time frame between interpretation of the index tests and reference standards was unclear. The unknown length of time between the index test and reference standard could lead to different results, potentially reflecting changes in a patient’s condition during that time rather than inaccuracies in the index test. In these 7 studies, the appropriateness of the time frame between the index test and reference standard could not be assessed. The authors of 1 study reported an average time frame of 2 days21 for interpretation of the index test, whereas the authors of 2 other studies reported a time frame of up to 2 months27,28 for both collection and interpretation of the reference standard. The definition of an appropriate time frame may vary by target condition and requires clinical input, which was not available. In 921-27,30,31 of the 11 studies, all patients received the reference standard. In the other 2 studies, only patients who screened positive on the index test (TUS protocol) underwent a confirmatory in-person ultrasound, which may have introduced bias.28,29 In 1 of these 2 studies,29 56% of patients who screened positive attended the follow-up in-person ultrasound, which may have introduced bias due to potential differences in the population lost to follow-up. Overall, across all included studies, there was a high risk of bias related to the flow and timing of the study.

Limitations

This report is limited in part by the quality of the primary studies, several of which are at risk of bias due to important limitations outlined in the critical appraisal section. Most notable, the exact time interval between the interpretation of the TUS and in-person ultrasound exams was unclear in most of the studies reviewed in this report. Overall, 9 studies were judged to have unclear risk of bias in at least 1 domain of the QUADAS-2, whereas 2 were assessed as having a high risk of bias.

For the included studies, clinical experts for each target condition were not consulted to confirm the appropriateness of the reference standard used. Additionally, there was a lack of clarity regarding what is considered an acceptable balance of sensitivity and specificity for these conditions. There is additional uncertainty in the QUADAS-2 appraisal about items requiring clinical expertise to determine whether the reference standard was likely to correctly classify the target condition. In the absence of clinical input, it was assumed that traditional ultrasound was an appropriate reference standard (i.e., likely to correctly classify the target condition) in all cases.

All the studies included in this review compared the image quality and the agreement and accuracy between diagnoses obtained using asynchronous TUS and standard in-person ultrasound. Our search did not identify any studies, or a very limited number of studies, within the inclusion time frame that explored outcomes relating to patient care quality, such as direct patient outcomes. These include outcomes outside of diagnostic results, such as the quality and safety of care, access to care, and how the index test performs in real-world health care settings. Notably, no studies used survey or interview tools to capture patient and ultrasound-naive operator experiences with TUS.

A further limitation of this review is the variability in image acquisition methods used across the included studies. Four studies employed VSI to standardize scan protocols, whereas the others used varied techniques. Differences in imaging methods as well as in the expertise of operators and interpreting physicians across studies may impact the comparability of results and limit conclusions about the diagnostic accuracy, agreement, and service quality of asynchronous TUS when implemented at scale. The use of different imaging methods may have led to varying goals and expectations for those involved in the studies (e.g., ultrasound operator, interpreting physician). Furthermore, the studies reviewed in this report varied in both purpose and the model of care, which may not accurately reflect and be generalizable to practice models in Canada.

The literature search was limited to English-language articles and articles published within the past 5 years. Therefore, the results and conclusions are not comprehensive of all available evidence answering the review questions. The results may have differed if all available evidence had been reviewed. Additionally, this report used a single-reviewer approach for study selection, data extraction, and risk of bias appraisal. This may have increased the risk for bias and error in these processes.

Conclusions and Implications for Decision- or Policy-Making

We reviewed the clinical evidence from 11 primary studies comparing asynchronous remote interpretation of TUS with standard in-person administered and interpreted ultrasound for various target conditions. Based on the literature search conducted for this review, we identified evidence about diagnostic accuracy, image quality, clinical utility, and acceptance. We did not identify any evidence about patient care quality that met our inclusion criteria. Additionally, although several position statements relevant to the use of TUS have been released, we did not identify any evidence-based guidelines to inform clinical practice.34-36

The role and scope of ultrasound imaging specialists and operators vary worldwide. The studies in our report included a range of imaging professionals who interpreted the TUS and in-person exams, including radiologists, specialty physicians, and family physicians with ultrasound experience. TUS and standard in-person ultrasound operators included sonologists, radiologists, physicians with and without ultrasound experience, medical trainees, and specialty physicians. The range of professionals included in this review reflects the variable exam protocols, expectations, and goals relevant to each study’s context, which may not be generalizable to the context in Canada.

Several studies identified in this report showed unclear or high risk of bias relating to patient selection, index test, reference standard, and flow and timing, which reduced the confidence in the studies’ conclusions.

Overall, asynchronous TUS was found to be a diagnostically accurate alternative to the standard in-person model of ultrasound for identifying certain targeted conditions. Asynchronous TUS was accepted by patients and clinicians in the 2 studies that examined these outcomes.27,30 Notably, the included studies performed a wide range of exam types (i.e., abdominal, thyroid, obstetrics, dermatological, and cardiac exams) and included both comprehensive and point-of-care exams, highlighting the growing role and expanding application of TUS in clinical practice.

High levels of diagnostic accuracy (e.g., sensitivity and specificity), diagnostic agreement, and exam image quality were reported across several, but not all, studies. The quality of evidence and conclusions are impacted by bias concerns, the risk of poor image quality, heterogeneity in results, and limited volume of evidence.

Notably, while 622,25,26,28,29,31 of the 11 included studies assessed TUS in low-resource settings, there was a lack of information on the experiences of patients, operators, and teleconsultants regarding factors like acceptability, accessibility, and comfort.

Considering the current limitations in the recent body of evidence, future well-designed and larger-scale studies may be needed to evaluate the quality of care provided by asynchronous TUS beyond feasibility and diagnostic outcomes. This includes exploring patient perspectives on accessibility (equitable access to services, financial burden) and personal preference and expectations, and incorporating surveys and qualitative methods into study designs to examine the impact on outcomes important to patients.

Beyond the current evidence, researchers may consider collecting equity-relevant population characteristics (e.g., gender, education, socioeconomic status, place of residence) to assess potential health disparities related to accessing ultrasound services. Researchers may also consider that equity-deserving populations, such as Indigenous communities, racialized groups, and newcomers to Canada, may face unique barriers to accessing ultrasound services. Therefore, researchers may consider efforts to recruit individuals from diverse groups in future studies. Studies that examine the real-world community, unmet clinical need, and health system impact of asynchronous TUS would also support a better understanding of the role of TUS for increasing access to services and providing timely and accurate diagnoses, particularly in resource-limited settings.3,11

Decision-makers may consider how closely the training, scope of practice, and roles of personnel in their local setting align with those in the reviewed studies. The diagnostic accuracy of asynchronous TUS may depend on the skill level of both image acquirers and interpreters. Therefore, policy or implementation decisions may benefit from being accompanied by clear protocols for training, credentialing, and oversight to ensure diagnostic accuracy, patient safety, and risk management.

References

1.Dearing E, Boniface K. Tele-Ultrasound. In: Sikka N, ed. A Practical Guide to Emergency Telehealth. Oxford University Press; 2021:chap 23.

2.Pian L, Gillman LM, McBeth PB, et al. Potential Use of Remote Telesonography as a Transformational Technology in Underresourced and/or Remote Settings. Emerg Med Int. 2013;2013(1):986160. doi:10.1155/2013/986160 PubMed

3.Adams SJ, Burbridge B, Obaid H, Stoneham G, Babyn P, Mendez I. Telerobotic Sonography for Remote Diagnostic Imaging: Narrative Review of Current Developments and Clinical Applications. Review. J Ultrasound Med. Jul 2021;40(7):1287-1306. doi:https://dx.doi.org/10.1002/jum.15525 PubMed

4.Bhide A, Datar S, Stebbins K. Ultrasound Imaging - Cheap, Versatile, and Safe (Working Paper 20-003). Harvard Business School; 2020. Accessed 2024 Oct 17. https://www.hbs.edu/ris/Publication%20Files/20-003_8157a0c0-71c9-4f6a-88c7-98cba5294123.pdf

5.Dowdy DL, Harris RD. Tele-Ultrasound: Meeting Global Imaging Challenges. Applied Radiology. 2024;53(1):38-41. doi:10.37549/ar2949

6.Duarte ML, Dos Santos LR, Iared W, Peccin MS. Telementored ultrasonography: a narrative review. Review. Sao Paulo Med J. 2022;140(2):310-319. doi:https://dx.doi.org/10.1590/1516-3180.2020.0607.R2.15092021 PubMed

7.Micks T, Sue K, Rogers P. Barriers to point-of-care ultrasound use in rural emergency departments. CJEM. Nov 2016;18(6):475-479. doi:10.1017/cem.2016.337 PubMed

8.Executive Summary. Health Sciences Association of British Columbia; 2016. Accessed 2024 Oct 8. https://www.hsabc.org/sites/default/files/uploads/Executive%20Summary_1.pdf

9.Strategic Plan 2023-2025. Sonography Canada; 2022. Accessed 2024 Oct 8. https://sonographycanada.ca/app/uploads/2022/11/Sonography-Canada-Strategic-Plan-2023-2025-Member-version.pdf

10.Britton N, Miller MA, Safadi S, Siegel A, Levine AR, McCurdy MT. Tele-Ultrasound in Resource-Limited Settings: A Systematic Review. Systematic Review. Front. 2019;7:244. doi:https://dx.doi.org/10.3389/fpubh.2019.00244 PubMed

11.Barberato SH, Lopes M. Echoes of Telecardiology Guideline. Arq Bras Cardiol. 01 2020;114(1):130-132. doi:https://dx.doi.org/10.36660/abc.20190720

12.Hanna TN, Steenburg SD, Rosenkrantz AB, Pyatt RS, Jr., Duszak R, Jr., Friedberg EB. Emerging Challenges and Opportunities in the Evolution of Teleradiology. AJR Am J Roentgenol. Dec 2020;215(6):1411-1416. doi:10.2214/ajr.20.23007 PubMed

13.Uschnig C, Recker F, Blaivas M, Dong Y, Dietrich CF. Tele-ultrasound in the Era of COVID-19: A Practical Guide. Review Research Support, Non-U.S. Gov't. Ultrasound Med Biol. 06 2022;48(6):965-974. doi:https://dx.doi.org/10.1016/j.ultrasmedbio.2022.01.001

14.Chen RJ. Teleultrasound in Remote and Austere Environments. Journal of Mobile Technology in Medicine. 2023;9(1):43-47. doi:10.7309/jmtm.9.1.5

15.Marsh-Feiley G, Eadie L, Wilson P. Telesonography in emergency medicine: A systematic review. PLoS ONE. 2018;13(5):e0194840. doi:10.1371/journal.pone.0194840 PubMed

16.Salerno A, Tupchong K, Verceles AC, McCurdy MT. Point-of-Care Teleultrasound: A Systematic Review. Systematic Review. Telemed J E Health. 11 2020;26(11):1314-1321. doi:https://dx.doi.org/10.1089/tmj.2019.0177

17.Constantinescu EC, Nicolau C, Săftoiu A. Recent Developments in Tele-Ultrasonography. Curr Health Sci J. Apr-Jun 2018;44(2):101-106. doi:10.12865/chsj.44.02.01 PubMed

18.Tenajas R, Miraut D, Illana CI, Alonso-Gonzalez R, Arias-Valcayo F, Herraiz JL. Recent Advances in Artificial Intelligence-Assisted Ultrasound Scanning. Applied Sciences. 2023;13(6):3693.

19.Li XL, Sun YK, Wang Q, et al. Synchronous tele-ultrasonography is helpful for a naive operator to perform high-quality thyroid ultrasound examinations. Ultrasonography. Oct 2022;41(4):650-660. doi:https://dx.doi.org/10.14366/usg.21204 PubMed

20.Whiting PF, Rutjes AW, Westwood ME, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. Oct 18 2011;155(8):529-36. doi:10.7326/0003-4819-155-8-201110180-00009 PubMed

21.Alfageme F, Minguela E, Martinez C, et al. Dermatologic Ultrasound in Primary Care: A New Modality of Teledermatology: A Prospective Multicenter Validation Study. Multicenter Study. J Ultrasound Med. Feb 2021;40(2):351-356. doi:https://dx.doi.org/10.1002/jum.15409 PubMed

22.Dougherty A, Kasten M, DeSarno M, et al. Validation of a Telemedicine Quality Assurance Method for Point-of-Care Obstetric Ultrasound Used in Low-Resource Settings. J Ultrasound Med. Mar 2021;40(3):529-540. doi:https://dx.doi.org/10.1002/jum.15429 PubMed

23.Hjorth-Hansen AK, Andersen GN, Graven T, et al. Feasibility and Accuracy of Tele-Echocardiography, With Examinations by Nurses and Interpretation by an Expert via Telemedicine, in an Outpatient Heart Failure Clinic. J Ultrasound Med. Dec 2020;39(12):2313-2323. doi:https://dx.doi.org/10.1002/jum.15341 PubMed

24.Lu J, Lin J, Yin L, et al. Using remote consultation to enhance diagnostic accuracy of bedside transthoracic echocardiography during COVID-19 pandemic. Echocardiography. 08 2021;38(8):1245-1253. doi:https://dx.doi.org/10.1111/echo.15124

25.Marini TJ, Oppenheimer DC, Baran TM, et al. Testing telediagnostic right upper quadrant abdominal ultrasound in Peru: A new horizon in expanding access to imaging in rural and underserved areas. Research Support, Non-U.S. Gov't. PLoS ONE. 2021;16(8):e0255919. doi:https://dx.doi.org/10.1371/journal.pone.0255919 PubMed

26.Marini TJ, Weiss SL, Gupta A, et al. Testing telediagnostic thyroid ultrasound in Peru: a new horizon in expanding access to imaging in rural and underserved areas. J Endocrinol Invest. Dec 2021;44(12):2699-2708. doi:https://dx.doi.org/10.1007/s40618-021-01584-7 PubMed

27.Morel B, Hellec C, Fievet A, et al. Reliability of 3-D Virtual Abdominal Tele-ultrasonography in Pediatric Emergency: Comparison with Standard-of-Care Ultrasound Examination. Ultrasound Med Biol. Nov 2022;48(11):2310-2321. doi:10.1016/j.ultrasmedbio.2022.07.004 PubMed

28.Nascimento BR, Beaton AZ, Nunes MCP, et al. Integration of echocardiographic screening by non-physicians with remote reading in primary care. Research Support, Non-U.S. Gov't. Heart. 02 2019;105(4):283-290. doi:https://dx.doi.org/10.1136/heartjnl-2018-313593

29.Nascimento BR, Sable C, Nunes MCP, et al. Echocardiographic screening of pregnant women by non-physicians with remote interpretation in primary care. Research Support, Non-U.S. Gov't. Fam Pract. 06 17 2021;38(3):225-230. doi:https://dx.doi.org/10.1093/fampra/cmaa115

30.Nieto-Calvache AJ, Benavides-Calvache JP, Aryananda R, et al. Telemedicine ultrasound assessment for placenta accreta spectrum: Utility and interobserver reliability of asynchronous remote imaging review. Int J Gynaecol Obstet. Mar 2025;168(3):1191-1203. doi:10.1002/ijgo.15991 PubMed

31.Toscano M, Marini TJ, Drennan K, et al. Testing telediagnostic obstetric ultrasound in Peru: a new horizon in expanding access to prenatal ultrasound. BMC Pregnancy Childbirth. Apr 26 2021;21(1):328. doi:https://dx.doi.org/10.1186/s12884-021-03720-w PubMed

32.Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. J Clin Epidemiol. 2009;62(10):e1-e34. PubMed

33.Hartling L HM, Milne A, et al. Validity and Inter-Rater Reliability Testing of Quality Assessment Instruments. Agency for Healthcare Research and Quality; 2012. Accessed 19 Sep 2025. https://www.ncbi.nlm.nih.gov/books/NBK92287/

34.Radiologist CAo. CAR Position Statement on Remote Reporting. 2025 Apr 29, https://car.ca/wp-content/uploads/2021/12/RR-Position-Statement_forPDF-1.pdf

35.Wei P-F. Chinese recommendations for the implementation of bedside echocardiography and remote consultation in patients with coronavirus disease 2019. Chin Med J (Engl). Dec 5 2020;133(23):2847-2849. doi:10.1097/cm9.0000000000001222 PubMed

36.Radiology ACo. ACR PRACTICE PARAMETER FOR RADIOLOGIST COVERAGE OF IMAGING PERFORMED IN HOSPITAL EMERGENCY DEPARTMENTS. 2023. https://gravitas.acr.org/PPTS/DownloadPreviewDocument?DocId=22

Appendix 1: Selection of Included Studies

Please note that this appendix has not been copy-edited.

Figure 1: Selection of Included Studies

A flow chart showing the number of records identified for review, excluded, and studies included in the final report. Out of 692 records from the electronic literature search and grey literature reports identified, 11 studies were included in this review.

Appendix 2: Characteristics of Included Studies

Please note that this appendix has not been copy-edited.

Table 2: Characteristics of Included Studies

Study citation, country, funding source

Study design, outcomes

Population characteristics

Index test and reference standard

Nieto-Calvache et al. (2024)30

Country of publication: Colombia

Funding source: None

Cross-sectional cohort agreement study

Type of ultrasound: Generala

Patient sample size: 5

Relevant outcomes:

  • diagnostic agreement

  • image quality

  • clinical utility and acceptance

Patients from Colombia and Indonesia treated for placenta accreta spectrum (PAS) requiring transabdominal and transvaginal ultrasound for prenatal staging

Index test: Asynchronous TUS interpretation with simulated consultation by 12 experts located in 11 countries: Argentina, Brazil, Colombia, Egypt, England, Ghana, Ireland, Italy (2), Taiwan, US (2)

Teleultrasound interpreter: Imaging medical expert with experience in PAS

Reference standard: Standard in-person ultrasound interpretation

Standard ultrasound interpreter: Onsite physician sonologist

Morel et al. (2022)27

Country of publication: France

Funding source: NR

Cross-sectional cohort diagnostic accuracy study

Type of ultrasound: 3D Ultrasound Aplio i800 (Canon Medical System)

Patient sample size: 103

Relevant outcomes:

  • diagnostic measures (sensitivity, specificity, PPV, NPV, reliability, agreement)

  • image quality

  • clinical utility and patient acceptance

Children undergoing ultrasound for abdominal pain in 2 hospitals in France (1 university and 1 regional hospital)

Mean age, months ± SD:

  • 125.0 ± 57.0 months

  • (range 0 to 212)

Sex, n (%):

  • Female: 39 (48%)

  • Male: 54 (52%)

Index test: Asynchronous 3D TUS interpretation

Teleultrasound interpreter: 1 senior radiologist with 7 years of experience and 1 senior resident located at a university hospital

Reference standard: Standard in-person ultrasound and interpretation

Traditional ultrasound interpreter: Senior pediatric radiologist

Alfageme et al. (2021)21

Country of publication: Spain

Funding source: None reported

Cross-sectional cohort diagnostic accuracy study

Type of ultrasound: Dermatological ultrasound

Patient sample size: 143

Relevant outcomes:

  • diagnostic measures (discordance, sensitivity, specificity, PPV, NPV)

  • image quality

Adult patients in Spain aged from 18 to 70 years with palpable nodular skin lesions

Mean age, years ± SD:

  • 47.0 ± 23.0

Sex, n (%):

  • Female: 93 (65%)

  • Male: 50 (35%)

Index test: Asynchronous TUS interpretation of images obtained from 6 primary care centres by family physicians with a minimum of 5 years of ultrasound experience

Teleultrasound interpreter: 4 dermatologists with at least 5 years of dermatological ultrasound experience located at a tertiary centre

Reference standard: Standard in-person ultrasound interpretation

Traditional ultrasound interpreter: Dermatologists at a tertiary centre

Marini et al. (2021)25

Country of publication: Peru

Funding source: Various

Cross-sectional cohort diagnostic accuracy study

Type of ultrasound: Generala (Mindray DP-10 ultrasound machine)

Sample size: 144

Relevant outcomes:

  • diagnostic agreement

  • image quality

  • visualization

Adult patients located at a health centre in Peru requiring ultrasound examination of the right upper abdominal quadrant

Mean age, years (range):

  • 43.9 (18 to 90)

Sex, n (%):

  • Male: 15 (10.4%)

  • Female: 129 (89.6%)

Index test: Asynchronous TUS interpretation of VSI abdominal exam clips obtained by 2 ultrasound-naive operators (nurse and care technician) in Peru

Teleultrasound interpreter: 2 board-certified abdominal fellowship-trained radiologists in the US

Reference standard: Standard in-person ultrasound and interpretation by radiologist

Traditional ultrasound interpreter: Radiologist in Peru with more than 10 years of experience

Marini et al. (2021)26

Country of publication: Peru

Funding source: Various

Cross-sectional cohort diagnostic accuracy study

Type of ultrasound: Generala (Mindray DP-10 ultrasound machine)

Patient sample size: 121

Relevant outcomes:

  • diagnostic agreement

  • image quality

  • visualization

Adult patients visiting a health centre in Peru for various reasons

Mean age, years ± SD:

  • 33.3 ± 15

Sex, n (%):

  • Female: 119 (98.3%)

  • Male: 2 (1.7%)

Index test: Asynchronous TUS interpretation of VSI thyroid exam clips obtained by 2 ultrasound-naive operators (nurse and care technician) in Peru

Teleultrasound interpreter: 2 board-certified fellowship-trained radiologists in the US with 7 and 40 years of experience, respectively

Reference standard: Standard in-person ultrasound and interpretation by radiologist

Traditional ultrasound interpreter: Experienced radiologist in Peru

Toscano et al. (2021)31

Country: Peru

Funding source: Various

Cross-sectional cohort diagnostic accuracy study

Type of ultrasound: Generala (Mindray DP-10 ultrasound machine)

Sample size: 126

Relevant outcomes:

  • diagnostic agreement

  • image quality/visualization

Second or third trimester patients visiting a health centre in Peru for an obstetric ultrasound exam

Mean age, years:

  • 25.7

Trimester:

  • 68 second trimester

  • 58 third trimester

Index test: Asynchronous TUS interpretation of obstetric VSI exam clips obtained by 2 ultrasound-naive operators (nurse and care technician) in Peru

Teleultrasound Interpreter: An experienced maternal-fetal medicine fellow in the US

Reference standard: Standard in-person ultrasound and interpretation by radiologist

Traditional ultrasound interpreter: Experienced radiologist in Peru

Lu et al. (2021)24

Country of publication: China

Funding source: Medical Research Council

Cross-sectional cohort diagnostic accuracy study (consensus reference standard)

Type of ultrasound: B-TTE

Patient sample size: 30

Relevant outcomes:

  • diagnostic agreement

  • reliability

  • misdiagnosis

Patients admitted and treated for COVID-19 at a health centre in China requiring B-TTE

Mean age, years ± SD (range):

  • 52 ± 15 (25 to 87)

Sex, n (%):

  • Female: 14 (98.3%)

  • Male: 16 (1.7%)

Index test: Asynchronous interpretation of B-TTE exams obtained by 5 frontline ECG physicians

Teleultrasound interpreter: 2 ECG remote consultants with specialized ultrasound training (associate chief physician and chief physician)

Reference standard: In-person B-TTE and interpretation (consensus reference standard)

Traditional ultrasound interpreter:

Ultrasound-qualified physicians with more than 5 years of experience

Nascimento et al. (2021)29

Country of publication: Brazil

Funding source: Various

Cross-sectional cohort diagnostic accuracy study with partial verification

Type of ultrasound: Hand-held (GE-VSCAN)

Patient sample size: 56b of 1,112

Relevant outcomes:

  • diagnostic agreement

Pregnant patients undergoing screening ECG for heart disease at 22 primary care centres located in Brazil

  • Mean age, years ± SD:

  • 27 ± 8

Mean gestational

age, weeks ± SD:

  • 22 ± 9

Index test: Asynchronous remote interpretation of ECG exams obtained by ultrasound-naive health care workers in Brazil

Teleultrasound interpreter: 3 experts (1 in Brazil and 2 in the US)

Reference standard: In-person ECG exam and interpretation by expert (in cases of positive significant screening results)

Traditional ultrasound interpreter: Imaging experts (physicians) located at primary care centres

Dougherty et al. (2020)22

Country of publication: US

Funding source: NR

Cross-sectional cohort diagnostic accuracy study

Type of ultrasound: Voluson E8 system (GE HealthCare) and GE LOGIQ

Patient sample size: 113

Relevant outcomes:

  • diagnostic agreement

  • fetal number

  • fetal presentation

Second trimester (14 to 16 weeks) outpatients at a university medical centre in the US requiring an obstetric ultrasound exam

Index test: Asynchronous TUS interpretation of obstetric VSI exam clips obtained by 2 ultrasound-naive operators with training (fourth-year medical students)

Teleultrasound interpreter: 3 physicians with specialized training in obstetric ultrasound with between 3 and 25 years of experience

Reference standard: Standard in-person ultrasound by trained sonographer

Traditional ultrasound interpreter: Maternal-fetal medicine specialist or radiologist

Hjorth-Hansen et al. (2020)23

Country of publication: Norway

Funding source: NR

Cross-sectional cohort diagnostic accuracy study

Type of ultrasound: Vivid 7 scanner (GE HealthCare)

Patient sample size: 50

Relevant Outcomes:

  • diagnostic agreement

Adult patients located at an outpatient heart failure clinic in Norway requiring ECG

Mean age, years (range):

  • 79 (33 to 95)

Sex, n (%):

  • Female: 23 (46%)

  • Male: 27 (54%)

Index test: Asynchronous TUS interpretation of images obtained by 3 specialized nurses with 6 to 12 years of heart failure experienced trained to use ECG

Teleultrasound interpreter: Out-of-hospital cardiologist

Reference standard: Standard in-person ultrasound immediately after teleultrasound scan

Traditional ultrasound interpreter: 3 cardiologists and 1 experienced resident

Nascimento et al. (2019)28

Country of publication: Brazil

Funding source: Various

Cohort diagnostic accuracy study with partial verification

Type of ultrasound: Hand-held (GE-VSCAN)

Sample size: 85 of 1,004c

Relevant outcomes:

  • diagnostic agreement

Pregnant patients undergoing screening ECG for heart disease at 16 primary care centres located in Brazil

Index test: Asynchronous remote interpretation of ECG exams obtained by 20 ultrasound-naive health care workers in Brazil (12 technicians, 4 nurses, 4 physicians)

Teleultrasound Interpreter: 3 experts (1 in Brazil and 2 in the US)

Reference standard: In-person portable ECG exam and interpretation by expert (within 60 days, in cases of positive significant screening results)

Traditional ultrasound interpreter: Imaging experts located at primary care centres

B-TTE = bedside transthoracic echocardiographic examination; ECG = electrocardiogram; NPV = negative predictive value; NR = not reported; PAS = placenta accreta spectrum; PPV = positive predictive value; SD = standard deviation; TUS = teleultrasonography; VSI = volume sweep imaging.

aGeneral ultrasound refers to the use of ultrasound in a broad range of clinical applications, including organ-specific exams.

bA total of 56 of 1,112 patients screened with the TUS protocol underwent a follow-up confirmatory in-person ultrasound.

cA total of 85 of 1,004 patients screened with the TUS protocol underwent a follow-up confirmatory in-person ultrasound.

Appendix 3: Main Study Findings

Please note that this appendix has not been copy-edited.

Table 3: Summary of Findings

Primary study

TUS diagnostic accuracy and agreement

TUS image quality

Clinical experience

Nieto-Calvache et al. (2024)30

  • 71.7% (38/53) of teleconsultations aligned with topographical prognosis by local sonologist

  • 28.3% (15/53) of teleconsultations differed from local sonologist: lesion severity was overestimated in 9 cases and there was a failure to detect signs of PAS in 6 patients

  • Substantial agreement between the 10 teleconsultants (kappaa = 0.63)

  • Variable agreement between the teleconsultation and in-person ultrasound diagnosis, with near perfect agreement in 5 of 10 teleconsultants (kappa = 0 to 1)

  • 16.7% (2/12) of teleconsultants considered the quality of ultrasound images to be suboptimal

  • 84.6% (11/13) of teleconsultants considered TUS protocol useful for surgical planning

  • 75% (9/12) of teleconsultants considered evaluating ultrasound images remotely no more difficult than evaluating in person

  • 100% of teleconsultants reported that telemedicine can facilitate the prenatal assessment of patients at risk for PAS

Morel et al. (2022)27

  • Comparing the conclusions of the standard and the teleultrasound examinations:

  • Senior TUS radiologist: sensitivity = 86%, specificity = 95%, PPV and NPV = 92%,

  • TUS resident: sensitivity = 84%, specificity = 92%, PPV = 86%, NPV = 91%

  • The agreement between the TUS consultations and standard ultrasound diagnoses was almost perfect (kappa = 0.82; range = 0.72 to 0.92) and substantial (kappa = 0.71; range = 0.58 to 0.82) for the senior radiologist and resident, respectively

  • Excellent interobserver agreement between the index test and reference standard for images for quantitative variables (i.e., appendix, ileitis, fallopian tube, pylorus muscle) in pathological cases (ICC = 0.99, 95% CI = 0.98 to 0.99)

  • The quality of the imagesb was considered good to excellent in 84% (radiologist) and 70% (resident) of cases

  • Images were obtained without objections from the children, their parents, or the operators in more than 95% of cases

Alfageme et al. (2021)21

  • In-person and TUS diagnoses were concordant in 95.7% of consultations

  • All malignant tumours were detected (sensitivity = 100%); 2 cases of benign lesions were telediagnosed as malignant (specificity = 97.8%)

  • The PPV and NPV were 90% and 100%, respectively

  • Image qualityc was insufficient to include for evaluation in 6.1% (9/147) of consultations

  • 4.2% (6/144) of TUS consultations were not valid due to low-quality images that required in-person ultrasound evaluation

Marini et al. (2021)25

  • Almost perfect agreement reported between the index test and reference standard for the following variables:

  • Liver echogenicity and pancreas abnormalities: 100% agreement (kappa = 1; 95% CI, 1 to 1)

  • Liver abnormalities: 98.9% agreement (kappa = 0.8; 95% CI, 0.41 to 1.2)

  • Normal or abnormal abdominal exam: 94.5% agreement (kappa = 0.84; 95% CI, 0.7 to 0.98)

  • Gallbladder abnormalities: 86.8% (kappa = 0.69; 95% CI, 0.55 to 0.83)

  • Right kidney abnormalities: 86.2% (kappa = 0.13; 95% CI, -0.11 to 0.37)

  • Among TUS exams considered “acceptable” or “excellent” image quality, the sensitivity for detecting cholelithiasis was 93% (95% CI = 68.1% to 99.8%), and the specificity was 97% (95% CI = 89.5% to 99.6%)

  • Among all exams, the sensitivity for detecting cholelithiasis was 84.2% (95% CI, 60.4% to 96.6%), and the specificity was 97.7% (95% CI, 91.9 to 99.7%).

Image qualityd of exams rating (%):

  • poor = 36.8%

  • acceptable = 38.9%

  • excellent = 24.3%

Marini et al. (2021)26

  • Excellent agreement (98.3%) on the presence of a nodule between TUS and standard-of-care thyroid imaging protocols (kappa = 0.91; 95% CI, 0.78 to 1)

  • There was fair to moderate agreement on thyroid size between the index test and reference standard (ICC = 0.37 to 0.58) and no significant difference in nodule size (TUS: 9.8 mm ± 5.2 mm; standard: 10.1 mm ± 8 mm)

  • The Bland-Altman bias varied between 2.84 and 1.07, indicating that VSI tended to result in larger measurements compared to standard ultrasound

  • 88% of exams were rated as having excellent image qualitye

  • Visualization of thyroid structures was complete or near complete for most exams

Toscano et al. (2021)31

  • TUS diagnosis showed excellent agreement with standard-of-care ultrasound for identifying the number of fetuses (100% agreement), fetal presentation (95.8% agreement, kappa = 0.78), placental location (85.6% agreement, kappa = 0.74), and assessment of normal/abnormal amniotic fluid volume (99.2% agreement)

  • Sensitivity and specificity were > 95% for all variables

  • High agreement between VSI and standard ultrasound or most exams (95.2% agreement, kappa = 0.55)

  • 76.2% agreement between VSI and standard ultrasound on the status of the fetus (i.e., confirmation of a live fetus)

  • ICC between VSI and standard protocol was excellent for overall estimated gestational age (ICC = 0.95) and good or excellent for all fetal biometric measurements (ICC = 0.81 to 0.95)

  • 99% of imaging clips were considered of “acceptable” (38.1%) or “excellent” (61.1%) qualityf

Lu et al. (2021)24

  • Remote B-TTE showed superiority in diagnostic accuracy compared to the onsite B-TTE protocol:

  • Diagnostic reliability of onsite B-TTE was very weak to weak (kappa = 0.304 to 0.6; agreement = 63.3%)

  • Onsite B-TTE assessment resulted in a misdiagnosis rate of 36.7% for major abnormalities (11/30)

Nascimento et al. (2021)29

  • There was 80.4% (45/56) agreement between the TUS and the in-person follow-up ultrasound for identifying major heart disease

  • Agreement was 82.2%, and 40% for mitral valve disease and aortic valve disease, respectively

Dougherty et al. (2020)22

The agreements among remote readers and the in-person readers varied according to placental location and fetal numbers:

  • Excellent agreement for the anterior and posterior variables (kappa = 0.81 to 0.88 and kappa = 0.77 to 0.9, respectively), with sensitivity and specificity values ≥ 0.85

  • Agreement was slight or fair (kappa = 0 to 0.39) for other placental locations where the placenta attaches to the uterus (left, right, fundal, low), and the sensitivity and specificity ranged widely

  • Excellent agreement for fetal number comparisons (kappa = 0.82 to 1, sensitivity = 0.83 to 1, and specificity = 0.99 to 1)

  • Moderate agreement for fetal presentation comparisons (kappa = 0.43 to 0.49)

Hjorth-Hansen et al. (2020)23

  • The agreement with reference measurements was reported as very high by the telemedical approach; substantial agreement with the reference for classifying heart failure (weighted kappa = 0.73)

  • TUS approach for assessing and quantifying left atrial and left ventricular measurements and functional indices was feasible in 94% of cases

  • Feasibility was high for 86.7% of all measured indices (13/15), while significant differences between TUS and in-person methods were reported in 26.7% of indices (4/15)

  • Sensitivity and specificity to detect moderate mitral stenosis, mitral regurgitation, and tricuspid regurgitation were 100% and 95% or higher, respectively

  • Sensitivity and specificity to detect moderate aortic stenosis were 43% and 97%, respectively

Nascimento et al. (2019)28

  • There was 78.8% (67/85) agreement between TUS and the in-person follow-up ultrasound for identifying significant heart disease

B-TTE = bedside transthoracic echocardiographic examination; CI = confidence interval; ICC = intraclass correlation coefficient; NPV = negative predictive value; P = P value, PAS = placenta accreta spectrum; PPV = positive predictive value; TUS = teleultrasonography; VSI = volume sweep imaging; .

aFleiss' Kappa coefficient: the strength of agreement was classified as “poor” (0.0), “slight” (0.01 to 0.2), “fair” (0.21 to 0.4), “moderate” (0.41 to 0.6), “substantial” (0.61 to 0.80), and “almost perfect” (0.81 to 1).

bThe quality of the 3D acquisitions was qualitatively assessed on a 4-point Likert scale (“technical issue,” “poor,” “good,” or “excellent”).

cThe technical quality of the images was classed as being of sufficient or insufficient quality for the teleultrasound evaluation.

dThe overall image quality was rated as “excellent,” “acceptable,” or “poor.” “Excellent” examinations showed complete or nearly complete visualization of the liver and gallbladder with appropriate imaging quality. “Acceptable” examinations showed nearly complete or partial visualization of the liver and gallbladder with appropriate imaging quality. “Poor” examinations showed only partial or inadequate visualization of the liver and gallbladder with image quality limiting evaluation.

eImage quality was rated as “excellent,” “acceptable,” or “poor.” Excellent examinations visualized the entire or nearly entire thyroid gland with good brightness and resolution. Acceptable examinations visualized > 80% of the thyroid or had slightly limited brightness/resolution but still allowed for adequate gland or nodule assessment. Poor examinations visualized less than 80% of the total thyroid or were nondiagnostic.

fImage quality was assigned using 3-point Likert scale (1 = “excellent,” 2 = “acceptable,” or 3 = “poor”) as was reader confidence in imaging findings (1 = “confident,” 2 = “intermediate,” 3 = “not confident”).

Appendix 4: Critical Appraisal of Included Studies

Please note that this appendix has not been copy-edited.

Figure 2: Results of the QUADAS-2 Risk of Bias Assessment by Domain

Visual summary of the results of the QUADAS-2 risk of bias assessment for all 11 included studies. Each study was evaluated on 4 domains: 1) Patient selection, 2) Index test, 3) Reference standard, and 4) Flow and timing. Each domain was rated as having either a “low,” “unclear,” or “high” risk of bias. Across all included studies, the “patient selection” domain showed the lowest risk of bias, while the “flow and timing” domain had the highest risk of bias, and the “index test” domain had the most unclear risk of bias.

QUADAS-2 = Quality Assessment of Diagnostic Accuracy Studies 2.

1Marini et al. (2021)25

2Marini et al. (2021)26

Table 4: Risk of Bias Assessment of Included Studies

Study citation

Strengths

Limitations

Nieto-Calvache et al. (2024)30

  • Patients were randomly selected.

  • A case-control design was avoided and patients were not inappropriately excluded

  • The study population, index test, and reference standard match those targeted by the review.

  • The reference standard results were interpreted without knowledge of the results of the index test.

  • The index test results were interpreted without knowledge of the reference standard.

  • All participants received the same reference standard.

  • Methods to conduct and interpret the index test were well described.

  • The authors declared that they have no conflicts of interest with respect to the research, authorship, and publication of the article.

  • Did not present patient characteristics.

  • A small number of patients were included.

  • It is unclear how the teleconsultants were sampled.

  • Included teleconsultants did not have formal training and variable experience with the methodological approach employed by the authors (topographical classification).

  • Not all participants were included in the analysis.

  • It is unclear the interval between the reference standard and index test.

Morel et al. (2022)27

  • Patients were recruited using consecutive sampling.

  • A case-control design was avoided and patients were not inappropriately excluded.

  • Patient characteristics were presented.

  • The reference standard results were interpreted without knowledge of the results of the index test.

  • The study population, index test, and reference standard match those targeted by the review.

  • Methods to conduct and interpret the index test were well described.

  • The reference standard was likely to correctly classify the target condition.

  • The authors declared that they have no conflicts of interest with respect to the research, authorship, and publication of the article.

  • The index test results may have been interpreted with knowledge of the reference standard as the index test and reference standard were performed by the same clinician, although interpretation of the index test occurred 2 months after collecting the reference standard to minimize bias.

  • The funding source for this study was not clearly reported.

Alfageme et al. (2021)21

  • The index test and the reference standard were interpreted with an appropriate time frame (< 3 days, with an average of 2.18 days).

  • The study population, index test, and reference standard match those targeted by the review.

  • Methods to conduct and interpret the index test were well described.

  • The index test results were interpreted without knowledge of the results of the reference standard.

  • The reference standard results were interpreted without knowledge of the results of the index test.

  • The reference standard was likely to correctly classify the target condition.

  • The authors declared that they have no conflicts of interest with respect to the research, authorship, and publication of the article.

  • Not all participants were included in the analysis.

  • The index test and the reference standard were not collected at the same visit, and it is unclear whether they were performed within an appropriate time frame.

  • It is unclear whether all patients received the same index test due to variability in equipment across sites.

Marini et al. (2021)25

  • All patients were included in the analysis.

  • All patients received the same reference standard.

  • The study population, index test, and reference standard match those targeted by the review.

  • Methods to conduct and interpret the index test were well described.

  • The index test results were interpreted without knowledge of the results of the reference standard.

  • The reference standard results were interpreted without knowledge of the results of the index test.

  • The reference standard was likely to correctly classify the target condition.

  • The authors declared conflicts of interest of interest with respect to the research, authorship, and publication of the article.

  • Patients were recruited using convenience sampling.

  • The exact interval between interpreting the index test and reference standard is unclear.

Marini et al. (2021)26

  • Patients were recruited using random sampling.

  • The study population, index test, and reference standard match those targeted by the review.

  • All patients were included in the analysis.

  • All patients received the same reference standard.

  • The reference standard results were interpreted without knowledge of the results of the index test.

  • Methods to conduct and interpret the index test were well described.

  • The index test results were interpreted without knowledge of the results of the reference standard.

  • The reference standard was likely to correctly classify the target condition.

  • The authors declared conflicts of interest of interest with respect to the research, authorship, and publication of the article.

  • The index test was interpreted later than the reference standard, although the exact interval is unclear.

  • The reference standard and index test were completed using different scanner protocols.

Toscano et al. (2021)31

  • The study population, index test, and reference standard match those targeted by the review.

  • All patients received the same reference standard.

  • The reference standard results were interpreted without knowledge of the results of the index test.

  • Methods to conduct and interpret the index test were well described.

  • All patients were included in the analysis.

  • The index test results were interpreted without knowledge of the results of the reference standard.

  • The reference standard was likely to correctly classify the target condition.

  • The authors declared conflicts of interest of interest with respect to the research, authorship, and publication of the article.

  • Patients were recruited using convenience sampling.

  • The index test was interpreted later than the reference standard and the exact interval is unclear.

  • The reference standard and index test were completed using different scanner protocols.

Lu et al. (2021)24

  • Patient characteristics were presented.

  • The study population, index test, and reference standard match those targeted by the review.

  • All patients received the same reference standard.

  • Methods to conduct and interpret the index test were well described.

  • All patients were included in the analysis.

  • The index test results were interpreted without knowledge of the results of the reference standard.

  • The authors declared no conflicts of interest of interest with respect to the research, authorship, and publication of the article.

  • Patients were recruited using convenience sampling.

  • The reference standard was established through a consensus with knowledge of the results of the index test to produce the “gold-standard.”

  • The reference standard was not likely to correctly classify the target condition due operator-specific reasons.

  • The index test was interpreted later than the reference standard and the exact interval is unclear.

Nascimento et al. (2021)29

  • Patients were recruited using consecutive sampling and avoided inappropriate exclusions.

  • Patient characteristics were presented.

  • The study population, index test, and reference standard match those targeted by the review.

  • Methods to conduct and interpret the index test were well described.

  • The index test results were interpreted without knowledge of the results of the reference standard.

  • The authors declared no conflicts of interest with respect to the research, authorship, and publication of the article.

  • The reference standard was conducted and interpreted with knowledge of index test results: Only patients that screened positive with the index test were provided the reference standard for confirmation.

  • The reference standard was conducted and interpreted later than the index test, although the exact interval is unclear.

  • Only a subset of patients were included in the analysis that compared the index test results to the reference standard.

  • A small number of patients were included in the comparison.

Dougherty et al. (2020)22

  • Patients were recruited using random sampling.

  • The study population, index test, and reference standard match those targeted by the review.

  • All patients received the same reference standard.

  • The reference standard results were interpreted without knowledge of the results of the index test.

  • Methods to conduct and interpret the index test were well described.

  • The index test results were interpreted without knowledge of the results of the reference standard.

  • The reference standard was likely to correctly classify the target condition.

  • Patient characteristics were not presented in detail.

  • The index test was interpreted later than the reference standard and the exact interval is unclear.

  • Not all participants were included in the analysis.

  • The authors did not report on conflicts of interest of interest with respect to the research, authorship, and publication of the article.

Hjorth-Hansen et al. (2020)23

  • The study population, index test, and reference standard match those targeted by the review.

  • All patients received the same reference standard.

  • The reference standard results were interpreted without knowledge of the results of the index test.

  • Methods to conduct and interpret the index test were well described.

  • The index test results were interpreted without knowledge of the results of the reference standard.

  • The reference standard was likely to correctly classify the target condition.

  • The authors declared conflicts of interest with respect to the research, authorship, and publication of the article.

  • Patients were recruited using convenience sampling.

  • The reference cardiologist was not blinded to previous ECG recordings (i.e., scans that predate the study).

Nascimento et al. (2019)28

  • Patients were recruited using consecutive sampling and avoided inappropriate exclusions.

  • Patient characteristics were presented.

  • The study population, index test, and reference standard match those targeted by the review.

  • Methods to conduct and interpret the index test were well described.

  • The index test results were interpreted without knowledge of the results of the reference standard.

  • The authors declared no conflicts of interest with respect to the research, authorship, and publication of the article.

  • The reference standard was conducted and interpreted with knowledge index test results: Only patients that screened positive with the index test were provided the reference standard for confirmation.

  • The reference standard was conducted and interpreted up to 60 days later than the index test, which may not be an appropriate time frame for the target condition.

  • Only a subset of patients were included in the analysis that compared the index test results to the reference standard.

  • A small number of patients were included in the comparison.

ECG = electrocardiogram.