Accuracy and validation for surgical navigation systems
By Anthony M. DiGioia III, MD; Andrew B. Mor, PhD;
Branislav Jaramaz, PhD, and Joel M. Bach
Computer-assisted orthopaedic surgical (CAOS) systems are becoming more popular because these devices can improve accuracy, reduce outliers and enable less invasive surgical techniques in many areas of orthopaedics. Their greatest potential lies in enabling true minimally invasive surgery (MIS) and in their use as MIS surgical trainers.
With all of these potential clinical benefits, however, there is no standard method for characterizing and reporting the performance of CAOS systems. This presents difficulties for surgeons and researchers attempting to evaluate and compare different systems.
Accuracy and validation
Most of the currently available commercial CAOS systems are “passive” tools, commonly called surgical navigation systems (see Figure 1). These systems do not act on the patient directly but do provide the surgeon with feedback on the position of a tool or implant relative to the patient’s anatomy. Navigation systems are real-time intraoperative information systems that can also measure and document surgical techniques, range of motion and stability both before and after an intervention.
Figure 1. Components of a typical CAOS system: A. surgical navigation station; B. foot pedal (input device); C. position tracker (localizer); D. localizer interface unit; E. tracking markers.
Typical navigation systems for hip or knee replacement will inform the surgeon of the position and orientation of the tools and implants with respect to the bone, to ensure that the surgeon places the implant in the desired position.
The need to establish standards and guidelines for measuring the accuracy and validation of these computer-assisted tools is very real. There are two different ways to characterize accuracy, validation and performance: technical and clinical.
Technical accuracy is based on how accurately the system reports information to the surgeon. Technical accuracy includes both end-to-end accuracy of the complete system and the accuracy of the different subsystems making up the complete system.
Clinical accuracy is determined by how accurate the final position of the implanted device is compared to the planned or ideal placement. Clinical validation is related to clinical accuracy because it looks at short-term and long-term clinical outcomes determined by more conventional clinical measures. Clinical accuracy is typically evaluated postoperatively, and different surgical procedures will have different minimum accuracy requirements. For example, the clinical accuracy requirements for spinal interventions such as pedicle screw placement are likely to be very different than those for total hip or total knee replacement.
Both technical and clinical accuracy are application-specific. Currently, they are evaluated differently by different manufacturers, with no standardization in reporting.
The best-case accuracy values for a given system are determined using end-to-end evaluation of the system’s accuracy in a bench-top environment, which uses controlled test conditions and standard phantom bone models. Surgeons are also a factor in the evaluation, because they must not only collect the appropriate anatomic information as input but also act on the information correctly.
End-to-end tests take a phantom model through all the steps used in a typical procedure—from model generation and anatomic data acquisition (computed tomography [CT] or fluoroscopy for most image-based systems, landmark acquisition for image-free systems) to device placement. Then, an independent, highly accurate tool, such as a coordinate measuring machine, is used to determine the final position of the device.
Subsystem testing of the different components of the CAOS system is used to characterize the differing contributions of the subsystem to the overall system’s performance. Different subsystems that might be identified include: the tracking system, a patient simulation subsystem to aid in surgical planning, any software routines used to build models of the patient’s anatomy, registration between a preoperative model and the patient’s anatomy, anatomic point collection strategies for non-image systems and the selection of an anatomic coordinate system. Table 1 shows different subsystems that might be tested for three types of navigation systems.
Subsystem testing results, when reported, can also be helpful in predicting expected accuracy when a surgeon or company wants to adopt a commercial system for a different use (such as adapting a spine system to help guide the treatment for avascular necrosis of the femoral head).
Clinical accuracy is, by its very nature, quite different from technical accuracy. Cadavers can be used in a manner similar to the end-to-end accuracy tests done on bone phantoms, but they don’t equate completely to the procedures used in practice. Patients cannot be accurately characterized to the degree that phantom models can be measured, and standard evaluation techniques, such as radiographs, have been shown to be inaccurate at a level equal to or greater than typical CAOS system errors.
CT-based validation of implant positioning would provide the best clinical data to measure implant positioning, but has the obvious drawbacks of radiation exposure and high cost. A validated, reliable, standard postoperative measurement method is needed that would reduce measurement errors and is accurate enough to be able to quantify the errors generated through the use of CAOS systems.
Short-term outcomes have been reported for most systems, although often not measured with a technique of similar accuracy to most CAOS systems. Longer-term results have recently begun to be published, and the clinical outcomes look promising.
Sources of error
CAOS systems have to be used correctly to be accurate and of practical use in the operating room (OR). Just as any mechanical alignment guide must be positioned correctly on bone for proper alignment, CAOS systems have specific requirements that must be followed for optimal performance.
The internal computer model of the patient’s anatomy is the heart of all navigation systems. This is also the point on which most systems differ. Image-based systems rely on a visual model of the patient’s anatomy; CT-based systems on a three-dimensional surface model; and fluoroscopy-based systems on a set of two-dimensional images. CT-based systems can introduce error if there is undetected motion of the patient in the CT scanner, while fluoroscopy-based systems typically require sufficient angular separation between images to generate accurate locations of anatomical landmarks. CT-based systems also require registration, or alignment, between the surface model of the anatomy generated from the CT scan and the patient’s anatomy on the operating table.
Image-free systems, on the other hand, rely on anatomic information gathered intraoperatively by the surgeon. These systems depend on consistent collection of anatomical landmark locations or surface geometry with a point probe and on kinematic analysis of joint centers from tracking data. If the surgeon doesn’t properly locate the landmarks, the final placement of the guides and alignment will be inaccurate. Poor collection of surface geometry, or poor construction of a reference model from collected data, will also introduce errors into the final result.
The other main source of error for CAOS systems is tracking inaccuracy. Navigation systems rely on either an optical or electromagnetic (EM) system to locate the relative positions of the patient and the surgical tools and devices during the procedure. A tracker is attached to the relevant anatomy (pelvis, femur, tibia, etc.) that can be located relative to a fixed reference, typically a camera or electromagnetic transmitter. The surgeon’s tools are similarly tracked, enabling the system to calculate the location of the tool relative to the patient.
The most predictable error is inherent to the tracking system itself, which would be determined during the subsystem testing described above. However, OR use can degrade tracking accuracy. Optical trackers that rely on cameras can get dirty from patient fluids, there can be line-of-sight issues between the camera and the trackers, or cameras can be confused by other light sources in the room. EM systems can be affected by the presence of metallic objects in the OR, and even by the presence of nearby metallic objects. Both systems also can demonstrate problems when tracking moving objects because individual trackers are often located sequentially and not simultaneously.
A potentially dangerous inaccuracy can be introduced if a tracker is inadvertently moved during the procedure, changing the relative positions of the bone and the tracker. The typical system cannot detect this, and the unwary surgeon may not notice the inconsistency between results on the computer screen and what is happening with the patient. Minimally invasive procedures increase the likelihood that this type of error will be undetected because of the surgeon’s limited visual field.
One subsystem test that measures contributions to system errors is the characterization of an EM tracking system. Although EM tracking is increasing in popularity, the different EM tracking systems may not have been appropriately tested in realistic surgical environments.
Because EM system accuracy can be affected by nearby metal objects or magnetic fields, tests should address that issue as well as the basic accuracy of the system. To measure the system’s accuracy over the volume of space, the tracker is moved through the volume and the measured locations compared to the actual locations. This test must be repeated with and without metal artifacts in the tracked volume to assess the effect of that disturbance. The second test should determine the accuracy of the system in measuring various positions and orientations of a well-characterized phantom, with and without metal in the vicinity. This provides a more application-specific measure of the system’s accuracy.
A third possible test would look at the accuracy of the system in tracking moving objects, which would be important for many image-free systems and systems that track patient kinematics intraoperatively. Finally, repeating the second test in the OR environment would determine if the OR equipment causes any possible disturbances. The results from these tests also can be compared to optical tracking systems, which are the current standard for commercially available navigation systems.
ASTM International and standards for measuring accuracy
Standard methods for measuring and reporting accuracy for CAOS systems, as well as definitions for the types of systems and related terminology, will reduce the current difficulties in comparing competing systems, and enable doctors, administrators and regulators to make more knowledgeable decisions when evaluating these systems. The American Society for Testing and Materials International (ASTM International) is assisting the CAOS community in developing standards to achieve this goal.
ASTM International, one of the largest standards-developing organizations in the world, is a global forum for the development of consensus standards for materials, products, systems and services. In June 2004, a group of surgeons, academics, and product manufacturers and regulators from several countries—including Canada, France, Germany, Italy, Japan, Switzerland and the United States—attended an organizational meeting with ASTM International staff members held in conjunction with the CAOS International annual meeting. They discussed the development of voluntary consensus standards for the metrology, validation and performance of CAOS systems. Three draft standards were planned:
• Standard practice for measurement of positional accuracy of CAOS systems
• Standard classification of CAOS imaging systems
• Standard terminology relating to CAOS systems
The first draft standard is currently being developed. Joel M. Bach, of the Colorado School of Mines, leads the task group. The terminology and classification standards will follow. Completion of the first standard and initial voting for acceptance will take place soon.
The CAOS promise
CAOS systems will reduce the number of outliers in surgical outcomes and lower the errors in implant placement for both novice and experienced surgeons. But until standards for assessing these systems are developed and implemented, there will be uncertainty about how well individual systems perform and how they compare to each other.
Although it may be some time before performance standards are developed, ensuring that different systems are tested in the same manner will increase confidence among the user community that CAOS systems will achieve their stated outcome goals. Comprehensive testing and reporting of accuracy numbers, both technical and clinical, will ensure such systems are suitable for routine use.
Anthony M. DiGioia III, MD, is a member of the AAOS Biomedical Engineering Committee and director of the Institute for Computer Assisted Orthopaedic Surgery (ICAOS) at The Western Pennsylvania Hospital. He can be reached at firstname.lastname@example.org
Branislav Jaramaz, PhD, is assistant and scientific director of ICAOS; Andrew B. Mor, PhD, is a research scientist at ICAOS.