In Conversation With… Karl Bilimoria, MD, MS
Editor's note: Dr. Bilimoria is the Director of the Surgical Outcomes and Quality Improvement Center of Northwestern University, which focuses on national, regional, and local quality improvement research and practical initiatives. He is also the Director of the Illinois Surgical Quality Improvement Collaborative and a Faculty Scholar at the American College of Surgeons. In the second part of a two-part interview (the earlier one concerned residency duty hours), we spoke with him about quality and safety in surgery.
Dr. Robert M. Wachter: Surgeons have always cared deeply about their quality and safety and their outcomes, but they have tended to embrace the traditional, "It's all about the individual performance and virtuosity" mantra. How has life changed over the last 10 or 15 years in terms of systems thinking, and how do you and surgeons think about improvement now?
Dr. Karl Bilimoria: It has evolved from the individual blame game to thinking about how to improve systems of care to achieve safe care and good outcomes. The M&M [morbidity and mortality] conferences across the country have changed from blaming the presenter to thinking about how do we prevent this from ever happening again. It has certainly moved to a nonpunitive environment. The best evidence for how our culture has changed in surgery has been that whenever we now put forth public reporting initiatives, there is almost no pushback. Surgeons are used to this. They realize that it's better when we do it ourselves rather than having it pushed upon us. We definitely have evolved in both the safety and quality areas and hopefully to some degree are perceived as leaders in the area.
RW: I think you are. As people accept the idea of measurement, do they buy that the methodology is good enough—both in the sense that you're measuring the right things and that you've captured the frequent perception that "my patients are older and sicker"?
KB: I think it depends on the data source. Our confidence in claims data continues to be very poor. They are fairly terrible for measuring postoperative complications. And most of today's claims data are focused on the inpatient stay, but length of stay has decreased a lot. For complex operations, we discharge people after a day or two, so most of their complications are occurring in the outpatient setting.
But when you move into registry data like the National Surgery Quality Improvement Program (NSQIP) data—that we have much more confidence in. They're collected in a standardized fashion across 600 hospitals in the country. They collect a number of demographics and comorbidities about the patient and what was done in the actual surgery, which allows us to faithfully risk adjust and level the playing field among hospitals and surgeons. So that is where our future is.
RW: What are the other megatrends, as EHR adoption has gone from 10% a decade ago to 90% now? How does the work of registry data collection and analysis change, and where is that going over the next 5 or 10 years as the systems get better?
KB: We have perfected manual data abstraction—it's as good as it's going to get for now, using standardized definitions, audits, and whatnot. But it's clearly not sustainable. It only gets us a sample of the cases at each hospital because we don't have the manpower to collect every case. At this point, we are not very good at pulling data out of the EHR in an accurate, reliable way. We can get the easy variables: the demographics and their lab values and even some of their comorbidities. But what really becomes challenging has been to identify postoperative complications from the EHR using standard definitions. We've had projects with various EHR vendors putting their best and brightest to work trying to figure this out and that really hasn't panned out. I have faith that we can do it. Especially as systems move to unified EHRs, we'll see more and more standardization. We clearly have to do it. Otherwise, we will not succeed in having high-quality data for all of our patients.
RW: If you had a family member who needed surgery in a distant town (and you were not a doctor who knew who to call), what would you do to figure out what the best place was for him or her to go and who the best surgeon would be?
KB: The publicly available data for patients are still fairly weak. Most of it is based on that flawed administrative data and nonstandard, unaudited data. So that is challenging. I don't think there are great data sources out there for assessing empirically where you should receive care. To be honest, the thing you would do is to talk to your referring doctors in the local market and talk to family and friends and hear about who they have had good experiences with to get some of that information.
The other place that seems to have good face validity because it collects that data for you through a somewhat systematic survey is the US News and World Report rankings. Through their reputation survey, they essentially do that work for you of asking people, "Where do you think I should go for this type of care?" A lot of us fault the reputation survey and dislike it for a number of reasons, but it is filling a void right now. It is probably reflecting things like the services and specialists available, the trials, the research, and a variety of things that are done at those specialty centers.
It's embarrassing to say as somebody in the measurement field: I wish we could provide better information, but we don't have that type of information available yet. We're starting to make progress in publishing registry data, but not all of the hospitals in the country participate in these registries. The Society for Thoracic Surgeons National Database is a great example of where most hospitals participate, but not all of them are willing to publicly report yet. That trend is starting to change. As we can provide better data to the patients, I think they'll use that data over the recommendation of family, friends, and their referring doctors.
RW: There is an interesting tension there because of NSQIP and other efforts. We've gotten better at internally knowing who's doing well and who's not and learning from it. But we collectively have decided not to open that data spigot for patients. Grappling with the ethics of that and also the practicality of that—would there be less engagement with these systems if you knew it was going to be public—are really complicated questions.
KB: Yes.
RW: Talk about the Birkmeyer trial from a few years ago and the evidence for differences in technical skills among surgeons as they relate to meaningful outcomes. What do you take from that in terms of what we need to do about training and certification of surgeons?
KB: John Birkmeyer's study was brilliant. The fact that technical skills are associated with outcomes comes as no surprise to any surgeon—and probably no surprise to anybody else. Certainly, the technical maneuvers in the operating room are important, particularly in a highly complex procedure like laparoscopic bariatric surgery. The skill set for that is pretty complex. Birkmeyer set us up well to think about how do we get surgeons to study their own skill when they are out in practice, and improve upon it. There are several initiatives looking at video coaching. I lead a state-wide collaborative of 56 hospitals here in Illinois. We have implemented a program where surgeons record their laparoscopic colectomy; they score each other so they get a score sheet with comments and standardized scores about their procedure. Then they get together, pair up, and review their peer's operation. I watch yours and you watch mine, and we talk about details of the case. It's fantastic to watch. They flip over a piece of paper, and they're drawing and trying to explain their point of how they would do this differently or some other trick they learned.
The feedback has been heartwarming. We have surgeons who have said that it has been a transformational, life-changing experience to finally have somebody to give them feedback about their technical skills once they're in practice. I mean nobody watches us operate except maybe the residents, or if you're out in the community, some nurses and maybe your partner. It then spills over into nontechnical issues, which is really interesting. Like how do you get set up for the operation? What do you make sure you have in the room? How do you talk about which experts you need on backup? How do you deal with your scrub nurse who you've never worked with before and make sure she can help you through this complicated operation? The goal now is to figure out how we move that forward. How is it scalable? It is complicated to get two busy surgeons together to watch videos. We're looking at trying to do this more broadly for many more procedures across the state of Illinois and working potentially with the American Board of Surgery to try to understand whether this could have potential for a more meaningful maintenance of certification approach.
RW: Back to the question before about transparency for patients versus creating a safe environment for improvement. When I read Birkmeyer's study, I thought: if I needed a surgeon for my dad I would want to know his or her video score. I imagine if that were made generally available from the work you're doing or Birkmeyer did, your participation rates would plummet. What's the balance between this work and the patient's right to know whether their surgeon has technical skills?
KB: It's a fair point. I don't think we're there yet. We're still pretty early in this journey of trying to evaluate technical performance. We have no risk adjustments for this sort of measurement right now. Right now, it's purely an improvement activity. The bigger thing to keep in mind is that this is one piece, and it shouldn't be overshadowed by other aspects of surgical care. I would argue that even more important, and more nuanced, than the technical maneuvers in the operating room are the decisions about whom to operate on. The aftercare is important, too. You could do a technically beautiful operation, but if you did not give them optimal VTE [venous thromboembolism] prophylaxis and surgical site infection prevention approaches, or if it's a cancer operation and you don't give them the right adjuvant therapy afterward or recommend that they go meet with the medical oncologist… There are so many other aspects beyond technical skills that may be more important. What I'd like to see is a bundled approach where we have valid, reliable measures in decision-making, best practice adherence, and technical scores. And patient experience measures added to that. Then, we have a more robust comprehensive package by which to evaluate surgeon performance.
RW: In terms of trying to make this work scalable, is anybody working on analyzing the videotape through machine learning? I went to golf camp a few years ago and my swing got compared to Tiger Woods and Ernie Els. I didn't do very well, but they've devised an automated way to look at the mechanics of my swing and overlay them against a gold standard. All to tell me how bad my swing is!
KB: I have not seen that. Some groups have started to put biosensors on surgeons to be able to track, for example, how much they use their nondominant hand. How do you get that nondominant hand more involved in the operation to make you more efficient and safe? But anything more sophisticated than that with machine learning, it may be happening but not that I know of.
RW: It's probably still true that when someone applies for a surgical residency, nothing in the process of interviewing and applying tests whether they have the aptitude to have the technical skills to be a surgeon. When you compare that to the way people interview for a job to fly planes or other complex technical skills, does that seem right?
KB: It's true. We still don't do any technical skills assessment. There is probably some technical skills assessment that happens as you're a medical student. If there really is some phenomenal failure of your potential to be a surgeon, it would come out in your letters. But beyond that, it's also likely that we can train most people to be a competent surgeon. Some surgeries require really gifted technical skills, and a large number of other surgeries are much less complicated. I think people self-select once they're in residency. If you don't have the skills, you're not going to do hepatobiliary surgery. You're going to go do something more straightforward.
RW: Sometimes when I speak, I'll show Birkmeyer's videos of the good example and the bad example. The question is always: Can that bad example person be trained to be at least good enough?
KB: Yes, again we are pretty early in this, but we believe that we can get those people to improve—to be competent enough to do those surgeries safely.
RW: What does the training of surgeons and the measurement of surgical quality look like 10 years from now?
KB: I hope that we can have a transformative moment where we are able to get great data out of the EHR in a standardized fashion across all hospitals in the country and be able to provide really high-quality data. And then to give it back to ourselves to be able to analyze performance and identify opportunities for improvement. Give it to the public so they can assess quality in a way that we all believe in and has good face validity. I would love to see individual surgeon performance measures be available to the public as well as transparency around all aspects of hospital care. Patients want detailed data on every type of condition and every type of operation and want to know how their hospital and surgeon do at that specific condition. I would hope that in 10 years we've made considerable progress to getting standardized data to be able to provide that. Right now, it's simply a data abstraction and an analysis issue. If we can get that data in a more systematic fashion out of the EHRs, we will be able to provide good information.