What's New In Surgical AI: 05/07 Edition
Vol 24: The Real Techbros of Silicon Valley, Dan finally gets his AI Intern
Welcome back! If you’re new to ctrl-alt-operate, we do the work of keeping up with AI, so you don’t have to. We’re grounded in our clinical-first context, so you can be a discerning consumer and developer. We’ll help you decide when you’re ready to bring A.I. into the clinic, hospital or O.R.
This week (seemingly like all weeks) was filled with news, but in a much more political, he-said-she-said, kind of way. We get into the news, and how that even remotely brings us back to the clinic. Oh, and we’ll bring you the newest Glaucomflecken video which takes on AI!
Remember when we talked about how chatGPT could be your data intern? That time has come. Dan got access to chatGPT’s new code interpreter - this is an incredibly powerful tool that abstracts out exploratory data workflows, especially quantitative analysis. He’ll go through a first-person POV using code interpreter as his data intern in our deep dive.
Table of Contents
📰The News: A.I. Gossip, Big Bets, and Some Science
🤿Deep Dive: Putting our new data intern from chatGPT, Code Interpreter, to the test!
The News: A.I. Gossip, Big Bets, and Some Science
First, some humor. AI has gone mainstream and social-MD guru Dr. Glaucomflecken shares his take on Ortho Bros powered by AI. Enjoy:
If you haven't watched the latest episode of The Tech Bros of Silicon Valley, don’t worry, we’re loyal fans so can fill you in on all the billion dollar corporate gossip. Now, admittedly, some of this may not be directly applicable to clinical or surgical life. But, they involve the most powerful companies on the planet who dictate our A.I. futures, so we think it’s worth knowing.
A leaked memo from a Google GOOG 0.00%↑ researcher has been making the rounds, with the bold statement:
“We [Google] have no moat. But neither does OpenAI… while we’ve been squabbling, a third faction has been quietly eating our lunch. I’m talking, of course, about open source.
The concept of a moat comes from Warren Buffett, who described it as a company’s durable competitive advantage, protecting it from competitors as its name implies. Their moat in the AI world has thus far been that Google and OpenAI’s models have just been, flat out, the best. The open-source world is closing that gap, and fast.
The CEOs of OpenAI, Microsoft Google and more met at the met at the White House to discuss A.I. policy (including $140 Million in National Science Foundation funding!). Two key missing persons: Elon Musk and Mark Zuckerberg. While chatGPT and language models are all the rage, Tesla and Meta (formerly Facebook) have built a legacy of computer vision advancement (see prior issue of Ctrl-Alt-Operate).
Finally, the “Godfather of AI” Geoffrey Hinton, senior author of one of the seminal computer vision papers, has left Google so he can speak freely on the dangers of A.I. This tracks for Hinton: he was a lifelong academician who even left Carnegie Mellon because he was uncomfortable with taking Pentagon dollars. He then had a decade long stint at Google, which he was only in because he also left tens of millions of dollars on the table so he could work for Google instead of the Chinese tech-giant Baidu (this is an insane story). He is worried that because of the fierce competition and pressure to innovate, we could create dangerous A.I. systems at a rapidly alarming pace and only realize it too late.
In totally unrelated news, Palantir announced their A.I. for Defense platform, where troop movements and analysis can be performed using natural language. PwC also announced a billion dollar investment in the next three years in AI, where it sees agents writing notes, performing billing duties, and more (sound familiar?)
Now to return to some more true A.I. news, OpenAI continues to impress and dropped a “Text to 3-D Shape” paper. Input words, output a 3D rendering (rotatable) shape. My favorite one is “Chair that looks like an avocado” that also looks like a chair in every hotel lobby.
I think this allows us to double down on our thought process that multi-modal models, that is- models that incorporate text, speech, image analysis, image creation, etc. will represent the next “wave” of A.I. technology. From a surgical standpoint, as we move towards more hyper-personalized implant technology, this could be a fantastic way to correlate plans with real prints- guided by natural language.
Finally, ElevenLabs released a voice-cloning model which can change your voice into a number of different languages. Their demo is scary good - take a native Spanish speaker, and watch their own voice start speaking English, with the appropriate accent and everything. What if we had these buddies (sync’d with Google Translate) in the hospital? Where our voice (and the concern, empathy, lightheartedness…) could just be said in the language our patients most comfortably speak?
🤿Deep Dive: Code Interpreter, your new data intern from chatGPT.
Although most MD’s don’t write code, anyone who publishes a paper has to analyze data using computational methods (raise your hand if you’re still using a slide rule and transparencies - nope, didn’t think so).
This section is for you - whether you’re a student or seasoned prof.
Let’s take the latest secret tool from OpenAI, Code Interpreter, for a quick spin and see where things stand in May 2023.
First, some background. Since its earliest inception, chatGPT and similar tools from major IDEs (Copilot, replit) have accelerated code writing. These tools started as simple auto-completes, but are now able to write functions and even entire programs, moving ideas into production-ready webapps in hours instead of days.
Here’s where I think Code interpreter stacks up today.
Data Visualization: Can I use CodeInterpreter to explore a dataset faster and more easily than writing r or python scripts? (My Grade: B, limited but does well within its limitations)
Data Cleaning: How well does CodeInterpreter handle messy datasets? (My Grade: B+, needs supervision but can do simple jobs)
Data Analysis: Can we build toy models, and understand them? (My Grade: F, the interface is too problematic to use effectively).
What is Code Interpreter: Code interpreter is a large language model (GPT) interface to a jupyter notebook that runs python code in a sandbox environment to which you can upload files and data. You can find the full list of libraries here.
It allows you to create and run python code using natural language.
Let’s see how this does on real data. We’re going to use our public dataset of surgical performance, called SOCAL, which you can download here. SOCAL was created from a nationwide training exercise where surgeons attempted to control bleeding from an extremely lifelike perfused cadaveric simulator, and supported a series of publications from 2017-2022 where we described our earliest steps into automating surgeon performance analysis.
First, we will upload two csv’s that contain the data about our surgeons and how they performed. Since codeinterpreter uses standard python packages (pandas, numpy, scikit-learn, matplotlib, seaborn) and jupyter notebooks, it will default to those methods.
Our data files are formatted strangely, since one file contains all of the data about the participants, including their trial #1 and trial #2 performance, but the other contains a list of trials and their outcomes. Let’s see if we can guide code interpreter through this process.
Looks good. Now let’s make some nice figures using seaborn. Not too bad.
Now lets give it a real task.
Now here’s the neat part, it went through a series of errors (just as I would have), troubleshot its own mistakes through several iterations, came up with a solution, failed, tried again, failed, and voila:
Not the most visually appealing, lets see what happens if we ask it to do better:
Whoops! Like any intern, it misinterpreted what I asked, and gave me a bunch of bar plots instead of nice looking tables, so we’ll try to get it to try again. Unfortunately the current build of code interpeter doesn’t have the right python package or permissions to install it, so this is the best we will get for now :-/ But it will write the correct code so that I could execute it in my python environment with a copy-paste.
See what happened? It picked the wrong variable to use in the logistic regression. We have to be careful about supervising our intern otherwise the analysis might not make any sense! After a few iterations of this, it generates a model that allegedly has some pretty good performance characteristics (but on closer inspection … needs some help)
What about some other fun exercises? Well, it turns out that we monitored the heart rates of many of our surgeons. Let’s see if we can generate a model that predicts their baseline heartrate. Were more experienced surgeons more likely to have higher or lower heartrates? What about surgeons who had more or less aerobic training?
On our first attempt, the intern does something cheeky: it builds a model using all of the other heart rate data to predict the baseline data. Clearly good thinking, but not useful for us. Once we tell it not to do that, it isn’t nearly as successful.
Oh, and here’s how it handles errors:
In summary: Code Interpreter is a limited playground tool that can automate creation of python scripts and run them against data in a sandbox. It has limitations based on the restrictions placed on its permissions as well as its computational environment. The less specification you give it, the more you need to police its work. It can troubleshoot some simple errors.
Overall, it didn’t speed up my work and there are some areas of its output that were problematic. Personally, I would prefer working in a more open, pair-programmer interface like copilot or even using copy-paste chatGPT to provide a better experience than StackOverflow, and I didn’t like how obfuscated some of the design decisions and code samples were.
I know that there is more to come on this front, but for now, I don’t think Code Interpreter will replace me or my (imaginary, human) intern. With a few tweaks to the front end, and a better ability to interface with existing IDE’s, the underlying concept will be baked into everything we use next year. So take it for a spin when you get access :)
🪦 Best of Twitter (On Hold)
Feeling inspired? Drop us a line and let us know what you liked.
Like all surgeons, we are always looking to get better. Send us your M&M style roastings or favorable Press-Gainey ratings by email at ctrl.alt.operate@gmail.com