Over 300 million computed tomography (CT) scans are performed globally each year, with 85 million in the United States alone. Radiologists are continuously seeking ways to expedite their workflow and produce precise reports. To address this need, NVIDIA Research has developed a new foundation model, VISTA-3D, which is integrated into an optimized microservice called NVIDIA NIM, designed for scalable deployment, according to NVIDIA Technical Blog.
VISTA-3D Model
The VISTA-3D (Versatile Imaging Segmentation and Annotation) model is trained on over 12,000 volumes, covering 127 types of human anatomical structures and various lesions, including lung nodules, liver tumors, and bone lesions. It offers accurate out-of-box segmentation and state-of-the-art, zero-shot interactive segmentation, making it a versatile tool for medical imaging.
The model features three core workflows:
- Segment everything: Allows comprehensive body exploration, aiding in understanding complex diseases affecting multiple organs.
- Segment using class: Provides detailed views based on specific classes, essential for targeted disease analysis.
- Segment point prompts: Enhances segmentation precision through user-directed selection, accelerating the creation of accurate ground-truth data.
The architecture of VISTA-3D includes an encoder layer followed by two parallel decoder layers—one for automatic segmentation and another for point prompts. This structure ensures high accuracy and adaptability across diverse anatomical areas.
VISTA-3D NIM Microservice
Hosted on the NVIDIA API Catalog, the VISTA-3D NIM microservice allows users to test its capabilities with sample data. It can segment over 100 organs or specific classes of interest, providing views in axial, coronal, or sagittal planes.
Using NIM Microservices
Users can run VISTA-3D on their data by signing up for a personal key from NVIDIA, which provides 1,000 free credits to try any NIM microservices. Detailed instructions on generating an API key and running the model are available, along with sample code in various programming languages.
For those looking to run VISTA-3D on their own data, setting up an FTP server to serve medical images is necessary. This approach accommodates the large size of medical images, which are typically too large to send in API payloads directly.
Running NIM Microservices Locally
To run NIM microservices locally, users need to apply for NVIDIA NIM access. Upon approval, they will receive a Docker container to run the VISTA-3D NIM microservice on their preferred hardware. Prerequisites include having Docker, Docker Compose, and NVIDIA drivers installed.
Conclusion
NVIDIA’s VISTA-3D foundation model represents a significant advancement in medical imaging, offering precise segmentation of over 100 organs and various diseases in CT scans. The NVIDIA NIM microservice simplifies the deployment and usage of this powerful model, enhancing the workflow and accuracy of radiologists.