Self-teaching AI uses pathology images to find similar cases, diagnose rare diseases
New model acts as search engine for large databases of pathology images
Computer generated picture
“We show that our system can assist with the diagnosis of rare diseases and find cases with similar morphologic patterns without the need for manual annotations, and large datasets for supervised training,” said senior author Faisal Mahmood, PhD, in the Brigham’s Department of Pathology. “This system has the potential to improve pathology training, disease subtyping, tumor identification, and rare morphology identification.”
Modern electronic databases can store an immense amount of digital records and reference images, particularly in pathology through whole slide images (WSIs). However, the gigapixel size of each individual WSI and the ever-increasing number of images in large repositories, means that search and retrieval of WSIs can be slow and complicated. As a result, scalability remains a pertinent roadblock for efficient use.
To solve this issue, researchers at the Brigham developed SISH, which teaches itself to learn feature representations which can be used to find cases with analogous features in pathology at a constant speed regardless of the size of the database.
In their study, the researchers tested the speed and ability of SISH to retrieve interpretable disease subtype information for common and rare cancers. The algorithm successfully retrieved images with speed and accuracy from a database of tens of thousands of whole slide images from over 22,000 patient cases, with over 50 different disease types and over a dozen anatomical sites. The speed of retrieval outperformed other methods in many scenarios, including disease subtype retrieval, particularly as the image database size scaled into the thousands of images. Even while the repositories expanded in size, SISH was still able to maintain a constant search speed.
The algorithm, however, has some limitations including a large memory requirement, limited context awareness within large tissue slides and the fact that it is limited to a single imaging modality.
Overall, the algorithm demonstrated the ability to efficiently retrieve images independent of repository size and in diverse datasets. It also demonstrated proficiency in diagnosis of rare disease types and the ability to serve as a search engine to recognize certain regions of images that may be relevant for diagnosis. This work may greatly inform future disease diagnosis, prognosis, and analysis.
“As the sizes of image databases continue to grow, we hope that SISH will be useful in making identification of diseases easier,” said Mahmood. “We believe one important future direction in this area is multimodal case retrieval which involves jointly using pathology, radiology, genomic and electronic medical record data to find similar patient cases.”
Original publication
Other news from the department science
Get the analytics and lab tech industry in your inbox
From now on, don't miss a thing: Our newsletter for analytics and lab technology brings you up to date every Tuesday. The latest industry news, product highlights and innovations - compact and easy to understand in your inbox. Researched by us so you don't have to.