Development of a Pathology Database for an Internet Hospital Platform
AI-powered virtual physician for healthcare delivery with pathology database development and NLP applications
Overview
This project developed the AI backbone for a virtual physician system within an Internet Hospital Platform. The core challenge: how do you build an AI that can understand medical questions in natural language, reason about pathology data, and provide useful health guidance — all while maintaining medical accuracy?
What We Built
The project centered on three interconnected components:
pathBERT — a specialized BERT language model fine-tuned on pathology texts. While general-purpose language models struggle with medical terminology and reasoning, pathBERT was trained specifically to understand the language of pathology reports, clinical descriptions, and medical literature.
Named Entity Recognition (NER) and Relation Extraction (RE) — models that identify medical entities (diseases, symptoms, medications, anatomical structures) in text and map the relationships between them. For example, recognizing that a symptom is associated with a specific condition, or that a medication treats a particular disease.
Medical Knowledge Graph — a structured database of medical relationships extracted from pathology data, enabling the AI to reason about connections between symptoms, diagnoses, and treatments rather than simply pattern-matching on text.
How It Works Together
When a user asks a medical question, the system uses pathBERT to understand the question, the NER/RE models to identify relevant medical concepts, and the knowledge graph to reason about possible answers. A question-answering module then generates responses grounded in the structured medical knowledge.
Significance
The project demonstrated that specialized NLP models combined with structured medical knowledge can provide a foundation for AI-assisted healthcare services, particularly in contexts where access to specialist physicians is limited.
Collaborators
- Cheju Halla University: Lead research institution
- Medical Partners: Domain expertise and data validation