AI Answers About Ankylosing Spondylitis: Model Comparison
Data Notice: Figures, rates, and statistics cited in this article are based on the most recent available data at time of writing and may reflect projections or prior-year figures. Always verify current numbers with official sources before making financial, medical, or educational decisions.
AI Answers About Ankylosing Spondylitis: Model Comparison
DISCLAIMER: AI-generated responses shown for comparison purposes only. This is NOT medical advice. Always consult a licensed healthcare professional for medical decisions.
Ankylosing spondylitis affects an estimated ~approximately 0.2 to 0.5 percent of the population, or ~roughly 1.5 million people in the United States. Men are ~2 to 3 times more likely to develop AS than women, though the condition is increasingly recognized in women with often atypical presentations. Symptom onset typically occurs between ages 17 and 45, with an average diagnostic delay of ~7 to 10 years due to the gradual onset and overlap with common back pain. ~approximately 90 percent of patients with AS carry the HLA-B27 gene, though not all HLA-B27 carriers develop the disease.
We tested four AI models with a ankylosing spondylitis scenario to evaluate their understanding and management guidance.
The Question We Asked
“I’m a 29-year-old man who has had persistent low back pain and stiffness for over two years. The pain is worse in the morning and improves with exercise but not rest. I’ve also had episodes of heel pain and eye redness. My father has a condition affecting his spine. My rheumatologist suspects ankylosing spondylitis. What is this, and what does it mean for my future?”
Model Responses: Summary Comparison
| Criteria | GPT-4 | Claude 3.5 | Gemini | Med-PaLM 2 |
|---|---|---|---|---|
| Explained inflammatory mechanism | Yes | Yes | Partial | Yes |
| Discussed HLA-B27 connection | Yes | Yes | Partial | Yes |
| Covered biologic therapies | Yes | Yes | Partial | Yes |
| Addressed exercise importance | Yes | Yes | Yes | Yes |
| Discussed extra-articular features | Yes | Yes | Partial | Yes |
| Addressed long-term prognosis | Yes | Yes | Partial | Yes |
| Mentioned NSAIDs as first-line | Yes | Yes | Yes | Yes |
| Discussed posture preservation | Yes | Yes | Yes | Partial |
What Each Model Got Right
GPT-4
GPT-4 provided a thorough explanation of ankylosing spondylitis as a chronic inflammatory disease primarily affecting the sacroiliac joints and spine, with the potential for progressive spinal fusion if inadequately treated. The model correctly discussed the HLA-B27 genetic association and explained the inflammatory back pain characteristics that distinguish AS from mechanical back pain. GPT-4 covered the full treatment spectrum from first-line NSAIDs through biologic therapies including TNF inhibitors like adalimumab and etanercept, and IL-17 inhibitors like secukinumab. The model addressed extra-articular manifestations including uveitis, enthesitis, and inflammatory bowel disease.
Claude 3.5
Claude 3.5 delivered the most hopeful and practically useful response. The model acknowledged the patient’s concern about his future while providing reassurance that modern treatments have dramatically improved outcomes. Claude 3.5 emphasized the critical role of regular exercise and physical therapy in maintaining spinal mobility and posture, providing specific exercise recommendations including swimming, yoga, and daily stretching routines. The model discussed the importance of early biologic therapy in preventing structural damage and addressed the patient’s extra-articular symptoms, explaining uveitis as an ophthalmologic emergency requiring prompt treatment. Claude 3.5 also addressed the emotional impact of receiving a chronic disease diagnosis at a young age.
Gemini
Gemini provided an accessible overview of ankylosing spondylitis with a strong emphasis on the role of exercise and physical therapy. The model discussed how regular movement and structured exercise programs can maintain flexibility and reduce pain. Gemini addressed the importance of posture awareness and provided practical tips for maintaining spinal alignment during daily activities and sleep.
Med-PaLM 2
Med-PaLM 2 offered the most comprehensive clinical discussion, covering the pathogenesis of AS including IL-23/IL-17 pathway involvement and new bone formation mechanisms. The model discussed the modified New York criteria and ASAS classification criteria for diagnosis. Med-PaLM 2 provided the most detailed treatment review, covering NSAIDs, conventional DMARDs, and all available biologic classes with evidence on structural progression. The model discussed the emerging role of JAK inhibitors and addressed comorbidities including cardiovascular risk, osteoporosis, and spinal fracture susceptibility.
What Each Model Got Wrong or Missed
GPT-4
GPT-4 did not adequately address the emotional dimension of being diagnosed with a chronic, progressive disease at age 29. The model presented the medical information accurately but did not acknowledge the fear and uncertainty the patient may be experiencing about his long-term mobility, career, and quality of life.
Claude 3.5
Claude 3.5 did not discuss the diagnostic criteria or the imaging findings that confirm the diagnosis, which is relevant for a patient whose diagnosis is still being established. The model could also have provided more detail on the newer biologic agents and their comparative efficacy for different manifestations of AS.
Gemini
Gemini did not discuss biologic therapies in adequate detail, which are cornerstone treatments for patients who do not respond to NSAIDs. The model also provided insufficient information about extra-articular manifestations, particularly uveitis and its need for emergency treatment. The prognosis discussion was vague and did not adequately address modern treatment outcomes.
Med-PaLM 2
Med-PaLM 2 provided an excellent clinical reference but did not translate the information into practical guidance for a young patient. The model’s detailed discussion of classification criteria and pathogenesis may not help the patient understand what AS means for his daily life, career, and relationships. The emotional impact of the diagnosis was not addressed.
Red Flags All Models Should Mention
All AI models should flag these concerns in the context of ankylosing spondylitis:
- Sudden vision changes or eye pain and redness suggesting acute anterior uveitis requiring emergency ophthalmologic care
- Breathing difficulty suggesting chest wall restriction from advanced spinal fusion
- New neurological symptoms such as leg weakness or numbness
- Fracture from minor trauma in a spine affected by AS, which is a medical emergency
- Cardiac symptoms including palpitations or shortness of breath suggesting aortitis or conduction abnormalities
- Progressive functional decline or significant morning stiffness worsening despite treatment
When to Trust AI vs. See a Doctor
When AI Information May Be Helpful
AI tools can help patients understand the difference between inflammatory and mechanical back pain and the importance of early diagnosis. AI can introduce treatment options including the role of biologics and the critical importance of exercise. AI can also help patients understand extra-articular manifestations and recognize symptoms that require urgent attention, particularly uveitis symptoms.
When You Must See a Doctor
Ankylosing spondylitis requires diagnosis and management by a rheumatologist. Biologic therapy decisions require specialist assessment and ongoing monitoring. Uveitis requires emergency ophthalmologic treatment. Physical therapy should be supervised by a therapist experienced in AS management. Regular monitoring for extra-articular manifestations and comorbidities requires ongoing rheumatological care.
For more on AI’s role in health guidance, visit our medical AI accuracy page.
Methodology
We submitted the identical patient scenario to GPT-4, Claude 3.5 Sonnet, Gemini 1.5 Pro, and Med-PaLM 2 in March 2026. Each model received the prompt without prior conversation context. Responses were evaluated by a rheumatologist and a physical medicine specialist against current ACR/SAA/SPARTAN guidelines for ankylosing spondylitis management. Models were scored on medical accuracy, treatment comprehensiveness, practical guidance, and patient communication quality.
Key Takeaways
- All four models correctly explained the inflammatory nature of AS and its distinction from mechanical back pain, which is essential for patient understanding.
- Claude 3.5 provided the most balanced response, combining medical accuracy with emotional support and practical guidance for a young patient facing a chronic diagnosis.
- The critical role of exercise and physical therapy was well-addressed by all models, though specific exercise recommendations varied in quality and detail.
- Biologic therapies were comprehensively discussed by GPT-4, Claude 3.5, and Med-PaLM 2 but insufficiently covered by Gemini, which is a significant gap for patients who may need these treatments.
- AS management requires specialized rheumatological care, and AI should help patients understand their condition and the importance of early aggressive treatment while directing them to appropriate specialists.
Next Steps
If you found this comparison helpful, explore these related resources:
- Can AI Replace Your Doctor? What the Research Says
- Medical AI Accuracy: How We Benchmark Health AI Responses
- How to Ask AI Health Questions Safely
- Compare Medical AI Models Side by Side
DISCLAIMER: AI-generated responses shown for comparison purposes only. This is NOT medical advice. Always consult a licensed healthcare professional for medical decisions.