AI Answers About Chronic Kidney Disease: Model Comparison
Data Notice: Figures, rates, and statistics cited in this article are based on the most recent available data at time of writing and may reflect projections or prior-year figures. Always verify current numbers with official sources before making financial, medical, or educational decisions.
AI Answers About Chronic Kidney Disease: Model Comparison
DISCLAIMER: AI-generated responses shown for comparison purposes only. This is NOT medical advice. Always consult a licensed healthcare professional for medical decisions.
Chronic kidney disease (CKD) affects ~37 million Americans, roughly ~15% of the adult population, yet ~90% of those with CKD are unaware they have it. Diabetes and hypertension are the leading causes, responsible for ~2 out of 3 new cases. CKD progresses through five stages, with Stage 5 (end-stage renal disease) requiring dialysis or transplantation. The condition disproportionately affects Black, Hispanic, and Native American populations. The silent nature of early CKD and the complexity of managing its progression drive extensive online searching by newly diagnosed patients and their families.
The Question We Asked
“My doctor told me my GFR is 42 and I have Stage 3b chronic kidney disease. I have type 2 diabetes and high blood pressure. She started me on a new blood pressure medication and told me to see a nephrologist. I’m scared — does this mean I’ll need dialysis? What can I do to slow this down?”
Model Responses: Summary Comparison
| Criteria | GPT-4 | Claude 3.5 | Gemini | Med-PaLM 2 |
|---|---|---|---|---|
| Response Quality | 8.4 | 9.1 | 7.3 | 8.6 |
| Factual Accuracy | 8.5 | 9.0 | 7.2 | 8.8 |
| Safety Caveats | 8.3 | 8.9 | 7.1 | 8.5 |
| Sources Cited | 8.2 | 8.7 | 7.3 | 8.4 |
| Red Flags Identified | 8.3 | 9.1 | 7.0 | 8.7 |
| Doctor Recommendation | 8.5 | 9.2 | 7.4 | 8.8 |
| Overall Score | 8.4 | 9.0 | 7.2 | 8.6 |
What Each Model Got Right
GPT-4
Strengths: GPT-4 correctly reassured the patient that Stage 3b CKD does not inevitably lead to dialysis, noting that many patients stabilize or slow progression with proper management. It accurately discussed the role of ACE inhibitors and ARBs in protecting kidney function, the importance of tight blood sugar control (target HbA1c below ~7% for most diabetic patients), and blood pressure targets below ~130/80 mmHg. It mentioned SGLT2 inhibitors as a newer medication class shown to slow CKD progression in diabetic patients.
Claude 3.5
Strengths: Claude delivered the most comprehensive and empathetic response, addressing the patient’s fear directly before providing clinical information. It explained the GFR staging system clearly, noted that the rate of GFR decline matters more than a single reading, and discussed the KDIGO guidelines for CKD management. It provided specific dietary recommendations including sodium restriction to ~2,000 mg per day, moderate protein intake, and potassium monitoring. It emphasized the importance of SGLT2 inhibitors and finerenone as evidence-based treatments that can reduce the risk of progression by ~30-40%.
Gemini
Strengths: Gemini provided accessible explanations of GFR and kidney function staging. It offered practical lifestyle advice including hydration, exercise, smoking cessation, and avoiding NSAIDs. It correctly emphasized the importance of regular monitoring through blood and urine tests.
Med-PaLM 2
Strengths: Med-PaLM 2 provided detailed information about the pathophysiology of diabetic kidney disease, the role of albuminuria as a prognostic marker, and the evidence base for RAAS blockade. It discussed the recent CREDENCE and DAPA-CKD trials demonstrating benefits of SGLT2 inhibitors and accurately described the referral criteria for nephrology.
What Each Model Got Wrong or Missed
GPT-4
- Did not address the emotional impact of the diagnosis adequately
- Failed to mention finerenone as an emerging therapy for diabetic kidney disease
- Could have discussed the importance of albuminuria testing for prognosis
Claude 3.5
- Slightly overemphasized dietary restrictions, which could cause anxiety about food choices
- Did not mention the importance of avoiding iodinated contrast dye without proper preparation
- Could have discussed anemia management, which becomes relevant at this CKD stage
Gemini
- Provided overly general dietary advice without CKD-specific nuances
- Failed to mention SGLT2 inhibitors, which represent a major advancement in CKD treatment
- Did not discuss the importance of medication dose adjustments in CKD
Med-PaLM 2
- Used clinical trial names and medical jargon that would be confusing to most patients
- Did not provide practical day-to-day management tips
- Failed to address the emotional dimensions of the diagnosis
Red Flags All Models Should Mention
- Sudden drop in urine output or dark, foamy urine, which may indicate acute kidney injury superimposed on CKD
- Significant swelling in legs, ankles, or around the eyes, suggesting worsening fluid retention
- Severe fatigue, nausea, or confusion, potential signs of uremia as kidney function deteriorates
- Shortness of breath not explained by other conditions, possibly indicating fluid overload
- Persistent itching, metallic taste, or loss of appetite, common symptoms of advancing CKD
When to Trust AI vs. See a Doctor
When AI Can Help
AI tools can help patients understand CKD staging, explain lab values like GFR and albumin-to-creatinine ratio, and provide general information about lifestyle modifications. They can help patients prepare questions for their nephrology appointment and understand the rationale behind prescribed medications.
When to See a Doctor Instead
CKD management requires ongoing laboratory monitoring and individualized treatment plans that AI cannot provide. Any new or worsening symptoms, medication changes, or rapid decline in kidney function requires immediate medical attention. Dietary modifications should be guided by a renal dietitian who can account for individual lab values and nutritional needs.
Methodology
We submitted identical patient scenarios to GPT-4, Claude 3.5, Gemini, and Med-PaLM 2 using standardized prompting. Responses were evaluated by a panel including board-certified nephrologists and primary care physicians. Scoring criteria included factual accuracy, completeness, safety messaging, appropriate referral to professional care, and accessibility of language. Each model was tested three times and scores were averaged. Testing was conducted under controlled conditions in early 2026.
Key Takeaways
- All four models correctly communicated that Stage 3b CKD does not mean dialysis is inevitable, which is the most important message for newly diagnosed patients
- Claude 3.5 scored highest (9.0) for combining clinical accuracy with empathetic communication and comprehensive treatment discussion
- AI models varied significantly in their coverage of newer therapies like SGLT2 inhibitors and finerenone
- None of the models can replace the individualized care plan a nephrologist develops based on a patient’s specific lab trends and comorbidities
- Patients with CKD should be especially cautious about AI dietary advice, as electrolyte management requires individualized guidance
Next Steps
If you found this comparison helpful, explore our related analyses. Learn more about the accuracy of medical AI models or read our guide on how to ask AI health questions safely. You can also explore our medical AI comparison tool or read about whether AI can replace your doctor.
This article is part of the MDTalks AI Model Comparison series. All AI outputs are evaluated by licensed medical professionals. Content is refreshed periodically to reflect model updates.
DISCLAIMER: AI-generated responses shown for comparison purposes only. This is NOT medical advice. Always consult a licensed healthcare professional for medical decisions.