What is Cross-Care?

Start Here

April 30, 2023

Written by BittermanLab

Cross-Care

Cross-Care is a research initiative that explores the world of large language models (LLMs), specifically focusing on their applications in healthcare.

The Importance of Benchmarks

Benchmarks play a crucial role in evaluating the performance, limitations, and robustness of LLMs. Well-known benchmarks like GLUE and SuperGLUE have been foundational in assessing language understanding and task performance. However, the challenges today go beyond these scopes, touching on aspects like domain knowledge, safety, hallucinations, and biases, especially in sensitive areas like healthcare. These issues are crucial because they can influence disparities in healthcare outcomes and the quality of care delivered.

Investigating Representational Biases

Our research specifically targets representational biases in LLMs concerning medical information. We analyze how biases in the data used to train these models can affect their outputs, particularly how diseases are associated with different demographic groups. By studying data from "The Pile," a large dataset used for training LLMs, we examine these biases and their impact on model behavior.

Bridging the Gap Between Model Perceptions and Reality

We compare the model likelihoods of disease across demographic groups to actual disease prevalences in the United States among various demographic groups. This comparison helps us understand the discrepancies between how models perceive the world and the real epidemiological data.

Contributions and Tools for the Community

Our work contributes to the field by:

Analyzing the associations between demographic groups and disease keywords in training datasets.

Examining how these biases are represented across different models, regardless of their size or architecture.

Comparing model-derived perceptions to real-world data to spotlight the inconsistencies.

This website ( crosscare.net ), allows users to explore this data further and download detailed findings for use in further research on model interpretability and robustness.

This research not only illuminates the biases present in LLMs but also equips researchers and practitioners with the necessary tools to develop more equitable and effective NLP systems for healthcare.

Continue reading about what we found in language models training data...