Dataset
DatingProfiles-100K
100,000 de-identified dating-app profiles with free-text bios and structured attributes.
100,000
Profiles
180 words
Avg bio length
EN, ES, PT
Languages
NA, EU, LATAM (coarsened)
Regions
A de-identified corpus of 100K dating-app profiles for matching, personalization, and conversational AI work. Each record includes a free-text self-description, a structured attribute set (age band, location coarsened to region, interests, prompt responses), and engagement features (response rate buckets). All PII removed; consent + ToS chain documented.
Data sample
What a record looks like
Sample profile record
JSONIllustrative — full sample available under NDA
{
"profile_id": "prf_8a3f...",
"age_band": "25-34",
"region": "NA-Northeast",
"bio": "Engineer who loves bouldering and bad puns. Looking for someone who picks the wine.",
"interests": ["climbing","travel","cooking"],
"prompts": [{"q":"Two truths and a lie","a":"..."}],
"engagement": {"response_rate":"medium"}
}