Dataset
MeetingHours-100K
100,000 hours of consented business meeting audio with verbatim transcripts and speaker labels.
100,000
Hours
~180,000
Meetings
4.2 per meeting
Avg speakers
EN (primary), ES, FR, DE
Languages
Long-form business meeting audio recorded with explicit participant consent. Each meeting ships with verbatim transcripts, speaker diarization, role labels (host / presenter / participant), and meeting-type metadata (sales call, standup, customer success, board, etc.). Useful for meeting AI, summarization, action-item extraction, and ASR fine-tuning.
Data sample
What a record looks like
Sample meeting transcript segment
JSONIllustrative — full sample available under NDA
{
"meeting_id": "mtg_2024_q3_18204",
"meeting_type": "sales_discovery",
"duration_sec": 2415,
"speakers": ["S1","S2","S3"],
"segments": [
{"t":12.4,"speaker":"S1","text":"Thanks for jumping on. Can you walk me through your current pipeline?"}
]
}