Four channels — online, offline, documents, transcripts — and the one architectural choice that joins them into a single participant record instead of four disconnected exports.
By Unmesh Sheth · Sopact
§ 3.0 · Where this chapter sits
Where this chapter sits
From design to data flowing in.
Chapter 02 told you what to measure. This chapter is the mechanics of getting
it in — including the three channels traditional survey tools ignore.
Chapters in Beyond the Survey
00Introduction8 pages
01Workflow22 pages
02Data Design17 pages
03Data Collectionyou are here
04Intelligent Suitenext chapter
05Actionable Insight~18 pages
The library
Book 01 · this book
Beyond the Survey
The foundational field guide — methodology for the AI era.
One unified intelligence layer across many programs.
2
CHAPTER · 03
Data Collection.
Surveys are one channel. Real collection has four — and pretending the
other three don't exist is why "impact data" is usually missing the most
important parts of what people actually said.
04.One cohort, four channels, one record — end-to-end
Time to read
12 min
16 pages · 22 illustrations
3
§ 3.1 · Why "survey" is too small
Chapter 03 · §3.1
Most impact data isn't in the survey.
Application essays, exit interviews, financial documents, partner audits,
field photos, voice memos from rural visits. These are the most evidence-rich
parts of any program — and traditional survey tools can't accept any of them.
What a survey tool sees
Q1
Rate confidence 1–5
Q2
Select your demographic
Q3
Open text · 200 char max
A flat schema of typed cells. Anything else is "out of scope."
What's actually in the program
📝
Web form
scales + open-ended
📱
Mobile offline
photos + voice
📄
Application PDFs
essays · recs · transcripts
🎙
Exit interviews
transcripts + tags
📊
Quarterly metrics
structured CSVs
🗂
Social audits
3rd-party PDFs
Six input shapes, one record. Each format is data — not exhaust.
If your tool can only handle questions and answers, you're collecting
maybe 30% of what your program actually produces.
4
§ 3.2 · Four channels
Chapter 03 · §3.2
Four channels. One stakeholder ID.
The unlock isn't accepting more formats — it's keeping them all linked to the
same person. A unique stakeholder ID assigned at first contact survives across
every channel that comes after.
∞
The architectural choice: persistent ID from first contact.
Same person fills a web form in March, gets interviewed in June, submits a
PDF in September. All three land on the same record. No reconciliation,
no VLOOKUPs, no consultant gluing exports together.
5
§ 3.3 · Online
Channel 01Online
Web forms, but unique-link.
Online surveys are familiar — the catch is what most tools do wrong: one
generic URL for the whole cohort. A unique link per respondent is the
difference between "we got 200 responses" and "we know which 200 people they
were and what each of them said the last time too."
Generic URL · the old way
survey.example.com/q3-feedback
Identity collected inside the form (if at all). Email retypes. Duplicates
pile up. Pre/post linkage by hand.
Unique link · designed
sense.app/f/q3?id=p_a7f3
Identity in the URL. Form pre-fills what's known. Respondent can edit
later via the same link. Pre/post linkage is a calculation, not a project.
EMBED
Iframe into any LMS, website, or partner portal. Same unique-link logic.
SAVE-PROGRESS
Long applications resume where the respondent left off. Days later, on any device.
SUBMISSION ALERT
Email triggered on submit with full payload — route to staff or downstream system.
6
§ 3.4 · Offline
Channel 02Offline
Capture now. Sync when connected.
The respondents you most need to hear from often have the worst connectivity:
rural farmers, refugee settlements, field staff on partner visits. Mobile
offline-first capture is the difference between hearing them and writing them
out of your data.
PHOTO
Camera-roll attached to a response. AI describes contents at sync time. Evidence, not exhibit.
VOICE
Press-hold to record a 30s voice memo in any language. Transcribed + tagged after sync.
GPS / TIMESTAMP
Optional location + timestamp on each response. Useful for field-monitor accountability.
7
§ 3.5 · Documents
Channel 03Documents
PDFs are data, not attachments.
Application essays, financial statements, social audits, grantee reports —
these arrive as documents. Treated as "attachments" they sit at the bottom of
the record unread. Treated as data, every page becomes searchable evidence
with a citation you can click.
PDF input
Sustainability Report 2025
… committed to net-zero emissions by 2035 through a 40% renewable energy mix by year-end 2026 …
… diversity on the board grew from 28% to 41% women-identifying members …
page 12 of 47
Structured output
net_zero_year
2035 p.12
renewable_pct_target
40% p.12
board_diversity_pct
41% p.12
prior_year_diversity
28% p.12
Every value clicks back to the page it came from. No "trust me" extracts.
Common extracts
Numbers (spend, runway, headcount) with units
Claims + commitments, tagged by framework
Demographics from rec letters or essays
Compliance items checked against checklists
What survives
Page-level citation per extracted value
Original source quote, in source language
Confidence score on each extraction
Human-override path when AI gets it wrong
8
§ 3.6 · Transcripts
Channel 04Transcripts
Interviews become queryable.
A 30-minute exit interview used to be "we'll listen to it later." Now it's
auto-transcribed with speaker labels and timestamps before the call ends —
and every line is joined to the same participant record as their survey.
Auto-transcript · timestamped
[00:02:14]Interviewer: Walk me through the moment you realized the program was working for you.
[00:02:24]Participant: Probably week 6. I was helping a peer debug an API call and I didn't have to look anything up.
[00:04:08]Interviewer: What changed for you outside of the technical skills?
[00:04:18]Participant:My partner could finally not ask "did you fix anything today?" like it was a joke.
[00:08:42]Participant:Confidence in interviews is real now. Not faked.
AI distinguishes interviewer from interviewee. Multi-party calls are split per speaker.
TIMESTAMP JOIN
Every claim in a report clicks back to the second of the recording it came from.
SOURCE LANGUAGE
Interview in Swahili? Transcribe in Swahili, tag in English, report in Portuguese.
9
§ 3.7 · Worked example
Chapter 03 · §3.7
One cohort. Four channels. One participant record.
A coding bootcamp cohort, 60 learners, 14 weeks. Watch how each of the four
channels delivers different data — and how all four land on the same record
without anyone joining them by hand.
PDFs extracted into structured fields · joined on stakeholder_id automatically
180 PDFs read
page-cited evidence
03
OFFLINE · WEEKS 1–14
Weekly mobile pulse-checks
14 quick check-ins per learner · sync on commute · early at-risk signals
~840 pulses
99% sync rate
04
TRANSCRIPTS · WEEK 14
Exit interviews · 30 min each
Auto-transcribed with speaker labels · themed in real-time · joined on id
54 interviews
~1620 quote-tags
The result · one record per learner, four channels deep
Day-1 demographics from the form. Application evidence from the PDFs.
Weekly pulse data from the phone. Closing reflections from the interview.
Same stakeholder_id, four data shapes, zero reconciliation.
The cohort report writes itself the morning week 14 ends.
10
§ 3.8 · Collection patterns
Chapter 03 · §3.8
Five patterns, by program type.
Different programs lean on different channel mixes. Recognizing your pattern
short-cuts the architecture phase from weeks to hours.
📝
Workforce training
online + offline mix · pulse-heavy
📄
Application-driven
document-heavy · scholarships, accelerators
🎙
Participatory eval
transcript-heavy · book 05
📊
Impact-investor
document + structured · book 03
📱
Field / rural
offline-first · book 04
Two patterns in detail
Workforce training (next page) — online intake + weekly offline pulse + exit transcript.
Heaviest on volume of small data points across many weeks. Pulse data is the
differentiator versus old-school pre/post-only.
Application-driven (page after) — document-heavy intake (essays, recs, financials)
+ structured online review forms. Each applicant generates 4–6 documents, all
joined to one applicant_id.
The remaining three patterns get full treatment in their domain books.
11
§ 3.8.1 · Workforce training
Pattern 01 of 05
Workforce training.
Online intake at week 0, mobile pulses every week, document-light, transcript at the end.
Continuous signal — not just a two-wave snapshot.
Cohort size
30–80 learners
Cadence
weekly pulse
Primary channels
Online + Offline
01
Online intake
Unique-link form, week 0
Demographics + goals + prior skill
Accommodations + language preference
stakeholder_id assigned here
02
Mobile pulse
30-second weekly check-in
Confidence + blocker + 1 photo
Captured offline, syncs on commute
Voice memo optional, in source lang
03
Capstone artifact
Project PDF or repo link
Extracted: stack, complexity, themes
Linked to same stakeholder_id
Reviewer rubric joined on submit
04
Exit interview
30-min auto-transcribed call
Speaker-labeled, time-coded
Themed against pulse history
Joined to T0 record automatically
05
+6mo follow-up
Same unique link as week 0
Wage / placement / retention
Open: "what's changed since?"
Pre/post + longitudinal in one shot
The win
Pulse data surfaces at-risk learners by week 3. Capstone evidence is read,
not skimmed. Exit interviews are queryable by theme. +6mo response rate is
77% — because the same unique link still works.
500 scholarship applications reviewed in two days instead of three weeks.
Every decision auditable, every score citation-backed. Selected cohort flows
straight into the pattern-01 workforce-training channel mix without re-entering data.
13
§ 3.9 · The accelerant
Chapter 03 · §3.9
How Sopact Sense handles all four channels.
Four channels could mean four tools. In Sopact Sense it's one — built around
Contacts, Forms, and Relationships, with Skills handling the channel-specific
work that traditional tools can't.
THE PLATFORM
Sopact Sense
Four channels, one platform. Contacts hold the unique IDs. Forms handle
the structured input. Relationships keep documents and transcripts joined
to the right person.
One stakeholder_id, four channel feeds, zero reconciliation.
THE ACCELERANT
Skills
Prepackaged playbooks for the channel-specific moves that take a lot of
configuration to get right the first time — and zero configuration on every
subsequent cohort.
{ }unique-link-router
Generates per-respondent URLs and pre-fills known fields.
{ }offline-sync-monitor
Tracks sync state across field devices; flags missing data.
{ }document-extractor
Pulls structured fields from PDFs with page-level citations.
{ }transcript-importer
Brings audio/video into the record with speaker labels and themed quotes.
These Skills run inside Sopact Sense. They aren't shipped as standalone files.
↑
Why this compounds
Cohort 1's transcripts teach Sense your theming vocabulary. Cohort 2 inherits
that vocabulary and adds nuance. By cohort 5 your team is starting from the best transcripts
pipeline you've ever had — not configuring channel mechanics from scratch.
14
§ 3.10 · Recap + Up Next
Chapter 03 · §3.10
Five lessons to carry forward.
1
"Survey" is the smallest of four channels.
Online forms are one part. Documents, transcripts, and offline mobile
capture cover the other 70% of what your program actually produces.
2
Persistent ID is the architectural choice.
Unique stakeholder_id from first contact survives every channel that comes
after. Pre/post becomes a calculation, not a project.
3
Documents and transcripts are data, not attachments.
Every PDF becomes structured fields with page citations. Every interview
becomes themed quotes with timestamps. Both join on stakeholder_id.
4
Offline-first or you lose your hardest-to-reach.
Rural, field-staff, low-bandwidth participants are the ones funders most
want evidence on. Mobile capture + sync makes them part of your data,
not absent from it.
5
Pattern-match before you architect.
Five patterns cover most programs. Find yours, lift the channel mix,
short-cut weeks of design work.
UP NEXT
Chapter 04 · Intelligent Suite
Four channels of data arrive on one record. Now: the AI features that
analyze them — cell, row, column, grid — and the four canonical report
types they produce.
04
15
End of Chapter 03
END OF CHAPTER 03 · BOOK 01
Six books. One spine. Built for the AI era.
Collection done across all four channels. Transformation next — where the
Intelligent Suite turns this record into reports your funder will read.
BOOK 01
Beyond the Survey
You are here
BOOK 03
Grant Management
Industry guide
BOOK 04
Impact Investment
Industry guide
BOOK 05
Workforce Training
Industry guide
BOOK 05
Nonprofit Programs
Industry guide
BOOK 06
Application Management
Industry guide
"Four channels. One stakeholder ID. Same record growing across every form,
every document, every interview."