Motorica Ops Kit — Walkthrough

# Motorica Ops Kit -- Full Walkthrough

Everything in this repo, explained slide by slide. Written for the Motorica team.

Slide 1 -- What This Repo Is

motorica-ops-kit/
├── scripts/                  ← Python automation (the workhorses)
├── knowledge-graph/          ← Motorica offline brain (45 entities, queryable)
│   ├── sources/              ← 7 data files that build the brain
│   │   ├── nexus-export.jsonl          ← baseline facts (client, campaigns, patterns, KPIs)
│   │   ├── signal-landscape.jsonl      ← summary of all 5 signal feeds
│   │   ├── Motorica updated Signals - Developers from SteamDB.csv
│   │   ├── Motorica updated Signals - Future Game Releases.csv
│   │   ├── Motorica updated Signals - Studios that got funding.csv
│   │   ├── Motorica updated Signals - Signal from LinkedIn hirring jobs.csv
│   │   └── Motorica updated Signals - Hirring jobs from another source.csv
│   ├── scripts/
│   │   ├── kg_index.py       ← rebuild the brain from sources/
│   │   └── kg_query.py       ← ask the brain questions (offline, no network)
│   ├── entities.jsonl        ← built: all 45 facts (regenerated by kg_index.py)
│   ├── relationships.jsonl   ← built: how facts connect
│   ├── index.json            ← built: lookup table + counts
│   ├── taxonomy.yaml         ← what domains exist
│   └── ontology.yaml         ← what relationship types exist
├── .claude/commands/
│   ├── net-new.md            ← /net-new: find studios NOT in CRM
│   └── re-engage.md          ← /re-engage: find cold-but-engaged contacts
├── docs/
│   ├── hubspot-cli-guide.md  ← how to run every script (for non-CLI users)
│   ├── leadgrow-systems-onepager.md ← systems architecture
│   ├── linkedin-cookie-guide.md     ← how Jamie/Nathan get LinkedIn cookies
│   └── repo-walkthrough.md   ← this document
├── tests/                    ← 223 tests (business logic only, no API calls)
├── exclusions/               ← suppression CSVs (auto-generated, gitignored)
├── sends/                    ← post-send attribution CSVs (gitignored)
├── .env                      ← HUBSPOT_ACCESS_TOKEN (gitignored, never committed)
└── README.md                 ← setup instructions + weekly ops rhythm

The big idea: everything here runs locally on Motorica machine. No LeadGrow servers, no network dependency (except HubSpot API calls from scripts that need live data). The knowledge graph is fully offline.

Slide 2 — The Knowledge Graph (Motorica's Brain)

🧠 Also available as a standalone deep-dive: Feynman explainer

What it is

Motorica's offline company brain — a searchable collection of 45 facts about your market, your campaigns, your ICP, and what actually works. Lives in plain files on your machine, zero cost, works without internet.

Think of it like a game's quest journal. Before you take a quest (run a campaign), you open the journal to check: Who's the target? What worked last time? What should I avoid? kg_query.py is the "open journal" button. kg_index.py is the journal updating when new intel drops.

How data flows

sources/*.jsonl          kg_index.py          entities.jsonl        kg_query.py          your terminal
(7 files, you own)  ──►  REBUILD  ──►  (45 facts, auto-gen)  ──►  SEARCH / LIST / SHOW  ──►  "what converts?"

What is in it (45 entities)

Domain	Count	What it tells you	Query when...
`client`	1	ICP, pain points, winning angle, status	Starting any new campaign
`persona`	5	Who you target: Founder, Director, Manager, C-Level, IC	Segmenting a lead list
`campaign`	3	Live stats: Masters 2026 (1.12%), Post Launch (1.43%), Just Funded (8.2%)	Comparing what's working
`pattern`	1	Proven CTA: reference a specific game character/animation	Writing outreach copy
`kpi`	30	Metric definitions (values live in HubSpot/Bison)	Building reports
`signal_landscape`	5	Market snapshot: qualified studios, funding, hiring signals	Sizing pipeline

How to use it

Three commands cover everything:

# SEARCH — find anything by keyword across all 45 facts
python knowledge-graph/scripts/kg_query.py search "reply rate"
python knowledge-graph/scripts/kg_query.py search "funding"
python knowledge-graph/scripts/kg_query.py search "CTA" --domain pattern

# LIST — show every fact in one category
python knowledge-graph/scripts/kg_query.py list campaign
python knowledge-graph/scripts/kg_query.py list signal_landscape

# SHOW — print one fact in full detail (use before writing any copy)
python knowledge-graph/scripts/kg_query.py show motorica
python knowledge-graph/scripts/kg_query.py show campaign:masters-2026

That's it. No database, no login, no internet. It reads the built files and prints answers.

How to rebuild (when new intel arrives)

python knowledge-graph/scripts/kg_index.py
# Output: wrote 45 entities, 44 rels -> entities.jsonl / relationships.jsonl / index.json

The rebuild is deterministic — same inputs always produce the same outputs. Zero network calls. Run it when new signal CSVs land or when you update any source file.

Think of it like a save file. You wouldn't keep playing from a save from three weeks ago after grinding new gear. The rebuild is the autosave — run it when the intel changes.

How to grow it

Drop a new knowledge-graph/sources/anything.jsonl file (one JSON object per line, tagged _type: "entity" or _type: "rel"), then re-run kg_index.py.

Slide 3 -- The Signal CSVs (Raw Ground Truth Data)

Five CSV files in knowledge-graph/sources/ contain the complete current market signal landscape. These are Motorica raw prospecting data -- vetted, qualified, and ready to feed outreach.

CSV File	Rows	Key Numbers
Developers from SteamDB	1,791	579 qualified, 430 maybe, 600 rejected
Future Game Releases	1,904	95 strong/excellent fit, 135 moderate/qualified fit, 755 unique studios
Studios that got funding	150	22 strong fit (Wildlife $120M, Devilest $100M, etc.), 24 moderate fit
LinkedIn hiring signals	177	69 primary qualified, 166 unique companies. Top role: Senior Technical Animator
External hiring jobs	28	19 primary qualified, 24 unique companies

These feed into the KG via signal-landscape.jsonl, which captures the aggregate counts and breakdowns. When new CSV data arrives, update the JSONL summaries and rebuild.

Slide 4 -- The Scripts (Weekly Rhythm)

Nine Python scripts. Every one supports --dry-run -- always dry-run first.

Weekly cadence

MONDAY MORNING (before any send)
  └─ python scripts/pull_exclusion_list.py
     └─ Upload exclusions/suppression-YYYY-MM-DD.csv to Bison + Heyreach

AFTER EACH BATCH DEPLOY (from Bison)
  └─ python scripts/update_campaign_date.py sends/<that-batch>.csv

Script reference

Script	What it does	When
`pull_exclusion_list.py`	Exports who NOT to email -- customers, active deals, sequenced, sales-touched, inside 60-day cooldown	Weekly, before sends
`update_campaign_date.py`	Stamps `last_campaign_contact_date` + first-touch attribution on HubSpot contacts	After every batch
`steam_signals.py`	Scans Steam free APIs for locomotion-genre studios in pre/mid-production, cross-refs CRM to `in_crm` column	Weekly or on-demand
`find_reengagement_candidates.py`	Finds engaged-but-cold contacts (past cooldown, replied/connected before, not a customer) to ranked CSV	Before re-engagement push
`linkedin_connections.py`	Exports Jamie/Nathan LinkedIn connections to marks HubSpot contacts as `sales_team_touched`	Onboarding + quarterly
`setup_crm_properties.py`	Creates 4 custom HubSpot properties (idempotent)	Once, first
`backfill_taxonomy.py`	Normalizes priority, job title, journey stage, contact type on enriched contacts	Once, after setup
`backfill_contact_taxonomy.py`	Normalizes country, industry, company size from enriched CSV	Once, after setup
`outreach.py`	Cold lead resurfacing -- identifies cold leads, researches signals, generates angles, writes send CSV	On-demand

HubSpot custom properties (what the scripts read/write)

Property	What it means	Drives
`last_campaign_contact_date`	Date last touched by a campaign	60-day cooldown
`lg_first_touch_campaign`	First campaign that ever touched them (write-once)	Attribution
`lg_first_touch_channel`	First channel: cold_email / linkedin_outreach / event / inbound	Attribution
`sales_team_touched`	Jamie/Nathan personally reached out	Suppression
`priority`	Outreach priority: High / Medium / Low	Targeting
`contact_type`	Animator / User / VC / Influencer / Interested party	Routing
`m_job_title`	Normalized persona title	Segmentation
`hs_journey_stage`	Funnel stage (derived from lifecycle + lead status)	Re-engagement

Slide 5 -- The Two Commands (/net-new and /re-engage)

Claude Code slash commands that wire the data primitives into actionable briefs.

/net-new -- Find studios NOT in the CRM

Goal: fresh studios building locomotion-relevant games, with a "why now" hook and angle.

Flow: 1. steam_signals.py --upcoming-only to CSV of all Steam studios 2. Filter to in_crm = False (net-new only) 3. Rank by coming_soon = True (pre/mid-production = sweet spot) then traction 4. WebSearch each top studio for a hook (genre, character, recent announcement) 5. Pull the winning angle from the KG to tailor a one-line opener 6. Write brief: sends/netnew-brief-<date>.md

/re-engage -- Find cold contacts worth re-touching

Goal: contacts who engaged before but went cold, with a "why now" trigger.

Flow: 1. find_reengagement_candidates.py --min-days-cold 30 to ranked CSV 2. Take top ~15 by score (replied + connected + high intent) and coldness 3. For each studio: WebSearch for trigger (new game, funding, hiring) + cross-ref Steam signals 4. Pull winning angle from KG to suggest opener 5. Write brief: sends/reengage-brief-<date>.md

Same signal feed, opposite sides: /net-new = studios not in CRM yet. /re-engage = studios already in CRM but cold.

Slide 6 -- Data Flow Diagram

                    ┌─────────────────────────────────────┐
                    │         knowledge-graph/sources/     │
                    │                                     │
                    │  nexus-export.jsonl (baseline)      │
                    │  signal-landscape.jsonl (summaries)  │
                    │  5x Motorica Signals CSV files       │
                    └──────────┬──────────────────────────┘
                               │ kg_index.py rebuilds
                               ▼
                    ┌─────────────────────────────────────┐
                    │  entities.jsonl (45 facts)           │
                    │  relationships.jsonl (44 links)      │
                    │  index.json (lookup)                 │
                    └──────────┬──────────────────────────┘
                               │ kg_query.py reads
                               ▼
               ┌───────────────────────────────┐
               │  "what is our ICP?"            │
               │  "which CTA converts?"        │
               │  "how many funded studios?"   │
               │  "what is the reply rate?"     │
               └───────────────────────────────┘


  HubSpot CRM                          Steam Public APIs
       │                                      │
       │  HUBSPOT_ACCESS_TOKEN                │  (free, no auth)
       ▼                                      ▼
  ┌──────────────┐                   ┌────────────────┐
  │ pull_excl.   │                   │ steam_signals  │
  │ update_camp. │                   │  .py           │
  │ find_reeng.  │                   │                │
  │ linkedin_    │                   │ in_crm column  │
  │ connections  │                   │ cross-refs CRM │
  └──────┬───────┘                   └───────┬────────┘
         │                                   │
         ▼                                   ▼
  ┌──────────────┐                   ┌────────────────┐
  │ exclusions/  │                   │ sends/steam-   │
  │ suppression  │                   │ signals-*.csv  │
  │ -YYYY-MM-DD  │                   │                │
  └──────┬───────┘                   └───────┬────────┘
         │                                   │
         │  upload to Bison                  │  feed /net-new
         │  + Heyreach                       │  + /re-engage
         ▼                                   ▼
  ┌──────────────────────────────────────────────────┐
  │              Bison / Heyreach                     │
  │         (campaign send platforms)                 │
  └──────────────────────────────────────────────────┘

Slide 7 -- Setup Checklist (New Machine)

1. Clone + install

git clone <this-repo>
cd motorica-ops-kit
pip install -r requirements.txt
npm install -g @hubspot/cli
npm install -g @bcharon/linkedincli    # Jamie/Nathan only

2. Create `.env`

HUBSPOT_ACCESS_TOKEN=pat-eu1-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
LINKEDIN_LI_AT=AQED...                 # Jamie/Nathan only
LINKEDIN_JSESSIONID=ajax:...           # Jamie/Nathan only

3. First-time setup (run once)

python scripts/setup_crm_properties.py --dry-run
python scripts/setup_crm_properties.py

python scripts/backfill_taxonomy.py --dry-run
python scripts/backfill_taxonomy.py

4. Verify

python -m pytest                          # 223 tests should pass
python knowledge-graph/scripts/kg_query.py show motorica   # should print ICP
python scripts/pull_exclusion_list.py --dry-run            # should count contacts

Slide 8 -- Common Questions

Q: Where do send CSVs come from? A: From Bison. After a batch deploys, export the sent contacts as a CSV with columns email, send_date, sequence_name. Put it in sends/, then run update_campaign_date.py against it.

Q: How do I get a HubSpot token? A: HubSpot > Settings > Integrations > Private Apps > hubspot-cli-agent > Auth tab. Copy the token. If it does not exist, Chris needs to create the private app with the required scopes.

Q: The KG indexer says 45 entities -- is that right? A: Yes. 1 client + 5 personas + 3 campaigns + 1 pattern + 30 KPIs + 5 signal landscape summaries = 45.

Q: What if new signal CSVs arrive? A: Drop them in knowledge-graph/sources/, update signal-landscape.jsonl with new aggregate numbers, then run kg_index.py. The CSVs themselves are the raw data; the JSONL is the structured summary.

Q: Can I edit the knowledge graph? A: Yes. The entities.jsonl / relationships.jsonl / index.json are built files -- they get overwritten by kg_index.py. To add facts permanently, add to sources/nexus-export.jsonl or drop a new sources/*.jsonl file, then rebuild.

Q: What if a script errors? A: Check: (1) .env exists with a valid token, (2) pip install -r requirements.txt ran, (3) the HubSpot app has all required scopes (see docs/hubspot-cli-guide.md appendix). All scripts support --dry-run -- test there first.

Q: How do sequencers/enrollment work? A: Not from this repo. HubSpot sequences are managed in the UI. This repo handles the data layer: who to exclude, who to re-engage, what signal justifies contacting them now. The actual enrollment happens in Bison/Heyreach (cold email) or HubSpot workflows (sequences).

Slide 9 -- The Golden Rules

--dry-run first, always. No exceptions.
Pull exclusion list before every send. Upload to Bison + Heyreach.
Stamp campaign dates after every send. That is what powers the 60-day cooldown.
Query the KG before writing any copy. The ICP, winning angle, and proven CTA are there.
Never share the .env token. It is a password for the entire CRM.
Rebuild the KG when source data changes. kg_index.py is deterministic and pure -- same inputs, same outputs.
The HubSpot UI is for one record. Scripts are for bulk. Do not try to use the hubspot CLI for single-record lookups -- the web UI is faster.