1. The problem keyword alerts can’t solve
Suppose you sell a tool that automates Reddit lead discovery. A keyword alert for “Reddit leads” will fire on every post that mentions Reddit and leads, regardless of whether the author wants to buy something or sell something. The signal-to-noise ratio collapses fast.
A lexical match doesn’t know the difference between “I’m looking for a tool to find Reddit leads” (a buyer) and “5 ways our tool finds Reddit leads” (a competitor’s blog post). A semantic match does. That’s the entire reason SignalPipe exists.
2. Anchor sentences as semantic targets
Each product configures 5–10 anchor sentences: short, natural-language examples of the buying intent it wants to detect. They’re written in the buyer’s voice, not the seller’s. Examples:
- “I need a tool to monitor Reddit for sales leads”
- “Looking for an AI sales agent that can find prospects automatically”
- “My outbound process is too manual and I want to automate lead discovery”
Mantidae embeds each anchor with OpenAI’s text-embedding-3-small and caches the vectors at product-load time. Incoming RSS / Reddit / HN posts are embedded the same way. The cosine similarity to the closest anchor becomes the embedding component of the signal score.
Why anchors instead of a single product description? Because real buyers don’t describe your features — they describe their pain. Several short anchors covering different framings of the same intent outperform a single long product blurb on every dataset we’ve tested.
3. Multi-factor geometric mean
Embedding similarity is necessary but not sufficient. A post can match an anchor semantically but still be irrelevant — a stale repost, a comment from someone with no audience, a sarcastic reply. Mantidae combines five components into one weighted geometric mean:
- Embedding match — cosine similarity to the closest anchor
- Keyword density — buy-signal keywords found in the post
- Freshness — exponential decay on age, faster on social platforms
- Engagement — comment / vote count where available
- Author reputation — follower / karma signal where available
We use a geometric mean instead of a weighted sum because it punishes weak components more aggressively — a post with a great embedding match but zero engagement and a stale timestamp shouldn’t score as high as a fresher, more-engaged post. The geometric mean enforces that all signals matter.
The output is a 0–100 content score — the truth signal before any post-processing.
4. The competitor-floor heuristic and its honesty cost
If a post mentions one of your configured competitor names, Mantidae enforces a minimum score of 75. Why: someone publicly evaluating your competitor is one of the highest-intent signals there is, and you don’t want a weak embedding match to bury it.
The cost: this rule fires for any mention of the competitor — including the competitor’s own marketers talking about their own product. To stay honest about it, Mantidae preserves the pre-floor content_score alongside the post-floor signal_score. The dashboard surfaces both numbers and visually flags missions where the gap between them is ≥ 30 points.
Crucially, the role assignment for the drafting swarm uses content_score, not signal_score. So a misclassified competitor mention surfaces in the queue (good — operator should still see it) but the draft stays value-first instead of getting upgraded to a closing pitch (good — we don’t want to send hard CTAs to the competitor’s own marketing team).
5. Reinforcement-learning feedback loop
Each listening station (an RSS feed, a subreddit, an HN search) carries its own rl_weight (default 1.0, clamped 0.5–2.0). When a post comes in, Mantidae multiplies the raw signal score by the source station’s weight before the 50-point threshold. So the system can demote a single noisy feed without penalising the rest of the product’s sources.
The weight updates on every operator decision. Approvals are flat; rejections are reason-aware — the size of the penalty matches how unambiguous the failure mode is.
- Approve a mission → +0.05 (this station produces leads worth pursuing)
- Reject — spam → −0.04 (bot or promoted post — strongest signal that the feed is broken)
- Reject — not_relevant → −0.03 (wrong audience or topic for this product)
- Reject — no_reason / too_vague → −0.02 (default; weak signal)
- Reject — sarcasm / wrong_product → −0.01 (real signal, just not buyable — gentlest)
- Reject — already_customer → 0.00 (positive outcome misclassified as a lead — no penalty)
Approvals are larger than the typical rejection penalty for the same reason as before: false negatives (missing a real prospect) cost more than false positives (one extra mission to skim). Per-station weights mean RL behaves like a feed quality score that actually distinguishes which sources are working — not a single global multiplier that drags every station for the product down together.
6. The 9-prompt role-aware swarm
Drafting is handled by a 3-judge swarm: a Skeptic, an Analyst, and an Optimist, each running a different system prompt. They produce three drafts in parallel; we fuse them using the geometric mean of their self-scores and pick the highest-fused candidate.
The role-aware part: each lead’s content_score determines a role — closer (> 80), advisor (61–80), or educator (40–60) — and the role swaps in a different system prompt for each of the three judges. That’s a 3 × 3 = 9-prompt matrix. The voice differences:
- Closer — propose a specific concrete next step (demo, trial, link)
- Advisor — consultative, acknowledge the problem, introduce the product as a natural fit
- Educator — answer the question first, pitch only if it fits naturally
Char budgets are passed in too: 280 for Twitter replies, 500 for Reddit DMs, 300 for manual outreach. The swarm targets the budget at draft time — operators don’t inherit a draft that needs to be cut in half.
7. Failure modes we know about
Honest list. Most are mitigated; one or two are open.
- Sarcasm — embedding similarity can’t detect inverted intent. Mitigated by a sarcasm-detection unit-test pass on the sidecar before drafting.
- Stale reposts — same URL crossposted to 5 subreddits. Mitigated by the interactions ledger (unique on URL + product).
- Competitor-marketer false positives — discussed in §4. Surfaces in queue but doesn’t produce a hard pitch.
- Bot accounts and karma farms — open. Author-reputation component helps but doesn’t fully solve. Operator review is the backstop.
- Cross-product anchor leakage — a generic anchor like “I need a sales tool” will match too broadly. Best practice: write anchors as specific as the product warrants.
Frequently asked questions
Why use anchor sentences instead of keyword alerts?
Keyword alerts can’t distinguish a buyer from a competitor: a search for “Reddit leads” fires on both “I need a tool to find Reddit leads” and “5 ways our tool finds Reddit leads.” Anchor sentences are embedded once with text-embedding-3-small, and incoming posts are scored by cosine similarity to the closest anchor — so the system reacts to the intent expressed, not the surface tokens.
Why combine factors with a geometric mean instead of a weighted sum?
A geometric mean punishes weak components more aggressively than a sum. A post with a great embedding match but zero engagement and a stale timestamp should not score as high as a fresher, more-engaged post. The geometric mean enforces that all five components — embedding, keyword density, freshness, engagement, author reputation — actually have to be present.
What does the competitor floor do, and why preserve content_score?
When a post mentions a configured competitor, the signal_score is raised to a minimum of 75 so the lead isn’t buried by a weak embedding match. The pre-floor content_score is preserved alongside it, surfaced in the dashboard, and used (not signal_score) to assign the drafting role — so a misclassified competitor mention still appears in the queue but the draft stays value-first instead of being upgraded to a hard CTA.
How does the reinforcement-learning loop adjust scoring?
Each listening station (RSS feed, subreddit, HN search) carries its own rl_weight (default 1.0, clamped 0.5–2.0) that multiplies the raw signal score before the 50-point threshold. Approvals bump the source station’s weight by +0.05; rejections nudge it down by a reason-specific amount (spam −0.04, not_relevant −0.03, no_reason / too_vague −0.02, sarcasm / wrong_product −0.01, already_customer 0.00). Per-station means a single bad feed is demoted without penalising the rest of the product’s sources, and reason-aware penalties separate “this is bot noise” from “this is real signal, just not for this product.”
What is the 9-prompt swarm and how does role selection work?
Drafting runs three judges in parallel — Skeptic, Analyst, Optimist — each with a role-swapped system prompt determined by the lead’s content_score: closer (> 80), advisor (61–80), or educator (40–60). Three judges times three roles equals nine prompts. The drafts are fused using the geometric mean of their self-scores, and the highest-fused candidate wins. Char budgets (280 Twitter, 500 Reddit DM, 300 manual) are passed at draft time so operators never inherit a draft that needs cutting.
What failure modes does SignalPipe acknowledge?
Five known: sarcasm (embeddings can’t detect inverted intent — mitigated by a sarcasm-detection unit-test pass), stale reposts (mitigated by the interactions ledger, unique on URL+product), competitor-marketer false positives (surface in queue but don’t produce hard pitches), bot accounts and karma farms (open — author-reputation helps but operator review is the backstop), and cross-product anchor leakage (mitigated by writing anchors as specific as the product warrants).