Killian Brief
April 30, 2026 · Nightly Run · 6 Bets Shortlisted
Bets shortlisted
6
Avg judge score
68/100
Run cost
$3.64
Bet #1

Hallucination checker for AI legal briefs

judge 67/100edge 1.0/10ai native

Lisandro — a New York lawyer just got fined $145K because ChatGPT invented six case citations and he filed them. This is happening weekly now: Mata v. Avianca, Park v. Kim, the Michael Cohen filing. Sanctions are escalating from $5K slaps to six-figure career-enders, and every litigator using AI is one bad paste away from the next headline.

The market: ~50K US attorneys actively use AI for drafting. At 2% capture and a $99/mo solo-practitioner ARPU, that's ~$1.2M ARR — modest but real. The wedge is narrow but genuine: Westlaw's KeyCite and Lexis's Shepard's only validate cases that exist in their database. They literally cannot flag a fabricated citation — they just return 'not found,' which attorneys misread as a database miss. Nobody ships a paste-and-check hallucination detector with a free tier today.

Why now: court sanctions crossed the $100K threshold in 2024, and bar associations started issuing formal AI-use guidance in Q4. Fear is finally priced in.

Why us: honestly, weak. I have no legal-tech edge. This is a distribution-and-speed bet, not a moat bet.

The path: 14 days, ~$3K in API/database costs, embedded in 3-5 public defender Slack channels and bar association lists. Kill if <500 citations verified by day 14, <$500 MRR by day 60, or unit cost stays above $0.15 with sub-2% conversion. Reversible, cheap, fast signal.

The real risk is Thomson Reuters shipping this in a sprint. So we either win in 90 days or we sell them the user list. Let's run it.

The detail behind the pitch
Problem
Criminal defense lawyers and litigators are pasting AI-generated legal citations into briefs without verification, leading to $145K+ court fines when citations don't exist.
Proposed solution
A free/paid legal citation verification tool that checks every citation in pasted legal text and flags invalid ones with green/red/yellow indicators.
Target market
Criminal defense attorneys, public defenders, litigators using AI tools (~50K+ active practitioners in US who use AI for briefs).
First test
Launch freemium version, embed in 3-5 public defender networks/bar association Slack channels, measure weekly active users and citations verified. Offer paid premium (batch processing, API, brief templates) after PMF signal.
Kill criteria
<500 citations verified total by day 14 OR <50 unique weekly active users by day 30 OR <$500 MRR by day 60 OR cost-per-verified-citation exceeds $0.15 with no paid conversion rate above 2% by day 45 → kill
Competitive landscape
Incumbents: Fastcase, Casetext (Thomson Reuters), Westlaw (Thomson Reuters), LexisNexis, Shepard's Citations (Lexis), KeyCite (Westlaw), CoCounsel (Casetext AI), Harvey AI, Lexis+ AI, Winston AI, Paxton AI Pricing: $20-$650/seat/mo (Westlaw/Lexis enterprise); Casetext ~$100/mo; most citation validators bundled inside full legal research platforms Saturation: medium Wedge: A zero-subscription, paste-and-verify UX that specifically detects AI-hallucinated (non-existent) citations — not just deprecated ones — is a gap no incumbent fills today, especially for price-sensitive public defenders and solo practitioners. User complaints: Existing tools require full platform subscriptions — no lightweight paste-and-check workflow; Westlaw/Lexis KeyCite and Shepard's only validate real citations that already exist in their DB — they don't flag AI-hallucinated (non-existent) case names at all; No freemium or low-friction entry point for solo/public defenders who can't afford $300-$650/mo Westlaw seats; AI-native tools like CoCounsel are expensive and still generate hallucinations themselves — no independent verification layer; Court sanctions ($5K–$145K+) are rising sharply but attorneys lack a quick pre-filing checklist tool; Public defenders especially lack budget for enterprise legal research tools, creating an equity gap Notes: The core citation validation layer (KeyCite, Shepard's) is mature but narrowly solves for overruled/deprecated cases, not fabricated ones. No incumbent has launched a standalone hallucination-specific citation checker with a free tier as of early 2025. The wedge is real but narrow: incumbents like Thomson Reuters and LexisNexis could ship this feature fast given their underlying data. A freemium tool must acquire users and convert to paid or a data/API licensing model before getting acqui-hired or outflanked. The ~50K AI-using attorney TAM is credible but willingness-to-pay varies sharply — public defenders near $0, BigLaw already buying Westlaw/Lexis bundles.
Skeptic + judge rationale
Death modes: - Public defenders (the easiest-to-reach free users) have $0 willingness to pay and institutional procurement blocks prevent any upgrade path; BigLaw and paying litigators already have Westlaw/Lexis bundles and won't add a redundant tool — the free tier fills with non-converting public defenders while paid conversion rate stays <1%, MRR never exceeds $500, and the business dies at month 3 when founder burns through savings subsidizing free API calls to court databases - Thomson Reuters or LexisNexis ships a 'hallucination detection' toggle inside CoCounsel or KeyCite within 60 days of this product gaining any press visibility — incumbents have the citation database, the existing attorney relationships, and the engineering capacity to copy the core feature in a single sprint, instantly neutralizing the wedge and making the standalone tool redundant before it reaches 1,000 users - The core technical promise fails at scale: verifying whether a case citation is fabricated (vs. merely overruled) requires access to comprehensive case law databases (CourtListener, Westlaw, Pacer), and API costs + rate limits mean each free verification costs $0.08–0.40 in data/LLM costs — at 500 citations/day the unit economics are underwater, forcing either a hard paywall that kills the freemium wedge or continued cash burn that exhausts runway before PMF is demonstrated # Judge rationale (score=67.0) Wins on low human intervention (pure software, paste-and-check UX), reasonable market of 50K AI-using attorneys, and a real pain point with rising court sanctions. Loses heavily on defensibility — Thomson Reuters/LexisNexis can ship hallucination detection in a sprint with their existing case databases. Capital and unit economics are concerning: API/database costs of $0.08-0.40 per verification underwater the freemium model, and the core free-user segment (public defenders) has near-zero willingness to pay. ARPU is capped because BigLaw already has Westlaw/Lexis bundles, leaving a narrow middle of solo practitioners as the real paying market.
Source: hn:show_hn
Reply "approve #1" on Telegram to ship this bet.
Bet #2

CSV-in, jurisdiction-out for license renewals

judge 70/100edge 1.5/10b2b saas

Every spring, finance teams at multi-location businesses spend 40-80 hours hand-mapping customer addresses to city and town boundaries for business license renewals. Zip codes lie, Google Maps returns coordinates not jurisdictions, and Avalara wants $25k/year to bundle this into a tax suite they don't need. So they stitch together Census TIGER files in Excel and pray.

There are roughly 10k US mid-market companies with this pain, willing to pay $500-2k/year. At 2% capture and $1k ARPU that's $200k ARR — modest. Honest read: this is a small market, and the skeptic's point is real — it's a seasonal purchase, not true SaaS. Realized revenue is closer to $80-200/customer/year unless we expand into ongoing address hygiene and nexus monitoring.

The wedge is narrow but genuine: nobody sells CSV-in/jurisdiction-out priced per batch. Avalara is overkill, SmartyStreets returns the wrong layer, and Census data needs a GIS engineer to operationalize. We win on workflow fit and price, not technology.

Why us: honestly, weak. This isn't manufacturing or aviation — no unfair edge here. That matters.

The path: 14 days, ~$2k for Mapbox credits and a Vercel deploy. Three paid pilots at $200/mo by day 60. If we can't convert 3 of 9 test users to paid, we kill it — no sunk-cost spiral.

I'd run the test, but only because it's cheap. If pilots convert, we expand scope. If not, we walk in 60 days with the lesson and the runway intact.

The detail behind the pitch
Problem
Finance/accounting staff manually map customer addresses to city/town boundaries for business license renewals, spending hours cross-referencing zip codes with wonky boundary lines.
Proposed solution
Automated geolocation service that classifies customer addresses into correct city/town jurisdictions using boundary APIs and returns bulk results.
Target market
Mid-market businesses with multi-location customers renewing licenses annually; 10k+ companies in US with this pain; each paying $500-2k/year.
First test
Build simple web form + Mapbox/Census Bureau boundary API integration. Have 3 SMBs upload their customer lists, verify jurisdiction classifications match manual effort, measure time saved.
Kill criteria
<3 paying customers (not free pilots) at ≥$200/mo by day 60, AND <$1,200 total MRR by day 90, AND average pilot-to-paid conversion rate <33% across first 9 test users → kill or hard pivot
Competitive landscape
Incumbents: Avalara, TaxJar, Vertex Inc., SmartyStreets (USPS/jurisdiction lookup), Google Maps Platform (geocoding API), HERE Maps, Regrid (parcel boundary data), Precisely (formerly Pitney Bowes) MapInfo Pricing: $200-$1,500/mo for API tiers; Avalara enterprise deals $5k-$50k/yr; raw geocoding APIs $0.005/lookup Saturation: low Wedge: Purpose-built bulk address-to-municipal-jurisdiction classifier targeting the specific business license renewal workflow — priced per-batch, not bundled into a $50k tax suite. User complaints: Avalara and Vertex overkill for address-to-jurisdiction lookup only — bundled into full tax compliance suites at enterprise price points; Generic geocoding APIs (Google, HERE) return lat/lng but do not resolve municipal vs. county vs. special district boundaries reliably; Zip codes frequently cross city/town boundaries, causing misclassification that neither USPS nor Google Maps corrects; No off-the-shelf tool purpose-built for business license jurisdiction mapping — teams stitch together Census TIGER files + manual QA; TIGER/PLACE boundary data from Census is free but requires GIS expertise to operationalize and has update lag; Bulk batch processing of 10k+ addresses against municipal boundaries not offered cleanly by existing geocoders Notes: The incumbents split into two camps neither of which nails this use case: (1) full tax compliance platforms (Avalara, Vertex) that solve jurisdiction for tax remittance but are overpriced and over-scoped for license-only workflows; (2) raw geocoding APIs that return coordinates but leave the boundary-polygon lookup and municipal classification to the buyer. The Census TIGER/PLACE + point-in-polygon approach is the technical solution but requires GIS engineering overhead no mid-market finance team has. A lightweight, workflow-aware SaaS wrapping authoritative boundary data (Census, state GIS portals) with a clean CSV-in/jurisdiction-out interface has a genuine gap to fill. Key risk is ceiling: the TAM at $500-2k/yr per customer is real but modest, and Avalara could bundle this trivially if the segment grows.
Skeptic + judge rationale
Death modes: - Annual renewal cycle kills MRR: 90% of target customers only need jurisdiction classification once per year (license renewal season), meaning they pay for 1-2 months and churn — actual realized revenue is $40-160/customer/year not $500-2k/month, destroying unit economics before month 3 - SmartyStreets or Google Maps Platform ships a 'jurisdiction layer' add-on at $0.001/lookup within 60 days of any traction signal, undercutting the entire wedge — the technical moat is a thin API wrapper over public Census data that any incumbent can replicate in a sprint - The 3 SMB test users validate accuracy but have no budget authority: the actual buyer is a finance director or compliance manager who routes any new SaaS through IT/procurement, extending the sales cycle to 4-6 months, while the founder exhausts runway at month 3 having closed $0 in paid contracts despite positive pilot feedback # Judge rationale (score=70.0) Strong on capital efficiency and operational simplicity — pure software wrapping public Census data on Vercel-class infra, low intervention once shipped. Loses heavily on recurring revenue (skeptic's annual-renewal-cycle critique is devastating: this is a seasonal one-shot purchase masquerading as SaaS, killing MRR math) and defensibility (thin wrapper over public data that SmartyStreets or Google could replicate in a sprint). ARPU is decent if you can land the right tier but the buyer-vs-user gap likely stretches sales cycles past Killian's runway. Real gap exists, but the wedge is structurally fragile — better as a feature than a company.
Reply "approve #2" on Telegram to ship this bet.
Bet #3

Cloud-synced chat UI for local Ollama users

judge 68/100edge 1.5/10consumer app

Ollama has 100k+ weekly active users running local models because they love the privacy and control — but the chat UI situation is a mess. Open WebUI feels like Jira, Chatbot UI got abandoned, LM Studio is glued to model management, and nobody has clean cloud-synced history with real search and markdown export. Power users are duct-taping Docker stacks to do something that should take 30 seconds.

The market is small but real: ~30k Ollama users who'd pay, maybe 3% capture at $10/mo ARPU = ~$108k ARR. Not a unicorn — a tight cash-flowing wedge. The gap is 'pay a little, get polish' — every incumbent is either free-and-bloated or self-host-only.

Here's where I have to be straight with you: this bet has two real knives at its throat. First, the CORS/localhost problem — users' Ollama runs on 127.0.0.1, our cloud frontend can't reach it without a tunnel agent. We have to solve that in onboarding or we're dead at hello. Second, the privacy paradox — the exact people running local models distrust cloud storage. We'd need an opt-in encrypted-sync model, not default cloud.

The test: 14 days, ~$400 (Vercel + a domain + Reddit/HN posts). Ship a minimal UI plus a 50-line local agent that handles the bridge. Kill if <20 paid by d30 or <10% DAU/signup at d14. I have zero operator edge here — it's pure execution speed against Open WebUI's next release.

Small, reversible, honest. Want to take the swing?

The detail behind the pitch
Problem
Users running local AI models (Ollama) via open models love the capability but lack a polished, persistent chat UI—they've tried FOSS options and none satisfy them; willing to pay.
Proposed solution
Open-source or lightweight SaaS chat frontend (with storage/hosting) optimized for local + open models, emphasizing simplicity, search history, and export.
Target market
AI enthusiasts and developers using Ollama/local models (est. 10k–50k active); willingness to pay $5–20/month for hosted storage + UX.
First test
Build minimal chat UI (React/Vue) that connects to any local Ollama instance; host a free tier + $9.99/month paid tier with 100GB storage; measure: signups, active users, paid conversions in 14d.
Kill criteria
<5 users successfully complete end-to-end Ollama connection (send at least 1 message) within 7d of launch AND <$200 MRR (fewer than 20 paid conversions at $9.99) by d30 AND DAU/signup ratio <10% at d14 → kill or full pivot
Competitive landscape
Incumbents: Open WebUI (formerly Ollama WebUI), Chatbot UI (McKay Wrigley), AnythingLLM, LM Studio (built-in chat), Jan.ai, Msty, Lobe Chat Pricing: Mostly free/open-source self-hosted; AnythingLLM Cloud ~$17/mo; Msty free desktop; Open WebUI self-host free or ~$15/mo managed; no dominant paid SaaS winner Saturation: medium Wedge: A dead-simple, hosted-storage chat frontend that works with any local Ollama endpoint out of the box — no Docker, persistent cross-device history, and first-class search/export — at a price point incumbents ignore ($5–10/mo). User complaints: Open WebUI is feature-bloated and heavy — 'feels like enterprise software, not a chat tool'; No persistent cloud sync — chat history lost on reinstall or across devices; Self-hosting is a barrier for non-technical enthusiasts (Docker complexity); Chatbot UI v2 was abandoned/pivoted to paid hosted product, leaving community stranded; Export options are minimal or buried — no clean markdown/PDF export; Search within conversation history is weak or absent across most tools; LM Studio chat is tightly coupled to model management, poor as a standalone UI; Jan.ai is unstable on some hardware configs, frequent crashes reported Notes: The space has many FOSS entrants but no clear SaaS winner for the 'pay a little, get polish' segment. Open WebUI dominates mindshare but is increasingly complex and self-host-only. The real gap is cloud-synced history + search for users who run Ollama locally but want their chat data accessible anywhere — a thin backend-as-a-service model. Willingness to pay is real but modest; conversion rates will hinge on nailing onboarding (connect your Ollama URL in 30 seconds) and avoiding feature creep that made incumbents unappealing.
Skeptic + judge rationale
Death modes: - The CORS/network bridging problem kills onboarding: users' local Ollama instances run on localhost:11434 and are unreachable from a cloud-hosted frontend without a local proxy/tunnel (e.g., ngrok or a local agent), so the '30-second connect your Ollama URL' promise fails for 80%+ of users on day one — churn before a single message is sent, no paid conversions materialize - The target persona (Ollama power users) self-selects against paying for hosted storage because they already distrust cloud sync of their private local-model conversations — the privacy paradox kills the core value prop: the exact users who run local models to avoid data leaving their machine won't pay $9.99/mo to store chat history on someone else's server, collapsing the conversion funnel to <1% - Open WebUI ships a simplified 'Lite' mode or one-click cloud sync feature within 60 days (it's open-source, community PRs move fast), instantly neutralizing the wedge — since the target users are already installed on Open WebUI and the switching cost to stay is zero, the addressable 'dissatisfied but reachable' segment evaporates before paid MRR reaches $500 # Judge rationale (score=68.0) Wins on low capital (pure software, Vercel-shippable), recurring SaaS model, and fast time-to-launch. Loses heavily on ARPU ($120/yr ceiling), small market (10-50k buyers, half won't pay on principle due to privacy paradox), and weak defensibility against Open WebUI which can copy the wedge in a weekend. Human intervention is moderate—Lisandro will get sucked into Discord support for CORS/tunneling issues since the localhost-to-cloud bridging problem is the #1 onboarding killer. Crowded FOSS field with no clear paid winner is a yellow flag, not green: it usually means the segment doesn't actually convert.
Source: hn:ask_hn
Reply "approve #3" on Telegram to ship this bet.
Bet #4

Hosted Fossil for tiny dev teams

judge 68/100edge 1.5/10infra tooling

Tiny engineering teams—5 to 20 people building internal tools or side projects—are paying GitHub for a Boeing 747 when they need a Cessna. They want a repo, a wiki, a ticket tracker, and a forum in one place, not seventeen tabs and a Dependabot survey. Fossil already does all of that in a single binary; nobody has built the hosted version that doesn't require a sysadmin.

The market is real but small: realistically 10–30k reachable teams globally, not 200k. At 1% capture and $25 ARPU, that's $30–90k ARR—a niche, not a rocket. The wedge is genuine: Chisel.sh is dead, no funded player targets this, and the 'repo as mini-intranet' story has no Git-native equivalent.

Honest risks: Fossil's mindshare is microscopic, and the same single-binary simplicity that's the wedge means anyone technical enough to want it can self-host on a $5 VPS in 20 minutes. That's the bet's throat.

Why now: nothing changed. This is a demand-validation sprint, not a timing play. Why us: no special edge here—this is a cheap, fast probe, not a conviction bet.

The path: 14 days, ~$200 in VPS and ads, post in indie-dev and Fossil communities, onboard 10–15 teams. Kill if <8 signups, if clone-out rate >60% by day 10, or if <2 teams say they'd pay $10+/mo. We'll know by day 14 whether there's a pulse.

Small check, fast answer, clean kill. Let's run it.

The detail behind the pitch
Problem
Hosted Fossil SCM founder is uncertain whether small teams (1–50 people) actually want a federated, self-contained repo alternative to Git+GitHub—lacks customer signal.
Proposed solution
Run a 14-day private beta with 10–15 small engineering teams; measure: signup rate, repo creation rate, whether they clone/export after 7 days, willingness to pay $10–50/month.
Target market
Small teams (5–50 people) building internal tools, open-source side projects, or migrating from GitHub ($50–500/month willingness); est. 50k–200k teams globally.
First test
Post on targeted communities (indie dev forums, Fossil users, small-team Slack groups); offer free beta access for 2 weeks; collect: sign-ups, repos created, days-to-first-clone, exit-survey intent-to-pay.
Kill criteria
<8 signups by day 14 OR <3 repos created by day 7 OR 0 non-Fossil-prior-user signups (all signups already knew Fossil) OR clone/export rate >60% of repos by day 10 (teams bailing) OR <2 teams express WTP ≥$10/mo in exit survey AND 0 unprompted teammate referrals by day 14 → kill
Competitive landscape
Incumbents: GitHub, GitLab, Bitbucket, Gitea (self-hosted), Sourcehut, Codeberg, Chisel (hosted Fossil, defunct/niche) Pricing: $4–$29/seat/mo (GitHub Teams ~$4, GitLab Premium ~$29); Sourcehut $20–$100/yr flat; Gitea Cloud ~$3/user/mo Saturation: low Wedge: Fossil's uniquely self-contained architecture (single binary, built-in wiki+tickets+forum, append-only history) offers a simpler operational story than any Git host for teams that treat the repo as a mini-intranet, not just a code store. User complaints: GitHub is overkill and survey-heavy for tiny internal-tool teams who just want a simple repo + wiki + ticket tracker in one place; GitLab self-hosted is resource-hungry and operationally burdensome for sub-10-person teams; Gitea/Forgejo require self-hosting expertise most small teams lack; Sourcehut's email-based workflow has a steep learning curve and poor onboarding for developers used to PR-based flows; No mainstream hosted option offers Fossil's built-in wiki, ticketing, forum, and chat under one binary with zero external dependencies; Git's history rewriting and detached-HEAD confusion frustrates non-Git-native developers on small side-project teams Notes: The hosted-Fossil niche is nearly unoccupied: no well-funded, actively marketed SaaS product targets it. Chisel.sh was an attempt but has minimal traction and appears unmaintained. The risk is not competition but demand — Fossil's mindshare is tiny and teams must be persuaded to leave Git workflows, making the 14-day beta's clone/export metric especially important as a retention signal. Pricing at $10–50/month is credible and sits below GitHub Teams for comparable seat counts, but the founder's core uncertainty (do small teams actually want this?) is the right question; saturation is low precisely because market size may also be low, so the beta should instrument 'referred a teammate' as an additional leading indicator of organic pull.
Skeptic + judge rationale
Death modes: - Fossil's 0.1% mindshare means the 10–15 beta teams are all existing Fossil enthusiasts/hobbyists, not representative small engineering teams—signups hit 8 but zero are Git-migrating teams, making WTP data meaningless and demand validation invalid by day 14 - Clone/export metric fires immediately: teams sign up, poke around for 2–3 days, then clone their repos out and return to GitHub because their CI/CD pipelines (GitHub Actions, Dependabot, npm/PyPI integrations) have zero Fossil support—ecosystem lock-in kills retention before day 7, leaving 0 of 10 teams still active at day 14 - The $10–50/month price point collapses under self-hosting logic: the exact persona who wants Fossil's simplicity (indie dev, internal-tools team) discovers they can run the single Fossil binary on a $5 Hetzner VPS in 20 minutes, so every interested prospect self-hosts instead of paying—fewer than 2 of 15 beta teams express any WTP when the free alternative is trivially easy # Judge rationale (score=68.0) Wins on low capital (beta is cheap to run on a VPS), recurring SaaS model, and fast time-to-signal (14 days). Loses heavily on ARPU ($10-50/mo is sub-$1k/yr per team), market size (Fossil mindshare is microscopic, real buyer pool likely <10k not 200k), and defensibility (the same single-binary simplicity that's the wedge means any prospect can self-host on a $5 VPS, gutting WTP). Human intervention is moderate — beta requires hand-holding 10-15 teams, onboarding, exit surveys, and likely Lisandro on community posts and support DMs during the 2 weeks. The skeptic's three death modes are all plausible and the kill criteria are well-instrumented, but the bet's ceiling is capped: even if it works, this is a $50k-300k ARR niche tool, not a zero-human compounding asset.
Source: hn:ask_hn
Reply "approve #4" on Telegram to ship this bet.
Bet #5

HIPAA-grade WeTransfer for small law firms

judge 69/100edge 1.5/10services

WeTransfer just moved password protection and encryption behind a $12/mo paywall and cranked the ads on free users. That's pissing off exactly the people who can't legally use it anyway: small law firms, solo accountants, and 5-doctor clinics who need to send a sealed deposition or a patient file without making the recipient create an account. They're Googling alternatives right now, and Tresorit and Kiteworks quote them $150/seat with a 6-week procurement dance.

There are roughly 450k US law firms under 20 attorneys plus ~200k small healthcare practices. At 0.3% capture and $40/seat/mo across 3 seats average, that's ~$2.8M ARR — real, not a TAM fantasy. The wedge is narrow on purpose: dead-simple zero-knowledge transfer with a BAA and an audit log, priced at $30-50/seat instead of $150.

Honest read: I don't yet know if the compliance angle converts or if the privacy angle does. That's the bet. Two landing pages, identical $500 spend, 14 days, measuring credit-card-intent — not email vanity. Kill the compliance side if CPL > $80 and <2 checkout intents. Kill consumer if signups are 3x but nobody will pay >$10.

Caveat I want to flag: $500/side is thin for B2B compliance CPCs. If LinkedIn eats the budget in noise, I'd rather extend to $1500/side than read a false negative and kill the higher-ARPU wedge. No special edge here for you — this is a clean validation sprint, not a moat play.

Give me $1.5k and 14 days to find out which door has paying customers behind it.

The detail behind the pitch
Problem
P2P encrypted file transfer founder can't differentiate from WeTransfer; unsure whether to rebrand toward compliance/B2B or privacy/consumer, and lacks data on which direction yields paying customers.
Proposed solution
Run two parallel 14-day landing page campaigns (compliance angle vs privacy angle) with identical traffic spend; measure conversion rates, email signups, and willingness to pay.
Target market
B2B compliance-focused: law firms, accountants, healthcare (high-margin, smaller TAM); Privacy-focused: tech-savvy consumers + journalists (large TAM, low willingness-to-pay).
First test
Build two distinct landing pages; spend $500 on ads to each audience; measure: signup rate, email confirmation rate, survey responses on pricing willingness in each group.
Kill criteria
<2 credit card intent signals (pricing page click-through OR mock checkout initiation) on the compliance variant by day 14 AND compliance-side CPL > $80 with <3% landing-page-to-email conversion → kill compliance wedge; OR consumer variant generates >3x signups but 0/10 survey respondents indicate WTP above $10/mo → kill consumer pivot; overall kill: zero paying or credit-card-captured leads from either variant by day 30 → kill experiment and reassess product differentiation before any further spend
Competitive landscape
Incumbents: Tresorit, FTAPI, MASV, Proton Drive, SFTP To Go, WeTransfer Pro, Kiteworks, Box Pricing: $10-$25/seat/mo (consumer/prosumer); $50-$150+/seat/mo (B2B compliance/enterprise) Saturation: medium Wedge: B2B compliance angle (law firms, healthcare, accountants) is the higher-conviction wedge — WeTransfer's 2024 ad-push and feature paywalling is actively pushing regulated-industry users to search for alternatives, and no incumbent cleanly owns the 'dead-simple HIPAA/GDPR-compliant P2P transfer with zero-knowledge encryption' positioning at SMB pricing. User complaints: WeTransfer moved password protection and encryption behind paywall ($12/mo), angering free users; WeTransfer showing more ads to free users since 2024, driving churn; Compliance tools (HIPAA, FedRAMP, audit trails) absent from consumer-grade tools but SMBs don't want enterprise pricing; Most encrypted alternatives still require recipient sign-up, killing frictionless UX; No single tool cleanly covers both zero-knowledge E2E encryption AND compliance audit trails at an SMB price point; Tresorit and Kiteworks perceived as expensive and over-engineered for smaller law firms or accountants Notes: The consumer privacy space (Proton Drive, Signal-style transfers) is crowded and WTP is low — journalists and tech consumers want free or near-free. The B2B compliance gap is real: enterprise tools (Kiteworks, Tresorit Enterprise) are overbuilt and expensive; WeTransfer Pro is underbuilt for regulated industries; FTAPI is EU-focused. The sweet spot is U.S./EU SMB regulated verticals at $20-$50/seat/mo with no-signup recipient UX. The 14-day A/B landing page experiment is well-designed to validate this — watch for willingness-to-pay signals (credit card entry, not just email), not just signup rates.
Skeptic + judge rationale
Death modes: - The $500/side ad budget is insufficient to reach B2B compliance buyers (law firms, healthcare admins) who require 5-12 touchpoints and $50-200 CPCs on LinkedIn/Google compliance keywords — the experiment generates only 5-10 clicks per day to each page, producing statistically meaningless conversion data by day 14 and a false negative that kills the higher-conviction B2B wedge prematurely - Landing page visitors in the compliance segment complete email signups (vanity metric) but zero enter credit card details when shown a $99/seat pricing survey, because the actual decision-maker is a firm's IT admin or managing partner who won't commit spend without a BAA (Business Associate Agreement), SOC 2 documentation, or a procurement review — founder interprets low WTP signal as market rejection rather than a sales-process mismatch - The consumer privacy variant outperforms on raw signup rate (cheaper traffic, higher emotional resonance) and the founder pivots toward that direction — burning the next 60 days building consumer features for an audience with <$15/mo WTP and 40%+ annual churn, while the B2B compliance window created by WeTransfer's 2024 paywalling closes as Tresorit and Proton for Business ramp SMB marketing spend # Judge rationale (score=69.0) Wins on ARPU ($600-1800/seat/yr B2B compliance), SaaS recurring revenue, and clean software ops. Loses on capital — $1k ad spend is on the light side and skeptic is right that B2B compliance CPCs eat the budget fast, producing noisy data. Days-to-paid is realistic at 30-60d given B2B compliance sales cycles need BAA/SOC2 conversations, not pure self-serve. Defensibility is weak — 8 incumbents listed, no moat beyond positioning/UX, and the founder hasn't yet committed to the wedge so this scores a validation experiment, not a shipped bet.
Reply "approve #5" on Telegram to ship this bet.

★ Killian's Wildcard

Off-Brief, Off-Hand

Tonight's instinct bet — synthesized from training, not pulled from sources. Same calibration, different lane.
The Wildcard

One-click server-side tracking for Shopify

judge 65/100edge 2.0/10

Shopify merchants spending $5k-$50k/mo on Meta and Google ads are flying blind: post-iOS14, they're losing 20-40% of conversion signal, and their CAC is bleeding because of it. The fix — server-side GTM with CAPI — exists, but Elevar and Stape both demand GTM expertise or an agency retainer. Most sub-$5M GMV merchants just don't do it. They eat the loss.

BuiltWith shows ~150k Shopify stores in the $2k-$50k/mo ad spend band. At $79/mo flat and 1% capture, that's $1.4M ARR. Elevar charges $50-500/mo on order volume (punishes growth); Stape gets one-star reviews for support and 5-8s GTM load delays. Our wedge: a Shopify app that deploys a hosted sGTM container with pre-wired CAPI/Google/TikTok tags in one click — no GCP, no dev, flat price.

Why now: Consent Mode v2 just made the pain acute, and DTC operator Slacks are openly complaining about Stape and post-Elevar-acquisition confusion.

Why us: honestly, weak. This isn't in your manufacturing or LatAm wheelhouse. The edge here is speed and a tight wedge, not unfair access.

The test is small: $300 in Meta ads + 3 subreddits + 2 DTC Slacks, landing page and Loom demo, 14-day pilot with manual backend setup. Kill if <5 install requests or <3 live stores in 14 days, or >50% churn at 60 days. Total exposure: ~$1.5k and two weekends.

Let's spend $1.5k to find out if 150k merchants actually want this.

The detail behind the pitch
Problem
Shopify merchants running paid ads lose 20-40% of conversion tracking post-iOS14 / GA4 migration, and current server-side tagging tools (Stape, Elevar) require dev work most sub-$5M GMV merchants won't do.
Proposed solution
One-click Shopify app that deploys a server-side GTM container with pre-built CAPI/Google Ads/TikTok event mapping, billed flat $79/mo — no dev work, no GCP account required (we host).
Target market
Shopify merchants spending $2k-$50k/mo on Meta/Google ads; ~150k Shopify stores in this band per BuiltWith / Store Leads; current tools (Elevar $50-500/mo, Stape $20-200/mo) prove willingness-to-pay.
First test
Spin up landing page + Loom demo using a test store. Run $300 of Meta ads to 'Shopify owner' interest + post in 3 Shopify subreddits and 2 DTC Slack groups (DTC Newsletter, 2pm.co). Offer 14-day free pilot with manual setup on backend. Target: 10 install requests, 3 stores live.
Kill criteria
<5 install requests AND <3 stores live after 14-day ad + community push → kill immediately; OR ≥10 installs but <25% convert to paid by day 30 of free trial → kill paid acquisition, pivot to manual sales only; OR first-cohort 60-day churn rate >50% (>3 of first 6 paying stores cancel) → kill, core product is broken; OR Shopify app review requests material architecture change to proxy/pixel approach before first 10 installs → kill and rebuild outside App Store channel
Competitive landscape
Incumbents: Elevar, Stape, Littledata, Trackbee, SignalBridge, Tracklution, Analyzify, TripleWhale, Hyros Pricing: Elevar $50-500/mo (order-volume-based); Stape $25-200/mo; SignalBridge $29/mo; ServerTrack.io $10/mo; Littledata ~$59-299/mo Saturation: medium Wedge: The only Shopify app that deploys a fully hosted server-side GTM container with pre-wired CAPI/Google Ads/TikTok tags in one click — zero GTM knowledge, zero GCP setup, zero agency needed — at a flat rate undercutting Elevar's variable pricing. User complaints: Stape's Custom Loader injects GTM 5-8 seconds after page load, causing missed page_view/session_start events and lost gclid/utm parameters; Stape support immediately rejects requests and is described as worst-in-class by multiple reviewers; Elevar setup and maintenance overhead deters non-technical merchants; Elevar was acquired and merged into a broader analytics tool, causing product confusion post-acquisition; Both Elevar and Stape require GTM expertise or agency/dev involvement to configure correctly; Consent Mode v2 signals arrive too late with Stape's Custom Pixel approach, breaking Google ad personalization attribution; Stape causes tracking instability and data loss, reducing ROAS for some merchants Notes: The space is crowded at the top (Elevar, Stape, Littledata) but none have fully cracked true no-code, one-click sGTM deployment for sub-$5M GMV merchants — most still require GTM configuration or developer time. Newer entrants like SignalBridge ($29/mo), Tracklution, and Trackbee are attacking the no-code angle but lack the GTM-native framing that agencies and slightly technical merchants trust. The $79/mo flat-rate positioning is defensible against Elevar's order-volume pricing (which punishes growth) but faces margin pressure from the $10-29/mo low end. Key risk: Shopify's own native conversion API integrations are improving and could commoditize parts of this stack within 12-24 months.
Skeptic + judge rationale
Death modes: - Shopify App Store review rejects or sandboxes the app within 30 days because the server-side proxy approach (routing Shopify storefront traffic through a hosted sGTM container) violates Shopify's checkout extensibility or data residency policies — specifically, Shopify has been tightening rules around apps that intercept or re-route checkout/pixel events, and the 'one-click' GTM container deployment likely requires a Custom Pixel or ScriptTag that Shopify flags as non-compliant, killing distribution before organic installs can compound - The 'one-click' claim breaks on store variant complexity within the first 60 days: merchants with custom themes, headless storefronts, Rebuy/CartHook post-purchase flows, or non-standard checkout scripts find that the pre-wired CAPI/Google Ads/TikTok event mapping misfires on add-to-cart or purchase events, producing duplicate or missing conversions — support tickets spike, the founder spends all time on manual per-store fixes, churn hits 60%+ in month 2, and word-of-mouth in DTC Slack groups turns negative before MRR scales past $2k - Meta/Google/TikTok CAPI authentication tokens expire or require reauthorization every 60-90 days per platform policy, and without an automated token-refresh flow built at launch, a wave of silent tracking failures hits the first cohort of live stores around day 45-75 — merchants see ROAS drop, blame the app, churn en masse, and leave 1-star reviews citing 'stopped working after 2 months,' destroying the App Store rating before the founder can ship a fix # Judge rationale (score=65.0) Strong on ARPU ($79/mo flat = ~$948/yr), pure SaaS recurring revenue, and a real ~150k buyer market with proven WTP. Loses heavily on human intervention: server-side tracking is notoriously per-store-fragile (custom themes, headless, post-purchase apps, CAPI token refresh cycles) — the skeptic's death modes are real and likely force founder into hands-on debugging per merchant, especially during free pilot with 'manual setup on backend.' Complexity is mid: not just software, but live integration with Meta/Google/TikTok APIs plus Shopify App Store policy risk on pixel interception. Defensibility is weak — nine named incumbents, Shopify-native CAPI commoditizing the stack in 12-24mo, and no data/network moat.
Reply "approve wildcard" on Telegram to ship.