AI Skill Report Card

Evaluating Early Stage Ventures

A-82·Jan 16, 2026

Evaluating Early-Stage Ventures

When someone pitches "real-time AI voice translation with zero latency":

Don't code. Ask:

  • End-to-end latency budget humans tolerate? (~150-200ms)
  • Where does latency accumulate? (capture → encode → network → inference → decode → playback)
  • Which parts scale with model size vs hardware vs physics?

If they claim "zero latency" without edge compute or predictive buffering, this violates physics. Dead on arrival.

Recommendation
Add concrete input/output examples for each phase of the workflow - show actual founder responses and your specific analysis

Phase 1: Technical Sniff Test (5 minutes)

  • Identify the fundamental constraint (physics, information theory, economics)
  • Map where complexity accumulates in their solution
  • Check if they're solving a constraint or working around it
  • Flag impossible claims vs. questionable assumptions

Phase 2: Problem vs Solution Framing (10 minutes)

  • Extract the organizational pain, not just the capability claim
  • Distinguish structural pain from preferences/complaints
  • Test: Does this solve a bottleneck or add a nice-to-have?
  • Verify: Can the problem compound if unsolved?

Phase 3: Second-Order Effects (15 minutes)

  • Map first-order benefits
  • Identify what changes when those benefits arrive
  • Find where moats might shift or erode
  • Test durability against competitive responses

Phase 4: Anti-Consensus Check (10 minutes)

  • Identify the prevailing wisdom
  • Find the artifact trail that contradicts it
  • Distinguish early/wrong from early/right
  • Assess timing vs. market readiness

Phase 5: Human Judgment Under Stress (20 minutes)

  • Test founder response to being wrong
  • Check artifact quality vs. presentation quality
  • Look for one ugly assumption they acknowledge
  • Verify execution evidence over charisma signals
Recommendation
Include a decision framework or scoring rubric for final go/no-go recommendations

Example 1: False Technical Confidence Input: "We've solved the cold start problem for recommendation engines" Output: "What's your precision@k for users with <5 interactions? How do you handle taste evolution? If you need collaborative filtering anyway, what exactly did you solve?"

Example 2: Problem-Space Clarity Input: "We autocomplete code better than GitHub Copilot" Better: "Senior engineers are bottlenecks; juniors can't unblock themselves" Analysis: First sells capability (competitive). Second sells organizational pain (compounds).

Example 3: Second-Order Thinking Input: "Cloud costs dropped, so our GPU-intensive product is now viable" Question: "If costs dropped for you, they dropped for incumbents too. What prevents them from copying your features cheaply now?"

Recommendation
Expand the Quick Start with 2-3 more concrete scenarios beyond the voice translation example

Compress Technical Assessment

  • Don't need to code to understand fundamental limits
  • Focus on constraint analysis over implementation details
  • Trust artifact trails over demo performance

Follow the Pain, Not the Solution

  • Organizational bottlenecks > technical capabilities
  • Structural problems > loud complaints
  • Compounding issues > one-time fixes

Model the System, Not Just the Startup

  • How does success change the game?
  • What do competitors do when this works?
  • Where do moats migrate under pressure?

Test Stress Response

  • How do founders handle being wrong?
  • What's the ugliest part they'll admit?
  • Do they show work or just results?

Overconfidence in Pattern Matching

  • "I've seen this before" ≠ "this can't work"
  • Constraints can relax silently (regulation, hardware, cost)
  • Sometimes the impossible becomes possible when one assumption changes

Mistaking Noise for Signal

  • Twitter rage ≠ market demand
  • Loud complaints ≠ structural pain
  • Clean pitches are often suspicious (real opportunities have ugly parts)

Over-Modeling

  • Perfect causal chains that predict nothing
  • Second-order effects can reverse unexpectedly
  • At some point, bet before the model converges

Charisma Pattern Matching

  • Some founders rehearse the role without the substance
  • Artifacts > vibes
  • Scattered conversation + strong execution history = potential false negative

False Positive Traps

  • Fast growth driven by temporary regulatory loopholes
  • Product-market fit that depends on unsustainable unit economics
  • Technical elegance without market timing

Intervention Damage

  • Giving advice when founders need to discover their own signal
  • Contaminating their learning with your assumptions
  • Sometimes the highest-skill move is strategic silence
0
Grade A-AI Skill Framework
Scorecard
Criteria Breakdown
Quick Start
11/15
Workflow
11/15
Examples
15/20
Completeness
15/20
Format
11/15
Conciseness
11/15