peopleanalyst

Portfolio · Mike West · Pittsburgh, PA

Many designs. One conviction. Feedback makes people — and A.I. — better. Data workflow is the spine.

A range of analytical products across coding, fantasy football, baby naming, figurative art, AI-augmented authorship, and enterprise people analytics, including human performance and all that relates. They sound unrelated. They are all instances of the same loop — measurement in, reflection within, better decisions out — applied to whatever you care about.

Projects

DevPlane

Two products in one project. (1) A local cockpit for multi-tool software development — assignment registry, two-phase actor handoff, coordination-event log, MCP server, CLI, Chrome extension; the operator-side measurement layer AI coding tools' agent-side metrics miss. (2) The portfolio's shared engineering brain — pattern library, architecture maps, session handoffs, cross-project assignment registry, API registry, decision log, capabilities catalog, and the capability-architecture doctrine the whole portfolio is built against.

The problem

Two convergent problems. AI coding tools' productivity claims rest on agent-side measurements — lines produced, tasks completed, time-to-PR. If the Ironies of Automation are operative (operator vigilance falling as agent reliability rises), those measurements systematically overstate net effect. There is no operator-side cockpit catching the loss. And on the portfolio side, every repo accumulates its own patterns, architecture maps, handoffs, and decisions — most of which are unreachable from any other repo. The result is rebuilding the same primitive five times across five products and not noticing.

What I built

A local dashboard + MCP server + CLI + Chrome extension that orchestrates day-to-day work inside one target project (kanban, actions, dispatch, session reports). A multi-agent kanban with a completion-block protocol that tracks per-card execution across heterogeneous AI tools. Two-phase actor handoff (builder → reviewer) where the second transition requires an artifact only the reviewer can produce. Cross-tool sync via a hub SDK so an operator coordinates Cursor, Claude Code, Replit, and other agents through one board. The continuous production telemetry that runs the C1 risk-compensation field study — a pre-registered test of Bainbridge 1983 in real coding work, with hypotheses, analysis plan, and falsification criteria specified before data accumulates. The cross-project documentation hub: pattern library (~20 production-validated patterns), architecture maps per repo, session-handoff archive, assignment registry across repos, API registry, decision log, capabilities catalog, and the Portfolio Capability Platform Playbook that governs how every product folder is structured (src/capabilities/<name>/{contracts,core,adapters,ui,tests} with extraction maturity levels 0–3). Performix is the first reference implementation of the doctrine.

What's novel
  • 01Two-phase actor handoff (builder → reviewer) where the second transition requires an artifact only the reviewer can produce — enforces review without trusting it
  • 02Coordination-event log as a research instrument, not just an audit trail — the apparatus for the C1 risk-compensation field study
  • 03Completion blocks as a protocol — every assignment ends with structured machine-readable completion, not free-text close-out
  • 04Hub-and-spoke sync between heterogeneous AI tools so an operator coordinates Cursor + Claude Code + Replit + custom agents through one board
  • 05Cross-project documentation hub — patterns, maps, handoffs, assignments, APIs, decisions, and capabilities legible across every repo in the portfolio. Reduces the rebuild-it-five-times tax that solo-cadence multi-product work would otherwise pay.
  • 06Capability Architecture Doctrine — the portfolio-wide playbook for structuring applications as compositions of extractable capabilities with contracts, adapters, and props-driven UI. Performix is the first reference implementation; subsequent products inherit the structure.
Outcome

Private. The operator-side coordination spine for the multi-app portfolio. Live locally, measuring, instrumented for the C1 field study. Productising as a multi-tenant SaaS per docs/DEVPLANE-ROADMAP.md. Portfolio consolidation plan in flight: 5 clusters + DevPlane (Fantasy Football absorbs mfl-command-center; namesake/ monorepo absorbs baby-namer family; etc.). Pattern library ships ~20 reusable patterns; capability doctrine is the architectural framework for everything built going forward.

DevPlane is two bets in one project. The first bet is operator-side: the productivity claims being made for AI coding tools are largely grounded in agent-side measurements — and those measurements systematically miss what an operator running multiple agents actually has to do. Build the cockpit, instrument it, run the pre-registered field study against the agents-on-tap-make-everyone-faster claim, and either the data validates or qualifies it. Either way it is more honest than what the field has today. The second bet is shared-brain: every repo in this portfolio accumulates its own patterns, maps, handoffs, and decisions — and most of them stay locked inside the repo that produced them. DevPlane's cross-project documentation role pulls those into one legible surface so the next product can stand on the last one's shoulders rather than rebuilding from scratch. The two bets reinforce each other: the cockpit collects the telemetry that the shared-brain layer turns into pattern recommendations across products.

Fourth & Two

Gridiron Platform — a fantasy football platform built around four converging efforts: a GM Command Center (lineup, waivers, trades, draft, rankings), a Python analytics API (PRISM and CAMS frameworks ported to football), an Insight Card system (PRISM trend / CAMS alignment / market-signal cards composed across surfaces), and Strategy League / Football IQ (a coaching-strategy game layered on fantasy leagues with a simulation engine).

The problem

Fantasy products are either marketplaces with shallow projections or hardcore stat tools that do not make a case. The middle — readable intelligence with a point of view — is empty. And the strategy-game adjacency is even more under-served: there is no serious head-to-head coaching game where the simulation engine respects fantasy-league context and the analytics that drove the season inform the play.

What I built

A monorepo with four converging efforts: (1) **MFL Command Center** — GM workflow app at `apps/web` (lineup, waivers, trades, draft mode, rankings) with the MFL adapter live; (2) **Gridiron Analytics API** — Python / FastAPI service with PRISM (analytics-grade ranking and decision support) and CAMS (Capability / Alignment / Motivation / Support — the portfolio framework adapted to football roles and matchups); (3) **Insight Card system** — rich composable card components (PRISM trend, CAMS alignment, market signal, more) surfaced at `/insights` (browse) and in-context across Command Center / Strategy League (drill-down); (4) **Strategy League / Football IQ** — coaching-strategy game layered on fantasy leagues with a simulation engine. Migrations through 0023 (LLM enrichment tables); insight-card system has its own vision doc + testing plan + outstanding-work inventory.

What's novel
  • 01Four converging efforts on one substrate — Command Center / Analytics API / Insight Cards / Strategy League. Each ships standalone value; together they compose into a cohesive product narrative.
  • 02CAMS ported from people analytics to football — the same capability-alignment-motivation-support framework that drives Performix's diagnostic instrument, adapted to roles, matchups, and weekly performance. Cross-portfolio framework reuse without re-implementation.
  • 03PRISM — analytics-grade ranking + decision support built into the Python/FastAPI service; not a generic projection engine. Same statistical-rigor posture the rest of the portfolio inherits.
  • 04Insight Card system as the legibility surface — composable cards (PRISM trend, CAMS alignment, market signal) browse-able at `/insights` and embeddable in context. Cards are first-class products; meta-tagging + context filtering govern when each one appears.
  • 05Strategy League / Football IQ — coaching-strategy game with a simulation engine layered on fantasy leagues. Unusual adjacency for an analytics product; the strategy-game audience and the analytics audience are different shapes.
  • 06MFL adapter pattern — formal adapter contract for fantasy provider swaps; the platform is provider-agnostic by design even though MFL is the first integration.
Outcome

Private. Four efforts in active development; Insight Cards have their own vision doc + testing plan + outstanding-work inventory tracked separately. Database migrations through 0023 (LLM enrichment tables shipped). Card surfacing decision locked per ADR 0004 (cards ship in `apps/web` at `/insights` browse + in-context drill-down; `apps/analytics` is not the canonical card surface until a new route-to-market). Games shelf + Clerk auth + Strategy League game-plan persistence shipped in April 2026.

Fourth & Two is the bet that intelligence is more readable when it has a sensibility, but the deeper bet is structural: cross-portfolio framework reuse. CAMS was built for people analytics; it generalizes to football roles and matchups because the underlying capability-alignment-motivation-support model isn't HR-specific — it's about what produces realized performance given the conditions. PRISM is the analytics-grade decision-support layer the field is missing. The Insight Card system is the consumption-shape that makes the analytics legible without flattening them into projections. Strategy League / Football IQ is the longer bet on whether the simulation-engine adjacency lands as a distinct game audience or stays as a high-engagement layer on top of fantasy. The platform exists because the middle — analytics with a point of view, a coherent UI surface, and an unusual cross-portfolio framework lineage — was empty.

Namesakenamesake.baby

A name chosen, not stumbled upon. Every baby-naming product helps parents find names; Namesake is the only one that helps them choose one. Built on twenty years of weekly search data, a per-name composite score (SSA stats × LLM-enriched meaning × cultural-event attribution), a tournament-bracket decision UX with village voting, and a cultural-diffusion research apparatus underneath.

The problem

Most baby-naming products are SEO listicles dressed up as products. They cannot tell you whether a name is currently rising or fading; they cannot attribute the rise (which film, which royal birth, which character); they cannot help two parents converge on a shared decision; and they cannot tell you why one name carries narrative weight and another does not. The decision warrants more than a top-100 list. Nothing on the market treats it that way.

What I built

Live product at namesake.baby. $24 one-time tournament unlock — no subscription — that gates the builder, single-elimination bracket UX, village voting, shower mode, and birth reveal. A wizard flow walks parents through their story, taste, and constraints; the system produces a scored, swipable list (the *Namesake score* is a deterministic composite — uniqueness × 0.30 + trend × 0.25 + meaning × 0.20 + feasibility × 0.13 + search × 0.12, percentile-normalized into a 5–95 display range against the dataset). A cultural-intelligence content layer maps name spikes to specific films, shows, and events with attribution drawn from OMDb, TMDb, Wikipedia, and AI synthesis — the definitive record of how popular culture has shaped American baby naming. Visual identity is light-mode taste-grade (Libre Baskerville + Source Sans 3, terracotta and cream palette); profile illustrations are generated through a FLUX pipeline via FAL.

What's novel
  • 01Decision-discipline product framing — the locked positioning is *a name chosen, not stumbled upon*. Every surface is built around helping parents converge on a single name, not browse infinitely.
  • 02Tournament-bracket UX — single-elimination naming bracket, village voting, shower mode, and birth reveal as bundled features under a $24 one-time unlock. The bracket forces the choosing-not-browsing motion the positioning names.
  • 03Twenty years of weekly search data plus systematic cultural-event attribution. Most naming products show annual SSA rankings; Namesake shows weekly trend velocity with named causal events (the film that landed, the character who became iconic, the royal birth) attached. No competitor has the dataset.
  • 04Cultural-diffusion research apparatus — Bass diffusion, Hawkes self-exciting processes, Moran's I, Granger causality, Lieberson null model, phonetic-spillover graphs, variance decomposition. The research surface is mirrored to `/research/namesake/` and is structural substrate for the network-mediated-adoption book project: same mechanism on a cultural-name corpus that the book argues for on organizational adoption.
  • 05Deterministic, percentile-normalized scoring — no LLM-on-render. The Namesake score is computed in a batch rescore script (`npm run recalculate-scores`); the LLM enrichment runs once per name, not per request. Every name's display score is reproducible and bounded; the substrate stays interpretable.
  • 06Light-mode executive-tasteful aesthetic — Libre Baskerville serif, terracotta palette, FLUX-pipeline illustrations on the cooler illustration-surface token. The visual register is the audience's; the editorial discipline does not leak.
Outcome

Live at https://namesake.baby. Paid product at $24 per tournament unlock — Stripe live. Research surface mirrored to peopleanalyst.com/research/namesake (cultural-diffusion methodology, Bass / Hawkes / Moran reports, predictability-ceiling finding, null-model + phonetic-spillover phases). Cultural-intelligence content strategy laid out for `/culture/{films,shows,events}` SEO surface; route currently redirects to `/names` pending hub build.

Naming a child is high-stakes and irreversible, and most products treat it as entertainment. Namesake exists because the decision warrants the discipline. Underneath the product is a real research bet: twenty years of weekly search data, a spike-detection pipeline, and a cultural-attribution layer that maps name surges to specific cultural events. The dataset is the moat; no competitor is doing this work. The product layer — the wizard, the bracket, the village-voting feature, the bundled $24 unlock — is what turns the data into a decision. Two architectural relationships sit underneath: the research surface is mirrored to PA-site and contributes structural evidence to the network-mediated-adoption book (Bass + Hawkes + Moran + the predictability-ceiling finding work on cultural-name diffusion the same way Centola's lab work runs on organizational behavior); and the player-paradigm UX discipline — taste-grade visual register, paced reveal, decision-shaped surfaces — is the consumer-facing analog of the analytics-grade discipline running across the rest of the portfolio.

Velavela.study

1,300+ commits · solo

A contemplative platform that began with fine-art figurative work and has broadened into four things at once — an Image System (player + Reveal + composites), a five-layer Written System (analytical essays · Mosaic testimony · fiction · readings · submissions), a Narrative Intelligence Platform (three-domain editorial spine), and an Adaptive Authorship substrate the rest of the portfolio is built to sit on top of. Editorial Office, Penwright in /labs, Artist Studies arcs (Warhol shipped; Schiele / Klimt / Sargent staged), and a preregistered research program at /study running underneath.

The problem

Image platforms either flatten taste into engagement metrics or hide behind gatekeepers. Editorial platforms publish on calendars rather than to readers. Neither produces a reading rhythm. Neither learns from you. Vela is built on the bet that a single substrate can do both — taste-driven figurative discovery and longform editorial work — when adaptive measurement runs underneath.

What I built

385 active works pulled from museum APIs (ARTIC, Met, BnF, Smithsonian, Europeana) under full attribution and license discipline. The Reincarnation engine learns per-reader desire and pool composition across visual rhyme and emotional register; first production batch shipped. A five-layer Written System — analytical essays, Mosaic pieces (1,353-passage testimony corpus), fiction, audio readings, and submissions — paced per-user (each reader's magazine begins when they arrive, not on a calendar). The Narrative Intelligence Platform generalizes the Mosaic engine into a three-domain spine (lived / fiction / research) with five admin surfaces deployed (Dashboard · Sources · Excerpt Review · Quote Explorer · Insight Workbench · Guide Builder). Artist Studies as the flagship editorial arc pattern — Warhol shipped (10 ASNs, Virtual Docent extension, Pittsburgh walking tour); Schiele, Klimt, and Sargent staged. Penwright lives at /labs/penwright (F-03 Authorship Packet UI MVP shipped; F-19 Adaptive Authorship Control Kernel is the spine; *They Say / I Say* method registered as the third primitive 2026-05-13). Editorial Office: Writer's Desk for 1:1 with each writer; The Office for multi-writer convening with round-2 react. Derivative Pipeline (unlicensed → Vela-licensed, with transformation-distance scoring) live as a new push. Stripe membership in live mode. Preregistered research program at /study with consent-gated session, instrument-validation cohort, and dissertation-track scaffolds for RQ1–RQ4.

What's novel
  • 01Reincarnation engine: per-user desire scoring with RID/SID adaptive measurement and visual-rhyme sequencing — the secret sauce, treated as scientific instrument under formal scrutiny (math spec, segmentation explorer, replay pipeline, instrument-validation cohort)
  • 02Per-user magazine pacing — each reader's editorial schedule begins when they arrive; positioning wedge is "your magazine begins when you do"
  • 03Five-layer Written System on a single substrate — analytical essays, Mosaic testimony, fiction, audio readings, and submissions — coexisting around three editorial axes (figurative response, emotion architecture, developmental theology); emotion is the core axis post the 2026-04-30 pivot
  • 04Narrative Intelligence Platform — generalizes the Mosaic editorial engine into a three-domain spine (lived / fiction / research); five admin surfaces deployed (Dashboard, Sources, Excerpt Review, Quote Explorer, Insight Workbench, Guide Builder) for processing the 1,353-passage testimony corpus
  • 05Artist Studies arcs as the flagship editorial pattern — research → essay → gallery → AI method studies → walking tour → retrospective; Warhol shipped (Virtual Docent + Pittsburgh tour); Schiele, Klimt, Sargent staged in the queue
  • 06Editorial Office — Writer's Desk + multi-writer convening turns the writer roster from production tool into colleagues
  • 07Adaptive Authorship substrate underneath (lib/platform/, CI-guarded against Vela-specific imports) — Vela is property #1; siblings reuse the substrate
  • 08Preregistered research program at /study — consent-gated session, arm-assignment log, instrument validation, dissertation scaffolds for RQ1–RQ4. The site is the exhibit; the engine is the science.
  • 09Museum-grade attribution and license discipline as a first-class feature, not a footnote
Outcome

Live at vela.study. Stripe membership in live mode. Magazine publishing on per-reader pacing. Reincarnation first production batch shipped. Narrative Intelligence Platform: five admin surfaces deployed, 1,353-passage testimony corpus live, dual-write from Mosaic → Excerpts running. Artist Studies: Warhol shipped (ASN-306..315 + Virtual Docent extension); Schiele, Klimt, Sargent staged. Labs migration: seven experimental surfaces (Coincidences, Reveal, Three Rooms, Tournament/Flow Preview, Personal Composites, Constellation, Reveal Grid, Penwright) at /labs/* with explicit forward-looking framing. /study research program staged behind first-pass instrument validation. Penwright F-03 Authorship Packet UI MVP shipped; F-19 Adaptive Authorship Control Kernel is the active spine.

Vela began as a bet that taste compounds when given a substrate. The substrate is the asymmetry: AI holds the survived corpus, humans hold the unsurvived response. Vela is the place where those two meet — careful sourcing on one side, calibrated human signal on the other, and a magazine for the language in between. The bet has broadened: the same substrate now hosts Penwright (authorship system in /labs), the Editorial Office (writer collaboration), per-user magazine pacing, and three editorial axes that coexist without collapsing. It is also the reference implementation for an adaptive-authorship platform that future siblings will sit on top of.

Penwright

An AI-augmented authorship system — corpus control, packet-shaped composition, and a measurement framework that asks whether the writer is better with it, than without it, in six months.

The problem

Most AI writing tools optimize for output fluency. They make it easier to produce something faster — and that something is often shaped by the model rather than the writer. The longer-term cost (capability erosion, voice flattening, sycophancy spirals, source attribution buried) is barely measured because the field measures what is easy to measure. The result is a generation of tools that look like assistants and act like substitutes.

What I built

An authorship environment that inverts the prompt-then-edit pattern. Writers assemble Authorship Packets — intent · structure · key ideas · relevant passages · counterpositions — before the AI is invoked. Corpus selection is explicit: writers choose which sources influence the work rather than inheriting the model's training distribution. The Adaptive Authorship Control Kernel (F-19) is the spine — central registry of skill measurement, intervention, and genre-aware behavior (memoir / nonfiction / fiction never collapsed). The Penwright Measurement Framework — six skill dimensions, six derived indices, three measurement layers, five-step learning loop, and four non-negotiable failure modes — determines whether a session made the writer better. Lives inside Vela's repo (app/labs/penwright/) for now; graduates when the design stabilizes.

What's novel
  • 01Authorship Packet Model — replaces freeform prompting with structured input units; the structure itself is data
  • 02Corpus Control Layer — writer selects sources rather than inheriting the LLM's training distribution
  • 03Adaptive Authorship Control Kernel (F-19) — central registry of measurement and intervention; genre-aware behavior forks copy + schema enums + prompts + metrics rather than collapsing them
  • 04Penwright Measurement Framework — first multi-dimensional measurement system for AI-augmented writing skill development; four non-negotiable failure modes (output-only optimization · over-automation · weak measurement · ignoring genre differences) act as veto
  • 05Accumulating named writing-method primitives — Borrowed Architecture (v1.0, 2026-05-12), They Say / I Say (third primitive, v1.1, 2026-05-13). Each is a registered method with structural anatomy + worked examples + anti-patterns, available to the writer as a callable move inside the kernel rather than as advice in a style guide.
  • 06Anti-invention constraint — when a structural rhetorical move requires biographical material the user has not supplied, the tool refuses to render rather than confabulating
  • 07Writing-craft corpus as the method-reference substrate — Layer-1 writing-craft books (Carson, Bluets, the recent Cluster A ingestion) tagged for retrieval against method-development sessions, so the kernel cites primary sources when proposing a move, not just trained patterns
  • 08Has its own published research program at peopleanalyst.com/research/ai-human-interaction (12-paper Penwright Research Program across three tiers)
Outcome

Early build inside Vela's repo (app/labs/penwright/). F-03 (Authorship Packet UI MVP) shipped. F-19 (Adaptive Authorship Control Kernel) is the architectural spine; it ships first or in parallel with the first feature. 19 features (F-01..F-19) sequenced across 6 implementation waves. Three named writing-method primitives registered so far (Authorship Packet structure, Borrowed Architecture, They Say / I Say); writing-craft corpus Layer-1 (Carson + Bluets + cohort) ingested as method-reference substrate. Research program at peopleanalyst.com/research/ai-human-interaction is the public-facing trajectory.

Penwright exists because the field of AI writing is being measured by output and not by capability. The longitudinal test — better writer with Penwright, than without it, in six months — is unfashionable but load-bearing. The alternative bet — better outputs faster, optimization toward fluency — is the bet most of the field has already taken. Penwright is the bet on the other side: that writers can become more capable inside an AI-augmented environment, and that this can be measured rigorously enough to fail on its own terms. Seven non-negotiable rules in §7 of the vision doc act as the spine for every product decision (don't build generic AI writing features · don't collapse genre distinctions · don't hide source attribution · don't flatten emotional nuance · don't optimize for speed over authorship · don't make AI compliant · don't over-moralize).

Performix

Protected feedback and performance intelligence platform — surfaces what employees can observe but cannot safely say, scores team-level Capability / Alignment / Motivation / Support (CAMS), and renders one binding-constraint card per team. Precompute-and-playback architecture — the iTunes of analytics, not a dashboard. First external MCP consumer of the People Analytics Toolbox.

The problem

Organizations cannot improve what employees cannot safely say. Performance is not produced by capability alone; capability becomes realized performance only when alignment, motivation, and support are present. Most products miss this — they treat performance data as a transaction record (HRIS posture) or a dashboard (BI posture) or a sentiment score (engagement-survey posture). None of those instruments the question that actually matters: what is blocking team performance right now, and what is one accountable action that would unblock it? The richer signal — protected, comparative, longitudinal, distribution-aware, small-N-honest, mechanism-grounded — gets averaged into uselessness or lost in privacy theater.

What I built

MVP 1 — Protected Team Performance Diagnostic — in active build. Manager picks a team, answers ~12 protected survey statements (four CAMS dimensions × three items), and the system renders one binding-constraint card showing which dimension is starving plus safe comment themes plus recommended action plus follow-up pulse plan. Seven capabilities staged: protected-feedback (min-N + redaction primitive every other capability passes through), survey-collector-adapter, segmentation-adapter, cams-diagnostic, performance-science-library (cited findings backing every evidence pill), insight-player (the Player-Paradigm-inherited UI; the user's home), action-loop. CAMS canonical spec, subconstruct boundaries, binding-constraint rule, and code contract all live in `docs/CAMS.md`. Architecture is precompute-and-playback — metrics are calculated upstream by segment and stored as Insights / InsightCharts / Smart Lists; the player retrieves, never re-computes at render. Safe AI Insight Interpreter sits at the interpretation layer, never the headline — summarizes patterns, drafts leadership communications, flags weak evidence, respects min-N and role-based visibility. MCP consumer of the People Analytics Toolbox via the Reincarnation adapter as of 2026-05-11 (PFX-30) — typed-contract dependency, no shared substrate coupling.

What's novel
  • 01Protected feedback as a substrate primitive, not a privacy setting. Min-N enforcement, small-cell suppression, comment redaction, identity-risk scoring, role-based visibility, and safe aggregation all live in the gate every Insight passes through before it reaches storage. The player never has access to anything that has not been suppression-checked.
  • 02CAMS as the binding-constraint diagnostic — not a dashboard of metrics. The output is one card per team naming the dimension that is currently starving the system. Capability / Alignment / Motivation / Support; whichever is lowest is the constraint; that constraint is what to act on. Capability alone never produces realized performance.
  • 03Precompute-and-playback architecture, not BI. Insights are first-class records calculated upstream and stored; the player and the library retrieve. The framing is musical: lists, collections, and the now-playing experience. Where any surface starts computing metrics in response to a user request, that is a bug.
  • 04Player Paradigm — the consumption discipline inherited from Vela, adapted to executive enterprise consumption (modern light mode, taste-grade typography, no dark-mode museum aesthetic). The player is the user's home; the library is the secondary discovery surface. Same paradigm as Vela; different aesthetic for the enterprise audience.
  • 05Safe AI Insight Interpreter — AI is not the headline; AI is the interpretation layer that summarizes team patterns, drafts leadership communications, flags weak evidence, recommends follow-up. Hard constraints: never expose raw responses, never infer who said what, never generate individual employee scores from protected feedback, never support retaliation or unsafe targeting.
  • 06MCP consumer of the People Analytics Toolbox. Performix vendors typed Zod contracts from the toolbox (Reincarnation for adaptive measurement; data-anonymizer for the suppression gate; segmentation-studio for cohort resolution; calculus for confidence intervals on small-N segments) instead of re-implementing them in-product. First-of-fleet test of the substrate-not-product positioning the toolbox is built around.
  • 07Same product, three beachheads, three front-door messages. Sales-performance variance for CROs / VP Sales / RevOps; AI-transformation readiness for Chief AI Officers / COOs / CHROs; post-acquisition integration for Chief Integration Officers / PE op partners. The underlying instrument is identical; only the buying motion changes.
Outcome

Early build. MVP 1 scope locked at docs/VISION.md (canonical 2026-05-10; tiebreak source: PRD V2). MCP consumer of the toolbox via PFX-30 (2026-05-11). Three beachheads on the GTM roadmap (sales-performance variance first). Solo build with Alvan on system architecture and platform engineering; Mike owns performance science, CAMS, measurement constructs, and customer-use-case validation.

Performix exists because the question that actually matters for enterprise performance is not *who is rated what* but *what is blocking the team's performance right now, and what is one accountable action that would unblock it.* Capability alone never produces realized performance; alignment, motivation, and support are conjunctive conditions. CAMS is the model. The protected-feedback substrate is what makes the model legible — employees can observe the social, managerial, and structural conditions blocking performance better than any external instrument can measure them, and the only way to surface that signal is to make it safe to say out loud. The product is built on top of a precompute-and-playback architecture borrowed from music players — the user's home is the player, the library is the secondary catalog, and the discipline is to never recompute at render. Two architectural bets sit underneath: that the Player Paradigm ports from Vela's poetic register to enterprise-executive consumption with the right aesthetic discipline, and that a vendored-contracts dependency on the People Analytics Toolbox (rather than a shared substrate inside Vela) is the right shape for fleet-wide reuse. The first MCP-transport adoption (PFX-30, 2026-05-11) is the cleanness test for the second bet.

MetaFactory

Two-shell production-factory substrate for the portfolio — an engine (OLD, AI-controlled, no human UI) that ingests books and research at chapter-respecting fidelity and runs a roster of named factories (Persona Factory, Survey Factory, Competency Factory, Models Factory, Requirements Factory, Prompt Factory, Publishing Factory, Business Ideas Factory, Application Designs Factory) producing canonical outputs; and an API host (PROD, Vercel) that exposes a v1.2.0 REST + MCP contract plus a cross-portfolio library layer (Stream 7, ~944 records) the rest of the portfolio reads from.

The problem

Cross-cutting infrastructure — book ingestion at chapter-respecting fidelity, schema-conformant extraction of behavioral constructs, job / competency / persona generation, survey factories, the canonical-vocabulary substrate every consumer needs to compare measurements across products — gets re-implemented per consumer when there's no production-factory substrate. The cost isn't only engineering: each consumer ends up with its own slightly-drifted definition of competency, persona, engagement, effect-size, and behavioral constructs that should be comparable across products become incomparable. The earlier 'Universal Information Factory' framing tried to do too much; the resulting analytics-vs-production drift left the repo half-renovated for a stretch. The fix was a deliberate split: production-factory engine in one shell, consumer-facing API in another, with a clean seam between.

What I built

Two-repo architecture. **OLD (`meta-factory`)** — AI-controlled engine, local-only on Mike's Mac, no human UI. Owns ingestion pipelines (collector → organizer → referee → classifier 14-stage book flow; research_agent + deep_research_agent for articles), the named factory roster (Persona Factory, Survey Factory, Competency Factory, Models Factory, Requirements Factory, Prompt Factory, Publishing Factory, Business Ideas Factory, Application Designs Factory, plus checkpoint-charlie and orchestrator infrastructure), the asset registry (5,013 entries as of 2026-05-11), and canonical_outputs production. Cryptographic provenance with SHA-256 tracking on every source file; safe-delete invariants require hash verification before any local delete. Dual-grade corpus ingestion migrated in from Vela — same database holds editorially-selected curator passages and bulk research chunks, distinguished by tag. **PROD (`meta-factory-prod`)** — pure API host on Vercel. Owns the v1.2.0 REST + MCP contract (`docs/API-CONTRACT.md` + `CONTRACT-CHANGELOG.md`), the cross-portfolio library layer (Stream 7, library snapshot of ~944 records consumed by every portfolio product), consumer-onboarding doc (`CONSUMERS.md`), auth boundary (`META_FACTORY_API_SECRET` shared-secret), and cloud content access (bundled snapshot + Supabase Storage). Same engine; different shells.

What's novel
  • 01Split engine + API host architecture — OLD does the work, PROD makes it accessible. The seam is a snapshot + cloud-storage refresh, run manually on a manual cadence. The two repos can evolve independently without coupling consumer integration to engine internals.
  • 02Named factories as units of production — Persona Factory, Survey Factory, Competency Factory, Models Factory, Requirements Factory, Prompt Factory, Publishing Factory, Business Ideas Factory, Application Designs Factory. Each is a package that outputs a structured behavioral artifact; analytics offerings explicitly excluded. The roster grows; the substrate doesn't.
  • 03Cross-portfolio library layer (Stream 7) — a single read surface for ~944 corpus records that every product consumes, with a shared lifecycle and spec at `peopleanalyst-site/docs/library/SPEC.md`. The portfolio has one library, not seven slightly-different ones.
  • 04Dual-grade corpus ingestion — same database holds editorially-selected curator passages and bulk research chunks, distinguished by tag. Vela's pipeline lifted into the substrate.
  • 05Cryptographic provenance contract — SHA-256 tracked for every source file; safe-delete invariants require hash verification on backup and durable storage before any local delete. The substrate cannot lose source material to a careless deletion.
  • 06~$0.13 per research-run synthesis at 30K+ passage scale — most 'AI research' tools run 10–100× more expensive because they retrieve without pattern extraction. The discipline is statistical: extract patterns once, cite the evidence, don't re-retrieve.
  • 07Schema-extracted measurement vocabulary (@measurement/core) shared across consumers — constructs, items, instruments, effect-sizes defined once in canonical form, so behavioral measurement compares cleanly across Performix / Principia / PA Platform / the toolbox. Cross-product comparison becomes structurally possible rather than aspirational.
  • 08v1.2.0 REST + MCP contract — PROD exposes the same engine over HTTP for engineers and MCP for AI agents. Consumer integration is contract-versioned with a changelog; consumers pin to the version they were built against and migrate explicitly on major bumps.
Outcome

Private. OLD/PROD split shipped; v1.2.0 API contract live on Vercel at `meta-factory-prod.vercel.app`. Asset registry at 5,013 entries; cross-portfolio library snapshot at ~944 records (Stream 7, 2026-05-11). All major portfolio consumers integrated (PA-site, vela, principia, Fourth & Two, Performix, DevPlane). Operator commands (`registry:build`, `registry:verify`, `snapshot:registry`, `upload:content`) are the engine-health surface for Mike or an AI agent. The half-renovated state is behind us; the substrate is legible in thirty seconds.

MetaFactory exists because cross-portfolio infrastructure can't be re-built per consumer and can't sit half-renovated forever. Two structural decisions defined the long-term shape. The first was the narrowing decision — production-factory artifacts (competencies, personas, instruments, job profiles) over open-ended analytics offerings; the cuts were executed and the roster sharpened. The second was the OLD/PROD split — the engine that does the work is too heavy to ship on Vercel and too dangerous to expose to consumers as an internal surface, and the API host that consumers actually integrate against is too thin to carry the ingestion pipelines. Two repos, one system, one seam. Underneath the systems argument is a measurement argument: every artifact a factory ships becomes an input to a human decision somewhere — a competency model becomes a development plan, a persona shapes a product call, a survey instrument runs at a client. The substrate is narrowed not just for engineering legibility, but because the artifacts go to people who need them defensibly grounded.

People Analytics Toolbox

Independently-versioned analytical microservices for people analytics — psychometric diagnostics, preference modeling, privacy primitives, segmentation, statistical enrichment, compensation logic, decision forecasting, metadata-grounded codegen — deployed as a single Next.js application and exposed over two transports: HTTP for engineers, MCP for AI agents. One Vercel project, one Supabase project. The behavioral and statistical substrate consumer apps compose against.

The problem

HR analytics products treat behavioral data as if it were transactional. Engagement gets reduced to a survey score; performance to a rating average; retention to a churn rate; compensation to a band. The richer questions — what actually drives engagement in this organization, what signal in performance distributions matters for decisions, what statistical posture handles small-N segmentations honestly, what value would more information have before another study runs — get lost. Cross-cutting concerns (anonymization, metric calculation, segmentation, survey delivery, decision support) also get re-implemented per product, brittle and fragmented. The combination — wrong analytical posture and fragmented infrastructure — is the field's default state.

What I built

Live spokes — reincarnation (adaptive psychometric diagnostic engine; IRT-weighted item selection; pool-based item lifecycle), preference-modeler (Likert/multi-choice/free-text plus MaxDiff, conjoint, penny allocation, paired comparison; BIBD-balanced task generation; MNL utility estimation via Newton-Raphson), data-anonymizer (PII detection, deterministic HMAC tokenization, k-anonymity min-N gate, substitution-strategy registry), segmentation-studio (HRIS canonical-field normalization with 35-field priority catalog; multi-membership cohort resolution; OneModel adapter; recipes; pack publishing), calculus (statistical enrichment, anomaly detection, time-series imputation, metric × segment × period combinatorial factory; auto-selects Wilson / t-interval / normal CI), anycomp (comp models, market band math, stateless evaluation, auditable cycle runs), forecasting (Monte Carlo simulation, EVPI, discrete EVSI on aligned-chance decision trees), and Conductor (metadata-grounded SQL/Python codegen — the model sees schema, field semantics, and canonical metric definitions, so the queries it produces are construct-honest by construction) — plus a reserved namespace job-family-agent whose canonical home is meta-factory-prod. Each spoke owns its own schema, contract, and audit trail. Per-route structured logs plus per-tool audit rows in mcp.mcp_audit. Consumer apps vendor the typed contracts; the algorithms live here.

What's novel
  • 01Substrate, not product. Consumer apps (Performix, vela, future analytical products) vendor only the typed Zod contracts; the algorithms stay in the toolbox. Adoption is spoke-by-spoke; no all-or-nothing migration.
  • 02AI-native by construction. Every spoke is callable from AI agents over MCP (Model Context Protocol) without bespoke integration. Per-consumer auth, scope-restricted keys, fire-and-forget audit log. Performix migrated to MCP transport 2026-05-11 as the first external consumer; devplane operates the wildcard key.
  • 03Behavioral science in the algorithms, not bolted on. Reincarnation runs IRT a-parameter-weighted adaptive item selection with Cronbach α tracking. Preference-modeler runs MNL utility estimation on real MaxDiff/conjoint designs. Calculus auto-selects the right confidence-interval method by data shape. These are textbook psychometrics and choice theory implemented as service APIs.
  • 04Conductor: metadata-grounded codegen — the model sees schema, field semantics, and canonical metric definitions, so the SQL/Python it produces is construct-honest by construction, not example-grounded. Bridges the gap between an AI writing queries and an AI writing queries that respect what the constructs are supposed to mean.
  • 05Privacy is a service, not a setting. Data-anonymizer is cross-cutting — every spoke that surfaces team-level rollups calls min-N-check before responding. Anonymity-gated aggregations return blocked status below the floor; tokenization is deterministic and cache-backed; substitution strategies (mask / pseudonymize / synthetic-realistic) live in a registry.
  • 06Systems × survey × behavioral-science join is first-class. Segmentation-studio normalizes HRIS canonical fields; data-anonymizer makes the join safe under k-anonymity; calculus enriches the joined records into MetricEnvelope objects. The same envelope shape carries data from a Workday extract, a survey response, or a derived rollup — consumers don't care which.
  • 07Explicit contract versioning. Every spoke ships CONTRACT_VERSION; every additive change is a semver bump; every breaking change is a major bump with affected-consumer notes. Consumers vendor a copy and re-vendor on major bumps. The deploy boundary is clean.
Outcome

Eight live spokes (reincarnation, preference-modeler, data-anonymizer, segmentation-studio, calculus, anycomp, forecasting, Conductor) plus one reserved namespace (job-family-agent). MCP gateway + HTTP routes shared across the spoke set; all health endpoints green; mcp.mcp_audit writing real rows; database migrations auto-run on Vercel production builds. First external consumer (Performix) migrated to MCP transport 2026-05-11. Solo build.

The toolbox exists because every HR analytics product I worked with kept re-implementing the same five things — anonymization, metric definitions, segmentation, surveys, decision support — and getting each one slightly wrong. Building them once, well, and letting verticals consume them is the architectural bet. Two things changed in the last twelve months that sharpened the bet. The first was the narrowing decision: an earlier roster of broader spokes was cut down to seven that actually pull weight at production scale. The second was MCP. Once the algorithms can be called directly from AI agents — typed, scoped, audited — the toolbox is no longer a back-end someone else's UI sits on top of; it is the legible service substrate that both engineering teams and AI consumers compose against. Underneath all of that the original measurement bet is unchanged: HR analytics works only when behavioral science and statistical rigor are first-class. Constructs defined defensibly. Anonymity thresholds enforced in the contract, not in a settings page. Decisions getting value-of-information treatment rather than dashboard intuition. Small-N segmentations handled honestly rather than averaged into uselessness. The architecture is what makes one operator productive at the scale of a software company.