What is PeopleAnalyst?

PeopleAnalyst is the front door for people-analytics research: 205+ works indexed and profiled, 40+ citation-grade findings extracted, and peer-reviewed behavioral science translated from academic to actionable — the missing manual for the people analytics you always meant to do.

What is people analytics?

People analytics is not a dashboard. It is behavioral science and statistical inference applied to workforce decisions — a discipline with its own methodology, spanning measurement, organizational design, talent, leadership, and analytics craft.

Why does AI in HR need measurement science?

AI is being deployed in high-stakes people decisions — hiring, performance, attrition — without the measurement science to evaluate whether it works or whom it harms. Construct validity, effect sizes, and criterion validity are the vocabulary for asking an AI vendor the right questions.

How is the research made accessible?

The evidence is indexed and searchable: 205+ works, 40+ citation-grade insight cards, and 8 research arcs, so the right finding reaches the right decision at the right time.

What separates good people measurement from assertion?

Good measurement has a method: construct validity, reliability, and effect-size interpretation are not optional — they are what separates evidence from assertion.

library / lib819d3ddf552fcc6d

AI Agents and Applications (with LangChain, LangGraph, and MCP)

Roberto Infante · 2025

In a sentence

A hands-on developer guide that takes you from LLM prompt basics through advanced RAG, multi-tool agents, multi-agent systems, and the Model Context Protocol using LangChain, LangGraph, and LangSmith.

This book is a comprehensive, code-driven journey through the full spectrum of LLM-powered application development. Beginning with the fundamentals of prompt engineering and the OpenAI API, it progressively builds toward sophisticated architectures: summarization engines, Q&A chatbots grounded in private knowledge bases via Retrieval-Augmented Generation, and finally autonomous multi-tool AI agents and multi-agent systems. Each concept is illustrated with practical, runnable Python examples centered on a travel industry theme. The book covers LangChain's modular component model, LangGraph's stateful graph-based agent framework, LangSmith's observability tooling, and the emerging Model Context Protocol (MCP) standard. Readers learn not just how to wire components together but how to reason about trade-offs in chunk size, embedding strategy, query transformation, routing, and production concerns like memory and guardrails—making this an indispensable reference for any software developer building real-world AI applications.

The four lenses

Science
Statistics
Systems
Strategy

Tags

aisoftware-engineeringf1-systems

The model

A causal model describing how design levers in LLM application development—spanning prompt engineering, indexing strategy, query transformation, retrieval architecture, agent orchestration design, and observability—drive intermediate system states such as retrieval relevance and agent reasoning quality, which in turn determine outcome metrics including answer accuracy, hallucination rate, system reliability, and developer productivity.

Prompt Engineering Qualitydesign lever

The degree to which prompts are deliberately structured with clear persona, context, instructions, output format, tone, and in-context examples (zero-shot through few-shot and chain-of-thought) to guide the LLM toward accurate and appropriate outputs for a given task. High quality reflects well-crafted, task-specific prompt templates with appropriate examples and safety instructions such as instructing the model to admit ignorance rather than hallucinate.

Indexing Strategy Sophisticationdesign lever

The extent to which the content ingestion pipeline employs advanced techniques beyond naive fixed-size chunking, including appropriate splitting strategies (HTML header, markdown header, recursive character, sentence-based), multi-vector embedding schemes (child chunk embeddings, summary embeddings, hypothetical question embeddings), chunk expansion with adjacent context, and metadata enrichment. Higher sophistication means the vector store indexes each document in multiple semantically rich ways.

Query Transformation Sophisticationdesign lever

The degree to which the system applies pre-retrieval query reformulation techniques including rewrite-retrieve-read, multiple query generation, step-back abstraction, hypothetical document embeddings (HyDE), and single-step or multi-step decomposition of complex questions. Higher sophistication means the system does not pass raw user queries directly to the vector store but instead reformulates them to maximize semantic alignment with indexed content.

Retrieval Architecture Breadthdesign lever

The extent to which the application draws on multiple heterogeneous content stores beyond a single vector database, including relational SQL databases, graph databases, and metadata-filtered vector search, with appropriate routing logic to direct each query to the most relevant store and post-processing techniques such as reciprocal rank fusion to rerank results from multiple sources.

Agent Orchestration Design Qualitydesign lever

The quality of the agent architecture as measured by adoption of explicit typed state management (TypedDict state schemas), appropriate use of LangGraph's conditional edges and node functions, clear tool descriptions and system prompts that guide LLM tool selection, use of proven patterns such as ReAct and supervisor, and avoidance of brittle linear chains for tasks requiring dynamic decision-making. Encompasses both single-agent and multi-agent system design.

Observability and Tracing Practicedesign lever

The extent to which the development team instruments the LLM application with end-to-end tracing using tools such as LangSmith, enabling inspection of every LLM call, tool invocation, retrieved document, and message flow. High observability practice means traces are captured for all environments, examined during development, and used to diagnose issues in production rather than relying on surface-level outputs alone.

MCP Integration Adoptiondesign lever

The degree to which the agent application leverages the Model Context Protocol to discover and consume remote tools published by MCP servers, rather than maintaining bespoke per-service tool wrappers. Higher adoption means external capabilities are accessed through standardized MCP clients (FastMCP, LangChain MultiServerMCPClient) and the agent tool registry is populated dynamically from MCP server tool catalogs.

Retrieval Relevancepsychological state

The degree to which the document chunks returned by the retrieval stage of a RAG pipeline are semantically pertinent to the user's question and contain sufficient information for the LLM to synthesize a correct and complete answer. High retrieval relevance means the top-k results consistently contain the key facts needed, with minimal noise or off-topic content, and that context length is appropriate—neither too sparse to answer nor too verbose to confuse the model.

Agent Reasoning and Tool Selection Qualitypsychological state

The quality of the agent's intermediate reasoning steps, including its ability to correctly identify which tools to call, formulate accurate tool arguments, interpret tool outputs, know when to request additional tool calls versus when to synthesize a final answer, and avoid defaulting to internal model knowledge when tools should be used. High quality means the agent's chain of thought is coherent, grounded in retrieved evidence, and converges efficiently to a correct final answer.

Context Grounding Qualitypsychological state

The degree to which the LLM's generated responses are anchored to verified external context provided in the prompt (retrieved chunks, tool outputs, database records) rather than to the model's pre-trained parametric knowledge. High grounding means the model cites or refers to retrieved sources, stays within the boundaries of provided context, and explicitly acknowledges uncertainty rather than fabricating plausible-sounding but unverified information.

Answer Accuracyoutcome metric

The correctness and completeness of the final natural language response delivered to the user, measured against a ground truth reference. High accuracy means the answer correctly addresses all parts of the user's question, contains no factual errors, and does not omit critical information available in the knowledge base. Encompasses both factual precision and response completeness.

Hallucination Rateoutcome metric

The proportion of LLM responses that contain fabricated, unsupported, or factually incorrect information not present in the retrieved context or tool outputs. Lower is better. Hallucinations occur when the model fills gaps with plausible-sounding content drawn from parametric memory rather than grounded context, particularly when the retrieved context does not contain the answer.

System Reliability and Maintainabilityoutcome metric

The degree to which the LLM application or agent system behaves predictably, handles edge cases gracefully, can be debugged and extended without major refactoring, and remains stable as underlying models or data sources change. High reliability reflects modular component design, explicit state management, robust error handling, and the ability to swap LLMs or vector stores without rewriting application logic.

Developer Productivityoutcome metric

The speed and ease with which a software developer can design, implement, test, and iterate on LLM-powered applications and agent systems. High productivity reflects reduced boilerplate through framework abstractions (LangChain, LangGraph, pre-built agent components), faster debugging through observability tooling (LangSmith), and reuse of community resources (LangChain Hub, MCP ecosystem) rather than building every component from scratch.

How they connect

prompt engineering quality → influences context grounding quality
prompt engineering quality → influences agent reasoning quality
prompt engineering quality → influences answer accuracy
indexing strategy sophistication → predicts retrieval relevance
query transformation sophistication → predicts retrieval relevance
retrieval architecture breadth → influences retrieval relevance
retrieval relevance → predicts context grounding quality
retrieval relevance → predicts answer accuracy
context grounding quality → predicts answer accuracy
context grounding quality − predicts hallucination rate
agent orchestration design → predicts agent reasoning quality
agent reasoning quality → predicts answer accuracy
agent reasoning quality → influences system reliability
observability practice → influences agent reasoning quality
observability practice → influences system reliability
mcp integration → influences developer productivity
mcp integration → influences system reliability
indexing strategy sophistication → mediates context grounding quality
query transformation sophistication → mediates context grounding quality
prompt engineering quality − influences hallucination rate
agent orchestration design → predicts system reliability
answer accuracy − correlates hallucination rate

The process

The book provides an operational playbook for building increasingly sophisticated applications powered by Large Language Models (LLMs). The journey begins with foundational skills, such as executing prompts programmatically and performing text summarization on large or multiple documents. It then progresses to constructing a complete research summarization engine that automates web searching, scraping, and reporting, introducing the power of LangChain Expression Language (LCEL) for creating complex data pipelines. The core of the playbook is centered on two major architectural patterns. First, it details the end-to-end process of building a Retrieval-Augmented Generation (RAG) system for creating knowledge-based Q&A chatbots, covering everything from basic implementation to advanced techniques for indexing, query transformation, and routing. Second, it transitions from structured workflows to dynamic, autonomous systems by teaching how to build tool-using AI agents with LangGraph. This culminates in orchestrating multi-agent systems where specialized agents collaborate under the direction of a router or supervisor. Finally, the playbook extends these agents to the broader AI ecosystem by demonstrating how to build and consume standardized MCP servers, enabling seamless integration with remote tools.

Programmatic Prompt Execution

To interact with an LLM programmatically using its API directly or through the LangChain framework, forming the basic building block of any LLM application.

When to use: When you need to automate interactions with an LLM for tasks like text generation, classification, or simple Q&A.

Step 1Set up the development environment.
Entry: Python and a package manager (pip) are installed.
Exit: All required libraries are installed in an active virtual environment.
In: List of required packages (e.g., from requirements.txt) · Out: Activated virtual environment with installed dependencies
Step 2Obtain and configure API keys securely.
Entry: An account with the LLM provider is created.
Exit: The API key is accessible to the application code without being hardcoded.
In: LLM provider API key · Out: Configured API key in the application's environment
Step 3Instantiate the LLM client.
Entry: API key is configured.
Exit: An LLM client object is ready to use.
In: API key · Out: LLM client instance
Step 4Define and format the prompt.
Entry: The task for the LLM is clearly defined.
Exit: A complete prompt string or `PromptValue` object is created.
In: User input, Task instructions, Examples (for few-shot) · Out: Formatted prompt
Step 5Invoke the LLM and process the response.
Entry: An LLM client and a formatted prompt are available.
Exit: The LLM's generated text is extracted and ready for use.
In: Formatted prompt · Out: LLM response object, Extracted text content

Summarizing Large or Multiple Documents

To generate a concise summary from a single document that exceeds the LLM's context window, or from multiple separate documents.

When to use: When you need to condense long reports, articles, books, or a collection of related documents into a brief summary.

Step 1Load and split the document into chunks.
Entry: A large document file is available.
Exit: The document is split into a list of text chunks.
In: Large document · Out: List of text chunks
Step 2Create and execute a 'map' chain for individual summaries.
Entry: A list of text chunks is available.
Exit: A list of summaries, one for each chunk, is generated.
In: List of text chunks · Out: List of individual summaries
Step 3Create and execute a 'reduce' chain for the final summary.
Entry: A list of individual summaries is available.
Exit: A single, final summary of the entire document is generated.
In: List of individual summaries · Out: Final summary
Step 4Assemble and invoke the complete map-reduce chain.
Entry: All individual chains (split, map, reduce) are defined.
Exit: The final summary is returned.
In: Large document · Out: Final summary

Building a Research Summarization Engine

To create an automated system that takes a user query, performs web searches, scrapes the content, and generates a comprehensive summary report using LCEL.

When to use: When you need to generate a detailed report on a topic by gathering and summarizing information from the internet.

Step 1Set up the project and core functionality.
Entry: A clear research task is defined.
Exit: Utility functions for web search and scraping are implemented and tested.
Out: Web search function, Web scraping function
Step 2Create a query rewriting chain.
Entry: An LLM client is instantiated.
Exit: A chain that transforms a single question into a list of search queries is created.
In: User's research question · Out: List of specific web search queries
Step 3Perform web search and scraping for each query.
Entry: A list of web search queries is available.
Exit: A collection of scraped text content from multiple web pages is gathered.
In: List of web search queries · Out: List of scraped web page texts
Step 4Summarize individual web pages.
Entry: A list of scraped web page texts is available.
Exit: A list of summaries, one for each web page, is generated.
In: List of scraped web page texts · Out: List of individual summaries
Step 5Generate the final research report.
Entry: A list of individual summaries is available.
Exit: A final, formatted research report is generated.
In: List of individual summaries, Original user question · Out: Final research report
Step 6Assemble the master LCEL chain.
Entry: All sub-chains are defined and tested.
Exit: A single runnable chain for the entire research engine is created.
In: User's research question · Out: Final research report

Implementing a RAG System for Q&A

To build a question-answering system that grounds LLM responses in a private knowledge base, reducing hallucinations and improving the accuracy and trustworthiness of answers.

When to use: When you need to create an AI assistant that can accurately answer questions about your own data.

Step 1Load documents from sources.
Entry: A collection of source documents is available.
Exit: All source content is loaded into a list of `Document` objects.
In: Source files (PDF, TXT, etc.), Web URLs · Out: List of LangChain `Document` objects
Step 2Split documents into chunks.
Entry: A list of `Document` objects is available.
Exit: A list of smaller document chunks is created.
In: List of `Document` objects · Out: List of document chunks
Step 3Generate embeddings and store in a vector store.
Entry: A list of document chunks is available.
Exit: The vector store is populated with document chunks and their corresponding embeddings.
In: List of document chunks · Out: Populated vector store
Step 4Create a retriever from the vector store.
Entry: The vector store is populated.
Exit: A retriever object is ready to use.
In: Vector store instance · Out: Retriever object
Step 5Construct the RAG chain.
Entry: A retriever, prompt template, and LLM client are available.
Exit: A runnable RAG chain is created.
In: Retriever object, Prompt template, LLM client · Out: RAG chain
Step 6Incorporate conversational memory (optional).
Entry: A basic RAG chain is working.
Exit: The RAG chain can maintain context across multiple conversational turns.
In: User question, Previous chat history · Out: Grounded answer, Updated chat history
Step 7Invoke the chain and get an answer.
Entry: The RAG chain is created.
Exit: An answer is generated by the LLM based on retrieved documents.
In: User question · Out: Answer

Building a Tool-Based Agent with LangGraph

To create a dynamic agent that can reason, make decisions, and select from a set of available tools to accomplish a multi-step task.

When to use: When building applications that need to plan and execute a sequence of actions, such as a travel assistant that can search for flights and book hotels.

Step 1Define the agent's tools.
Entry: The agent's required capabilities are identified.
Exit: A list of tool-decorated Python functions is created.
In: Functional requirements for the agent · Out: List of tools
Step 2Define the agent's state.
Entry: The data that needs to be tracked across the agent's execution is identified.
Exit: An `AgentState` TypedDict is defined.
Out: AgentState class
Step 3Bind the tools to the LLM.
Entry: A list of tools and an LLM client are available.
Exit: An LLM instance capable of calling the specified tools is created.
In: List of tools, LLM client · Out: LLM with tools bound
Step 4Create the graph nodes.
Entry: The agent's state and the LLM with tools are defined.
Exit: Node functions for LLM interaction and tool execution are implemented.
In: AgentState · Out: Updated AgentState
Step 5Build and compile the agent graph.
Entry: Node functions are implemented.
Exit: A compiled, runnable LangGraph agent is created.
In: Node functions, AgentState class · Out: Compiled agent graph
Step 6Run the agent in a conversational loop.
Entry: The agent graph is compiled.
Exit: The user can interact with the agent conversationally.
In: User input · Out: Agent's response

Orchestrating a Multi-Agent System

To coordinate multiple specialized agents to handle complex user requests that span different domains, enabling collaboration and division of labor.

When to use: When building a comprehensive assistant that needs to combine capabilities like research, booking, and data analysis.

Step 1Build or define the specialist agents.
Entry: The distinct roles and capabilities required by the system are identified.
Exit: A list of runnable, specialized agent objects is available.
In: Agent roles and toolsets · Out: List of specialist agents
Step 2Instantiate and configure the supervisor agent.
Entry: A list of specialist agents is available.
Exit: A supervisor agent is instantiated and configured.
In: List of specialist agents, Supervisor system prompt · Out: Supervisor agent instance
Step 3Compile the multi-agent system.
Entry: The supervisor agent is configured.
Exit: A single, runnable multi-agent system is created.
In: Supervisor agent instance · Out: Compiled multi-agent system
Step 4Invoke the system with a complex user request.
Entry: The multi-agent system is compiled.
Exit: A comprehensive answer that leverages multiple agents is returned.
In: Complex user request · Out: Final answer

Exposing a Tool via an MCP Server

To make a tool or service available to AI agents through the standardized Model Context Protocol (MCP), promoting reusability and simplifying integration.

When to use: When you want to standardize how your service is accessed by AI agents, avoiding the need for every developer to write a custom wrapper.

Step 1Set up the MCP server environment.
Entry: The service or API to be exposed is identified.
Exit: The development environment is ready and dependencies are installed.
In: Service API key (if applicable) · Out: Configured project environment
Step 2Implement the tool logic.
Entry: The tool's functionality is clearly defined.
Exit: An async function that performs the desired action is implemented.
In: Tool parameters (e.g., location) · Out: Tool result (e.g., weather data)
Step 3Wrap the function as an MCP tool.
Entry: The tool logic is implemented in an async function.
Exit: The function is exposed as an MCP tool.
In: Tool function · Out: MCP-decorated tool
Step 4Run the MCP server.
Entry: The MCP application and tool are defined.
Exit: The MCP server is running and accessible at the specified address.
In: Server configuration (host, port, path) · Out: Running MCP server process
Step 5Test the server with an MCP client or inspector.
Entry: The MCP server is running.
Exit: The tool's functionality is successfully verified through the MCP server.
In: Test parameters · Out: Tool response

The story

The reader A software developer or technical practitioner who wants to build reliable, production-grade AI applications and agents using large language models but is overwhelmed by the rapidly evolving ecosystem and unsure how to move from toy demos to robust systems.

External problem

The developer needs to design and implement LLM-powered applications—summarization engines, Q&A chatbots, and autonomous agents—that work reliably on real data, but encounters brittle pipelines, poor retrieval quality, hallucinating models, and fragile integrations every time they try.

Internal problem

They feel uncertain and out of their depth in a field that changes daily, worried that the approach they are building will become obsolete or that they are missing critical techniques that separate toy demos from production systems.

Philosophical problem

Developers should not have to reinvent the same boilerplate infrastructure over and over, and AI systems should be grounded in verified knowledge rather than generating plausible-sounding falsehoods.

The plan

Master the fundamentals of prompt engineering and programmatic LLM interaction using the OpenAI API and LangChain.
Build summarization engines and research tools using LangChain chains, LCEL, map-reduce, and refine strategies.
Implement RAG from scratch to understand vector stores, embeddings, and retrieval before using LangChain abstractions.
Apply advanced indexing (parent-child chunks, multi-vector retrieval, hypothetical questions, summaries) and query transformations (rewrite, step-back, HyDE, decomposition) to achieve production-quality retrieval.
Extend RAG to heterogeneous data sources with metadata self-querying, SQL generation, knowledge graph querying, chain routing, and reciprocal rank fusion.
Transition from linear chains to stateful, conditional LangGraph workflows with explicit state management.
Build single-tool and multi-tool ReAct agents using LangGraph, understanding tool calling protocol, tool registration, and LLM guidance techniques.
Compose multi-agent systems using router and supervisor patterns, enabling specialized agents to collaborate on complex requests.
Build and consume MCP servers with FastMCP 2 and integrate them into agent applications alongside local tools.
Monitor, trace, and debug the entire system lifecycle with LangSmith.

Success

The reader confidently designs, builds, debugs, and maintains LLM-powered applications and multi-agent systems in production.
They understand the architectural trade-offs between engines, chatbots, and agents, and choose the right pattern for each use case.
Their RAG systems surface genuinely relevant context and rarely hallucinate, because they apply advanced indexing and query transformation techniques.
Their agents use tools reliably and adapt dynamically to user requests rather than relying on model memory.
They can integrate any external service into an agent via MCP without writing bespoke wrappers.
They have end-to-end observability through LangSmith and can diagnose issues quickly in production.

At stake

Without these techniques, developers continue building brittle, one-off pipelines that fail in production due to hallucinations, poor retrieval, and unmanageable orchestration code.
They waste time reinventing infrastructure that LangChain, LangGraph, and MCP already solve, slowing delivery and accumulating technical debt.
Their LLM applications erode user trust by confidently returning incorrect information instead of admitting uncertainty.
They miss the shift to agent-based architectures and MCP-standardized tooling, leaving them unable to participate in the rapidly growing ecosystem of AI-ready services.

Questions this book answers

What architectural patterns underpin LLM-based applications, chatbots, and agents?
How do you implement Retrieval-Augmented Generation (RAG) and what advanced techniques improve its accuracy?
How do you build, orchestrate, and debug tool-using AI agents with LangGraph?
How do you coordinate multiple specialized agents into a coherent multi-agent system?
What is the Model Context Protocol (MCP) and how do you build and consume MCP servers?

Related in the library

Tools these methods power