CodeWithSabir
HomeAIDevOpsNext.jsMobile DevelopmentWeb Development
CodeWithSabir
  • Home
  • AI
  • DevOps
  • Next.js
  • Mobile Development
  • Web Development
  • About
  • Contact
CodeWithSabir

In-depth articles, tutorials, and guides on web development, React, Next.js, AI, and modern programming practices.

Topics

  • AI
  • DevOps
  • Next.js
  • Mobile Development
  • Web Development

Company

  • About
  • Contact
  • Privacy Policy
  • Terms

© 2026 CodeWithSabir. All rights reserved.

Built with SabirSoft.com

Home/AI/Using the Claude API in Real Projects: A Practical Developer Guide
AI

Using the Claude API in Real Projects: A Practical Developer Guide

A hands-on guide to integrating the Claude API into real applications — covering streaming, tool use, prompt caching, system prompts, and production best practices.

Sabir KhaloufiSabir KhaloufiApril 12, 20264 min read
Using the Claude API in Real Projects: A Practical Developer Guide

Reading the docs and building a toy "ask a question, get an answer" demo takes about 20 minutes with the Claude API. Building something production-ready — with proper streaming, error handling, cost controls, and useful system prompts — takes a lot more thought.

This guide is about that second part. We'll cover the patterns that matter once you move past hello-world.

Setup

bash
npm install @anthropic-ai/sdk
typescript
// lib/claude.ts
import Anthropic from '@anthropic-ai/sdk'
 
export const claude = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY!,
})

Always initialize the client once and export it. Creating a new instance per request wastes resources and loses connection pooling benefits.

Streaming Responses

Nobody wants to wait 8 seconds for a response to appear all at once. Always stream for user-facing features:

typescript
// app/api/chat/route.ts (Next.js)
import { claude } from '@/lib/claude'
 
export async function POST(req: Request) {
  const { messages, systemPrompt } = await req.json()
 
  const stream = await claude.messages.create({
    model: 'claude-sonnet-4-6',
    max_tokens: 1024,
    system: systemPrompt,
    messages,
    stream: true,
  })
 
  // Return a ReadableStream to the client
  const readable = new ReadableStream({
    async start(controller) {
      for await (const event of stream) {
        if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
          controller.enqueue(new TextEncoder().encode(event.delta.text))
        }
        if (event.type === 'message_stop') {
          controller.close()
        }
      }
    },
  })
 
  return new Response(readable, {
    headers: {
      'Content-Type': 'text/plain; charset=utf-8',
      'Transfer-Encoding': 'chunked',
    },
  })
}
tsx
// components/ChatStream.tsx — consuming streamed response
'use client'
 
import { useState } from 'react'
 
export default function ChatStream() {
  const [response, setResponse] = useState('')
  const [loading, setLoading] = useState(false)
 
  async function sendMessage(message: string) {
    setLoading(true)
    setResponse('')
 
    const res = await fetch('/api/chat', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        messages: [{ role: 'user', content: message }],
        systemPrompt: 'You are a helpful coding assistant.',
      }),
    })
 
    const reader = res.body!.getReader()
    const decoder = new TextDecoder()
 
    while (true) {
      const { done, value } = await reader.read()
      if (done) break
      setResponse(prev => prev + decoder.decode(value))
    }
 
    setLoading(false)
  }
 
  return (
    <div>
      <button onClick={() => sendMessage('Explain async/await in JavaScript')}>
        Ask Claude
      </button>
      {loading && <p>Thinking...</p>}
      <div>{response}</div>
    </div>
  )
}

System Prompts: Your Most Powerful Lever

A weak system prompt produces generic responses. A strong one shapes every response in the conversation. Invest time here.

typescript
const CODE_REVIEW_SYSTEM_PROMPT = `You are a senior software engineer doing a thorough code review.
 
When reviewing code:
- Focus on correctness, security, and performance in that order
- Point out specific line numbers when referencing issues
- Explain WHY something is a problem, not just that it is one
- Suggest concrete improvements with example code
- Be direct but constructive — you're helping a colleague, not grading homework
 
Format your review as:
1. Summary (2-3 sentences)
2. Critical issues (must fix before merging)
3. Suggestions (nice to have)
4. What's done well
 
If there are no critical issues, say so clearly.`
typescript
const DOCUMENTATION_SYSTEM_PROMPT = `You are a technical writer who specializes in developer documentation.
 
Rules:
- Write for developers, not managers
- Use active voice
- Include runnable code examples for every concept
- Don't explain what something is — explain when and why to use it
- Keep paragraphs to 3 sentences maximum
- Use second person ("you") not third person ("the developer")`

Tool Use: Giving Claude Capabilities

Tool use (function calling) lets Claude decide to call a function when it needs external data:

typescript
import Anthropic from '@anthropic-ai/sdk'
 
const claude = new Anthropic()
 
const tools: Anthropic.Tool[] = [
  {
    name: 'get_post_by_slug',
    description: 'Fetch a blog post by its URL slug. Use this when the user asks about a specific post.',
    input_schema: {
      type: 'object',
      properties: {
        slug: {
          type: 'string',
          description: 'The URL slug of the post, e.g. "nextjs-server-components"',
        },
      },
      required: ['slug'],
    },
  },
  {
    name: 'search_posts',
    description: 'Search blog posts by keyword. Use when the user asks to find posts about a topic.',
    input_schema: {
      type: 'object',
      properties: {
        query: { type: 'string', description: 'Search query' },
        limit: { type: 'number', description: 'Max results to return (default 5)' },
      },
      required: ['query'],
    },
  },
]
 
async function chatWithTools(userMessage: string) {
  const messages: Anthropic.MessageParam[] = [
    { role: 'user', content: userMessage },
  ]
 
  while (true) {
    const response = await claude.messages.create({
      model: 'claude-sonnet-4-6',
      max_tokens: 1024,
      tools,
      messages,
    })
 
    if (response.stop_reason === 'end_turn') {
      const textBlock = response.content.find(b => b.type === 'text')
      return textBlock?.type === 'text' ? textBlock.text : ''
    }
 
    if (response.stop_reason === 'tool_use') {
      // Add Claude's response (with tool calls) to history
      messages.push({ role: 'assistant', content: response.content })
 
      // Execute each tool call
      const toolResults: Anthropic.ToolResultBlockParam[] = []
 
      for (const block of response.content) {
        if (block.type !== 'tool_use') continue
 
        let result: string
        if (block.name === 'get_post_by_slug') {
          const post = await getPostBySlug((block.input as any).slug)
          result = post ? JSON.stringify(post) : 'Post not found'
        } else if (block.name === 'search_posts') {
          const posts = await searchPosts((block.input as any).query)
          result = JSON.stringify(posts)
        } else {
          result = 'Unknown tool'
        }
 
        toolResults.push({
          type: 'tool_result',
          tool_use_id: block.id,
          content: result,
        })
      }
 
      // Add tool results to history and continue
      messages.push({ role: 'user', content: toolResults })
    }
  }
}

Prompt Caching: Cut Costs on Long System Prompts

If your system prompt is long (documentation, a large context document), prompt caching can reduce costs by up to 90% on repeated calls:

typescript
const response = await claude.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 1024,
  system: [
    {
      type: 'text',
      text: longSystemPrompt, // Your 2000-word system prompt
      cache_control: { type: 'ephemeral' }, // Cache this!
    },
  ],
  messages: [{ role: 'user', content: userMessage }],
})

The first call processes and caches the system prompt. Subsequent calls within the cache TTL (5 minutes) pay only the cache read price, which is ~10x cheaper than re-processing.

Error Handling in Production

typescript
import Anthropic from '@anthropic-ai/sdk'
 
export async function callClaude(prompt: string, retries = 2): Promise<string> {
  try {
    const response = await claude.messages.create({
      model: 'claude-sonnet-4-6',
      max_tokens: 1024,
      messages: [{ role: 'user', content: prompt }],
    })
 
    const text = response.content[0]
    return text.type === 'text' ? text.text : ''
 
  } catch (error) {
    if (error instanceof Anthropic.RateLimitError && retries > 0) {
      await new Promise(r => setTimeout(r, 2000))
      return callClaude(prompt, retries - 1)
    }
 
    if (error instanceof Anthropic.APIStatusError) {
      if (error.status >= 500 && retries > 0) {
        await new Promise(r => setTimeout(r, 1000))
        return callClaude(prompt, retries - 1)
      }
      throw new Error(`Claude API error: ${error.status} ${error.message}`)
    }
 
    throw error
  }
}

Common Mistakes

1. Not setting max_tokens. Without it you might hit the model's ceiling unexpectedly or burn through tokens on a runaway response.

2. Putting conversation history in the system prompt. System prompts are for instructions, not history. Use the messages array for conversation turns.

3. Not handling stop_reason. Always check why the model stopped — end_turn vs max_tokens vs tool_use require different handling.

4. Building without observability. Log every API call in production: model, tokens used, latency, stop reason. You'll need this for debugging and cost tracking.

5. Exposing your API key client-side. Always proxy Claude calls through your backend. Never call the Anthropic API directly from the browser.

Key Takeaways

  • Always stream responses for user-facing features — it dramatically improves perceived performance
  • System prompts are your most powerful tool — invest time crafting them carefully
  • Tool use lets Claude pull real data instead of hallucinating it
  • Prompt caching cuts costs significantly for large, repeated system prompts
  • Handle rate limits and 5xx errors with retry logic with exponential backoff
  • Never expose your API key to the browser — always proxy through your backend
#claude api#anthropic#llm#ai integration#nodejs
Share:
Sabir Khaloufi — author photo

Written by

Sabir Khaloufi

Full-stack developer and tech blogger sharing in-depth tutorials on React, Next.js, AI, and modern web development.

On this page

  • Setup
  • Streaming Responses
  • System Prompts: Your Most Powerful Lever
  • Tool Use: Giving Claude Capabilities
  • Prompt Caching: Cut Costs on Long System Prompts
  • Error Handling in Production
  • Common Mistakes
  • Key Takeaways

Related Articles

Claude AI vs ChatGPT: An Honest Comparison for Developers
AI

Claude AI vs ChatGPT: An Honest Comparison for Developers

A real, no-hype comparison of Claude AI and ChatGPT for developers — covering code generation, API usage, context windows, and which one actually helps you ship faster.

Sabir KhaloufiApril 28, 20265 min read
AI Tools Every Developer Should Be Using in 2026
AI

AI Tools Every Developer Should Be Using in 2026

A practical guide to the AI tools that are genuinely changing how developers write code, review PRs, write docs, and ship faster — with honest takes on each one.

Sabir KhaloufiApril 13, 20265 min read
AI

Prompt Engineering for Developers: Write Prompts That Actually Work

A practical prompt engineering guide written for developers — covering chain-of-thought, few-shot examples, output formatting, and how to get consistent, structured responses from LLMs.

Sabir KhaloufiApril 10, 20264 min read