Skip to content
  1.  
  2. © 2023 – 2025 OpenRouter, Inc
    Favicon for z-ai

    z-ai

    Browse models from z-ai

    6 models

    Tokens processed on OpenRouter

    • Z.AI: GLM 4.6VGLM 4.6V
      41M tokens

      GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It supports up to 128K tokens, processes complex page layouts and charts directly as visual inputs, and integrates native multimodal function calling to connect perception with downstream tool execution. The model also enables interleaved image-text generation and UI reconstruction workflows, including screenshot-to-HTML synthesis and iterative visual editing.

      by z-ai131K context$0.30/M input tokens$0.90/M output
    tokens
  3. Z.AI: GLM 4.6GLM 4.6
    1.06B tokens

    Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex agentic tasks. Superior coding performance: The model achieves higher scores on code benchmarks and demonstrates better real-world performance in applications such as Claude Code、Cline、Roo Code and Kilo Code, including improvements in generating visually polished front-end pages. Advanced reasoning: GLM-4.6 shows a clear improvement in reasoning performance and supports tool use during inference, leading to stronger overall capability. More capable agents: GLM-4.6 exhibits stronger performance in tool using and search-based agents, and integrates more effectively within agent frameworks. Refined writing: Better aligns with human preferences in style and readability, and performs more naturally in role-playing scenarios.

    by z-ai200K context$0.40/M input tokens$1.75/M output tokens
  4. Z.AI: GLM 4.5VGLM 4.5V
    116M tokens

    GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) architecture with 106B parameters and 12B activated parameters, it achieves state-of-the-art results in video understanding, image Q&A, OCR, and document parsing, with strong gains in front-end web coding, grounding, and spatial reasoning. It offers a hybrid inference mode: a "thinking mode" for deep reasoning and a "non-thinking mode" for fast responses. Reasoning behavior can be toggled via the reasoning enabled boolean. Learn more in our docs

    by z-ai66K context$0.48/M input tokens$1.44/M output tokens
  5. Z.AI: GLM 4.5GLM 4.5
    9.46B tokens

    GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It leverages a Mixture-of-Experts (MoE) architecture and supports a context length of up to 128k tokens. GLM-4.5 delivers significantly enhanced capabilities in reasoning, code generation, and agent alignment. It supports a hybrid inference mode with two options, a "thinking mode" designed for complex reasoning and tool use, and a "non-thinking mode" optimized for instant responses. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs

    by z-ai131K context$0.35/M input tokens$1.55/M output tokens
  6. Z.AI: GLM 4.5 AirGLM 4.5 Air
    261M tokens

    GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter size. GLM-4.5-Air also supports hybrid inference modes, offering a "thinking mode" for advanced reasoning and tool use, and a "non-thinking mode" for real-time interaction. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs

    by z-ai131K context$0.104/M input tokens$0.68/M output tokens
  7. Z.AI: GLM 4 32B GLM 4 32B
    12.4M tokens

    GLM 4 32B is a cost-effective foundation language model. It can efficiently perform complex tasks and has significantly enhanced capabilities in tool use, online search, and code-related intelligent tasks. It is made by the same lab behind the thudm models.

    by z-ai128K context$0.10/M input tokens$0.10/M output tokens