ModelScopeChatEndpoint
ModelScope (Home | GitHub) is built upon the notion of “Model-as-a-Service” (MaaS). It seeks to bring together most advanced machine learning models from the AI community, and streamlines the process of leveraging AI models in real-world applications. The core ModelScope library open-sourced in this repository provides the interfaces and implementations that allow developers to perform model inference, training and evaluation.
This will help you getting started with ModelScope Chat Endpoint.
Overview
Integration details
| Provider | Class | Package | Local | Serializable | Package downloads | Package latest | 
|---|---|---|---|---|---|---|
| ModelScope | ModelScopeChatEndpoint | langchain-modelscope-integration | ❌ | ❌ | 
Setup
To access ModelScope chat endpoint you'll need to create a ModelScope account, get an SDK token, and install the langchain-modelscope-integration integration package.
Credentials
Head to ModelScope to sign up to ModelScope and generate an SDK token. Once you've done this set the MODELSCOPE_SDK_TOKEN environment variable:
import getpass
import os
if not os.getenv("MODELSCOPE_SDK_TOKEN"):
    os.environ["MODELSCOPE_SDK_TOKEN"] = getpass.getpass(
        "Enter your ModelScope SDK token: "
    )
Installation
The LangChain ModelScope integration lives in the langchain-modelscope-integration package:
%pip install -qU langchain-modelscope-integration
Instantiation
Now we can instantiate our model object and generate chat completions:
from langchain_modelscope import ModelScopeChatEndpoint
llm = ModelScopeChatEndpoint(
    model="Qwen/Qwen2.5-Coder-32B-Instruct",
    temperature=0,
    max_tokens=1024,
    timeout=60,
    max_retries=2,
    # other params...
)
Invocation
messages = [
    (
        "system",
        "You are a helpful assistant that translates English to Chinese. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg
AIMessage(content='我喜欢编程。', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 3, 'prompt_tokens': 33, 'total_tokens': 36, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'qwen2.5-coder-32b-instruct', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-60bb3461-60ae-4c0b-8997-ab55ef77fcd6-0', usage_metadata={'input_tokens': 33, 'output_tokens': 3, 'total_tokens': 36, 'input_token_details': {}, 'output_token_details': {}})
print(ai_msg.content)
我喜欢编程。
Chaining
We can chain our model with a prompt template like so:
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate(
    [
        (
            "system",
            "You are a helpful assistant that translates {input_language} to {output_language}.",
        ),
        ("human", "{input}"),
    ]
)
chain = prompt | llm
chain.invoke(
    {
        "input_language": "English",
        "output_language": "Chinese",
        "input": "I love programming.",
    }
)
AIMessage(content='我喜欢编程。', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 3, 'prompt_tokens': 28, 'total_tokens': 31, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'qwen2.5-coder-32b-instruct', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-9f011a3a-9a11-4759-8d16-5b1843a78862-0', usage_metadata={'input_tokens': 28, 'output_tokens': 3, 'total_tokens': 31, 'input_token_details': {}, 'output_token_details': {}})
API reference
For detailed documentation of all ModelScopeChatEndpoint features and configurations head to the reference: https://modelscope.cn/docs/model-service/API-Inference/intro
Related
- Chat model conceptual guide
- Chat model how-to guides