Skip to main content
This guide will walk you through making your first API call to Simplismart’s pre-deployed models. Before starting, ensure you have signed up for a Simplismart account.

Prerequisites

  • A Simplismart account
  • Basic Python knowledge
  • Python 3.8+ installed on your system

Step-by-Step Guide

1

Access the Playground

  1. Log in to your Simplismart account
  2. From the left sidebar, click on Playground
  3. In the model dropdown, select Gemma 3 1B
  4. You’ll see an interactive chat interface where you can test the model directly Playground interface with model selection
2

Get API Details

  1. In the Playground, click on Get API details in the left sidebar
  2. You’ll be redirected to a page with ready-to-use code snippets
  3. Note that both Python (OpenAI client) and cURL examples are provided
  4. Copy the provided code snippet or use the given below
The API is compatible with any OpenAI-compliant client library, not just the official Python SDK.
3

Create Your Python Script

Create a new file named inference.py with the following code:
# inference.py
from openai import OpenAI

# Replace with your actual API key from Settings > API Keys
simplismart_api_key = "YOUR_API_KEY"

# Replace with your endpoint for the Gemma-3-1B model from the Model details page 
endpoint_url = "YOUR_MODEL_ENDPOINT"

# Optional request identifier for tracking in logs
request_id = "e68343f3-7d0b-4b37-af9a-b0ad2c7eefc7"

try:
    # Initialize the OpenAI client with Simplismart endpoint
    client = OpenAI(
        api_key=simplismart_api_key,
        base_url=endpoint_url,        
        default_headers={"id": request_id}
    )
    
    # Define model and prompt
    MODEL_NAME = "gemma-3"
    PROMPT = "What is quantization in GenAI models?"    

    print(f"User: {PROMPT}\n")
    print("AI Assistant: ", end="", flush=True)

    # Create a streaming completion request
    stream = client.chat.completions.create(
        model=MODEL_NAME,
        messages=[
            {
                "role": "system",
                "content": "You are a helpful AI assistant."
            },
            {
                "role": "user", 
                "content": PROMPT
            }
        ],
        max_tokens=512,  # Response length limit
        stream=True,     # Enable streaming for faster first token
    )

    # Process the streamed response
    for chunk in stream:        
        text_delta = chunk.choices[0].delta.content
        if text_delta:            
            print(text_delta, end="", flush=True)
    print()  # Add newline after response

except Exception as e:    
    print(f"An unexpected error occurred: {e}")
Remember to replace "YOUR_API_KEY" and YOUR_MODEL_ENDPOINT with the actual API key and model endpoint you generated in the previous steps.
4

Generate an API Key

  1. Navigate to Settings > API Keys from the main sidebar
  2. Click Generate New Key
  3. Provide a descriptive name for your key (must be unique)
  4. Set an appropriate expiration date
  5. Copy the generated API key (you won’t be able to see it again) Settings
Keep your API key secure and never expose it in client-side code or public repositories.
5

Run Your Script

  1. Install the OpenAI Python client if you haven’t already:
pip install openai
  1. Run your script:
python inference.py
  1. You should see the model’s response to your query streaming in your terminal!
Congratulations! 🎉 You’ve successfully made your first API call to a Simplismart model.

Understanding Shared vs. Dedicated Endpoints

In this quickstart, you used a shared endpoint - a pre-deployed model that’s available to all Simplismart users. While convenient for testing and development, shared endpoints have some limitations:

Next Steps

Ready to take your AI implementation further? Try these next steps:
I