Use Cloud-Only Models
This guide explains how to configure and use FreeToken with cloud-only AI models that don't have local implementations. These models are useful when you need advanced AI capabilities that exceed local hardware limitations.
Overview
FreeToken supports both local and cloud-based AI models. While many models can run locally on device, some advanced models are only available via cloud APIs. This guide focuses on how to set up and use these cloud-only models effectively.
Completions with Cloud-Only Models
When you need to quickly use an AI model for an advanced task, a completion is often the best choice. Completions allow you to send a prompt to a cloud-based model and receive a response without needing to create chat threads or handle messages.
In this example, we'll be summarizing a large block of text using a cloud-only model. This is a common use case where local models may not have the necessary context window or memory to handle large inputs effectively.
import FreeToken
// Assumes you have already set up your FreeToken configuration with your API key
// In this example, we'll assume this is a large block of text we want to summarize
let giantText = """
FreeToken is an open-source framework that enables developers
to build AI-powered applications that can run both on-device
and in the cloud. It provides a unified API for interacting
with various AI models, allowing seamless switching between
local and cloud-based processing. This flexibility ensures
optimal performance, privacy, and cost-efficiency for a wide
range of applications...
"""
// Create a completion request using a cloud-only model
let prompt = """
Summarize the following text into 2 paragraphs and make sure
to create a `key themes` section at the bottom:\n\n\(giantText)
"""
FreeToken.shared.generateCloudCompletion(
prompt: prompt,
modelCode: "llama_4_scout_cloud", // Find this code at https://console.freetoken.ai/ai_models
success: { result in
print("Summary: \(result.response)")
},
error: { err in
print("Error generating completion: \(err.localizedDescription)")
}
)
Chat with Cloud-Only Models
There are two primary ways to chat with Cloud-Only models:
1. Set your Agent to use a Cloud-Only Model
In the web console, you can create an agent and set its model to a cloud-only option. This way, all chat interactions with this agent will automatically use the specified cloud model.
2. Specify the Model Code in runMessageThread
This allows you to dynamically choose high-context cloud-only models for specific chat threads (or even messages) without needing to create separate agents.
import FreeToken
// Assumes you have already set up your FreeToken configuration
FreeToken.shared.runMessageThread(
threadId: "your-thread-id", // Replace with your actual thread ID
modelCode: "llama_4_scout_cloud", // Find this code at https://console.freetoken.ai/ai_models
success: { response in
print("AI Response: \(response.text)")
},
error: { err in
print("Error during chat: \(err.localizedDescription)")
}
)