Use Cloud-Only Models

This guide explains how to configure and use FreeToken with cloud-only AI models that don't have local implementations. These models are useful when you need advanced AI capabilities that exceed local hardware limitations.

Overview

FreeToken supports both local and cloud-based AI models. While many models can run locally on device, some advanced models are only available via cloud APIs. This guide focuses on how to set up and use these cloud-only models effectively.

Completions with Cloud-Only Models

When you need to quickly use an AI model for an advanced task, a completion is often the best choice. Completions allow you to send a prompt to a cloud-based model and receive a response without needing to create chat threads or handle messages.

In this example, we'll be summarizing a large block of text using a cloud-only model. This is a common use case where local models may not have the necessary context window or memory to handle large inputs effectively.

import FreeToken

// Assumes you have already set up your FreeToken configuration with your API key

// In this example, we'll assume this is a large block of text we want to summarize
let giantText = """
FreeToken is an open-source framework that enables developers 
to build AI-powered applications that can run both on-device 
and in the cloud. It provides a unified API for interacting 
with various AI models, allowing seamless switching between 
local and cloud-based processing. This flexibility ensures 
optimal performance, privacy, and cost-efficiency for a wide 
range of applications...
"""

// Create a completion request using a cloud-only model
let prompt = """
Summarize the following text into 2 paragraphs and make sure 
to create a `key themes` section at the bottom:\n\n\(giantText)
"""

FreeToken.shared.generateCloudCompletion(
    prompt: prompt,
    modelCode: "llama_4_scout_cloud", // Find this code at https://console.freetoken.ai/ai_models
    success: { result in
        print("Summary: \(result.response)")
    },
    error: { err in
        print("Error generating completion: \(err.localizedDescription)")
    }
)

Chat with Cloud-Only Models

There are two primary ways to chat with Cloud-Only models:

1. Set your Agent to use a Cloud-Only Model

In the web console, you can create an agent and set its model to a cloud-only option. This way, all chat interactions with this agent will automatically use the specified cloud model.

2. Specify the Model Code in runMessageThread

This allows you to dynamically choose high-context cloud-only models for specific chat threads (or even messages) without needing to create separate agents.

import FreeToken

// Assumes you have already set up your FreeToken configuration

FreeToken.shared.runMessageThread(
    threadId: "your-thread-id", // Replace with your actual thread ID
    modelCode: "llama_4_scout_cloud", // Find this code at https://console.freetoken.ai/ai_models
    success: { response in
        print("AI Response: \(response.text)")
    },
    error: { err in
        print("Error during chat: \(err.localizedDescription)")
    }
)