OpenAI-Compatible Client

LeapOpenAIClient / leap-openai-client (introduced in v0.10.0) is a small, dependency-light client for any OpenAI-compatible chat-completions endpoint — OpenAI itself, OpenRouter, vLLM, llama-server, or your own proxy. It ships in the same SDK release as LeapSDK, so you can route requests between an on-device LFM and a cloud model from a single app.

When to use it

Hybrid on-device + cloud routing. Run small / fast models on-device with LeapSDK, fall back to a larger cloud model for hard prompts.
Standardised cloud API. Talk to any OpenAI-compatible backend without pulling in a heavier OpenAI SDK.
Streaming first. SSE streaming is the only mode — non-streaming requests aren’t exposed. streamChatCompletion(...) forces stream = true on the outgoing request regardless of the stream field on the ChatCompletionRequest you pass in.

Add the dependency

iOS / macOS (SPM)
Android (Gradle)
Kotlin/Native (Gradle)

Add the LeapOpenAIClient product to your target. See the Quick Start for the full SPM setup.

dependencies: [
    .package(url: "https://github.com/Liquid4All/leap-sdk.git", from: "0.10.6")
]

targets: [
    .target(
        name: "YourApp",
        dependencies: [
            .product(name: "LeapOpenAIClient", package: "leap-sdk"),
        ]
    )
]

In Swift sources, import LeapOpenAIClient. The Darwin (URLSession) Ktor engine is bundled — no extra HTTP setup needed.

dependencies {
  implementation("ai.liquid.leap:leap-sdk:0.10.6")
  implementation("ai.liquid.leap:leap-openai-client:0.10.6")
}

Bundles an OkHttp-engine Ktor client. No extra HTTP setup needed.

dependencies {
    implementation("ai.liquid.leap:leap-sdk:0.10.6")
    implementation("ai.liquid.leap:leap-openai-client:0.10.6")
}

Targets linuxX64, linuxArm64, and mingwX64 (Windows native). leap-openai-client does not publish a JVM target — leap-sdk does, but the OpenAI client lives in Android + Apple + native source sets only. If you need OpenAI-compatible chat completions from a JVM-only host, use any standard OpenAI client (e.g. official OpenAI Java/Kotlin SDK, OkHttp + JSON) — the LeapSDK on-device runner has no JVM-specific cloud dependency.

Basic usage

Swift (iOS / macOS)
Kotlin (all platforms)

The leap-sdk-openai-client Kotlin module does not apply the SKIE plugin (only leap-sdk, leap-sdk-model-downloader, and leap-ui do). That means Flow<ChatCompletionEvent> is not bridged to a Swift AsyncSequence and the onEnum(of:) helper is not generated for ChatCompletionEvent. Swift consumers must collect the Kotlin Flow through its native collector and downcast each event with as?. For most Swift apps that just need cloud chat completions, an off-the-shelf OpenAI Swift client is more ergonomic — use LeapOpenAIClient from Swift only if you need to share Kotlin code with Android.

Manual collection pattern (the Flow<ChatCompletionEvent>.collect(...) shape varies by Kotlin/Native version — check the framework header in your Xcode build for the exact label):

import LeapOpenAIClient

let client = OpenAiClientKt.openAiClient(
    config: OpenAiClientConfig(
        apiKey: "sk-…",
        baseUrl: "https://api.openai.com/v1"
    )
)

let request = ChatCompletionRequest(
    model: "gpt-4o-mini",
    messages: [
        ChatMessage.System(content: "You are a helpful assistant."),
        ChatMessage.User(content: "What is the capital of Japan?")
    ],
    temperature: 0.7
)

// Pseudocode — actual collector signature depends on your Kotlin/Native version
// and framework headers. Without SKIE, there is no `for try await` integration.
try await client.streamChatCompletion(request: request).collect(
    collector: FlowCollector { event in
        if let delta = event as? ChatCompletionEvent.Delta {
            print(delta.content, terminator: "")
        } else if let done = event as? ChatCompletionEvent.Done {
            if let usage = done.usage { print("\nTokens: \(usage.totalTokens)") }
        } else if let err = event as? ChatCompletionEvent.Error {
            print("\nError: \(err.message)")
        }
    }
)

client.close()  // closes the underlying URLSession-backed HttpClient

import ai.liquid.leap.openai.ChatCompletionEvent
import ai.liquid.leap.openai.ChatCompletionRequest
import ai.liquid.leap.openai.ChatMessage
import ai.liquid.leap.openai.OpenAiClient
import ai.liquid.leap.openai.OpenAiClientConfig

val client = OpenAiClient(
    config = OpenAiClientConfig(
        apiKey = "sk-…",
        baseUrl = "https://api.openai.com/v1",
    )
)

val request = ChatCompletionRequest(
    model = "gpt-4o-mini",
    messages = listOf(
        ChatMessage.System("You are a helpful assistant."),
        ChatMessage.User("What is the capital of Japan?"),
    ),
    temperature = 0.7,
)

client.streamChatCompletion(request).collect { event ->
    when (event) {
        is ChatCompletionEvent.Delta -> print(event.content)
        is ChatCompletionEvent.Done  -> event.usage?.let { println("\nTokens: ${it.totalTokens}") }
        is ChatCompletionEvent.Error -> println("\nError: ${event.message}")
    }
}

client.close()

Configuration

OpenAiClientConfig is a Kotlin data class bridged identically on every platform.

data class OpenAiClientConfig(
    val apiKey: String,
    val baseUrl: String = "https://api.openai.com/v1",
    val chatCompletionsPath: String = "/chat/completions",
    val extraHeaders: Map<String, String> = emptyMap(),
)

Field	Default	Notes
`apiKey`	— (required)	Sent as `Authorization: Bearer <apiKey>`.
`baseUrl`	`https://api.openai.com/v1`	Override for OpenRouter, a self-hosted backend, etc.
`chatCompletionsPath`	`/chat/completions`	Appended to `baseUrl`.
`extraHeaders`	`{}`	Merged into every request — e.g. OpenRouter’s `HTTP-Referer`.

OpenRouter

Swift (iOS / macOS)
Kotlin (all platforms)

let client = OpenAiClient(
    config: OpenAiClientConfig(
        apiKey: "sk-or-…",
        baseUrl: "https://openrouter.ai/api/v1",
        extraHeaders: [
            "HTTP-Referer": "https://yourapp.example.com",
            "X-Title": "Your App"
        ]
    )
)

val client = OpenAiClient(
    OpenAiClientConfig(
        apiKey = "sk-or-…",
        baseUrl = "https://openrouter.ai/api/v1",
        extraHeaders = mapOf(
            "HTTP-Referer" to "https://yourapp.example.com",
            "X-Title" to "Your App",
        ),
    )
)

Self-hosted vLLM / llama-server

Swift (iOS / macOS)
Kotlin (all platforms)

let client = OpenAiClient(
    config: OpenAiClientConfig(
        apiKey: "anything",  // Required by config but typically unused
        baseUrl: "http://10.0.0.42:8000/v1"
    )
)

val client = OpenAiClient(
    OpenAiClientConfig(
        apiKey = "anything",
        baseUrl = "http://10.0.0.42:8000/v1",
    )
)

Request shape

ChatCompletionRequest covers standard OpenAI fields plus a few OpenRouter-specific extensions. OpenRouter-only fields are silently ignored by stock OpenAI-compatible APIs.

data class ChatCompletionRequest(
    val model: String,
    val messages: List<ChatMessage>,
    val temperature: Double? = null,
    val topP: Double? = null,
    val maxCompletionTokens: Int? = null,   // Preferred for newer OpenAI versions
    val maxTokens: Int? = null,             // Legacy alias — some custom backends still require it
    val frequencyPenalty: Double? = null,
    val presencePenalty: Double? = null,
    val stop: List<String>? = null,
    val stream: Boolean = true,
    // OpenRouter extensions:
    val topK: Int? = null,
    val repetitionPenalty: Double? = null,
    val minP: Double? = null,
    val topA: Double? = null,
    val transforms: List<String>? = null,
    val models: List<String>? = null,
    val route: String? = null,
    val provider: ProviderPreferences? = null,
)

ChatMessage (the OpenAI-client one, distinct from LeapSDK.ChatMessage) is a sealed type with three cases — System, User, Assistant.

Response shape

streamChatCompletion(request) returns an AsyncSequence<ChatCompletionEvent> (Swift) / Flow<ChatCompletionEvent> (Kotlin):

Variant	Meaning
`Delta(content: String)`	Text chunk from the model. May be empty for role-only deltas.
`Done(usage: Usage?)`	Stream finished. `usage` is non-`null` when the API includes token counts.
`Error(message: String)`	HTTP error or stream parsing failure.

data class Usage(val promptTokens: Int, val completionTokens: Int, val totalTokens: Int)

Hybrid routing example

Route simple prompts to a small on-device LFM; escalate harder prompts to a cloud model.

Swift (iOS / macOS)
Kotlin (Android)
Kotlin (JVM / native)

import LeapModelDownloader
import LeapOpenAIClient

@MainActor
final class HybridChatViewModel: ObservableObject {
    private let onDevice: Conversation
    private let cloud: OpenAiClient

    init(onDevice: Conversation, cloud: OpenAiClient) {
        self.onDevice = onDevice
        self.cloud = cloud
    }

    func send(_ text: String, useCloud: Bool) async throws {
        if useCloud {
            // Cloud path: leap-sdk-openai-client has no SKIE — collect the Kotlin
            // Flow manually and downcast each event with `as?`.
            let request = ChatCompletionRequest(
                model: "gpt-4o-mini",
                messages: [ChatMessage.User(content: text)]
            )
            try await cloud.streamChatCompletion(request: request).collect(
                collector: FlowCollector { event in
                    if let delta = event as? ChatCompletionEvent.Delta {
                        appendChunk(delta.content)
                    }
                }
            )
        } else {
            // On-device path: leap-sdk has SKIE — `for try await` + `onEnum(of:)`
            // work as written.
            let userMessage = ChatMessage(role: .user, textContent: text)
            for try await response in onDevice.generateResponse(message: userMessage) {
                if case let .chunk(c) = onEnum(of: response) { appendChunk(c.text) }
            }
        }
    }

    private func appendChunk(_ text: String) { /* … */ }

    deinit { cloud.close() }
}

import ai.liquid.leap.Conversation
import ai.liquid.leap.message.MessageResponse
import ai.liquid.leap.openai.ChatCompletionEvent
import ai.liquid.leap.openai.ChatCompletionRequest
import ai.liquid.leap.openai.ChatMessage as CloudChatMessage
import ai.liquid.leap.openai.OpenAiClient
import ai.liquid.leap.message.ChatMessage
import ai.liquid.leap.message.ChatMessageContent
import androidx.lifecycle.ViewModel
import androidx.lifecycle.viewModelScope
import kotlinx.coroutines.launch

class HybridChatViewModel(
    private val onDevice: Conversation,
    private val cloud: OpenAiClient,
) : ViewModel() {

    fun send(text: String, useCloud: Boolean) = viewModelScope.launch {
        if (useCloud) {
            val request = ChatCompletionRequest(
                model = "gpt-4o-mini",
                messages = listOf(CloudChatMessage.User(text)),
            )
            cloud.streamChatCompletion(request).collect { event ->
                if (event is ChatCompletionEvent.Delta) appendChunk(event.content)
            }
        } else {
            val message = ChatMessage(
                role = ChatMessage.Role.USER,
                content = listOf(ChatMessageContent.Text(text)),
            )
            onDevice.generateResponse(message).collect { resp ->
                if (resp is MessageResponse.Chunk) appendChunk(resp.text)
            }
        }
    }

    private fun appendChunk(text: String) { /* … */ }

    override fun onCleared() {
        super.onCleared()
        cloud.close()
    }
}

suspend fun hybridSend(
    onDevice: Conversation,
    cloud: OpenAiClient,
    text: String,
    useCloud: Boolean,
) {
    if (useCloud) {
        val request = ChatCompletionRequest(
            model = "gpt-4o-mini",
            messages = listOf(CloudChatMessage.User(text)),
        )
        cloud.streamChatCompletion(request).collect { event ->
            if (event is ChatCompletionEvent.Delta) print(event.content)
        }
    } else {
        onDevice.generateResponse(text).collect { resp ->
            if (resp is MessageResponse.Chunk) print(resp.text)
        }
    }
}

See Cloud AI Comparison for a side-by-side feature breakdown.

Lifecycle

The platform OpenAiClient(config:) factory creates an HttpClient internally and ties it to the returned client — call close() when you’re done.

Swift (iOS / macOS)
Kotlin (all platforms)

deinit { client.close() }

The lower-level constructor that accepts an externally-managed HttpClient is part of the Kotlin/Ktor surface and isn’t a useful entry point from Swift — the Ktor engine machinery isn’t bridged into the public Swift API. Use OpenAiClient(config:) and let the SDK own the session. If multiple consumers share a client, share the OpenAiClient instance and close() once at teardown.

override fun onCleared() {
    super.onCleared()
    client.close()
}

If you need to share an HttpClient across multiple clients (e.g., you already manage one for other Ktor-based code), use the lower-level constructor that takes a pre-built HttpClient — you then own its lifetime and shouldn’t call close() on the OpenAiClient:

val shared = HttpClient(OkHttp)  // your own instance
val client = OpenAiClient(config = config, httpClient = shared)
// Don't call client.close() — you own `shared` and decide when it dies

Leap SDK

Model Bundling Services

OpenAI-Compatible Client

When to use it

Add the dependency

Basic usage

Configuration

OpenRouter

Self-hosted vLLM / llama-server

Request shape

Response shape

Hybrid routing example

Lifecycle

Leap SDK

Model Bundling Services

Documentation Index

​When to use it

​Add the dependency

​Basic usage

​Configuration

​OpenRouter

​Self-hosted vLLM / llama-server

​Request shape

​Response shape

​Hybrid routing example

​Lifecycle

When to use it

Add the dependency

Basic usage

Configuration

OpenRouter

Self-hosted vLLM / llama-server

Request shape

Response shape

Hybrid routing example

Lifecycle