Getting chunked responses in resolver with new forge runtime

Hi! I’m calling in the resolver openai api which supports streaming. It’s achieved with chunked encoding and server sent events. The problem I run into is that I’m not getting response body as soon as it’s available but only once server closes a connection. It seems that there is something buffering response data. Did anyone experience this issue?

3 Likes

Same here. When I ran the code below, it waited so long to start streaming.

      const openai = new Openai({
        apiKey: "XXXXXXX"
      });
      const stream = await openai.chat.completions.create({
        model: "gpt-4",
        messages: [{ role: "user", content: "Write the poem in 250 words" }],
        stream: true,
      });
      console.log("Start streaming: ");
      for await (const part of stream) {
        process.stdout.write(part.choices[0]?.delta?.content || "");
      }
1 Like

Same issue here. The stream is available within the same timeframe (or even later) than normal completion (without streaming). With a stream, the chunks are printed separately but with a 1-millisecond between each chunk - so it is a completed answer - not the regular stream from OpenAI. :frowning: