Streaming responses using ring

This response is too big to wait until it is all generated to send back. The response time for the client is bad. Can we start sending some it earlier?

I needed to send a large list of files to a browser and have it show a directory tree. The response time of trying to generate the entire list was simply too slow. The user would click in their browser and wait.. and wait.. It was not a good user experience. Luckily, I could use streaming responses to start sending data as soon as a small portion was ready.

Ring streaming at the lowest level, InputStreams and OutputStreams

The lowest level way is to send a ring response with a :body of type java.io.InputStream. An InputStream is designed to have data read from it, which makes putting data in require a bit of book keeping. However, ring provides ring.util.io/piped-input-stream which handles this book keeping and provides a OutputStream for your use. An OutputStream is designed to have input placed into it, and works with functions like clojure.core/spit. It is one of the lower level io abstractions in java, and works at the byte level.

However, many libraries will want something that understands more than bytes. For example, clojure.data.xml'sclojure.data.xml/emit and cheshire's chesire.core/generate-stream both expect a java.io.Writer. A Writer is able to work at the Character and String level. Clojure provides a function clojure.java.io/writer to convert an OutputStream into a Writer.

With piped-input-stream and writer we can start putting together a streaming response. Imagine we hava a function directory-list which will return a xml response, such as from clojure.data.xml/parse. We'd prefer not to have the entire parse structure and a string for the response in memory due to space and time constraint.

Attempt 1: Sending a large XML response

(require '[clojure.data.xml :as xml])
(require '[clojure.java.io :as io])
(require '[ring.util.io :as ring-io])

(defn directory-list []
  ;; A small xml for demonstration.  Imagine something much larger
  (xml/parse (java.io.StringReader.
              "<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo><bar><baz>The baz value</baz></bar></foo>")))

(defn directory-stream []
  (ring-io/piped-input-stream
    #(xml/emit (directory-list)
               (io/make-writer % {})))))

This looks like it should work, but if we test it out at the repl we can see a problem.

(slurp (directory-stream))
;;=>  ""

When we try reading from the stream it is empty! This happens due to the intermediate Writer we have created. It will buffer data to make sure to send efficently. piped-input-stream will close the underlying OutputStream when it is done with it, but no one currently tells the Writer it should be done. So anything stored in its buffer gets lost. To fix this, we need to use java.io.Writer#flush after the emit.

(defn directory-stream []
  (ring-io/piped-input-stream
    #(let [writer (io/make-writer % {})]
       (xml/emit (directory-list))
           (.flush writer))))

Then when testing it at a repl, we can get the entire response.

(slurp (directory-stream))
;;=>  "<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo><bar><baz>The baz value</baz></bar></foo>"

Complete XML streaming example

(ns z.core
  (:require [ring.adapter.jetty :as jetty]
            [clojure.data.xml :as xml]
            [clojure.java.io :as io]
            [ring.util.io :as ring-io]
            [ring.util.response :as response])
  (:import java.io.OutputStreamWriter
           java.io.BufferedWriter
           java.net.URL))

(defn data
  "Download a 5MB file and parse it"
  []
  (-> "http://www.cs.washington.edu/research/xmldatasets/data/tpc-h/orders.xml"
      URL.
      .openStream
      xml/parse))

(defn send-xml [request]
  (if (= (:uri request) "/stream")
    (response/response
     (ring-io/piped-input-stream
      #(->> (io/make-writer % {})
            (xml/emit (data))
            .flush)))
    (response/response
     (xml/emit-str (data)))))

(comment
  (def server (jetty/run-jetty #'send-xml {:port 8080 :join? false}))
  (.stop server))

If we start a server as mentioned in the comment and use curl to check it we can see http://localhost:8080/stream begins streaming "quickly", where as http://localhost:8080 waits until the entire response is generated.