Parallelism in GraphQL-Ruby

It’s possible to get IO operations running in parallel with the graphql gem.

I haven’t tried this extensively, but I had to satisfy my curiosity!

Setup: Long-Running IO

Let’s say we have a GraphQL schema which has long-running IO- or system-bound tasks. Here’s a silly example where the long-running task is sleep:

QueryType = GraphQL::ObjectType.define do
  name "Query"
  field :sleep, !types.Int, "Sleep for the specified number of seconds" do
    argument :for, !types.Int
    resolve ->(o, a, c) {
      sleep(a["for"])
      a["for"]
    }
  end
end

Schema = GraphQL::Schema.define do
  query(QueryType)
end

Let’s consider a query like this one:

query_str = <<-GRAPHQL
{
  s1: sleep(for: 3)
  s2: sleep(for: 3)
  s3: sleep(for: 3)
}
GRAPHQL

puts query_str

puts Benchmark.measure {
  Schema.execute(query_str)
}

How long will it take?

$ ruby graphql_parallel.rb
{
  s1: sleep(for: 3)
  s2: sleep(for: 3)
  s3: sleep(for: 3)
}
  0.000000   0.000000   0.000000 (  9.009428)

About 9 seconds: three sleep(3) calls in a row.

Working in Another Thread

The concurrent-ruby gem includes Concurrent::Future, which runs a block in another thread:

future = Concurrent::Future.execute do
  # This will be run in another thread
end


future.value
# => waits for the return value of the block
#    and returns it

We can use it to put our sleep(3) calls in different threads. There are two steps.

First, use a Concurrent::Future in the resolve function:

- sleep(a["for"])
- a["for"]
+ Concurrent::Future.execute {
+  sleep(a["for"])
+  a["for"]
+ }

Then, tell the Schema to handle Concurrent::Futures by calling #value on them:

 Schema = GraphQL::Schema.define do
   query(QueryType)
+  lazy_resolve(Concurrent::Future, :value)
 end

Finally, run the same query again:

$ ruby graphql_parallel.rb
{
  s1: sleep(for: 3)
  s2: sleep(for: 3)
  s3: sleep(for: 3)
}
  0.000000   0.000000   0.010000 (  3.011735)

🎉 Three seconds! Since the sleep(3) calls were in different threads, they were executed in parallel.

Real Uses

Ruby can run IO operations in parallel. This includes filesystem operations and socket reads (eg, HTTP requests and database operations).

So, you could make external requests inside a Concurrent::Future, for example:

Concurrent::Future.execute {
  open("http://wikipedia.org")
}

Or, make a long-running database call inside a Concurrent::Future:

Concurrent::Future.execute {
  DB.exec(long_running_sql_query)
}

Caveats

Switching threads incurs some overhead, so multithreading won’t be worth it for very fast IO operations.

GraphQL doesn’t know which resolvers will finish first. Instead, it starts each one, then blocks until the first one is finished. This means that subsequent long-running fields may have to wait longer than they “really” need to. For example, consider this query:

{
  sleep(for: 5)
  nestedSleep(for: 2) {
    sleep(for: 2)
  }
}

Even with multithreading, this would take about 7 seconds to execute. First, GraphQL would wait for sleep(for: 5), then it would get to nestedSleep(for: 2), which would have already finished, then it would execute sleep(for: 2).

Conclusion

If your GraphQL schema is wrapping pre-existing HTTP APIs, using a technique like this could reduce your GraphQL response time.