123456789_123456789_123456789_123456789_123456789_

Architecture

Overview

http://bit.ly/2iJuFky

Puma is a threaded web server, processing requests across a TCP or UNIX socket.

Workers accept connections from the socket and a thread in the worker's thread pool processes the client's request.

Clustered mode is shown/discussed here. Single mode is analogous to having a single worker process.

Connection pipeline

http://bit.ly/2zwzhEK

Disabling queue_requests

http://bit.ly/2zxCJ1Z

The queue_requests option is true by default, enabling the separate thread used to buffer requests as described above.

If set to false, this buffer will not be used for connections while waiting for the request to arrive. In this mode, when a connection is accepted, it is added to the "todo" queue immediately, and a worker will synchronously do any waiting necessary to read the HTTP request from the socket.