Expand description
NOTE: The following documentation belongs in a spec. But for now, it’s our best attempt to document the design and protocol implemented here for integrating proxies with our RPC system. –nickm
Roughly speaking:
§Key concepts
A data stream is “RPC-visible” if, when it is created via a proxy connection, the RPC system is told about it.
Every RPC-visible stream is associated with a given RPC object when it is created.
(Since the RPC object is being specified in the proxy protocol,
it must be one with an externally visible Object ID.
Such Object IDs are cryptographically unguessable and unforgeable,
and are qualified with a unique identifier for their associated RPC session.)
Call this RPC Object the “target” object for now.
This target RPC object must implement
the ConnectWithPrefs special method.
Right now, there are two general kinds of objects that implement this method: client-like objects, and one-shot clients.
A client-like object is either a TorClient or an RPC Session.
It knows about and it is capable of opening multiple data streams.
Using it as the target object for a proxy connection tells Arti
that the resulting data stream (if any)
should be built by it, and associated with its RPC session.
An application gets a TorClient by asking the session for one, or for asking a TorClient to give you a new variant clone of itself.
A one-shot client is an arti_rpcserver::stream::OneshotClient.
It is created from a client-like object, but can only be used for a single data stream.
When created, it it not yet connected or trying to connect to anywhere:
the act of using it as the target Object for a proxy connection causes
it to begin connecting.
An application gets a OneShotClient by calling arti:new_oneshot_client
on any client-like object.
§The Proxy protocol
See the specification for SOCKS extended authentication for full details on integrating RPC with SOCKS. For HTTP integration, see the relevant section of prop365.
§Further restrictions on Object IDs and isolation
In some cases, the RPC Object ID may denote an object that already includes information about its intended stream isolation. In such cases, the stream isolation MUST be blank. Implementations MUST reject non-blank stream isolation in such cases.
In some cases, the RPC object ID may denote an object
that already includes information
about its intended destination address and port.
In such cases, the destination address MUST be 0.0.0.0 or ::
(encoded either as an IPv4 address, an IPv6 address, or a hostname)
and the destination port MUST be 0.
Implementations MUST reject other addresses in such cases.
§Another proposed change
We could add a new method to clients, with a name like “open_stream” or “connect_stream”. This method would include all target and isolation information in its parameters. It would actually create a DataStream immediately, tell it to begin connecting, and return an externally visible object ID. The RPC protocol could be used to watch the DataStream object, to see when it was connected.
The resulting DataStream object could also be used as the target of a proxy connection. We would require in such a case that no isolation be provided in the proxy handshake, and that the target address was (e.g.) INADDR_ANY.
§Intended use cases (examples)
(These examples assume that the application already knows the proxy port it should use. I’m leaving out the isolation strings as orthogonal.)
These are NOT the only possible use cases; they’re just the two that help understand this system best (I hope).
§Case 1: Using a client-like object directly.
Here the application has authenticated to RPC
and gotten the session ID SESSION-1.
(In reality, this would be a longer ID, and full of crypto).
The application wants to open a new stream to www.example.com. They don’t particularly care about isolation, but they do want their stream to use their RPC session. They don’t want an Object ID for the stream.
To do this, they make a SOCKS connection to arti,
with target address www.example.com.
They set the username to <torS0X>0SESSION-1,
and the password to the empty string.
(Alternatively, it could use HTTP CONNECT, setting Tor-Rpc-Target to SESSION-1.)
Arti looks up the Session object via the SESSION-1 object ID
and tells it (via the ConnectWithPrefs special method)
to connect to www.example.com.
The session creates a new DataStream using its internal TorClient,
but does not register the stream with an RPC Object ID.
Arti proxies the application’s connection through this DataStream.
§Case 2: Creating an identifiable stream.
Here the application wants to be able to refer to its DataStream
after the stream is created.
As before, we assume that it’s on an RPC session
where the Session ID is SESSION-1.
The application sends an RPC request of the form:
{"id": 123, "obj": "SESSION-1", "method": "arti:new_oneshot_client", "params": {}}
It receives a reply like:
{"id": 123, "result": {"id": "STREAM-1"} }
(In reality, STREAM-1 would also be longer and full of crypto.)
Now the application has an object called STREAM-1 that is not yet a connected
stream, but which may become one.
This time, it wants to set its isolation string to “xyzzy”.
The application opens a socks connection as before.
For the username it sends <torS0X>0STREAM-1,
and for the password it sends xyzzy.
(Alternatively, it could use HTTP CONNECT, setting Tor-Isolation to xyzzy, and Tor-Rpc-Target to STREAM-1.)
Now Arti looks up the RpcDataStream object via STREAM-1,
and tells it (via the ConnectWithPrefs special method)
to connect to www.example.com.
This causes the RpcDataStream internally to create a new DataStream,
and to store that DataStream in itself.
The RpcDataStream with Object ID STREAM-1
is now an alias for the newly created DataStream.
Arti proxies the application’s connection through that DataStream.