RSSCloud Callbacks to a Different IP

2009-10-19

A bit of context, in case you haven’t been keeping up with the real-time web craze:

RSSCloud is an… idea^* for getting updates on RSS feeds to clients faster, while decreasing network load. In traditional RSS models, subscribers make an HTTP request every 10 minutes or so to a publisher to check for updates. In RSSCloud, a cloud server aggregates several feeds from authors. When feeds are changed, their authors send an HTTP request to the cloud server notifying them of the update. The cloud server contacts one or more subscribers of the feed, sending them a notice that the feed has changed. The subscribers then request the feed from the authors. Everyone gets their updates faster, and with fewer requests across the network.

The Problem

When you subscribe to an RSSCloud server, you tell it several things about how to notify you of changes:

A SOAP/XML-RPC notify procedure (required but useless for REST)
What port to call back on.
What path to make the request to.
The protocol you accept (XML-RPC, SOAP, or HTTP POST).
The URLs of the feeds to subscribe to.

There’s something missing! The RSSCloud walkthrough says:

Notifications are sent to the IP address the request came from. You can not request notification on behalf of another server.

That’s great unless your originating IP address can’t receive HTTP traffic. That rules out users behind a NAT or behind a firewall (without forwarded ports). That’s most home users with routers, users on typical corporate networks, etc. It won’t work on the iPhone. And, to a lesser degree, it rules out the cloud itself.

One of the common aspects of cloud computing is that compute nodes (and their IP addresses) may come and go as needed. For example, Vodpod.com is served by several different servers which (through a combination of heartbeat-failover, IP routing, and HTTP proxying) may enter and leave the cluster at any time without service interruption. So, if one of those servers subscribes to a feed, it might not be online to receive pings later. You’d have to subscribe to each feed from every host to guarantee that you’d continue to receive responses. The problem only becomes worse when you start looking at cloud services like EC2.

The RSSCloud mailing list has been tossing around the obvious solution for several weeks now: just include a “domain” parameter which says what FQDN or IP address to connect to. On Friday, Dave Winer included it in his walkthrough. Even so, most of the cloud servers (Wordpress, for example) out there don’t support it yet.

A Partial Solution

What can you do to get around this?

One solution is to use PubSubHubbub, which uses a full callback URL. Additionally, Superfeedr will even use RSSCloud to offer real-time updates through PuSH, effectively bridging the two schemes.

Alternatively, you can lie (sort of) about your address. This is what we’ve done at Vodpod to get Wordpress to call us back correctly. When we subscribe, we actually re-bind the TCP socket to a publically accessible IP. That IP is guaranteed to go somewhere in the cluster which can accept the RSSCloud update ping. Here’s a truly evil hack to do just that, by replacing Net::HTTP’s TCP socket with our own.

<cr:code lang=“ruby”> res = Net::HTTP.new(uri.host, uri.port).start do |http| # Replace the socket with one that we bind to the interface we want to use.

# The local IP address we’d like RSSCloud to call back. local_addr = Socket.pack_sockaddr_in 0, ‘208.101.30.10’ # The RSSCloud server IP address remote_addr = Socket.pack_sockaddr_in uri.port, uri.host

# Create a new socket s = Socket.new Socket::AF_INET, Socket::SOCK_STREAM, 0 # Bind it to the local address s.bind local_addr

# Wrap for Net::HTTP and connect socket = Net::BufferedIO.new(s) s.connect remote_addr

# Replace the HTTP client’s connection http.instance_variable_set(‘@socket’, socket)

# And make the request http.request(req) end </cr:code>

^*Dave says it's not a standard, or a spec. As far as I can tell, RSSCloud consists of a mailing list, a walkthrough of how implementations can handle the pings/cloud tag in RSS feeds, and a bunch of loosely federated implementations with varying degrees of compatibility. Some speak XML-RPC, some speak SOAP, some speak plain-old REST, etc...

RSSCloud Callbacks to a Different IP

The Problem

A Partial Solution

Post a Comment