Friday 10 August 2007

Summary of XEP33 addresses limits

This post summarizes and updates all what I have said in the past weeks in those posts: The limit of addresses in XEP33 must be fixed, XEP33: types of limits and default values, XEP33: Tell limits in disco#info response using XEP128, and Updates to XEP33 limits proposal.


Introduce the problematic

Let's suppose that limiting the number of destination addresses in a XEP33 stanza really serves a purpose, for example, to prevent or reduce abuse of the multicast service. To count how many 'addresses' are there in a stanza, only TO, CC and BCC addresses are considered, since those are the ones that will generate traffic consumption.

XEP33 says that a server should have a limit for the maximum number of addresses allowed on a single packet: the limit SHOULD be more than 20 and less than 100.

That limit is easy to implement on the receiving party. But what happens with the sender? How many addresses can a sender put on each packet? If it puts too many, the packet will be rejected. If it puts too few, it is not profiting of XEP33 as much as it could do.

On the current version of XEP33, remote servers allow as few as 20 and as much as 100 addresses. This means that a sender have to reduce to the minimum common in order to not get rejects: it can only send as much as 20 addresses on each packet.

If we already know that 20 is the maximum limit in practice, then why bother telling some admins that they can put 30, 40 or more on their servers? Nobody will send more than 20 addresses on each packet!


Proposed solution: configurable limits, and method to inform

Allow configurable limits on the protocol for each different condition. Define default values in the protocol. And describe a method for senders to know which limits are applied on each destination server.

Another possible limitations to reduce abuse of a multicast service are # of messages per minute, # of addresses per minute, # of total bytes sent... But I don't expect them to be interesting for inclusion in XEP33.


Types of limits

Several limits can be defined, depending in the different characteristics of a XEP33 stanza:

  • sender is: local or remote
  • the stanza type is: message or presence. Note that iq stanzas don't directly include XEP33 addresses.
There is no way to know if a XEP33 was sent by a user or a server/service. So, that categorization is not possible.

Those categories do not allow to differentiate the stanzas sent by a trusted local service (like MUC or Pub/Sub components) from the rest of possible senders. Obviously, the trusted local services operated by the same administrator that installed the multicast service should have unrestricted access to the multicast service. This possibility is an implementation specific issue which will not be covered by XEP33.

The mentioned stanza characteristics allow to define 4 different limits:
  • local message
  • local presence
  • remote message
  • remote presence
The allowed values for the limits are:
  • Positive integers, including zero: 0, 1, 2, ...
  • the key word 'infinite', which means that the limit is not applied at all.

Method to inform

This method uses XEP-0128: Service Discovery Extensions, as proposed by Ralphm in a comment.

How does this work? Currently, when an entity wants to send a XEP33 stanza, it first checks if there is a XEP33-enabled service available. To check that, it queries disco#info to the service, and looks for in the response.

If there are limits to inform about, the disco#info response does not only announce XEP33 support, but also announce which are the exact limits in effect in the service.

When a multicast service announces limits in a disco#info response, it SHOULD only report limits which are configured to a different value than the one defined as default in XEP33. So, if XEP33 says that a given limit is 20; but the limit in effect in a server is 30, then the server must tell the limit. If the limit in effect is the default value, then it SHOULD NOT be specified at all in disco#info to save bandwidth.

Similarly, when a multicast service announces limits in a disco#info response, it SHOULD only report limits which are going to be applied to the entity that performs the request. The reason is that users of the local server and users/servers/services which are remote will have different limits, and it's a waste of bandwidth to announce limits to an entity that will never be affected by them.

The entity that requested this info must cache those limits for posterior reference.

Let's see an example. The Jabber server capulet.com wants to send a stanza with XEP33 addresses to the Jabber server shakespeare.lit. The response announces XEP33 support, and also provides information of several limits:

<iq type='get'
from='capulet.com'
to='shakespeare.lit'
id='disco1'>
<query xmlns='http://jabber.org/protocol/disco#info'/>
</iq>

<iq type='result'
from='shakespeare.lit'
to='capulet.com'
id='disco1'>
<query xmlns='http://jabber.org/protocol/disco#info'>
<identity
category='server'
type='im'
name='shakespeare.lit jabber server'/>
...
<feature var='http://jabber.org/protocol/address'/>
<x xmlns='jabber:x:data' type='result'>
<field var='FORM_TYPE' type='hidden'>
<value>http://jabber.org/protocol/address</value>
</field>
<field var='message'>
<value>20</value>
</field>
<field var='presence'>
<value>infinite</value>
</field>
</x>
...
</query>
</iq>


Apply limits to incoming stanzas

When a stanza is received by a XEP33-enabled entity to be routed to other destinations, the number of destination addresses is compared to the limit which is in effect for that kind of stanza. If the stanza has more addresses of type TO, CC and BCC than the allowed, an error message is returned to the original sender.


Take into account limits when sending stanzas

When any Jabber entity is about to send a XEP33 stanza, it MUST make sure the number of destination addresses is not greater than the limit reported by the destination entity. In this case, the destinations can be split in several groups (or batches).

No comments: