Badlop: ejabberd

Showing posts with label ejabberd. Show all posts

Tuesday, 2 October 2007

Results of mod_multicast execution time with timer:tc

I previously presented timer:tc and multicast_test. Now I'll illustrate some results I obtained with them.

I ran 6 experiments, using normal routing, trusted multicast and untrusted multicast; both single and multiple servers. Each experiment was ran for several number of destination addresses.

The time is expressed in microseconds (1 second = 1.000.000 microseconds). Note that the time shown here includes not only mod_multicast, but also packet building and ejabberd_router.

Results

Normal - Single

           #     Exec Time   Time per destination
Destinations (microseconds)        (microseconds)
         1           161                 161.00
       101          6074                  60.14
       201         22163                 110.26
       301         40950                 136.05
       401         74586                 186.00
       501        100445                 200.49
       601        107012                 178.06
       701        131163                 187.11
       801        146473                 182.86
       901        153147                 169.97

Normal - Multiple

           #     Exec Time   Time per destination
Destinations (microseconds)        (microseconds)
         1           196                 196.00
       101         11206                 110.95
       201         20918                 104.07
       301         30811                 102.36
       401         43627                 108.80
       501         52742                 105.27
       601         59650                  99.25
       701         75291                 107.41
       801         81077                 101.22
       901         80192                  89.00

Trusted - Single

           #     Exec Time   Time per destination
Destinations (microseconds)        (microseconds)
         1           226                 226.00
       101          8719                  86.33
       201         39938                 198.70
       301         77040                 255.95
       401        101675                 253.55
       501        122156                 243.82
       601        144422                 240.30
       701        158917                 226.70
       801        186553                 232.90
       901        197387                 219.08

Trusted - Multiple

           #     Exec Time   Time per destination
Destinations (microseconds)        (microseconds)
         1           283                 283.00
       101         21751                 215.36
       201         41077                 204.36
       301         66047                 219.43
       401        102542                 255.72
       501        315313                 629.37
       601        140158                 233.21
       701        164836                 235.14
       801        171249                 213.79
       901        189712                 210.56

Untrusted - Single

            #     Exec Time   Time per destination
Destinations (microseconds)        (microseconds)
          1           317                 317.00
        101         25528                 252.75
        201         79402                 395.03
        301        155075                 515.20
        401        284532                 709.56
        501        329484                 657.65
        601        357879                 595.47
        701        388755                 554.57
        801        414622                 517.63
        901        506557                 562.22

Untrusted - Multiple

            #     Exec Time   Time per destination
Destinations (microseconds)        (microseconds)
          1          1121                1121.00
        101        115736                1157.36
        201        331379                1656.89
        301        601325                2004.42
        400       2478072                6195.18
...

Analysis

What does this show? The computation time per destination address remains constant across different packet sizes. The fluctuations are only due to my computer: it was being used for other tasks, not only running those experiments [1].

For example, processing a packet with 300 destination addresses (few rosters and chatrooms have more than this number of items) costs around 77 milliseconds if the source is trusted (a local MUC service or session manager). The same packet costs 155 milliseconds if the source is not trusted and the packet must be carefully inspected.

In the less bright side of the results, the Multiple set of experiments again perform quite badly compared to Single. Note that part of this is due to the function add_addresses, as explained in the Fprof article. Another part of the problem is not in the server, but in the stressing tool: the building function I implemented in multicast_test puts non-existent servernames when it's run with the 'Multiple' option: "5.localhost", "7.localhost"...

Conclusions

As a summary, the results I obtained using timer:tc are compatible with the previous results I obtained with Fprof and Jabsimul. All them indicate that mod_multicast consumes approximately, in average, as much time as the ejabberd routing functions do.

So, using multicast increases the CPU consumption, as was expected. This cost will be acceptable once the benefits of multicasting are taken into account.

---
[1] Statistically speaking, it would be preferable to run a batch of experiments and show only the average and confidence interval, but I guess this is not really required for now.

Results of code profiling mod_multicast with Fprof

I previously presented Fprof and multicast_test. Now I'll illustrate some results I obtained with them.

I tried using normal routing, trusted multicast and untrusted multicast. Both single and multiple servers. And a fixed number of 300 destination addresses.

All those experiments generate a lot of information, so I summarize here only the important results. Times are in milliseconds, and measure the full processing time in the system where the experiments were ran.

Results

Normal - Single

Total: 320
build_packet: 30
ejabberd_router: 275

Normal - Multiple

Total: 630
build_packet: 30
ejabberd_router: 593

Trusted - Single

Total: 410
build_packet: 30
ejabberd_router: 266
mod_multicast: 114
string_to_jid already requires: 52

Trusted - Multiple

Total: 2020
build_packet: 30
ejabberd_router: 700
mod_multicast: 1290
add_addresses consumes too much here: 1067

Untrusted - Single

Total: 500
build_packet: 30
ejabberd_router: 270
mod_multicast: 200
string_to_jid requires: 50

Untrusted - Multiple

Total: 2060
build_packet: 30
ejabberd_router: 640
mod_multicast: 1390
add_addresses consumes too much here: 1070

Analysis

In those examples, using mod_multicast does not reduce the time consumed by ejabberd_router because the multicast packet is mean to be sent to local users, so the routing process is called 300 times always. If the destinations were not local, and some of them were on the same servers, ejabberd_router would be called less frequently and so the usage of mod_multicast would be noticeable in that aspect too.

The Trusted multicast requires only half of the time required by ejabberd_router itself. In the case of Untrusted, the additional checks make mod_multicast as costly as ejabberd_router.

In the case of Trusted multicast, the function in mod_multicast that consumes the more processing time is jlib:string_to_jid. In Untrusted, the most problematic function is add_addresses. Maybe the code of that function can be improved.

Summarizing, I consider that mod_multicast code is fairly efficient when compared to other parts of ejabberd, specially ejabberd_router.

multicast_test: analyze mod_multicast performance

To measure mod_multicast computation consumption I developed a small Erlang module: multicast_test.

This module includes functions to create a message packet with an XEP33 'addresses' element. The number of destinations is configurable. And the server of each destination can be 'single', so all destinations are in the same server, or 'multiple', so each destination is from a different server. This packet can be sent to route_trusted or route_unstrusted. It is also possible to send individual packets to ejabberd_router.

Code profiling

It is possible to run those functions with Fprof to profile the time consumed by each function in mod_multicast:

fprof:apply(multicast_test, ROUTING, [SERVERS, NUM_DESTS]).
fprof:profile().
fprof:analyse([{dest, []}]).

Where:

ROUTING: testn for normal routing, testt for trusted sender, and testu for untrusted sender.
SERVERS: single for just a single server, multiple for a different server for each destination address.
NUM_DESTS: number of destination addresses.

For example, execute this in the Erlang shell of the ejabberd node:

fprof:apply(multicast_test, testu, [single, 300]).
fprof:profile().
fprof:analyse([{dest, []}]).

And you will get a file fprof.analysis with very detailed information.

Execution time

It's also possible to measure the execution time of those functions with a varying number of destinations. This will show if the performance of mod_multicast is dependant on the number of destinations or the number of destination servers...

The functions are:

multicast_test:ROUTING(SERVERS, INI, END, INC).

Where:

ROUTING: normal, trusted, untrusted.
SERVERS: single or multiple.
INI, END and INC: The initial number of destinations, the increment and the ending value.

For example:

multicast_test:untrusted(single, 1, 1000, 100).

I will later post some results I obtained using those functions.

Results of stressing ejabberd+mod_multicast with Jabsimul

I tested mod_multicast in a living server, stressing it with synthetically generated load using Jabsimul.

The setup was: create 300 accounts with Testsuite's userreg; create a Shared Roster Group with @all@; configure Jabsimul to login in all the 300 accounts and change presence every 60 seconds.

Note that each user has 299 contacts online, and consequently each presence change generates a presence packet with 299 destination addresses.

I ran Jabsimul against an unaltered ejabberd trunk, and also a mod_multicast enabled version. The CPU consumption in both cases varied around 20% to 30%. I couldn't find a clear difference between enabling or disabling mod_multicast. The Virtual memory was around 125MB, and Resident memory around 105MB.

It seems the computation resources required by mod_multicast are only a small part of all the processing that takes place in ejabberd. So, this test indicates that probably enabling XEP33 in an ejabberd server does not have an appreciable impact in the server CPU or RAM consumption.

The computer used in the tests:

AMD Athlon(tm) 64 Processor 3000+, 4.000 Bogomips
1 GB RAM
Debian sid
Linux 2.6.22-2-686 (Debian package)
Erlang R11B-5 (Debian package)
ejabberd SVN r952
mod_multicast SVN r394

If you run your own tests and find a differing conclusion, please tell me.

PD: Some instructions to get Jabber Test Suite and Jabsimul: Benchmarking Jabber/XMPP Servers with Jabsimul

Monday, 1 October 2007

Major performance optimizations in mod_multicast

When I finished my Google Summer of Code 2007 project about implementing XEP-0033: Extended Stanza Addressing in ejabberd, I ran some benchmarking and found that my multicast module performed really bad.

It was explained in ejabberd gets XEP-0033: Extended Stanza Addressing. For example, CPU consumption was multiplied by x3 when users sent XEP33 packets to around 40 destinations.

This result was not surprising since my focus during the GSoC time was to implement XEP33 correctly, not efficiently. I added as a post-GSoC task to perform code profiling, find bottlenecks and deficiencies in mod_multicast, and improve the code accordingly.

Used tools

During September I've learned about several tools in Erlang/OTP to analyze Erlang source code. After experimenting with them in several of my ejabberd contributions, this weekend I decided it was time to come back to mod_multicast.

The tools I used for code profiling are:

Debugger: graphical tool which can be used for debugging and testing of Erlang programs.
Dialyzer: static analysis tool that identifies software discrepancies such as type errors, unreachable code, unnecessary tests.
Cover: coverage analysis tool for Erlang. Shows how many times is executed each line of the source code file.
Fprof: profiling tool that can be used to get a picture of how much processing time different functions consumes and in which processes.

For benchmarking, I used those tools:

Timer:tc: Measure the elapsed real time while executing a function.
Testsuite: to create a lot of Jabber accounts.
ejabberd's mod_shared_roster: to populate the rosters with a lot of contacts.
Jabsimul: stress the server sending constant presence changes.
top: view ejabberd consumption of CPU and RAM.
etop: similar to top, but to view Erlang processes.

Performance improvements

The part of mod_multicast packet processing that consumed more time was the traversal and formatting of the list of destination addresses. A task was specially time consuming in my early code: conversion of string to JID. And to make things worse, each destination address was converted from string to JID several times, for stupid reasons.

Yesterday I rewrote and reorganized a lot of the code that handles multicast packets. I'll now describe the important changes.

* Software engineering 101

Replace the do_route* control-passing functions with a main function 'route_untrusted' that has the control and calls worker functions.

* Erlang/OTP Programming with Style 101

Use Throw and Catch.

* Route_unstrusted

The function route_untrusted is used for any multicast packet that was sent from an untrusted source (local user, remote user, remote server, remote service). This packet is completely checked: access permissions, packet format, limit of number of destinations, packet relay.

* Route_trusted

The function route_trusted is used by local services like MUC and the session manager. Since the source of the packet is trusted by the multicast service, the packet is not checked.

* Route_common

The function route_common performs the main processing tasks: find a multicast service for each remote server (either in cache or start the query process), and send the packets.

* Packet prebuilding

There are two important improvements in the packets building task: the set of 'address' elements that represents each group of destinations is built initially, not for every packet sent. This is where the 'delivered=true' attribute is added.

Also for each group of destinations, they get a prebuilt packet which includes all the other addresses (with the 'delivered=true' attribute already present).

Finally, the list of groups is traversed, and for each one the only remaining duty is to build the final packet (by simply concatenating the list of addresses), and route to the destination.

This implementation has a complexity of order N, while the old implementation had complexity of order N^N.

Conclusions

I've described the nature of the performance improvements in mod_multicast. I'll soon describe how I ran the benchmarking tools, and the observed results.

Sunday, 19 August 2007

ejabberd gets XEP-0033: Extended Stanza Addressing

I consider finished my GSoC task of implementing XEP-0033: Extended Stanza Addressing in ejabberd.

The implementation is divided in several parts:

New ejabberd module that provides a multicast service: mod_multicast.erl
A small router for multicast packets: ejabberd_router_multicast.erl
mod_muc uses the new multicast router for message stanzas
ejabberd_c2s uses the new multicast router for presence stanzas

The largest part of the code is in the multicast service (mod_multicast).

Tomorrow is the GSoC pencils down deadline. This means that I will be evaluated only for the code that I wrote until today.

I expect all my code to be eventually included in ejabberd trunk. However, I'll propose that mod_multicast is disabled by default in the example configuration. At least in the first ejabberd release that includes that module.

Benchmark

I made some benchmarks using Jabsimul. The performance indexes that I could evaluate are only the %CPU and MB of RAM consumed by the ejabberd program. I created 900 accounts, and populated each one with around 40 roster items of type 'both'. Then, using Jabsimul each logged user changed its presence every few seconds.

With the patches and the multicast service enabled with small rosters and small chatrooms (less than 5 contacts or paticipants), there's a small increase in CPU consumption. With medium-size rosters (40 roster items), the CPU consumption triplicates in respect to the stock ejabberd trunk version.

Obviously, I don't consider acceptable when the CPU consumption is multiplied by 3 just because all the packets use XEP33 with 40 destinations. The bottleneck is mod_multicast.

However, this result does not surprise me at all. During my GSoC coding I only cared about optimization in the patches that will be commited to ejabberd trunk: ejabberd_c2s, mod_muc_room and ejabberd_router_multicast. I didn't care about code optimizations in mod_multicast. For me it was far more important the functionality correctness. Now that mod_multicast works correctly, I can concentrate in improving it without breaking its correctness.

This planning allowed me to do all the stuff that I planned for my GSoC project, and finish the summer with correct and working code. During the last week of August I plan to profile, reorganize and improve mod_multicast to reduce its computational consumption as much as possible.

Unexpected improvement

The funny thing is that my patches to ejabberd core with a disabled multicast service reduce slightly the CPU consumption compared to the stock ejabberd trunk version. This means that there is a possible optimization in ejabberd that does not deal with XEP33 at all. If properly investigated, this improvement could be included in ejabberd trunk and benefit all ejabberd deployments, not only the ones with multicast enabled.

Saturday, 18 August 2007

ejabberd gets XEP-0133: Service Administration

One of the minor tasks in my Google Summmer of Code project was to implement in ejabberd as many of the 31 commands described in XEP-0133: Service Administration as possible.

Aleksey Shchepin already implemented many commands in ejabberd more than 4 years ago. A year and a half ago Magnus Henoch updated them to use XEP-0050: Ad-Hoc Commands. So, I just had to update them a little to become XEP-0133 compliant:

23. Send Announcement to Online Users
24. Set Message of the Day
25. Edit Message of the Day
26. Delete Message of the Day

The commands that I implemented from scratch are:

1. Add User
2. Delete User
5. End User Session
6. Get User Password
7. Change User Password
9. Get User Last Login Time
10. Get User Statistics
13. Get Number of Registered Users
15. Get Number of Online Users
30. Restart Service
31. Shut Down Service

Other commands are not implemented, and I didn't add them because I consider ejabberd already provides other ways more suitable:

8. Get User Roster
18. Get List of Registered Users
20. Get List of Online Users
27. Set Welcome Message
28. Delete Welcome Message
29. Edit Admin List

And finally, I didn't implement those commands because they use features not available in ejabberd:

3. Disable User
4. Re-Enable User
11. Edit Blacklist
12. Edit Whitelist
14. Get Number of Disabled Users
16. Get Number of Active Users
17. Get Number of Idle Users
19. Get List of Disabled Users
21. Get List of Active Users
22. Get List of Idle Users

During this task, I found and reported some typo errors to the author of the XEP (Peter Saint-Andre).

Finally, I tested most commands with Tkabber SVN, Psi SVN and Gajim SVN. Sergei Golovan quickly fixed a small bug in Tkabber, and now all three clients work perfectly :)

I'm quite happy with the result, so I took this screenshot that depicts the impressive list of commands that allow an administrator to configure ejabberd just with a Jabber client:

Note that the commands are nested in the Service Discovery to allow the admin to find them easier.

The patch is available here. I hope it has quality enough to enter ejabberd trunk easily, so it is published in the next major ejabberd release.

Badlop

Tuesday, 2 October 2007

Results of mod_multicast execution time with timer:tc

Results of code profiling mod_multicast with Fprof

multicast_test: analyze mod_multicast performance

Results of stressing ejabberd+mod_multicast with Jabsimul

Monday, 1 October 2007

Major performance optimizations in mod_multicast

Sunday, 19 August 2007

ejabberd gets XEP-0033: Extended Stanza Addressing

Saturday, 18 August 2007

ejabberd gets XEP-0133: Service Administration

Labels

Blog Archive

GSoC Project: Resources

GSoC Project: Protocols

GSoC Project: Other links

About Me