This writeup is intended as a reference for the MUSCLE protocol node. It serves to explain in depth many of the issues mentioned in my writeup. Copyright 2002 Jeremy Friesner. Reproduced with permission.
MUSCLE Overview and Beginner's Guide
Introduction
The MUSCLE system is a robust, somewhat scalable, cross-platform client-server solution for dynamic distributed applications for BeOS, Linux, AtheOS, FreeBSD, and other operating systems. It allows (n) client programs (each of which may be running on a separate computer and/or under a different OS) to communicate with each other in a many-to-many message-passing style. It employs a central server to which client programs may connect or disconnect at any time (This design is similar to other client-server systems such as Quake servers, IRC servers, and Napster servers, but more general in application). In addition to the client-server system, MUSCLE contains classes to support peer-to-peer message streaming connections, as well as some handy miscellaneous utility classes. As distributed, the server side of the software is ready to compile and run, but to do much with it you'll want to write your own client software. Example client software can be found in the "test" subdirectory.
This document assumes you are familiar with C++ programming and BeOS programming. However it should be understandable even if you aren't.
Feature List
- Powerful: Provides a centralized "message crossbar server" for up to (n) simultaneous client programs to connect to. (n is limited by the server computer's FD_SET size--generally at least 32; currently 1024 under BeOS 5.0 and Red Hat 6.1).
- Easy: All communication is done over TCP, by sending flattened Message objects (which are very similar to BMessages, except portable) through MessageIOGateways. For AtheOS or BeOS-specific code, it's even easier--see item 8.
- Efficient: Messages sent to the server may be broadcasted to all connected clients, or multicasted intelligently using pattern matching logic.
- Portable: All code (except for some platform-specific convenience classes in the support folders) uses only standard C++ and BSD socket calls, and should compile and run under any modern OS with minimal changes. All code has been compiled and tested on BeOS, Red Hat Linux, Yellow Dog Linux, AtheOS, FreeBSD, NetBSD, and Windows XP.
- Flexible: Clients may store data (in the form of Messages) in the server's RAM, using a filesystem-like node hierarchy. Other clients may "subscribe" to this server-side data, and the server will then automatically send them updates to the data as it is changed. Subscriptions are also specified via wildcarding, for maximum flexibility.
- Open: All source code is included and freely distributable and usable for any purpose. The source code contains many useful classes, including platform-neutral analogs to Be's BMessage, BDataIO, BFlattenable, and BString APIs. In addition, the archive also includes handy double-ended-queue, Hashtable, Reference-counting, and "I/O gateway" classes. (Yes, I know the C++ STL contains similar functionality. But I think the STL is icky)
- Customizable: All server-side session handlers are implemented by subclassing a standard interface (AbstractReflectSession) so that they can be easily augmented or replaced with custom logic. Message serialization and low-level I/O is handled in a similar fashion, making it easy to replace the byte-stream format or transport mechanism with your own.
- Convenient: For AtheOS and BeOS programs, special utility classes are provided to translate Messages to native messages (and vice versa), and to hide the synchronous TCP messaging interface behind an asynchronous send-and-receive-messages API that's easier to deal with. There are also customized threaded client APIs for the Java and Qt platforms.
Is this software appropriate for use in my project?
The space of possible networking applications is large, and MUSCLE may or may not be appropriate for any given networking application. Here are several questions you should ask yourself when deciding whether or not to use the MUSCLE system or APIs.
-
Must my application be compatible with pre-existing Internet RFCs or other programs or data formats?
If yes, MUSCLE may not be for you. MUSCLE defines its own byte-stream formats and messaging protocol, and is not generally compatible with other software protocols (such as IRC or FTP). If your pre-existing protocol follows a "message-stream-over-TCP-stream" design pattern, you can customize MUSCLE (by defining your own subclass of AbstractMessageIOGateway) to make it use your protocol; if not, you're probably better off coding to lower level networking APIs.
-
Is TCP stream communication fast enough for my app? Do I need to use lower level protocols such as UDP or ICMP?
MUSCLE does all of its data transfer by serializing Messages over TCP streams. If your application is a particularly high-performance one (such as video streaming), MUSCLE may not be able to provide you with the efficiency you need. In this case, you might use MUSCLE TCP streams for your control data only, and hand-code separate routines for your high-bandwidth/low-latency packets. I've used this pattern (TCP + UDP) in audio-over-Internet programs before and it works well.
In addition, you should be aware of the CPU and memory overhead added by MUSCLE to your communications. While MUSCLE has been designed for efficiency, and will not make unreasonable demands on systems that run it, it is necessarily somewhat less efficient that straight byte-stream TCP programming. Specifically:
- Its use of Messages means that there will be several dynamic allocations and deallocations, and an extra data copy, for each message sent and received. (note: ObjectPools are used to minimize the former)
- It uses arbitrary-length message queues to avoid ever having to "block" your application's threads. While this will keep your application reliably responsive to the user, it can potentially use a lot of memory if you are producing messages faster than the network connection can send them.
- If you use the MessageTransceiverThread class (highly recommended for AtheOS and BeOS clients) there will be one extra thread used for each MUSCLE TCP connection. In addition, MessageTransceiverThread (and the muscled) uses select() to arbitrate data flows, which can be inefficient under the currently released (R5 and earlier) BeOS networking stacks. (it's more efficient under other operating systems, and should be fine under BONE as well)
Should my application use the muscled server, or just point-to-point messaging connections?
There are two common ways to use the MUSCLE package: you can have each client connect to a muscled server running on a central server system, and use it to communicate with each other indirectly... or you can have clients connect to each other directly, without using a central server. Each style of communication is useful in the right context, but it is important to choose the one that best fits what your app is going to do. Using the muscled in a client/server pattern is great because it solves several problems for you: it provides a way of communicating with other client computers without first needing to know their host addresses (etc), it gives you intelligent "broadcast" and "multicast" capabilities, and it provides a centralized area to maintain "shared state information" amongst all clients. On the down side, because all data must travel first to the central server, and from there on to the other client(s), message passing through the server is only half as fast (on average) as a direct connection to another client. Of course, to get the best of both worlds, you can use a hybrid system: each client connects to the server, and uses the server to find out the host addresses of the other clients; after that, it can connect to the other clients directly whenever it wants.
-
Which parts of MUSCLE should I use? Which parts should I extend? Which parts should I ignore?
The MUSCLE package consists of tens of classes, most of which are needed by the MUSCLE server, some of which are needed by MUSCLE clients, and some of which may be useful to you in their own right, as generic utility classes. For most applications, the standard MUSCLE server will be adequate: you can just compile it and run it, and concentrate solely on the client side of your app. For some specialized apps, you may want to make your own "custom" server--you can do this easily by creating your own subclass of AbstractReflectSession. Of course, if you do this you won't be able to use any of the "general purpose" muscled servers that may be available...
The Multi-Threaded messaging API
MUSCLE supports a multi-threaded messaging model as well as the single-threaded model (described elsewhere in this document). In the multi-threaded model, a separate MUSCLE thread is started up to handle networking chores for you. The advantage of doing it this way is that your GUI will never lock up due to networking activity, since all networking activity happens asynchronously. Your GUI merely sends commands to the MUSCLE networking thread, and receives Messages back which contain information received from the network.
The multi-threaded messaging API for MUSCLE is represented mainly by the MessageTransceiverThread class. This class manages the interaction between you and its internal thread, which does all the networking operations. If you are using Qt, BeOS, or AtheOS, you are in luck--there are system-specific subclasses of MessageTransceiverThread included with MUSCLE to make things easier for you. If not, you can still use MessageTransceiverThread, but you will need to do a little extra work to integrate it with your native threading model (how to do this is not covered here).
There are five things that you'll need to do in a typical client: Set up the thread, connect to the server, send messages, receive messages, and disconnect. Here's how to do them:
-
Setting up the MessageTransceiverThread object.
First thing you will need to do is create a (AMessageTransceiverThread/BMessageTransceiverThread/QMessageTransceiverThread) object (either on the stack or the heap). The AMessageTransceiverThread and BMessageTransceiverThread class constructors take a Messenger object; pass in a Messenger that points to the Looper object that will be handling network interactions. For Qt, it's even easier... just connect() the various signals of the QMessageTransceiverThread object to the various slots of your control object.
-
Connecting to the server.
Once you have your MessageTransceiverThread object, you can tell it that you want to connect out to the server by calling AddNewConnectSession() on it with the server's hostname and port number. Then you call StartInternalThread() on it to start the networking thread going. Both AddNewConnectSession() and StartInternalThread() will return immediately, but when the background TCP thread connects to the server (or fails to do so) it will send an event-message to your target Messenger to notify you (in Qt, it will just emit the appropriate signal).
-
Sending messages.
To send a message to the server, just call the MessageTransceiverThread's SendMessageToSessions() method. This method will return immediately, but the message you specify will be placed in an outbound-message queue for sending as soon as possible. Messages are passed in using a MessageRef reference object, to avoid needless data-copying. For example:
MessageRef newMsg = GetMessageFromPool('HELO');
if (newMsg())
{
newMsg()->AddString("testing", "please");
if (myTransceiver.SendMessageToSessions(newMsg) != B_NO_ERROR) printf("Couldn't send message!\n");
}
else printf("Couldn't allocate a message to send... out of memory!\n");
-
Receiving messages
If you're using Qt, this is easy--whenever a Message arrives from the server, the MessageReceived() signal will be emitted and your connected object can act on it. For AtheOS and BeOS, the process is slightly more involved: Whenever a new Message arrives from the server, a MUSCLE_THREAD_SIGNAL BMessage will be sent to you via the Messenger you specified in the MessageTransceiverThread constructor. When you receive such a message, your Looper should do something like this:
MessageRef msg;
uint32 code;
while(myTransceiver.GetNextEventFromInternalThread(code, &msg) >= 0)
{
switch(code)
{
case MTT_EVENT_INCOMING_MESSAGE:
{
Message * pMsg = msg(); // Get access to the reference's held Message object.
HandleMessage(pMsg); // Do whatever you gotta do
/* do NOT delete (pMsg). It will be deleted for you. */
}
break;
case MTT_EVENT_SESSION_CONNECTED:
printf("Connection to server was successful!\n");
break;
case MTT_EVENT_SESSION_DISCONNECTED:
printf("Disconnected from server, or connection failed!\n");
break;
}
};
-
Disconnecting
When you've had enough of chatting with the server, you can end your session by calling ShutdownInternalThread() on the MessageTransceiverThread object, and then deleting it. Or if you wish to reuse the MessageTransceiverThread again, call Reset() on it, and it's ready to use again, as if you had just created it.
The Single-Threaded messaging API (available on all platforms)
For code that needs to run on platforms other than Qt/BeOS/AtheOS (or even for Qt/BeOS/AtheOS code where you don't want to spawn an extra thread), you can use the single-threaded messaging API, as defined by the DataIO and MessageIOGateway classes. These classes allow you to decouple your TCP data transfer calls from your message processing calls, and yet still keep the same general message-queue semantics that we know and love.
To create a connection to the MUSCLE server, you would first make a TCP connection using standard BSD sockets calls (see portablereflectclient.cpp for an example of this). Once you have a connected socket, you would use it to create a TCPSocketDataIO, which you would use to create a MessageIOGateway object:
MessageIOGateway gw; // create the gateway
gw.SetDataIO(DataIORef(new TCPSocketDataIO(mysocketfd, false), NULL)); /* tell the gateway to use our TCP socket */
This gateway allows you to enqueue outgoing Message or dequeue incoming Messages at any time by calling AddOutgoingMessage() or GetNextIncomingMessage(), respectively. These methods are guaranteed never to block. Like the MessageTransceiverThread, the MessageIOGateway uses MessageRef objects to handle the freeing of Messages when they are no longer in use.
To actually send and receive TCP data, you need to call DoOutput() and DoInput(), respectively. These methods will send/receive as many bytes of TCP data as they can (without blocking), and then return B_NO_ERROR (unless the connection has been cut, in which case they will return B_ERROR). Because these methods never block (unless your TCPSocketDataIO is set to blocking I/O mode, which in general it shouldn't be), you will need to employ select() or some other method to keep your event loop from using 100% CPU time while waiting to send or receive data. Here is an example event loop that does this:
int mysocketfd = Connect("servername.serverdomain.com", 2960); // get a fresh TCP socket connection
MessageIOGateway gw;
gw.SetDataIO(DataIORef(new TCPSocketDataIO(mysocketfd, false), NULL));
bool keepGoing = true;
struct fd_set readSet, writeSet;
while(keepGoing)
{
FD_ZERO(&readSet);
FD_ZERO(&writeSet);
FD_SET(mysocketfd, &readSet);
if (gw.HasBytesToOutput()) FD_SET(mysocketfd, &writeSet);
if (select(mysocketfd+1, &readSet, &writeSet, NULL, timeout) < 0)
{
perror("select() failed");
keepGoing = false;
}
bool readyToWrite = FD_ISSET(mysocketfd, &writeSet);
bool readyToRead = FD_ISSET(mysocketfd, &readSet);
/* Do as much TCP I/O as possible without blocking */
bool writeError = ((readyToWrite)&&(gw.DoOutput() < 0));
bool readError = ((readyToRead)&&(gw.DoInput() < 0));
if ((readError)||(writeError)) keepGoing = false;
/* handle any received messages */
MessageRef msg;
while(gw.GetNextIncomingMessage(msg) == B_NO_ERROR)
{
printf("Received incoming TCP Message:\n");
if (msg()) msg()->PrintToStream(); // handle message here
}
}
/* note: don't call CloseSocket(mysocketfd), as the TCPSocketDataIO destructor will do it for you */
printf("Connection was closed!\n");
Alternatively, you can set the blocking-I/O parameter in the TCPSocketDataIO object to true, and use blocking I/O instead. If you do that, then you don't have to deal with the complexities of select()... but then it becomes difficult to coordinate sending and receiving at the same time (i.e. how do you call DoOutput() if you are blocked waiting for data in DoInput()?)
Message semantics for client/server connections
Regardless of whether you are sending and receiving messages with a MessageTransceiverThread with direct calls to a IOGateway, the result looks the same to the program at the other end of the TCP connection: It always sees a just a sequence of Message objects. How that program acts on those messages is of course up to it. However, the servers included in this archive do have some minimal standard semantics that govern how they handle the messages they receive. The following sections describe those semantics.
DumbReflectSession semantics
If you are connected to a MUSCLE server that was compiled to use the DumbReflectSession class to handle its connections, then the semantics are extremely simple: Any Message you send to the server will be sent on, verbatim, to every other connected client. (Sort of a high-level version of Ethernet broadcast packets). This may be useful in some situations, but for applications where bandwidth is an issue you'll probably want to use the "regular" server with StorageReflectSession semantics.
StorageReflectSession semantics
The StorageReflectSession-based server (a.k.a. "muscled") is much more powerful than the DumbReflectSession server, for two reasons: First, it makes intelligent decisions about how to route client messages, so that your messages only go to the clients you specify. The second reason is because this server allows you to store messages (semi-permanently; they are retained for as long as you remain connected) in the server's RAM, where other clients can access them without having to communicate with you directly. If you imagine a situation where the server is running on 100Mbps Ethernet, and the clients are connecting through 28.8 modems, then you can see how this can be useful.
The StorageReflectSession server maintains a single tree data structure very much like the filesystem of your average desktop computer. Although this data structure exists only in memory (nothing is ever written to the server's disk), it shares many things in common with a multi-user file system. Each node in the tree has an ASCII label that uniquely identifies it from its siblings, and also contains a single Message object, which client machines may get or set (with certain restrictions). The root node of the tree contains no data, and is always present. Nodes underneath the root, on the other hand, may appear and dissappear as clients connect and disconnect. The first level of nodes beneath the root are automatically created whenever a client connects to the server, and are named after the host IP address of the client machine that connected. (For example, "192.168.0.150"). The second level of nodes are also automatically created, and these nodes are given unique names that the server makes up arbitrarily. (This second level is necessary to disambiguate multiple connections coming from the same host machine) The number of level 2 nodes in the tree is always the same as the number of currently active connections ("sessions") on the server.
___________'/'_________ (level 0 -- "root")
| |
192.168.0.150 132.239.50.13 (level 1 -- host IP addresses)
| | |
3217617 3217618 1829023 (level 2 -- unique session IDs)
| |
SomeData MoreData (level 3 -- user data nodes)
| |
RedFish BlueFish (level 4)
Levels 1 and 2 of the tree reflect two simultaneous sessions connected from 192.168.0.150, and one connection from 132.239.50.13. In levels 3 and 4, we can see that the sessions have created some nodes of their own. These "user-created" nodes can be named anything you want, although no two siblings can have the same name. Each client may create data nodes only underneath its own "home directory" node in level 2--you aren't allowed to write into the "home directories" of other sessions. However, any client may read the contents of any node in the system.
As in any good filesystem (e.g. UNIX's), nodes can be identified uniquely by a node-path. A node-path is simply the concatenation of all node names from the root to the node, separated by '/' characters. So, the tree in the above example contains the following node-paths:
/
/192.168.0.150
/192.168.0.150/3217617
/192.168.0.150/3217617/SomeData
/192.168.0.150/3217618
/132.239.50.13
/132.239.50.13/1829023
/132.239.50.13/1829023/MoreData
/132.239.50.13/1829023/MoreData/RedFish
/132.239.50.13/1829023/MoreData/BlueFish
Creating or modifying nodes in your subtree (a.k.a. uploading data)
One thing most clients will want to do is create one or more new nodes in their subtree on the server. Since each node contains a Message, creating a node is the same thing as uploading data to the server. To do this, you send the server a PR_COMMAND_SETDATA message. A single PR_COMMAND_SETDATA message can set any number of new nodes. For each node you wish to set, simply AddMessage() the value you wish to set it to, with a field name equal to the path of the node relative to your "home directory". For example, here's what the client from 132.239.50.13 could have done to create the MoreData, RedFish, and BlueFish nodes under his home directory:
Message redFishMessage('RedF'); // these messages could contain data
Message blueFishMessage('BluF'); // you wish to upload to the server
MessageRef msg = GetMessageFromPool(PR_COMMAND_SETDATA);
if (msg())
{
msg()->AddMessage("MoreData/RedFish", redFishMessage);
msg()->AddMessage("MoreData/BlueFish", blueFishMessage);
myMessageTranceiver->SendMessageToSessions(msg);
}
else printf("Out of memory?!\n");
Note that the "MoreData" node did not need to be explicitely created in this message; the server will see that it doesn't exist and create it before adding RedFish and BlueFish to the tree. (Nodes created in this way have empty Messages associated with them). If 132.239.50.13 later wants to change the data in any of these nodes, he can just send another PR_COMMAND_SETDATA message with the same field names, but different messages.
Accessing nodes in the tree (a.k.a. downloading data)
If you want to find out the current state of one or more nodes on the server, you should send a PR_COMMAND_GETDATA message. In this PR_COMMAND_GETDATA message, you should add one or more strings to the PR_NAME_KEYS field. Each of these strings may specify the full path-name of a node in the tree that you are interested in. For example:
MessageRef msg = GetMessageFromPool(PR_COMMAND_GETDATA);
if (msg())
{
msg()->AddString(PR_NAME_KEYS, "/192.168.0.150/3217617/SomeData");
msg()->AddString(PR_NAME_KEYS, "/132.239.50.13/1829023/MoreData/RedFish");
msg()->AddString(PR_NAME_KEYS, "/132.239.50.13/1829023");
myMessageTranceiver->SendMessageToSessions(msg);
}
Soon after you sent this message, the server would respond with a PR_RESULT_DATAITEMS message. This message would contain the values you asked for. Each value is stored in a separate message field, with the field's name being the full node-path of the node, and the field's value being the Message that was stored with that node on the server. So for the above request, the result would be:
Message: what = PR_RESULT_DATAITEMS numFields = 3
field 0: name = "/192.168.0.150/3217617/SomeData" value = (a Message)
field 1: name = "/132.239.50.13/1829023/MoreData/RedFish" value = (a Message)
field 2: name = "/132.239.50.13/1829023" value = (an empty Message)
Pattern matching in node paths
Of course, not all the nodes you specified may actually exist on the server; if the server cannot find a node that you requested it simply won't add it to the PR_RESULT_DATAITEMS message it sends you. Thus it's possible to get back an empty PR_RESULT_DATAITEMS message if you're unlucky.
The above method of retrieving data is okay as far as it goes, but it only works if you know in advance the node-path(s) of the data you want. But in the real world, you won't usually know e.g. the host addresses of other connected clients. Fortunately, the MUSCLE server understands wildcard patterns in the node-paths you send it. Wildcarding allows you to specify a pattern to watch for rather than a particular unique string. A detailed discussion of pattern matching is outside the scope of this document, but if you've used any UNIX-style shell much at all you probably have a good idea how they work. For example, say we wanted to know the host address of every machine connected to the server:
MessageRef msg = GetMessageFromPool(PR_COMMAND_GETDATA);
if (msg())
{
msg()->AddString(PR_NAME_KEYS, "/*");
myMessageTranceiver->SendMessageToSessions(msg);
}
The "/*" pattern in the PR_NAME_KEYS field above matches both "/192.168.0.150" and "/132.239.50.13" in the tree, so we would get back the following:
Message: what = PR_RESULT_DATAITEMS numFields = 2
field 0: name = "/192.168.0.150" value = (an empty Message)
field 1: name = "/132.239.50.13" value = (an empty Message)
Or, perhaps we want to know about every node in every session's home directory that starts with the letters "Som". Then we could do:
msg()->AddString(PR_NAME_KEYS, "/*/*/Som*");
And so on. And of course, you are still able to add multiple PR_NAME_KEYS values to a single PR_COMMAND_GETDATA message; the PR_RESULT_DATAITEMS message you get back will contain data for any node that matches at least one of your wildcard patterns.
One more detail: Since patterns that start with "/*/*" turn out to be used a lot, they can be made implicit in your path requests. Specifically, any PR_NAME_KEYS value that does not start with a leading '/' character is taken to have an implicit '/*/*/' prefix. So doing
msg()->AddString(PR_NAME_KEYS, "Gopher");
is semantically equivalent to doing
msg()->AddString(PR_NAME_KEYS, "/*/*/Gopher");
Deleting nodes in your subtree
To remove nodes in your subtree, send a PR_COMMAND_REMOVEDATA message to the server. Add to this message one or more node-paths (relative your session directory) indicating the node(s) to remove. These paths may have wildcards in them. For example, if 132.239.50.13 wanted to remove all nodes from his subtree, he could do this:
MessageRef msg = GetMessageFromPool(PR_COMMAND_REMOVEDATA);
if (msg())
{
msg()->AddString(PR_NAME_KEYS, "MoreData/RedFish");
msg()->AddString(PR_NAME_KEYS, "MoreData/BlueFish");
msg()->AddString(PR_NAME_KEYS, "MoreData");
myMessageTranceiver->SendMessageToSessions(msg);
}
or this:
msg()->AddString(PR_NAME_KEYS, "MoreData"); /* Removing a node implicitely removes its children */
or even just this:
msg()->AddString(PR_NAME_KEYS, "*"); /* wildcarding */
You can only remove nodes within your own subtree. You can add as many PR_NAME_KEYS strings to your PR_COMMAND_REMOVEDATA message as you wish.
Sending messages to other clients
Any message you send to the server whose 'what' value is not one of the PR_COMMAND_* constants is considered by the server to be a message meant to be forwarded to the other clients. But which ones? Again, the issue is decided by using pattern matching on node-paths. The server will examine your message for a PR_NAME_KEYS string field. If it finds one (or more) strings in this field, it will use these strings as node-paths; any other client whose has one or more nodes that match your node-path expressions will receive a copy of your message. For example:
MessageRef msg = GetMessageFromPool('HELO');
if (msg())
{
msg()->AddString(PR_NAME_KEYS, "/192.168.0.150/*");
myMessageTranceiver->SendMessageToSessions(msg);
}
would cause your 'HELO' message to be sent to all sessions connecting from 192.168.0.150. Or, more interestingly:
msg()->AddString(PR_NAME_KEYS, "/*/*/Gopher");
Would cause your message to be sent to all sessions who have a node named "Gopher" in their home directory. This is very handy because it allows sessions to "advertise" for which types of message they want to receive: In the above example, everyone who was interested in your 'HELO' messages could signify that by putting a node named "Gopher" in their directory.
Other examples of ways to address your messages:
msg()->AddString(PR_NAME_KEYS, "/*/*/J*")
Will send your message to all clients who have a node in their home directory whose name begins with the letter 'J'.
msg()->AddString(PR_NAME_KEYS, "/*/*/J*/a*/F*")
This (contrived) example would send your message only to clients who have something like "Jeremy/allen/Friesner" present...
msg()->AddString(PR_NAME_KEYS, "Gopher");
This is equivalent to the "/*/*/Gopher" example used above; if no leading slash is present, then the "/*/*/" prefix is considered to be implied.
msg()->AddString(PR_NAME_KEYS, "Gopher");
msg()->AddString(PR_NAME_KEYS, "Bunny");
This message will go to clients who have node named either "Gopher" or "Bunny" in their home directory. Clients who have both "Gopher" AND "Bunny" will still only get one copy of this message.
If your message does not have a PR_NAME_KEYS field, the server will check your client's parameter set for a string parameter named PR_NAME_KEYS. If this parameter is found, it will be used as a "default" setting for PR_NAME_KEYS. If a PR_NAME_KEYS parameter setting does not exist either, then the server will resort to its "dumb" behavior: broadcasting your message to all connected clients.
Subscriptions (a.k.a. automatic change notification triggers)
Theoretically, getting and setting data nodes is all that is necessary for meaningful client-to-client data transfer to take place. Realistically, though, it would suck. After all, what good is it to download the data from a bunch of nodes if that data might be changed 50 milliseconds after it was sent to you? You'd end up having to issue another PR_COMMAND_GETDATA message every few seconds just to make sure you had the latest data. Now imagine fifty simultaneously connected clients doing that. No, that would never do.
To deal with this problem, the StorageReflection Server allows your client to set "subscriptions". Each subscription is nothing more than the node-path of one or more nodes that your client is interested in. The path format and semantics of a subscription request are exactly the same as those in a PR_COMMAND_GETDATA message, but the way you compose them is quite different. Here is an example:
MessageRef msg = GetMessageFromPool(PR_COMMAND_SETPARAMETERS);
if (msg())
{
msg()->AddBool("SUBSCRIBE:/*/*", true);
myMessageTranceiver->SendMessageToSessions(msg);
}
The above is a request to be notified whenever the state of a node whose path matches "/*/*" changes (which is actually the same as being notified whenever another session connects or disconnects--very handy for some applications). Note that the subscription path is part of the field's name, not the field's value. Note also that the field has been added as a boolean. That actually doesn't matter; you can add your subscribe request as any type of data you wish--the value won't even be looked at, it's only the field's name that is important.
As soon as your PR_COMMAND_SETPARAMETERS message is received by the server, it will send back a PR_RESULT_DATAITEMS message containing values for all the nodes that matched your subscription path(s). In this respect, your subscription acts similarly to a PR_COMMAND_GETDATA message. But the difference is that the server keeps your subscription strings "on file", and afterwards, every time a node is created, changed, or deleted--and its node-path matches at least one of your subscription paths, the server will automatically send you another PR_COMMAND_DATAITEMS message containing the message(s) that have changed, and their newest values. Note that each PR_COMMAND_DATAITEMS message may have more than one changed node in it at a time (i.e. if someone else changes several nodes at a time).
When the server wishes to notify you that a node matching one of your subscription paths has been deleted, it will do so by adding the node-path of the deceased node to the PR_NAME_REMOVED_DATAITEMS field of the PR_RESULT_DATAITEMS message it sends you. Again, there may be more than one PR_NAME_REMOVED_DATAITEMS value in a single PR_RESULT_DATAITEMS message.
Cancelling Subscriptions
Cancelling a subscription is just the same as removing any other parameter from your parameter set; indeed subscriptions are just parameters whose names start with the magic prefix "SUBSCRIBE:". As such, see the next section on setting and removing parameters for how to do this. (One caveat: since parameters to remove are specified with pattern matching, you may need to escape any wildcard characters in your SUBSCRIBE: string to avoid removing additional parameters that you didn't intend to. There is a utility function in StringMatcher.h that you can use to do this if you like)
Setting parameters
In addition to data-node storage, each client holds a set of name-value pairs called its parameter set. These parameters are used by the session to control certain session-specific policies. The names in this set may be any ASCII string (although only certain names are actually paid attention to by the server), and the values may be of any type that is allowed as a field in a Message. To set or modify a parameter for your session, just send a PR_COMMAND_SETPARAMETERS message to the server with the names and values included. For example:
MessageRef msg = GetMessageFromPool(PR_COMMAND_SETPARAMETERS);
if (msg())
{
msg()->AddBool(PR_NAME_REFLECT_TO_SELF, true);
// enable wildcard matching on my own subdirectory
msg()->AddString(PR_NAME_KEYS, "/*/*/Gopher");
// set default message forwarding pattern
msg()->AddBool("SUBSCRIBE:/132.239.50.*", true);
// add a subscription to nodes matching "/132.239.50.*"
msg()->AddBool("SUBSCRIBE:*", true);
// add a subscription to nodes matching "/*/*/*"
msg()->AddInt32("Glorp", 666);
// other parameters like this will be ignored
myMessageTranceiver->SendMessageToSessions(msg);
}
The fields included in your message will replace any like-named fields already existing in the parameter set. Any fields in the existing parameter set that aren't specified in your message will be left unaltered.
Getting the current parameter set
If you wish to know the exact parameter set that your session is currently operating under, send a PR_COMMAND_GETPARAMETERS message to the server. This message need not contain any fields at all; only the 'what' code is looked at. When the server receives a PR_COMMAND_GETPARAMETERS message, it will respond by sending you back a PR_RESULT_PARAMETERS message that contains all the name->value pairs in your parameter set as fields.
Removing parameters
To remove one or more parameters, send a PR_COMMAND_REMOVEPARAMETERS message. This message should contain a PR_NAME_KEYS field containing one or more strings that indicate which parameters to remove. For example:
MessageRef msg = GetMessageFromPool(PR_COMMAND_REMOVEPARAMETERS);
if (msg())
{
msg()->AddString(PR_NAME_KEYS, PR_NAME_REFLECT_TO_SELF);
// disable wildcard matching on my own subdirectory
msg()->AddString(PR_NAME_KEYS, "SUBSCRIBE:*");
// removes ALL subscriptions (compare with "SUBSCRIBE:\*" which would only remove one)
myMessageTranceiver->SendMessageToSessions(msg);
}
Recognized parameter names
Currently there are only a few parameters that whose values are acted upon by the StorageReflect Server. These are:
- PR_NAME_KEYS
If present and set to a string value, this value is used as the default pattern to match for determining which sessions user messages (i.e. messages whose 'what' code is not one of the PR_COMMAND_* constants) should be routed to. See the section on "Sending Messages to Other Clients" for details. Defaults to unset (which is equivalent to "/*/*").
- PR_NAME_REFLECT_TO_SELF
If present, all wild-card pattern matches will include your own session's nodes in their result sets. If not present (the default), only nodes from other sessions will be returned to your client. This field may be of any type or value; only its existence/non-existence is looked at.
- "SUBSCRIBE:..."
Any string parameter whose name starts with the prefix "SUBSCRIBE:" is treated as a subscription request. See the section on Subscriptions for details.
- PR_NAME_SUBSCRIBE_QUIETLY
This isn't actually a parameter itself, but if added to the same PR_COMMAND_SETPARAMETERS message as one or more "SUBSCRIBE:" entries, it will suppress the initial notification of the new subscription data. That is to say, if you add an item with this name to your message, you won't receive the values of the nodes you subscribed to until the next time they are changed. (remember that by default you would receive the "current' values of the nodes immediately)
(Optional) Indexed Nodes and Orderly Children
New for v1.40 of muscled is support for "indexed" nodes. An indexed node is the same as any other node, except that it maintains an ordered index of some or all of its children. In most cases, this index is unecessary, so by default no indexing is done. There are some cases where the ordering of child nodes is important, however--for example, if you are advertising to other clients a list of instructions that must be executed in order, and the list too large to upload an entire new list every time the list changes.
To enable indexing for a node, simply add child nodes to it using PR_COMMAND_INSERTORDEREDDATA instead of the usual PR_COMMAND_SETDATA messages. In a PR_COMMAND_INSERTORDEREDDATA message, you specify the parent node that the new child node is to be added to, and the data/Message the new child node will contain--but muscled will be the one to assign the new child node an (algorithmically generated) name. Generated names are guaranteed to start with a capital 'I'. Muscled will add the child node in front of a previously added indexed child whose name you specify, or will add it to the end of the index if you specify sibling name that is not found in the index.
Here is an example PR_COMMAND_INSERTORDEREDDATA message that will add an indexed child node to the node "myNode":
Message imsg(PR_COMMAND_INSERTDATA);
imsg.AddString(PR_NAME_KEYS, "myNode"); // specify node(s) to add insert the child node(s) under
Message childData('Chld'); // any data you want to store can be placed in this message....
imsg.AddMessage("I4", childData); // add the new node before I4, if I4 exists. Else append to the end.
If myNode already contains an indexed child with the name I4, the new node will be inserted into the index just before I4. If I4 is not found in the index, the new node will be appended to the end of the index. If you want to be sure that your new child is always added to the end of the index, you can just AddMessage() using a field name that doesn't start with a capital I.
You are allowed to specify more than one parent node in PR_NAME_KEYS (either via wildcarding, or via multiple PR_NAME_KEYS values)--this will cause the same child nodes to be added to all matching nodes. You are also allowed to specify multiple child messages to add in a single INSERTORDEREDDATA message (either by adding sub-messages under several different field names, or by adding multiple sub-messages under a single field name).
When a node contains an index (i.e. when it has at least one child under it that was added via PR_COMMAND_INSERTORDERREDDATA) any clients that are subscribed to that node will receive PR_RESULT_INDEXUPDATED messages when the index changes. These messages allow the subscribed clients to update their local copy of the index incrementally. Each PR_RESULT_INDEXUPDATED message will contain one or more string fields. Each string field's name will be the fully qualified path of the indexed node whose index has been changed. Each string/value in a given string field represents a single operation on the index. An example message might look like this:
Message: this=0x800c32f8, what='!Pr4' (558920244/0x21507234), entryCount=1, flatSize=79
Entry: Name=[/spork/0/hello], CountItems()=4, TypeCode()='CSTR' (1129534546) flatSize=40
0. [c]
1. [i0:I0]
2. [i1:I1]
3. [r1:I1]
The first letter of each string is an opcode, one of the INDEX_OP_* constants defined in StorageReflectConstants.h. Here we see that the first instruction has a 'c', or INDEX_OP_CLEARED, indicating that the index was cleared. The next instruction, i0:I0 starts with a INDEX_OP_ENTRYINSERTED, and indicates that a child node named I0 was inserted at index 0. After that, a child node named I1 was inserted at index 1. Lastly, the INDEX_OP_ENTRYREMOVED opcode ('r') indicates that the node at index 1 (I1) was then removed from the list. By parsing these instructions, the client can update its own local index to follow that of the server.
Note that the index only contains node names and ordering information; the actual node data is kept in the child nodes, in the normal fashion. So most clients will want to subscribe to both the indexed parent node, and its children, in order to display the data that the index refers to.
An indexed node will also send the contents of its index (in the form of a PR_RESULT_INDEXUPDATED message with a INDEX_OP_CLEARED op code, followed by one or more INDEX_OP_ENTRYINSERTED opcodes) to any client that requests it via the PR_COMMAND_GETDATA command. This message is sent in addition to the regular PR_RESULT_DATAITEMS message.
To remove a node entry from the index, simply remove the delete the child node in the normal fashion, using a PR_COMMAND_REMOVEDATA message. You can, of course, update a child node's data using a PR_COMMAND_SETDATA message, without affecting its place in the index.
One last note is that index data is always sent to all clients that ask for it; it is even sent to the client who created/owns the indexed node. That is to say, the PR_NAME_REFLECT_TO_SELF attribute may be considered always set as far as index data is concerned. This is because the index is created on the server side, and so not even the client-side initiator of the index creation can be exactly sure of the index's state. It's best for clients not to make assumptions about the contents of the index, and update their local indices based solely on the PR_RESULT_INDEXUPDATED messages they receive from the server.