Okay, this is a little rough but here goes.  There are two main hubs 
of data flow involved in this system:

    The first is local to the embedded system and is implemented by the 
ipc_server daemon (source located in the commhub/ directory).

    The second is a WAN based synchronization service that is responsible
for striving for the best possible balance between always-ready and eventual
consistency between the database state stored on the bus (the client) and
the master database state maintained by the garage (the server). 

---------------------------------- Interprocess Communication Hub -----------

    This module provides majordomo-like subscription based interprocess 
communication between local modules using a UNIX domain socket in 
SOCK_SEQPACKET mode.  This server functions as a smart router/switch 
allowing each module to address any/all other modules with any
of the following addresing modes:

  1) Mailbox Name (this will deliver to every process that has subscribed to the named mailbox).
  2) "_BROADCAST" (this will deliver to all connected clients)
  3) ">nnn" (this will deliver the message to the client with PID == nnn)
  4) ":module" (this will deliver to all clients registered with the supplied module name).

    The server also manages a number of "special" mailboxes that execute
specific communication flow management functions.  These mailboxes are as
follows (these are #define'd string constants that resolve to the actual
special mailbox names):

  1) MAILBOX_SUBSCRIBE	(subscribes the sending client to the mailbox named in the payload)
  2) MAILBOX_UNSUBSCRIBE (unsubscribes the sending client to the specified mailbox)
  3) MAILBOX_BROADCAST (sends the message to all connected clients)
  4) MAILBOX_WIRETAP (payload == "ON" to select wiretap mode, anything else to deselect).

    When a client select wiretap mode it will receive an unmodified copy of
each and every message from that point forward (until wiretap mode is
deselected).  This includes all special management messages and PID or
module addressed message.  This allows for easy message flow debugging
(including debugging the communication hub itself) and for robust on-the-fly
selectable logging in the field without any need to recompile or even
interrupt the system.  This allows us to attach to a running system in any
state (including failure states) and transparently observe data flow between
modules.

    The server listens on a socket located at /tmp/commhub.  Clients connect
using the connect_to_message_server() function which returns a file
descriptor that represents the connection to the IPC server.  This function
makes the socket connection and registers the client process with the server
by both PID and module name.

    Once a client is registered, messages can be sent to the server by
defining a message structure and populating it with the prepare_message()
function call and then calling the send_message() function to dispatch the
message to the server.

    Incoming messages can be retrieved with the get_message() function after
a call to poll() indicates POLLIN on the fd for the connection to the
server. If there are no other file descriptors to poll, or a quick
non-blocking check is desirable, there is a provided poll wrapper called
message_socket_status() which can be passed the fd associated with the
connection and a mask of MSG_SEND | MSG_RECV which will specify whether a
message can be sent and/or whether there is a message waiting to be received 
on that connection at the present moment. 

    An important feature of this architecture is that it allows for the
modules communicating through the hub to be started, stopped, and replaced
independently and transparently.  If the modules are designed with sensible
opaque interfaces this allows for a remarkable about of flexibility in
implementation and testing as modules can be fashioned to function as test
jigs to test other individual modules or complex subsystems of modules. 
Test jigs can even be built to emulate fault conditions to test system
recovery and failure-workaround.

    Modules can also be built to hide hardware-servicing routines.  For
instance, there could be a module that was responsible for communicating
with every peripheral that lives in the passenger interface unit (RFID,
Magstripe, Cash Vault, and Passenger Facing Display).  This module can
simply multiplex and demultiplex the serial communications with those
peripherals and create mailboxes to receive any data that needs to be
transmitted to those peripherals and send any input received from those
peripherals to some interface-specified mailboxes (one per peripheral and/or
one per message type).  The beauty of this abstraction layer is two-fold. 
First, it allows things like rebooting the PIU to happen without the need to
interrupt or reload any other process (minimizing collateral damage from
unneccesary depdendencies).  The second place this pays off is in the case
where a hardware change causes a peripheral to migrate from one subsystem to
another:  In this case, only the modules that directly service I/O to those
peripherals need to be aware of this change... all other modules will
continue to use the agreed upon message passing interface and the fact that
the (for instance) cash-valut relay has been moved from the Passenger
Interface Unit to the Driver Interface Unit will not matter to any of the
other modules that need to make use of that service.

    This design was heavily influenced by the elegance of the Erlang
ERTS framework.  My goal here to make a very lightweight and portable
C-based system that provides a similar type of modularization and
abstraction such that we can have maximum ongoing implementation flexibility
and robust testing similar to that provided by the ERTS framework without
incurring the overhead (and hassle) of maintaining out own cross-compiled
ARM port of ERTS (and without compelling all the other developers to learn
Erlang (which with such a tight development schedule would be madness)).


-------- Client (bus) to Server (garage) Synchronization System ---------


    This module actually consists of a server component in the garage which
is responsible for maintaining a master database which is considered the
'absolute truth' and always contains the canonical system state.  There is a
second component which lives on the bus (the client) which is intermittently
connected to the garage (as limited by network availability).  The goal of
these two modules to to maintain synchronization with as great a degree of
accuracy as is practical while taking the following constraints into
account:

  1) Connectivity is spotty.  A client may loose touch with the server at
any time, and therefore must ALWAYS be in a state where it can function
autonomously with a reasonable degree of predictability (it must not confuse
or thwart the drivers or riders).  This means there is a possibility of
accepting a fare (for instance) on a pass that may have been used up during
the communication outage, or other such conditions.  We must always be sure
to err in favor of permitting 'normal' system functions.  It acceptable
to occasionally give out what turns out to be a free ride on a recently
expired pass, it is not acceptable to EVER reject a fare on a valid pass.  

  2) Due to the multitude of busses and the intermittent nature of the
connectivity, each client bus can only make relative declarations (Rider X
has used 1 ride, decrement their pass) and the burden of aggregating and
calculating the resulting system state lies on the server.  Each bus may
individually update its own state, but it must always allow an update from
the server to override any local changes.  An example scenario:

   2a) Rider-X boards Bus-Y with an n-Ride pass with 7 rides left.
   2b) Bus-Y decrements its LOCAL count for that pass to 6 rides.
   2c) Bus-Y transmits a message to the server "Rider-X's pass has used 1 ride"
   2d) Rider-X' (Rider-X's evil twin) gets on Bus-Z with another copy of the same pass.
   2e) Bus-Z decrements its LOCAL count for that pass to 6.
   2f) Bus-Z transmits a message to the server "Rider-X's pass has used 1 ride"
   2g) The server receives both decrement messages and transmits to ALL
       busses the message "Rider-X's pass now has 5 rides left"
   2h) Bus-Y overwrites its LOCAL copy of Rider-X's pass count with the new
       one from the server (5).
   2i) Bus-Y overwrites its LOCAL copy of Rider-X's pass count with the new
       one from the server (5).

  3) For this method to work, the server must serialize all incoming events
that have been transmitted from the individual busses and apply each change
as a transaction in the master database.  This is a related but distinct
process from the next step.

  4) In incrementing serial number must be kept for each change in the
master database.  Each time a client bus checks in with the server it should
supply the serial number of the last successfully integrated transaction (a
transaction is counted as successfully integrated by the client once the
local database has been updated and the changes have been successfully
sync'd to secondary storage).  The server mush then supply that client with
a batch of all changes with a serial number greater than the supplied key.
The client will then integrate those changes, synchronize to secondary
storage, and then increment its own key.  In this manner a client can go an
arbitrary amount of time without checking in and then receive a batch of all
messages that need processing in the mean time.

  5) Either end (server or client) always has the right to request a full
synchronization where the client database is wiped and replaced wholesale by
the server copy, and the serial number is updated to the newest in the
transmitted 'full' copy.  This can be invoked if a client hasn't received
incremental updates in some time, or it can be used as a troubleshooting
measure if there is a suspected transmission or storage error whereby the
client appears to have in incomplete or incorrect database.