NestFoo
From NESTFE Wiki
NestFoo Software Architecture
Game
One-Flag Capture
- A referee randomly chooses
- A flag position
- An evader starting point
- An evader exit point
- A pursuer starting point
- Only the evaders are informed of the flag position and exit point and may not use the sensor network
- Only the pusuers are informed of their starting point and may use the sensor network
- The goal: the pusuers must catch the evaders before the evaders exit with the flag
Permutations
- Central intelligence
- Pursuers only use a remote base station
- Walkie talkies to people in the field
- Special forces
- Pursuers have PDA's
- Since we have line of sight to most of the large playing field, assume a foggy day
Scenario: Objects are tracked
Centralized multi-object tracking from gathered detections. The base station receives detection events from the network and uses PC-side tools to unpack the messages, to process the detection to events, and to visualize the events.
- The multiple object tracking algorithm requires detection position and detection time
- The detection messages could look something like this
struct detection_event {
uint16_t source_node_address;
uint32_t event_time;
uint16_t event_strength;
int32_t event_x_position; //24.8 fixed point
int32_t event_y_position; //24.8 fixed point
};
- What are the types to be used for x-position and y-position? Fixed-point 24.8 int32_t seems to be sufficient. That allows > 16000km end-to-end range (> 1/3 the circumference of the Earth) to a precision of < 4mm.
- event_strength is there because if otherwise unused, could be interesting for the visualization
Scenario: Gateways collect data
- One to many gateways are positioned throughout the network to collect and disseminate data
- Bridge from 802.11 to the sensor network
- Provide a SerialForwarder interface over 802.11
- One aggregating base station connects to all gateways and merges the SerialForwarder's
- Visualization base stations connect to the aggregating base station to acquire/log data
Gateway hardware
Details at Tier 2.
Scenario: Nodes report detections
Report to one gateway
This describes the basic problem and the existing solution. We need to extend it as described in "Report to one of many gateways".
"Report to one gateway" is easily achieved by using one spanning tree built using Drain.
We must use Drain (instead of, say, MultiHopLQI) because we must use nested AM's to be able to send and receive messages independent of the routing protocol. Drip and Drain already have the nested AM architectured.
Drain exposes a combination of both the Send and SendMsg interfaces.
- Send for getting a pointer the data you're allowed to write to.
- SendMsg for sending the message to an adderss.
- ... this will become the defacto interface the proposed interface for all routing layers, because we need the abstraction between routing layer and application message.
- ... is Send + SendMsg the abstraction we really want?
We may need a NestRouting configuration(or something with a better name) that brings in the standard set of supported routing protocols, and wires in the "select routing protocol by address" filter.
Report to one of many gateways
Even though there are many gateways, it is assumed there is only one base station.
- Only one destination: "route to the base station through your gateway"
- Many restricted spanning trees, one from each gateway.
- Nodes route to the shortest gateway.
- Gateways send periodic maintenance messages.
- Non-roots always immediately pick a shorter tree
- Non-roots drop their gateway if it times out (and begin the selection process again with each matintenance message)
- Are their better recovery algorithmns?
- It's not necessary to uniquely identify each gateway, but it might help
- Can be implemented with slightly more logic in DrainM
- May want to fork it to XXXDrainM.nc (NestDrainM.nc?)
For instance, consider a large grid of nodes with gateways at each corner. A node in quadrant 3 would probably pick the bottom-left gateway. Each gateway sends tree maintenance messages with some period. A node has a timeout of some N periods for a parent. A node always picks a better (shorter path) parent regardless of timeout. If the node doesn't receive a maintenance message from its own tree within the timeout, it drops from the tree to be pick up by the maintenance messages from other gateways.
Report to a mobile agent
- Landmark routing with one landmark.
- Must be able to support multiple mobile destinations.
- Use a distinct Drain tree, different from the gateway trees
- e.g. it will need a distinct "Drain AM"
- Crumb trails are easy to implement.
- Maintenance of the tree may be harder
Addressing
- Overload the destination addresses in Drain, with the Landmark routing library tied in below.
- Support 256 destinations in the mobile address space
- Mobile destinations are 0xfeXX where 0xfeff means "mobile agent broadcast" and other 0xfeXX addresses route to a single mobile address.
Crumb trails
- Reverse routes.
- Must track: uint8_t mobile_destination, uint16_t crumb_seqno, uint16_t return_address
- A "build crumb trail" message records src_addr as it goes up the tree
- Each crumb trail has a seqno that invalidates the previous trails
Landmark behavior
- Terminates crumb trails
- Forwards messages
- Sends out periodic tree maintenance messages
- Broadcast to mobile agents can be handled naively
- Send distint messages down each crumb trail
Mobile agent behavior
- Must initiate the crumb trails
- either from passive eavesdropping or explicit local area pings
- May actively maintain its link to its crumb terminal
- Once the link degenerates below a threshold, it can look for a new crumb terminal
- Reseeds crumb trail if a mobile routing tree-build message is overheard
Robust landmark routing
Open questions.
- The landmark sends out periodic maintenance messages. What if the landmark fails?
- A new node taking over and sending out a new Drain tree build seems to behave correctly
- The network must properly detect the failure
- What if there's congestion at the landmark?
- How to spread out the bandwidth?
Scenario: Nodes detect
- To generate detection reports, nodes put their filtered sensor readings into a Hood neighborhood.
- Use a similar aggregation technique to two years ago, the leader ...
- ... silently elects itself if it has the highest sensor readings
- ... runs a detection algorithm on the readings
- ... sends a report to either the base station and/or the mobile agents, configuration in Registry (below)
Hood
An application defines a set of attributes. A neighborhood is a set of reflections (cached values from neighbors) of compile-time specified subset of those attributes. Attributes are pushed and quietly cached by those that overhear. Multiple, distinct neighborhoods can exist per application. The essence of the neighborhood is defined by the neighborhood manager that determines who gets in. A node doesn't know what neighborhoods it's a member of.
Attributes and reflections
uses interface Attribute<uint16_t> as PirAdc @registry("PirAdc");
uses interface Reflection<uint16_t> as PirAdcRefl @reflection("PirHood","PirAdc");
uses interface Reflection<location_t> as LocationRefl @reflection("PirHood","Location");
Events indicate when attributes or member reflections are updated
event void PirAdc.updated(uint16_t val); event void PirAdcRefl.updated(uint16_t nodeID, uint16_t val); event void LocationRefl.updated(uint16_t nodeID, location_t val);
The neighborhood PirHood is created by the build system and informs of neighbors added and removed
event void LightHood.addedNeighbor(uint16_t nodeID); event void LightHood.removedNeighbor(uint16_t nodeID);
A PirHoodManager uses the above and HoodManager to define the neighborhood
uses interface HoodManager @hood("PirAdcHood", 8, "PirAdc", "Location");
And HoodManager indicates when there is a new candidate to evaluate
event void HoodManager.newCandidate(uint16_t nodeID);
And the client code can choose to accept the node if there's room, and to possibly eject some other node in favor of the new one. Stale member can be removed, as well.
call HoodManager.acceptCandidate( nodeID ); call HoodManager.replaceNeighborWithCandidate( worstNode, nodeID ); call HoodManager.removeNeighbor(neighbor);
Hood user
Once the Hood exists, a separate module can periodically iterate over the members and send a report.
call Hood.numNeighbors(); neighbor = call Hood.getNeighborID(n); // neighbor value can be used as a parameter to reflections to get cached data
Scenario: Nodes sense
- Specific sensors are used to gather data, beit a generic ADC interface or a specialized sensor interface
- Sensor readings are filtered and interpretted as necessary
- Processed readings are put into a Registry attribute
- This allows a Hood to transparently manage the readings from there on out
- Jaein has written initial drivers. We don't need a sensor abstraction (HIL is fine), because it's not like these are or need to be cross-platform pieces by a long shot.
Sleep
The sensing component seems to demand the most out of node on-time. This is the logical place to start thinking when/if we want to incorporate the notion of duty-cycled operation. We currently have no plan in this respect, as yet, however.
Scenario: Log data to flash
The Nucleus event logger is unfinished. It may be the most convenient option when/if it is finished, as it allows printf-style logging statements.
In the meantime, we can use the LogStorage or BlockStorage STM25P components. In that case, log entries can/should be similar to messaging to maximize versatility and so we can use MIG to help us decode the log:
struct LogMsg {
uint8_t length;
uint8_t type;
uint32_t local_time;
uint32_t global_time;
uint8_t data[length-10];
};
Time is so important that we make it part of the standard header. Local time is included because it is guaranteed to have the least noise, though it may be uncalibrated with the rest of the network. Global time is included because the logger can do the transformation directly on local_time, to guarantee that the two time values are precisely correlated.
Scenario: Global time
Logged data prefers to be associated with a global time. Even so, global time may (will) have accuracy and jitter issues, so local time is always logged, as well.
VU timesync compiles and mostly works for TelosB, but has accurracy and jitter problems. We will include it now as a stub. Debugging it is left as an exercise for this base system.
Scenario: Bulk download logs
Sukun will work to port the GGB bulk data aggregation code to TelosB. It will download the raw binary log data to be decoded on the PC using MIG. GGB should take advantage of the multiple gateways to minimize download time.
- Basically, maintain one bulk download per restricted tree.
Scenario: Reconfigure the network
Nucleus attributes which may map into the Registry which may map into the RegistryStore. New values are disseminated through Drip.
Nucleus Attributes
Nucleus attributes expose to the base station a value already managed by a component
provides interface Attr<uint16_t> as RoutingParent @nucleusAttr("RoutingParent");
the Nucleus Attr interface is
interface Attr<t>
{
command result_t get(t* buf);
event result_t getDone(t* buf);
event result_t changed(t* buf);
}
and a component that provides must handle the get command, like this
command result_t RoutingParent.get(uint16_t* buf) {
memcpy(buf, ¤tParent, sizeof(uint16_t));
signal RoutingParent.getDone(buf);
return SUCCESS;
}
Registry Attributes
Registry attributes are cached values managed by the system. They are exposed in a common namespace, namely RegistryC.
interface Attribute<uint16_t> as Light @registry("Light");
interface Attribute<location_t> as Location @registry("Location");
and they must be wired up in their configuration, perhaps like this
TestRegistryM.Light -> RegistryC.Light; TestRegistryM.Location -> RegistryC.Location;
The Registry Attribute interface is
interface Attribute<t>
{
command t get();
command result_t set(t val);
command bool update();
command bool isValid();
event void updated(t val);
}
Some registry attributes can be marked to be saved/restored from Flash with a syntax mechanism to be determined. The will be a RegistryStore interface to commit() and restore() the registry to/from flash.
Registry attributes can be reflected in Hood's.
Scenario: Invoke behaviors on the network
At a minimum we would like to start and stop services, put the node to sleep, wake it up, and reboot it. We choose to address this with a general RPC tool that makes it easy to make interfaces RPC'able and makes it easy to write a common PC tool for all RPC's.
Currently provided interfaces and commands can be made RPC'able like this, as long as the interfaces have no events
provides interface SystemSleep @rpc(); provides command void changePeriod( uint32_t millis ) @rpc();
The marshalling module is automatically generated, as are XML scripts that define the RPC's. All together, this means that for instance in Python, an RPC could be invoked like this
base.rpc.SomeModule.changePeriod(2000)
We have prototypes of the various pieces, but it still needs to be completed and integrated.
Scenario: Debug the network
Nucleus attributes can be gathered over Drain. Nucleus provides MemGet (peek) and MemSet (poke) for inspecting and assigning arbitrary memory locations. StdControl interfaces can be made RPC'able to easily start and shutdown services.
Single components can be written that multiplex to a canonical registry item from a larger set of registry items, and an small RPC can be inserted that knows how to switch between them.
Scenario: Change applications
We expect that Deluge will provide space for all together 4 images: 1 for GoldenImage and 3 for applications.
Deluge to all
One or many injection points are already supported. Can optionally use Nucleus to query state of reprogramming before issuing reboot.
Deluge to some
Need a sandbox to 1) share the network among experiments and to 2) deploy new images on a small subset of the network to sanity check.
Partitions
- Partition -- a piece of the network on a different group id and/or frequency than the rest of the network.
We need a partition when we need to ensure a piece of the network will not and cannot interact with the rest (like Deluging onto a subset).
- Because a partition is disconnected from the network by construction, must instruct nodes to switch only after a timeout or at a rendezvous time
Defining a partition might be tricky
- Send individual messages. Possible but seems a little cumbersome and potentially error-prone
- Maybe sets ...
Sets
- Set -- each node can be a member of multiple sets (an internal bitvector), messages can be addressed to a praticular set, very much (exactly) like the "one" case for current Drip addressing
- Carving up the address space for group destinations, like 0xfdXX.
Nucleus already provides the Grouper component. Just need to integrate it into Drip.
- Nucleus currently allocates each node membership in 4 groups in a 16-bit namespace
- Alternative: bitvector -- each node can have membership in 128 groups in a 7-bit namespace
TOSBase partition support
Also because of partitions, TOSBase should be augmented to allow runtime configuration of Group ID and Radio Channel.
Scenario: Nodes self-localize
Localization is a separate application that lives on top of this described software architecture. We consider the case that we do not have enough ROM for localization and the other applications. RegistryStore is a set of registry keys that are common across many applications. Inside RegistryStore is the position derived from the Localization application, produced by application A and available to application B.
Appendix: Archived Discussion
NestFoo is the core NEST application on which all demos will be built. Its design and specification includes the base TinyOS application and basic command-line and plotting tools.
NestFoo Requirements Discussion
Commands
- sleep, wake, microprocessor reboot, grenade timer reboot
- start/stop of different application services
Registry
- variables that are changed by the system (health statistics)
- deluge stats, network usage stats, energy stats, etc...
- variables that are changed by the user (configuration parameters)
- localization params, algorithm params, names, etc...
- both should be get-able and set-able remotely
- ability to save and load the configuration parameters to/from local flash
Debugging
- remote query and set of RAM contents
System Support
- point data collection (Drain)
- point data dissemination (Drip)
- bulk data collection (RBR, "Reliable Bulk Read" from GGB)
- bulk data dissemination (Deluge)
Mote System Architecture
- multiple static base stations
- extra: multiple mobile base stations, possibly with landmark
Host-Level Tools
- interactive command line for control, query, and visualization
- using Python? UNIX command line? MATLAB?
NESTFoo Components
Mote-side
| Name | Location | ~Code Size | Description |
|---|---|---|---|
| Deluge | beta/Deluge/ | ? | Network reprogramming |
| NestFoo | contrib/nestfe/ | ? | Core application |
| Nucleus | contrib/nucleus/ | ? | System management |
| STM25P | beta/STM25P/ | ? | External flash driver
</table> Still sorting out Timesync. [edit] PC-side
|
