eHive
Category Cross-Omics>Workflow Knowledge Bases/Systems/Tools
Abstract eHive is an Artificial Intelligent (AI) workflow system for genomic analysis.
It consists of a fault tolerant distributed processing system initially designed to support comparative genomic analysis, based on blackboard systems, network distributed autonomous agents, dataflow graphs and block-branch diagrams.
In the eHive system a MySQL database serves as the central blackboard and the autonomous agent, a Perl script, queries the system and runs jobs as required. The system allows you to define dataflow and branching rules to suit all your production pipelines (workflows).
eHive allows you to produce computationally demanding results in a reliable and efficient way with minimal supervision and high throughput.
eHive basic goals -- The basic goals of the system are:
1) To reduce the overhead of individual job processing;
2) Increase the maximum number of jobs that can be in-flight at any one time;
3) Provide fault tolerance; and
4) Allow concurrent run-time processing capable of implementing complex algorithms with branching and looping as part of the workflow.
eHive AI system design --
The design of the eHive system is specifically inspired by the descriptions of artificial life and incorporates several specific AI and workflow graph theories. In artificial life systems such as eHive, intelligent agents are created with a basic set of behavior rules that depend in part on the behavior of the agents closest to them.
This type of programming creates cooperativity among the agents and results in any given agent having the greatest behavioral effect on the agents closest to it in the system and little or No effect on an agent far away.
In these models, the system often exhibits global characteristics resulting from agents working together locally without explicit need for communication with all of the other agents working simultaneously in the system.
These global characteristics, which are Not explicitly programmed into the system, are termed “emergent behavior” and are a critical advantage of the eHive over other job scheduling methods. The rules for the eHive (and, indeed, the name of the system) are modeled on a living, active “honeybee” hive.
The manufacturers summarize the rules for each agent (i.e. “worker bee”) as follows:
1) Identify an available and appropriate work object;
2) Ensure any work objects arising from completion of the current work object can be identified by follow-on agents;
3) Work optimally throughout the lifespan; and
4) Die gracefully.
eHive Problem solving model -- The key concept to enable the eHive system is the blackboard model of problem solving.
Briefly, a blackboard model provides a means for knowledge sources (i.e. “workers”) to generate and store partial solutions to a problem without direct communication between the workers. Instead, the workers read and write information to a central globally accessible database (the “blackboard”).
In a blackboard model, the choice of worker type is based on the solution state at any given time. For example, if No tasks of type B can begin until at least 25% of tasks of type A have been completed, No type B workers will be created until the blackboard shows the completion of at least 25% of the type A tasks.
The strength of the blackboard model is that each part of the problem is solved by a specialist application in an incremental and opportunistic way which responds to the ongoing changes in the solution state.
In combination with the local model of behavior, a global pattern of system control emerges without the explicit need for a central controller to analyze the entire blackboard and make detailed decisions.
eHive Network distributed autonomous agents -- Within the eHive a collection of agents actually do the work. These agents can be classified as both collaborative agents and reactive agents. Collaborative agents generally work autonomously and may be required to cooperate directly with other agents in the system.
In the manufacturer’s application, the agents are distributed across the compute cluster and communicate only with the blackboard.
Reactive agents respond to the state of the environment (in this case the state of the blackboard) and thus may change their behavior based on the progression of the partial solution recorded on the blackboard. When their interactions are viewed globally, complex behavior patterns emerge from systems with reactive agents.
eHive Control structures -- The two (2) major control structures used for algorithm development within the eHive system are data flow graphs and block branch diagrams. Data flow graphs represent data dependencies among a number of operations required to solve a problem.
The block branch diagram (BBD) is a common aspect of object oriented software design. For each function, the BBD represents the control structure, input and output parameters, required data and dependent functions.
eHive Results -- Scheduling systems for large compute clusters are generally based on the idea of a central job queue and a centralized job manager. Cluster nodes are “dumb” and need to be given explicit instructions for each and every job they will need to execute. This creates a bottleneck in the central controller.
The basic function of the eHive system is to move the job scheduling function away from the center, but still to retain the ability to monitor and track jobs. This is accomplished using the AI systems described above to create a behavior model based on an analogy of a honeybee queen in a hive and her worker bees.
In the eHive, the jobs are No longer “scheduled” by a central authority, but each autonomous worker created by the queen now employs an algorithm of “job selection and creation” based on observing the central Blackboard with different levels of granularity.
The result is a hive behavior system with a very small (3 msec) job overhead, fault tolerance and an ability to efficiently utilize Central Processing Unit (CPU) clusters with more than 1,000 processors.
Additionally, and significantly different than a central job scheduler, the eHive system is a full algorithm development platform.
It implements the concept of dataflow as well as branching and looping. It allows any algorithm that can be described in block-branch notation to be implemented as eHive processes including serial or parallel algorithms.
The eHive system can run in a fully automated manner and re-run jobs if they fail a reasonable number of times. Strictly speaking, the operators only need to make sure that the beekeeper is running and potentially fix any data issue if a job consistently fails.
Another important feature is the ability of a worker to run more than one job, avoiding the overhead involved in submitting and scheduling a job in the queuing system. This is especially relevant when many short jobs must be run but you still want to control the process at the job level.
In conclusion, the manufacturers have developed the eHive system to take full advantage of their compute farm when running their pipelines (workflows).
The system is flexible enough to run virtually any pipeline, from the simplest case running as a batch job throttling manager to scenarios where several eHive systems are interconnected and use different compute farms.
eHive can easily be used for other purposes and adapted to other compute farms with little or No effort.
System Requirements
Contact manufacturer.
Manufacturer
- European Bioinformatics Institute
- Wellcome Trust Genome Campus
- Hinxton, Cambridge, CB10 1SD, UK
Manufacturer Web Site eHive
Price Contact manufacturer.
G6G Abstract Number 20750
G6G Manufacturer Number 104154