Page Content1. So just what is dataflow programming?
This page is a Work in progress.
This is just a little primer for novices.
Before I get flammed or accused of misrepresentation about dataflow itself by experts I would like to say that I hold no pretense whatsoever about the academically exactitude of the content on this page.
If you have comments with regards to this page and how to make it better/simple/more precise PLEASE feel free to correct me and I will be happy to alter the document.
1. So just what is dataflow programming?
In one line: its a software model which is concerned first and foremost with making explicit all the relationships within your software.
This then allows the processing itself to be implicit because changing a value in one system will cause an automatic reaction everywhere this needs to be reflected.
Here is a good reference if you want to wish to know more about the general idea of dataflow: http://en.wikipedia.org/wiki/Dataflow.
2. Why should I add or use dataflow within my projects?
Generally, it is very hard to break a properly implemented dataflow driven system.
The reason is simple. Because the processing is implicit, you do not have to explicitely remember to update all the dependencies any change may have throughout the system.
So less maintenace bugs. Basicaly.
2.2 Scalability (Long term development)
Using dataflow allows you to break up your architecture into many independent tasks, which all communicate without any effort on your part.
Because each piece of data is only responsible for managing itself, you can spend much more time working on the architecture and algorythms than fiddling around, trying to find bugs related to updates or new features.
Adding new parts to the system does not imply adding complexity to it, because each part is not responsible for understanding other parts. in traditional development each new complexity leads or tends to an exponential growth in maintenance complexity. Using Dataflow, this is reduced to a tendency towards linear growth in complexity.
2.3 Adds dynamicity to any system.
By definition, dataflow makes any system come alive. The relations between your systems is what drives the processing, and with more advanced engines (like liquid) this processing occurs only when it is really needed.
Depeding on your problem, dataflow may allow you to preset a very high number of possibilities and really only calculate the shortest path to any process. No time will be spent updating things which are not viewed, or used.
And because you can dynamically relink and build upon parts of your processes, you can allow your system to grow, adapt, and improve in real time with no real effort.
2.4 More easily fault Tolerant
Using dataflow techniques, with message passing, its easier to describe the data which is flowing within your system.
This then allows us to capture errors and propagate them without need for panic. Our relationships, if able, can simply recover or propagate these error themselves. This error management can be part of the implementation of your system, or be completely hidden within the message kernel and simply prevent your system from operating, until the error conditions are removed or fixed.
2.5 Allows parralel computing
without going into details.... using dataflow its very easy to split up your processes amongst independent processing units, because the processing itself is encapsulated into specific and organised tasks or "operations".
because you are able to determine if any of your data sources are valid of not, you can easily wait for your computing, or switch to another process. This can be done on several processors or machines in parralel and whenever a process realises its dependencies are resolved, that part of the processing, can be sent to any independent processing unit.
3. Is Dataflow programming complicated? or hard?
At its core expression DF programming is as easy as using functions or objects. In fact, they are a symmetric concept to functions and methods.
But its a different model, so it has its own specific challenges. It is much closer to how we as living being think. We build upon knowledge by relating it to past experience and changing it...
One challenge, which is somewhat opposite to functions and objects, is making sure your data updates does not cycle into updating itself infinitely.
Once you will have had a little practice, your tools will benefit from it, or will change perspective a little.
Many graphical tools use dataflow because its an obvious way to represent the independent but related operations.
4. Where is dataflow best used?
It can theoretically be used in any software, just like functions or object modeling can, but like those two other models, its sometimes harder to apply in some circumstances.
If functions are good at executing and objects are good at storing, I'd say dataflow is best at processing and at being dynamically reconfigured.
One good use of dataflow is in systems where the processing operations changes less but compute many things over and over with a simple change of input values. You basically setup your processing network and feed it with data.
Another good use of dataflow is in systems for which the processing itself can only be defined at run time, because it depends on the source data. In this case you will be creating, allocating, linking and deleting many operations on the fly, based on a controling system which detects (or is driven) to solve patterns or problems as they arrive.
Many GUI evnt engines provide some sort of limited dataflow, allowing you to interconnect some controls (widgets) and execute callbacks. but these are usually tied into an event-based processing model, which is more than often, push-processing based... meaning a click of a button can cause an entire tree of spanning things to update... even if not relevant.
page last updated: 17-Oct-2006