Skip to main content

Data Flow Diagrams – Are They Worth It?

A picture is worth a thousand words – somebody surely captured a lot in this saying. I am a Business Analyst and I write business requirements so where does this fit in? Well, it does fit in the context of all the diagrams that a BA can leverage to put forth business requirements in concise manner. In this article, let’s explore the world of data flow diagrams.

A neat and clear data flow diagram (or DFD) can depict a good amount of the system requirements graphically. Here, the system can be manual, automated, or combination of both. A DFD shows the inputs, outputs, and how the input data got converted to output data (by a process or calculation). Focus is always on the data and how it gets transformed. It can not represent timing of a data flow i.e. whether it constantly occurs in real time, once per week or once per month. There is also no indication of when a system would run.

In order to depict movement and transformation of data, DFD uses standardized notations. There are two commonly used styles of symbols, one set developed by Chris Gane and Trish Sarson and the other by Tom DeMarco and Ed Yourdon. The difference between these two styles is just on the drawing style and nothing else. Most people in western cultures read from left to right. Therefore, BAs create diagrams from left to right.

The below mentioned is the DeMarco – Yourdon set:

  • Circle – Process; can not create or consume data; start with verb
  • Arrows (Data Flow) – Data moving in or moving out
  • Rectangles (External Entity) – External entities to system and may or may not be part of the organization; data Source or data receiver
  • Parallel Lines (Data Store) – Data at rest; Form starting point of data model; data can be moved only by a process

Most business processes are complex enough to get encapsulated in a single DFD. Hence, sets of DFDs are used for this purpose. The first DFD captures the summarized process set or scope, and further DFDs decompose the processes in details. The first DFD is popularly known as “Context Diagrams”.

The context diagram defines the “context of the system” i.e. it defines how the business process or computer system interacts with its environment, primarily external entities. This diagram is best used in early stages of requirement elicitation and can aid in further probing of the requirement without losing focus. It can also be referred to at the time of system integration testing to verify and validate.

Since detailing is mentioned in further levels of the DFD, one can often wonder how much is enough? It is recommended to limit further detailing of a context diagram to three levels only. Hence, Context Diagram > Level 0 DFD > Level 1 DFD > Level 2 DFD and then stop. If you feel that the entire system is not getting captured in three levels, then go back to context diagram and check if it should be more summarized than it currently is.

One more concept to keep in mind while decomposing a context diagram is the conservation principle or balancing. This principle states that the input, outputs, and data flows of a higher-level DFD are to be conserved or balanced in the detailed DFD. For example, if a certain context diagram has input A and output B, then level 0 DFD will also have to have the same input A and same output B. This principle would ensure that faulty diagrams are not getting created as more process detailing is being achieved in further levels of DFD.

Also, keep in mind that as the requirement elicitation stage progresses, refinements to the DFD will occur. You can not create the perfect DFD in one go. Refinements happen in an iterative fashion as requirement understanding builds further.

There are many CASE tools available in the market that help you create readable, clutter free diagrams. Tools can help align notations, use straight lines and 90-degree lines, connect various data flows, balance or process naming convention and so on. All this enhances the visual appeal of the data flow diagram and makes it easy to browse. One more important concept is grouping related items together – the way swim lanes do in flow charts. Swim lanes partition the diagram into horizontal or vertical lanes that usually represent the entity that does the work of a process. These lanes help depict the related processes in an organized and clear manner. For example, lanes can depict how interactions happen between different department units participating in a DFD.

For some readers of this article, the notes mentioned could be a recap of what they already know. But the point is, how many of us still believe in using this simple tool in our requirement gathering activity? Most of the business analysts take data flow diagrams for granted and think that it is best left in university education, but that is definitely not the case. If you probe appropriately, you will realize that this structured analysis tool is used by the project team to draw their inference of the user cases. It can be used to check for completeness and, therefore, find missing requirements. They are also great in business process re-engineering projects since they provide a sneak peek of the processes in a simplified manner. Unfortunately, it is often the under utilized tool in the Business Analyst Toolkit.

The most remarkable feature of the wonderful data flow diagram is that they are simple, effective and easily understood, including by people from non-technical backgrounds who may not like overwhelming process depictions. Do use data flow diagrams liberally whenever you can on your next project or assignment!

Don’t forget to leave your comments below.

Aditi Sharma

Aditi Sharma, CBAP and PMP, has over 13+ years of experience in Business Analysis and Project Management. She has experience working with stakeholders from different geographies in Insurance domain, be it Life, General, or Statutory. Currently, she is working in Sydney, Australia, and enjoying every bit of it.