Skip to main content

Author: Dan Tasker

Well-defined Data Part 5 – A Business Entity Identifier Attribute

Having explored the concept of business entity in Parts 2, 3, and 4 of this series, the objective of Part 5 is to examine one particular kind of attribute — the business entity identifier.

Its purpose is to uniquely identify an instance of a business entity. Users of an IT-based system are expected to have knowledge of, or access to, this value. The value is used to start down, or stay on, the ‘happy path’ of any business process that deals with a specific business entity instance.

The concept of business entity identifier is like, but not exactly the same, as the concept primary key. Every table in a relational database is expected to have a primary key. A primary key can, however, involve combining more than one column to achieve uniqueness. It also may not be exposed to business users.

IDs and Numbers

Business entity identifiers are all around us. Our wallets and mobile phones are filled with them — values that have been exposed by the organizations we deal with. If you see an attribute name that ends in “ID” or “Number” there’s a good chance that its values are intended to identify a specific instance either of that entity, or some instance somewhere. The assumption is that we will have that value available when interacting with the organization that produced that value. These values are embossed on our credit and debit cards. They are printed on our membership cards, our discount cards, our driver’s license. Every phone number, email address, and app-specific contact we record is a business entity identifier value. We use these values when we want to ‘reach’ a specific person or interact with the issuing organization.

From the perspective of an IT-based business system, there is no doubt that given sufficient attributes, such as name, address, phone, etc., we can uniquely identify entity instances representing a person or organization. Similarly, there are attributes of a product, a sale, or a location that, taken in combination, will lead us to the instance we are seeking. The point of a business entity identifier is that it’s a ‘one stop shop’ — a single value that, if known, will get a user to the specific instance they are looking for at a given point in time. 

Multi-fact Business Entity Identifiers

The simplest form of business entity identifier is one based on a numeric sequence (e.g. the last assigned number plus one). This simple form is often used where values only need to be unique within the organization, and there is no need for the identifier to be meaningful in any way. Entities such as Purchase Order, Store, and Asset fall into this category within many organizations.

An example of an identifier that contains at least one fact is Credit Card Number. It appears to cardholders to be just a number. But because that number needs to be unique across all organizations that issue similar cards, the first six digits of each value identifies the issuing organization. The digits that follow those six can be generated any way a given organization chooses. Typically, six or more of these are assigned based on the ‘last assigned value plus one’ algorithm.

The attribute Stock Keeping Unit Number (SKU) is an example of an identifier that utilizes multiple facts to make up unique values of the business entity Inventory Item. A retail clothing business might create its unique SKUs based on the combination of an item’s brand, clothing type, style, size and color. For example, the SKU Number ‘LEV-JN-ST-34-BL’ representing the inventory item Levi jeans, straight leg, waist size 34 in blue.

When a business entity identifier contains one or more facts, those facts should also be defined as separate attributes in the same or some other entity. This eliminates the need for business users to ‘pick apart’ an identifier to find a given fact. E.g. Obtain a list of all Jeans with a 34-inch waist.
NOTE: Any fact included in a business entity identifier should ideally involve values that will not change over time. Having ‘exposed’ the identifier to the business, dealing with communicating a change is not productive for anyone. If you have ever changed your phone number or email address, and needed to notify family and friends, you will have some appreciation of the resulting effort and inconvenience.


Optimizing User Experience Accessing a Business Entity Instance

Keeping the following four things in mind will help optimize the user experience when a business process needs to involve a single business entity instance:

  • Maximise exposure to users
  • Minimize manual entry
  • Avoid finding a wrong instance
  • Offer backup ‘find’ options

Maximize exposure to users — When a business entity includes a business entity identifier attribute, that value should be ‘exposed’ where it will best serve the business processes that deal with single instances. An Employee Number can be displayed on the employee ID badge, and printed on employee payslips. An Asset ID can be printed on a label attached to the asset. A retail product that has a unique Barcode value registered by the manufacturer should have that barcode and its numeric value printed on each instance of that product.

Minimize manual entry — More and more these days, business information systems are being ‘front-ended’ with data capture technology, web portals, and apps. Input devices can read a barcode, a magnetic stripe, an embedded chip, or a value broadcast via radio frequency (an RFID). The trend is also towards self-serve, where customers use web portals or apps to connect to a business information system. A user Logon ID is an example of a business entity identifier — one that requires minimum data entry effort, thanks to web browsers or mobile devices offering to remember the user’s value for a given site or app.

Avoid finding a wrong instance — In situations where technology is not able to provide the value needed and the user needs to resort to manually entering the value to access a business entity, one of the most common data entry errors is transposing one or more digits. This can result in an entity instance being found, but not the one wanted. E.g. wanting the instance with identifier value 12345 but accidentally entering the value 12354. A commonly used technique for avoiding this type of error is the inclusion of a check-digit when generating the identifier.

Offer backup ‘find’ options — For every happy path there are any number of alternate or exception paths. At least one of these paths should support the user finding the desired entity instance when the correct business entity identifier value is not available. The values of one or more other attributes (or relationships) need to be able to be used to find the desired instance.

NOTE: Just as the business entity identifier is not the same thing as a database table’s primary key, ‘other attributes used to find an instance’ is not the same thing as the database concept alternate key. An alternate key value, like the primary key, is intended to find exactly one instance of a table. The business capability of searching using ‘other attributes’ is intended to reduce the candidate instances down to a shortlist from which the desired instance can be determined. For example, where Passenger “Fred Jones” has lost their ticket, and therefore does not know their Passenger ID, searching based on the Flight Number and Flight Date can return a list of passengers that may contain only one passenger named “Jones, Fred”.

So, while Flight Number plus Name would not be a valid alternate key in a database (because it doesn’t guarantee uniqueness), using those two values can turn out to be ‘close enough’ to get us back onto the happy path in a business search scenario. Typically search capabilities for business entities offer multiple filter criteria and present several attributes useful for identifying the instance being searched for.

Well-defined Business Entity Identifier Attributes

From a well-defined data perspective, an essential part of defining a business entity is identifying an attribute that can act as its business entity identifier. A business entity identifier needs a business-friendly, meaningful name. For the sake of the IT-based system responsible for creating new instances of the identifier, any ‘facts’ within the identifier need to be described and, at some point, specified in detail with regard to position, length, and valid values and/or value ranges.

Click here for Part 6 — Attributes that Name

Well-defined Data Part 4 – Products Customers Sales and Locations

This article discusses entities supporting the concepts product, customer, sale, and location.

The names given to these entities varies depending on the line(s) of business an organization is in and, in particular, the organization’s sales processes.

Product-Related Entities

The concept of product within the context of this series was defined in Part 3 as covering both goods and services, where ‘goods’ are things a customer can take ownership of and ‘services’ are resources a customer uses for a period of time. Goods have inventory levels (i.e. one or more instances available for purchase). Services have resource availability, at designated times (i.e. trained or qualified individuals to perform the service, and/or equipment appropriate to fulfil the service).

An entity representing an organization’s products will do so as a type, a batch, or as an instance.

Product Type – The products offered for sale by many organizations are not unique. For example, clothes, furniture, electronics, and pre-packaged food found in retail stores. A product type entity is used to record the name, size, price, etc., where values for these attributes apply to all instances of a given product.

Product Batch
– Products that are mass-produced can have one or more things vary within their production process (e.g. involve batches of components or ingredients from different suppliers). When it’s critical to an organization to track these variations, a product batch entity is needed. Its purpose is to maintain details about critical variable elements. Each product instance produced as part of that batch is associated with its batch instance.

Product Instance
– Some goods products are, by their nature, unique. Examples include property (i.e. real estate) and works of art. Other goods start life as similar, mass-produced instances, but can take on individual characteristics over time as the result of normal use, instance-specific modifications, or damage.

When an organization sells used, modified, or damaged goods, an entity representing each product instance is needed. That entity may make reference to the instance’s product type, if available, to represent its time-independent attributes. The product instance entity will need attributes similar to those of the product type entity, where a value that was common when the product was new can change over time. Plus additional attributes will be needed in the product instance entity to classify and describe its current condition.

In the case of service products, a product instance entity is not for the purpose of representing changes over time. Its objective is to represent specific ‘offerings’ of the service. Examples include appointment time-slots, rental availability periods, a specific scheduled flight on a route, the designated time and venue for the screening of a film, a live performance, or sporting event.

Well-defined data about products involves recognizing which of the above levels are needed by the organization, and naming those entities appropriately. When multiple levels are involved, product details (attributes and relationships) are included at the appropriate level, including the linking between levels.

Customer-Related Entities

An organization supports processes to maintain details about its customers within an IT-based system for one or both of the following reasons:

  1. The organization has an ongoing relationship with, or obligation to, its customers in relation to the sale of one or more of its products (e.g. supply of electricity, a loan, an insurance policy).
  2. The organization wants to be able to communicate with its customers to sell them additional products (e.g. the newest generation of a consumer product, a one-off service that a customer can benefit from periodically).

There are many organizations, such as those involved in retail sales, which sell their products to unknown customers. One mechanism commonly used by organizations to know who their customers are is a customer loyalty scheme. The customer provides their name and contact details in exchange for discounts or other forms of rewards. Another mechanism is warranty registration, where a manufacturer records the end-consumer of its products, even though the product was not sold directly to that individual.

Some organizations offer some or all of their products only to qualified customers. Banks only provide loans to credit-worthy customers, but offer savings accounts to any customer. Some educational organizations want only the ‘best possible’ students to fill their limited student numbers. Training organizations that have more availability than students will typically accept anyone that applies. Certain healthcare procedures may require a patient to be in an appropriate state of fitness. Specific government services may only be provided to persons of a given status.

Depending on the products an organization offers, and its sales processes, the types of customers it maintains in its IT-based system may be one or more of the following:

An Individual — a person able to provide enough information that they can be distinguished from other individuals within the IT-based system. E.g. name, address, phone number, government-issued ID number.

Multiple Individuals
— two or more people who are jointly involved in the sale of a product. E.g. a joint bank account, a family mobile phone plan.

An Association
– a named group of individuals associated through a common factor (e.g. a profession, sport, or hobby). The association itself may be the customer, or individuals members of the association.

A Registered Business
— an individual or organization that has officially registered to operate as a legally entity. E.g. a sole trader, a corporation.

A Reseller
– an individual or organization involved in the sale of an organization’s products, either with the intent of reselling the products to their own customers, distributing the products further down a supply chain, or acting on behalf of individuals or organizations (e.g. a travel agent). A reseller may be franchised to use the branding of the organization.

A Governmental Branch or Department
— an organization unit within a governing body with authority to procure goods or services from external sources. E.g. The Army, The Department of Education, a state-owned enterprise.

A Registered Internet User
— a person that has established a unique logon ID with an organization’s internet site. The organization operating the site may or may not have a business process that associates the user to an existing customer. In cases where it does, the user is not an additional customer, but an individual with access to one or more self-service capabilities on behalf of that customer.

Well-defined data about customers recognizes the line-of-business-specific name the organization uses to reference them, such as Client, Agent, Patient, or Resident. Attributes maintained for customers include name, contact details, and information applicable to repeatable sales events (e.g. account balance, available credit, medical history).


Sale-Related Entities

The concept of sale within the context of this series was defined in Part 3 as any type of event where a customer commits to consuming one or more of the organization’s products. Depending on the product(s) involved, and the organization’s end-to-end sale process, there may be pre or post-sale event events triggering activities within the end-to-end process. These activities can involve additional sale-related entities.

Pre-Sale — A pre-sale process may be triggered by a customer or the organization. Where a customer is seeking a product offered by a particular organization, the customer triggers the process. Examples of pre-sale activities within the sales process include the customer filling out an application or order that identifies the desired product(s), or requesting an appointment, quote, or reservation.

Where the organization proactively seeks a sale by contacting a customer, each contact event is supported by an activity that can result in an offer being made to the customer.

Having initiated a pre-sale process, there can be additional events and their associated activities that take place prior to formal commitment to the sale. Examples include the customer making a refundable deposit, the organization providing a quote, submitting a bid, drafting a statement of work, or the customer and organization negotiating a contract.

Sale Commitment — Within an organization’s sale process there will be at least one activity representing the customer and/or the organization formally committing to the sale. The customer can be said to place an order, sign a contract, or pay for a booking.

Post-Sale Events
— Lastly, there may be post-sale events related to the product sold, that trigger activities within the end-to-end sale processes, and involving post-sale-related entities. From the customer side these may involve subsequent purchases or usage of the product within the terms of a contract, either incurring an additional charge or utilizing pre-paid credit. Or the return of a rented item.

From the organization side, a post-sale event may involve charging a periodic service fee. Or a product may be delivered subsequent to the sale (e.g. the operation of a scheduled flight or an entertainment event taking place that was ticketed in advance). Payment may be due, or overdue for a product that was sold on credit. A service may involve periodic reporting, such as the production of a statement of account.

Well-defined data about a sale includes all entities involved in the end-to-end sales process, naming them in accordance with the line of business and organizational terminology. Attributes maintained for these entities can include the event-related dates, quantities of usage, and amounts charged to or paid by the customer.

Location-Related Entities

The concept of location within the context of this series was defined in Part 3 as a place managed by the organization for the purpose of the selling or the consuming of its products. Examples include retail stores, hotels, library branches, and properties provisioned for usage of utility services such as water or electricity.

As with products, customers, and sales, the organization’s location types relate to a given line of business. Locations have a positional aspect and an operational aspect.

The position of an organization’s locations can be:

Area-based — one or more named geopolitically-defined areas (e.g. suburbs, cities, states/provinces, countries), or map-drawn and named by the organization (e.g. sales districts, service coverage areas).

— Two or more named points defining start, end, and any intermediate stopping points. E.g. passenger air, rail or bus routes, goods transportation routes.

— A place identifiable by address and/or map-based reference. This includes locations the organization provisions with sales-related staff, such as a retail store, hotel, or bank branch. It also includes residential and commercial properties the organization has provisioned with network-based access to its product (e.g. water, gas, fibre).

— An identifiable point or area within a larger location. E.g. a designated floor within a building, section or seat within a sports/entertainment venue, a specific storage location within a warehouse.

Where goods are involved, an organization cares about its locations from both an operational and inventory perspective. Operational in the sense of business hours and sales staff on hand during those hours.

Where services are involved there is additional need for service-appropriate resources to support advanced or ‘walk-in’ sales. E.g. flight crew, medical staff, a rental vehicle, an aircraft or ship to provide seating or cargo space.

Well-defined data about locations involves line-of-business-specific entity names. Attributes within those entities are needed to represent its position, operation, inventory and other required resources.

NOTE: Depending on organizational context, a location can be an area or a point. E.g. from the perspective of a city authority, the city is a bounded area, but from the perspective of an airline it is a point.

Click here for Part 5 — A Business Entity Identifier Attribute

Well-defined Data Part 3 – Line-Of-Business-Specific Entities

This series is about well-defined data within the context of IT-based business information systems. In this article and the next, entities specific to an organization’s line(s) of business are discussed.

These entities represent an organization’s products, customers, sales, and sales-related locations. They will be viewed within the context of five line-of-business functions. These functions, taken in sequence, represent the business processes that support any product as it goes through its lifecycle.

For the purpose of the discussion of line-of-business-specific entities, the terms product, customer, sale, and location are defined as follows:

Product — This term is intended to represent either goods or services offered by an organization to its customers. Goods are things the customer can take ownership of, subject to available inventory. Services are resources (types of or specific persons or things) used by the customer for a period of time, subject to availability at a given point in time.
Customer — This term is used to represent an individual, group, or organization that consumes the organization’s products. Common industry-specific synonyms for the term include account holder, passenger, patient, and resident.
Sale — This term is used to represent any type of event where a customer commits to consuming one or more of the organization’s products. A sale can be immediate, such as a grocery purchase. The consuming related to a sale of a long-running product is ongoing, such as water or electricity usage. A sale can be a commitment to consume a product in the future, such as a ticket for a scheduled airline flight or concert.
Location — Within the context of line-of-business entities, a location is a place managed by the organization for the purpose of selling or the consuming of its products. Examples include, retail stores, hotels, library branches, and properties provisioned for access to utility services such as water or electricity.

Lines of Business

A line of business centres on the products an organization offers for sale to customers. Unlike generic business entities such as Staff Member or General Ledger Account, which have entity names, attributes, and relationships common across all organizations, the entity names, attributes, and relationships for entities representing the line-of-business concepts product, customer, sale, and location differ widely from business to business.

For example, in the commercial airline line-of-business, the entity representing the concept customer is referred to as a Passenger. The sales process can start with a Reservation for space within a specified seating class on a scheduled Flight. The good news is that while line-of-business-specific entities vary widely in this way, they tend to be similar among organizations in a given line of business.

Every organization will have at least one product, and therefore one line of business. An example of an organization that has multiple lines of business is the Walt Disney Company. Its products include films, film-related merchandise, and its Disney-themed amusement parks, hotel accommodations, and cruises.

Line-of-Business Functions

Entities representing customers, products, sales, and locations differ between different lines of business. However, there are five business functions that, taken in sequence, represent the business processes that support any product as it goes through its lifecycle. These functions are Marketing, Product Development, Sales, Customer Care, and Product Decommissioning.

tasker 07092018Figure 1 — Line of Business Functions Supporting a Product Through Its Lifecycle

While at the highest level the line-of-business functions are common for any product, the business processes within each function will vary due to the differences in the types of products, customers, sales, and locations involved — for example, booking films with movie theatre operators involves very different processes than renting hotel accommodations to travellers. But while processes within each of these functions will differ based on the specific line of business, the overall objective of each function is common for any type of product.


Marketing — The objective of marketing is to identify and maintain products that strike a balance between the organization’s goals and objectives, and the affordable needs or wants of customers (or potential customers). Product-related decisions include what new products should be offered, what changes, if any, should be made to existing products, and when to cease offering a product. These decisions may be supported by a business case that presents projected revenues, costs, and risks for one or more options.

When the organization decides to move ahead with an option, the product development phase of the product’s lifecycle begins.
Product Development — The objective of the product development function is to ‘bring the product to market’. Having decided to add, change, or decommission a product, things need to happen within the organization. The decision may be as simple as changing the price of an existing product, or as complex as adding a whole new line of business. Existing business processes may require changing, or whole new processes needed to be put into place.
If new or existing business processes are to be supported by an IT-based system, that system needs to be put in place, modified, or data within it updated (e.g. pricing changes). If an existing IT system is being replaced by a new one, migration of data may need to be carried out. Where staff are affected by changes to processes and/or an IT system, training needs to be conducted for those affected.
Only after all the necessary additions or changes have been successfully ‘put into production’ is the organization ready for the sales phase for the product.
Sales — The objective of the sales function is to get customers through the product’s sales process (ideally along the process’s happy path). Where a process involves an IT-based system, the product development phase has done (and tested) what’s required to support the process. The products and locations have been set up in the system and are ready to be referenced during the sales process. The process may include adding a new customer. It will certainly involve recording details relevant to the sale. For long-running products, there will be processes for recording consumption over time.
NOTE: Where customers are encouraged or allowed to self-serve, they must be able to gain access to the appropriate user interfaces. Given access, well-defined data includes their being able to understand what information they need to provide to successfully carry out a given self-service process, such as purchasing products, or maintaining account details.
When a customer has a problem or issue after a sale, or during the operation of a long-running product, the processes within the product’s customer care lifecycle phase come into play.
Customer Care — The objective of the customer care function is to deal with customers in situations where something has gone wrong, or appears to have gone wrong, with the product they have been sold. For example, it’s broken, or is believed to be broken. Or the normal operation of a customer’s long-running product is not operating correctly, or appears to be not operating correctly. Or a customer wants to return the product, or cease consuming a long-running product.
To resolve these types of customer-related issues, customer care staff need access to line-of-business-specific details about the organization’s products, customers, sales, and locations. Ideally these have been recorded in an IT system during product development, sales, and customer care phases of the product lifecycle. Where customers are provided with self-serve support options, such as frequently asked questions) those product-specific details should be accessible.
Product Decommissioning — The objective of the product decommissioning function is to end access to an existing product within the sales process. From an IT-based system perspective, it’s about preventing new instances of the sale event from occurring. For long-running products, where possible, existing customers would be switched to an alternate product that has been added as a successor product, or some other similar, available product.
Additional business processes within this function would include such things as notifying sales staff of the product being decommissioned, removing it from any advertising campaigns, and (at a designated point) curtailing customer care for it.

Having looked at the objectives of line-of-business functions that support any type of product going through its lifecycle, the next article will take a closer look at the four concepts representing line-of-business-specific entities.

Click here for Part 4 — Product, Customer, Sale, and Location Entities

Well-defined Data Part 2 – Generic Business Entities

We begin our journey down the path of well-defined data by looking at generic business entities.

These are entities such as General Ledger Account, Staff Member, and Asset. They are well-understood within any organization large enough to warrant one or more IT-based business systems supporting functions such as accounting, human resources, or asset management. These functions and business entities are well supported today by commercial off-the-shelf (COTS) packages. So well supported, it’s difficult to imagine any organization justifying a decision to ‘make’ in-house rather than buy.

In my previous series, Requirements in Context, a generic high-level business model was discussed. The model represents business functions applicable to any organization, regardless of it being in the public or private sector, for profit or not for profit. The business functions are seen categorized as management, line of business, or support.

The management and support functions are generic in that their processes and entities are common to any organization. The line-of-business functions are generic in that they describe the life cycle of any product or service an organization deals in. But the processes within those functions and the business entities supporting those processes vary by industry segment and even by organization within a given segment.

This article focuses on the generic business entities associated with management and support functions. Parts 3 and 4 of this series look at the line-of-business functions and their line-of-business-specific entities.

Generic Business Processes and Entities

Within each management and support function are a number of business processes. For example, human resources processes include organizational management, recruitment, and payroll. Each of these generic processes is described briefly below along with the generic business entities they create or reference.

NOTE: As you read these process descriptions, consider how each can apply to any organization in any industry sector. For example, a term such as ‘staff member’ is used in relation to a position within the organization. When a staff member is on an employment contract, they are part of the payroll process and eligible for benefits such as paid leave. If they are on a service contract, they are paid based on an invoice for their time and are not eligible for paid leave. Again, this is generic to any organization regardless of the line(s) of business the organization operates.

Organizational management is about maintaining the generic business entity Position and its relationship to other positions in the organization’s management structure.
Recruitment is about finding and engaging a suitable person to fill an existing, budgeted, vacant Position. When the process involves hiring from within the organization, existing instances of Staff Member will be referenced (those on employment contracts). If there are no suitable/available internal candidates the process will create new instances of Applicant as external people apply for the position.
The successful candidate will be an existing or new instance of Staff Member. If the position is filled by a person that has signed an Employment Contract the contract will have details of the agreed salary (or wage) and negotiated leave benefits. If the position has been filed based on a Service Contract, that contract will have been signed for an agreed rate, either with the individual or with an agent representing that individual.
Payroll involves referencing instances of Staff Member on a salary or wage. The process needs to adjust the periodic salary paid within the pay period based any unpaid Leave. For a person on a wage, it needs to calculate the total pay based on Hours Worked, factoring in overtime rates (where applicable) plus payment for any accumulated Leave taken.
The resulting Employee Payment will also need to be reduced by any withheld taxes and other deductions. NOTE: Payments against a service invoice are not managed by a payroll process.


The above three processes within the human resources function involved one or more of the following business entities:

  • Position
  • Staff Member
  • Applicant
  • Employment Contract
  • Service Contract
  • Leave
  • Hours Worked
  • Employee Payment

These entities are generic in that none of them are specific to a particular line of business.

Well-defined Generic Business Entities

The primary objective of a definition is to ensure that the thing being defined will be recognized and understood by those that need to deal with it. A classic example is the “Duck Test” (i.e. how to recognize a duck). The test goes, “If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.” From a data perspective the thing (a duck) is defined by a description attribute (an image of a duck), its relationship to an observed event (swimming), and a classification (the type of sound it makes).

From a business analysis perspective, the test for recognizing a Staff Member, and therefore its business definition, can be stated as: “A person engaged by the organization under an employment or services contract to fulfill the duties and responsibilities of a designated position.”

In addition to a name and definition for each entity, the following properties are commonly included for entities in a data dictionary template:

  • ID — an identifier unique within the context of the data dictionary
  • Alias(es) — other business terms that users of the data dictionary may know this concept as
  • Owner — typically the organizational position responsible for sourcing and/or maintaining instances of the entity (e.g. Head of Recruitment in the case of Staff Member).
  • Data Sources — operational and data migration (e.g. For Staff Member, details captured during the recruitment process such as name, contact details, skills).
  • Current Volume — order-of-magnitude number of instances of the entity. Of particular interest when numbers are low (under 100) or very high (above 10,000).
  • Expected Growth — order-of-magnitude additional instances (per day, week, month, or year). Of particular interest when growth will increase the total by one or more orders of magnitude.
  • Peak Volumes — Order of magnitude number of entity instance references and/or additions. Of interest when the number is an order of magnitude higher (or more) than the Current Volume value. Applies more to event-based line-of-business entities involved in sales-related processes.
  • Business Identifiers — [To be covered in Part 5 in this series]

COTS Generic Business Information Systems

There are numerous commercially-available business information systems covering one or more processes within a management or support function — for example, Accounting systems, HR systems, Asset Management systems. The advantage of acquiring multiple modules from the same vendor is that the data is integrated (i.e. entered once, referenced multiple times). The alternative is to either have interfaces between systems, copying data from one to the other, or re-entering data in each system as required.

ERP systems (Enterprise Resource Planning) offer an integrated set of modules that span multiple management or support functions.

Click here for Part 3 — Line-of-business-specific Entities

Well-defined Data Part 1 – Series Introduction

The objective of this series is to take an in-depth look at data required for an IT-based business information system.

Techniques and concepts for business analysts thinking about and documenting entities, attributes, and relationships will be presented. This introduction to the series defines what is meant by well-defined data and the rationale for it.

What Is Well-defined Data?

Data, to be well-defined, should be both well organized and well specified. Well-organized data follows the old proverb, “A place for everything and everything in its place.” In business analysis terms this translates to “An entity for everything and every attribute in its appropriate entity.”
Well-defined data begins with determining the best business name for an entity or attribute. From that point the defining continues, capturing a definition plus details needed to get that data up and running in an IT-based system. Ideally some form of data dictionary would be used to record these details. Throughout this series one example of a data dictionary template will be used. It will be seen to include entity properties such as current volumes and growth rate, and attribute properties such as data formats and precision.

Who Needs Well-defined Data? 

Outdated IT System Replacement — Any organization that wants to build or acquire a new system to replace an outdated one needs the best possible data definition for data in the current system. There will be fields that turned out not to be of any business use. There will be fields originally intended to be used for one thing that ended up being used for something else. And there will be data needed by the business that the current system does not support. Some or all of this unsupported data may be managed by the business outside the current system in spreadsheets and such. The best possible definition of all this data is needed to support designing or acquiring a replacement system, migrating data to it, and training current users where to find the data they need in the new one.

IT System Vendor — Any vendor of commercial off the shelf (COTS) software needs the data underlying its software well defined. This information is used to convince prospective customers of the software’s capabilities, and to respond accurately to requests for quotes (RFQs). When a sale of the software is made, well-defined data is needed to support system configuration, data migration, training, and development of any bespoke reports or interfaces required.

Requirements Documenter — A business analyst responsible for producing requirements documents should include well-defined data in those documents. A high-level requirements document (Stakeholder requirements in IIBA BABOK® terminology) typically will have a glossary rather than a fully-detailed data dictionary. The glossary name and definition will be useful as input to the data dictionary developed later in the project. A detail requirements document (Solution and Transition requirements in IIBA BABOK® terminology) ideally would include a full data dictionary as a central point of definition for entities and attributes referenced in detail specifications for screens, reports, interfaces, and batch processes. 

Other Waterfall SDLC Team Members — Any member of a team involved in waterfall development based on signed-off requirements needs well-defined data. This includes:

  • Designers
  •  Developers
  •  Testers involved in integration, end-to-end, and user acceptance testing
  •  Data migration team members
  •  Trainers — of end-users or train-the-trainers
  •  Technical writers of user manuals


People in all of these roles look to requirements documents to support their deliverables. A central place where data is defined, either in each document or centrally for the project, would be of great benefit. NOTE: If available, an organization-wide data dictionary should be referenced for existing business data definitions, adding to that resource any additional project-defined terms and their definitions.

Agile Scrum Team Members — As user stories are written and refined they will reference entities and attributes that need to be delivered. Maintaining these in a shared data dictionary would mean consistent delivery of the data component across different epics or features.

What’s Ahead in this Series

Entities — The next three articles focus on Entities. The first will discuss generic business entities. These have business names and definitions common within the IT systems that support functions common to all organizations, such as accounting or human resources. For example, the entities General Ledger Account and Journal Entry within accounting, and Staff Member and Position within human resources.

The following two articles focus on line of business-specific entities. The line(s) of business an organization is in influence the entity names applicable to its products, customers, sale-related business events, and locations. For example, an airline sells a Ticket to a Passenger on a specific Flight. A public library acquires a Book Copy allowing a registered Patron to Borrow it from a Branch.

The first of these two articles discusses five generic line-of-business functions — Marketing, Product Development, Sales, Customer Care, and Product Decommissioning. These are seen to represent a product lifecycle common to all organizations. The following article focuses on each of the four primary business entity concepts a given line of business deals with — products, customers, sales, and locations.

Attributes — an attribute’s properties vary based on its intended purpose within its entity. Articles will be dedicated to discussing each the following purposes:

  • Being the entity’s Primary Business Identifier — E.g. Customer Number, Employee Number, Account Number. 
  • Naming — E.g. codes, abbreviations, people, products, buildings.
  • Quantifying — E.g. Currency amounts, product dimensions.
  • Point-In-Time Happening — E.g. Date of Birth, Purchase Date/Time.
  • Describing — E.g. in sentences, photographically, graphically.
  • Classifying — selecting from a pre-defined list. E.g. Customer Type, Skill, Gender.
  • Identifying an external entity instance — E.g. customer’s driver’s licence or credit card details.

Attribute History — Two kinds of attribute history will be discussed — business-meaningful history and audit history. Business-meaningful history means that users need visibility of changes to an attribute’s value over time, as part of normal business processes. E.g. Account Status, Discount Rate. Audit history is only needed in exception cases, where an attribute’s value is not correct, and the business wants to know the source of the incorrect value and what the previous value was.

Relationships — The three classic relationship types — one to many, many to many, and one to one — will be discussed. The use of screen mock-ups as a mechanism for defining these will be compared to how they are normally defined using entity/relationship diagrams.

Derivable Data — Different levels of complexity of data derivation will be discussed, including simple totalling, point-in-time summations (such as year to date figures), and complex rule-based derivations (such as discount amounts based on historical purchases).

Additional Topics — As this series evolves there may be additional topics that prove worthy of presenting. One that I know is lurking in the background is mandatory data (attributes or relationships). Stay tuned.

Click here for Part 2 — Generic Business Entities