2.2. Methodology: Data Architecture (TERMINOLOGY MANAGEMENT)

=== Methodology Section 2 - Data Architecture ===

2.  Semantic/Metadata/Data Architecture.

Very expensive and time-wasting miscommunication and disagreement costs are common to "wild vocabulary", non-semantic, efforts in modeling the:

(I) Solution (Patterns and then Systems) Architecture (with its inventory of capabilities and their composition and distribution across the business), and

(II) Technology Architecture (with its inventory of technology standards, products and services that support those standards, and the mapping of the product and standard composition to each other).

(III) Semantics/Metadata/Data (e.g., Metadata/Data of Structured Conceptual Data Models (i.e., Concept Maps, DoDAF OV2), Logical Data Models (ERD from Concept Map), and Physical Data Models (DB Schema or RDF/OWL Metamodel for storage mechanism)),

(IV) Business Architecture ( with its interdependent strategic management efforts, its assignment of responsibility, the maturation of its guidance, and its formulation of its internal, boundary, and external security requirements and constraints)  
The more tangible and narrowly-focused (I) Solution Architecture and (II) Technology Architecture may seem to be the most practical and useful EA parts to build first.  And the higher order, more abstract (III) Data/Metadata/Semantic Architecture and (IV) Business Architecture parts seem to be least practical and useful, and thus often built last, if at all.
Unfortunately for most "practical" efforts to implement an EA or information system, the inverse is easier and more consistent and optimal for the enterprise as a whole (i.e., Semantics/Metadata/Data, then Business, then Solution, then Technology architectures).  When considered from an broader/unitary enterprise perspective (e.g., from the CEO, COO, Board/Legislature, Owner/Investor/Citizen viewpoints), it is more economical, effective, and faster to build the EA sub-architectures in the order: (III) Semantics/Metadata/Data, (IV) Business, (I) Solution and (II) Technology.  Without having a controlled vocabulary at the beginning, and its standard references, master metadata, and master data for the enterprise activities, all subsequent efforts that require communication, understanding, and agreement on terms and symbols used within and around the enterprise are typically a slow, inefficient, and ineffective Babel.

To quantify this, I submit that 80% of the cost of any annual operation or a one-time capability development effort is caused by uncontrolled, or wild, vocabularies (also called "uncontrolled Terminology") across the diverse individuals and groups participating in those efforts.  Reduce the Babel (i.e., uncontrolled terminology) in these efforts and you reduce the effort’s cost and time requirements, while increasing the effort’s resulting effectiveness and quality.  So no Executive, Manager, Supervisor, Worker, or Contractor should operate a single day without a controlled vocabulary for their work scope and broader external environment.

Starting with the (III) Semantic/Metadata/Data Architecture first, to generate a "controlled vocabulary" (e.g., an ontology, taxonomy and thesaurus) of the enterprise, enables all subsequent EA and enterprise operation efforts to be performed with lessened "communication and agreement" costs.

To simplify and harmonize the overall EA effort, in parallel with the Bottom-up and Top-down EA activities, I therefore recommend that EA efforts go through the following (III) Semantic/Metadata/Data Architecture procedure first, before your (IV) Business, (I) Solution, and (II) Technology Architecture efforts, to reduce the inevitable costly and stultifying friction and rework that happen if every participant’s terms and symbols are not in harmony.

The Enterprise Architect will need to work closely with the organization to build the (III) Semantic/Metadata/Data Architecture independently of Business or Solution, to use a Terminology Management process to develop what I have called an organization’s "intelligence inventory", which has at its most advanced levels a "controlled vocabulary".  A terminology management process builds intelligence capabilities, including the progression of semantic products such as lexicons, glossaries, dictionaries, concept models (e.g., concept maps, conceptual/logical data models, process models, ontologies, and axiologies), taxonomies, and thesaurus.  The ontologies, axiologies, taxonomies, and thesaurus are the "controlled vocabularies" of an endeavor.  In direct relation to the speed and capacity of the communication media of the organization, this terminology management process provides the means for "lowest-detail to highest-detail " integration – starting with "human interoperability".

A terminology management process can be used to automatically add an individual’s personal intelligence into group intelligence, group intelligence into organization intelligence, organization intelligence into national intelligence, and national intelligence into the global intelligence, while simultaneously providing the mechanism for identifying/controlling appropriate access and sharing of that intelligence.

The procedure for this Semantic/Metadata/Data Architecture (intelligence inventory) is below.

==== GEM Terminology (Intelligence Inventory, Fusion, and Refinement Procedure) ====

Subject: How to organize and share your information, awareness and learning, and to cooperate.

(A Procedure for Intelligence Fusion and Refinement)

There is a natural tendency of people with different vantage points and experiences to have ambiguous and divergent terms in their overlapping use of words, their vocabularies.  An individual’s evolving vocabulary (i.e., the list of terms forming their "lexicon") is the basis for their joining and forming groups, communities of interest, communities of practice, etc. That is, people come together because of shared interests, a shared purpose, around a mutually-understandable vocabulary – their group’s "glossary".  They control their glossary, and the definitions of the glossary terms, to maintain group cohesion, coherence, boundaries, and security.  This is the ‘semantic analysis" part of their enterprise management architecture (EMA), within a broader general endeavor management (GEM) approach.

Below is a simple, and relatively easy, path to achieve shared group semantics - an understandable, and controlled group vocabulary – regardless of the size of the group.  I submit that this procedure, while now using new technology and standards, has been applied by people throughout the history of our societies.

            1. For each member seeking to join a group, have them collect a list of the words they typically use in the group or group-related activities, such as from their resume, their favorite books, and their local computer file system or favorite Internet web site.

            2. Multiple individuals" lexicons, when merged into a group’s glossary, often have no definitions, or many, - they communicate based on non-validated, and thus often false, assumptions of understood meaning.  So, for individuals joining or forming a group, a dictionary must be built for each individual’s lexicon.   I recommend an online, and thus broadly accessible/sharable dictionary as the primary source of dictionary definitions, sense, etc.

            3. These lexical dictionaries are then merged to build the group’s dictionary from its glossary (probably with multiple meanings per term).

            4. Then, systematic efforts are pursued to build a minimized subject- tree (i.e., a taxonomy) of broader and narrow meanings (based on the definitions) about subjects of interest to the group.

            5. Then a thesaurus is built to show the group’s preferred terms, and any subgroups" and individuals" alternate terms, along with abbreviations, acronyms, aliases, and alternate spellings for the terms.

            6. Then subsequent "concept maps" are built from the terms to show how the subjects (i.e., same as SQL entities, XML Elements, or nouns) relate to each other (through verbs and verb phrases) from a given user’s, subgroup’s, and whole-group’s viewpoints.  These concept maps can also be used as "concept of operation" (CONOPS) models (such as for DoDAF OV1) and Conceptual Data Models (CDM) (as in DoDAF2, Volume 1).  Note that concept maps are one of the more prominent methods used to create "knowledge models".

            7. Then data models (e.g., Object Role Models - ORM, Entity/ Attribute Relationship Models - ERM, Logical Data Models - LDM as in DoDAF2, Volume 2) are built from the concept maps to show the attributes/behaviors/methods of a given defined subject in the concept map and of the relationships in the concept map.

            8. Then process models (e.g., using the standard business process management notation - BPMN) representing the ‘sequence" relationships contained in the concept maps are extracted and organized to show the names of sequential activities arrayed across process-model "pools and lanes" representing the functional software applications, the application-owning organization units, and their parent organization, with each "pool or lane" also containing data on the physical/postal/ facility/geospatial or virtual/network/telephonic/radio-frequency location of the application, OU, or Organization.  Note that BPMN process models are now required for DoDAF2 OV6c models, and we recommend them as the standard "Assessable Unit" modeling for the US Federal Management Internal Control - MIC program's process modeling requirements)

            9. Then, the now contextually-rich and content-rich process model and data model are diagrammed to provide a resultant dynamic model of knowledge (e.g., using the web ontology language - OWL) of the evolving viewpoints of the individuals, subgroups, and the full group.

            10. Then, the ontology is populated with the data from individual, group, and organizational instances of the subject types and relationships between specific subjects, to provide an evolving/ adaptive sharable knowledge-base for the group to enable shared situational awareness and co-operation.

            11. Repeat with each new group member, asking the new individual to use the group’s shared-knowledge-base as their online primary source of definitions, supplementing the group’s definitions as needed to evolve the group’s vocabulary using steps 3-10.

            12. Repeat steps 3-10 when merging groups and establishing value- chains (i.e., axiologies) across individuals, groups, and organizations.

If you read the sequence above from an enterprise architecture (EA) perspective, you"ll see I'm talking about building a shared architecture for a group and its endeavors, based on building a controlled-vocabulary for the group. The US DoD Enterprise Architecture authorities have recently begun referring to this as a "Federated EA", which is currently described in the DoD Information EA (DoDIEA) publication.  In the general endeavor management (GEM) approach, a full enterprise architecture is the above "knowledge-base" of the "enterprise", and the above "model of knowledge" (ontology) of the group is the "Metamodel" of the enterprise’s architecture. The above steps then provide a "methodology" for building both the Metamodel and shared knowledge- base called an enterprise architecture.

As for the technologies to use, the above steps can be performed with simple web pages, web tables, and a "lookup" mechanism for the tables" values, with a lot of procedural discipline and governance of the content. Alternately, you could use a ‘semantic wiki" tool like the one at [http://www.knoodl.com], in conjunction with concept mapping tools such as the CMap tool from [http://cmap.ihmc.us]. More automated and maintainable, you could use one of the extended metadata repository (XMDR) EA tools that have ontology management, knowledge-base (i.e., master data), and dynamic/semantic-application functionality such as the EVA NetModeler from [http://www.pro-mis.com].   In order, use Knoodl and CMap as low cost modeling tools for all-users, and migrate the results into the XMDR or ontology metadata repository as the foundation for automated semantic (appropriately and securly shared knowledge-base) applications.
Support my efforts in Enterprise Management Architecture and Terminology Management by contributing to the Non-Profit One World Information System http://www.one-world-is.com/paypal/paypal-contribute1.htm.