Data Warehousing, Spatial Data
Warehousing, and Data Marts
- Deriving or extracting aggregate data or views (e.g.,
database content slices) from primary enterprise transaction databases for
online analysis and processing (OLAP)
- Employs "snowflake schema" as warehouse database
data model (Example
)
- Also sometimes referred to as "multi-dimension
databases"
- Spatial data warehousing used to manage geographic,
terrain, or other large data bases designed for data visualization and visual
navigation (e.g., fly-overs)
- Case Study: Microsoft's TerraServer for spatial data warehousing
- Data marts are smaller-scale data warehouses
- Employed by small-medium size firms that seek the
potential benefits of data mining from existing transaction databases
Data or Information Discovery
- Where is the data or information I need?
- Search Engines (e.g., altavista.com) and Index Servers
(e.g., yahoo.com)
- Information Discovery: Internet, LAN, PC
- Data Mining a Data Warehouse, Data Mart,
or Data Base
- Building and extracting views characterizing underlying
data values
- OnLine Analytical Processing (OLAP)
- Query mechanisms designed to build Data Warehouses,
or to support Decision Support applications and computations (e.g., clustering,
concept formation)
- Report Generation/Visualization from Data Warehouses,
Data Marts, Web
- Knowledge Discovery and Deductive Information
Retrieval
- Deriving data relationships "hidden" in transaction
database instance values
Knowledge Management
- Knowledge (possessed-- "know-what") vs. Knowing
(knowledge practiced--"know-how: been there, done that")
- Formal knowledge representation schemes
- Semantic network data models (example
)
- Shared ontologies (database with metadata content)
and domain-specific terminology/notation
- Hierarchical Object-Relational Classification
Schemes and Taxonomies
- Biological Classification Schemes (Kingdoms, Phyla,
Classes, Orders, Families, Genera, and Species after Linneaus)
- Common Taxonomies
- Family Trees (Parent-Child): Child inherits attributes
of Parent
- Abstraction (Whole-Part): Part belongs to, share
interfaces, methods, and fits within Whole
- Genericity (General-to-specific): Industry, enterprise,
division, department, group, individual
- Faceted or Hierarchical Naming Schemes
- Global telephone numbers -- 01-1-949-824-4130
- Internet Domain Names -- mobile01.gsm.uci.edu
-
Sorting Things Out: Classification and its Consequences,
MIT Press, 1999.
- Stories: Narrative (or multi-media) descriptions
of knowledge/knowledge
- "Case studies"
- "Lessons Learned"
- "Best Practices" (example)
- Training materials with user annotations
- Scenarios and Simulations
- Work Practices -- ad hoc problem solving
work and information flow
- Elicit, capture, analyze, and situated improvement
- Source for best practices
- Intelligence Gathering (also surveillance
and reconnaissance)
- Shared repositories (cf. Lotus Notes)
- Electronic documents (e.g., scan and OCR paper documents,
save in e-doc repository or DB)
- Web site search and retrieval
- Email correspondence analysis
- cf. Brint.com
- Content Analysis (cf. Autonomy.com, Semio.com)
- Filtering and Clustering
- Keyword and Concept Recognition
- Content Management Systems
- Web Portals
- CRM
- Example CMS: Zope
(Flash demo)