AutoMap

Description of Projects

AutoMap:
Extract, Analyze and Represent Individual Mental Models

Jana Diesner
Advisor: Prof. Kathleen M. Carley

1. General Information
2. Product
3. Key Terms
4. Coding Choices: Filtering and Windowing
5. References

1. General Information

AutoMap extracts, analyzes and represents cognitive maps of texts as representations of individual's mental models. Mental models are conceptual networks of relations between concepts. Texts contain a portion of the author's mental model.
Differences in the distribution of concepts and relationships among the Concepts across texts provide insight into the similarities and differences in the content and structure of texts.

The employed algorithm is based on Carley’s approach to coding texts as cognitive maps and Danowski’s approach for proximity analysis. In the map analysis method both approaches are combined. AutoMap contains content analytic techniques and map analytic techniques to code and analyze texts and a toolkit for data reduction. The content analytic techniques are focused on collecting and analyzing quantitative information on the concepts. The map analytic techniques extract conceptual networks of texts.
AutoMap is not restricted to any language.

2. Product

AutoMap is a program written in Java, that runs under the Linux, UNIX and Windows operating systems. The release that has been made available as AutoMap.exe runs only under Windows.

As an input AutoMap takes one or several texts. The user can pre-process the text and determine the window size and further settings according to the research question. Every step of data reduction is visualized and can be stored for further analysis. As an output AutoMap generates the pre-processed text, a coded map, a visualization of the map and a statistical overview.
There are no scientific standards for defining information as irrelevant or how to create a delete list or a thesaurus. The user has to determine the most appropriate level of generalization considering his research question. AutoMap supports this decision making process.

More detailed information about the program and its functionlies is provided in the General Information file and the AutoMap Help.

Download AutoMap .

3. Key Terms

1.    A concept is a single idea or ideational kernel represented by a single word or phrase.
2.    A relation is a connection between two concepts.
3.    A statement is a set of two concepts and the relation between them.
4.    A map is a network of concepts formed from statements. Two statements are linked if they share one concept.
5.    A Thesaurus is a set of key concepts and their synonyms.
6.    A key concept is a concept that other concepts will be translated into.

4. Coding Choices: Filtering and Windowing

Coding a map is a two-step process that requires the user to make decisions about Filtering a text and Windowing (proximity analysis).
The coding choices may change the analysis results significantly.

Filtering a text means to reduce the data to a minimized set of the most relevant, content-bearing terms. Pre-processing is a semi-automated, iterative process that allows the user to stay close to the data and to beyond explicitly articulated ideas to implied ideas. The size of both text and map is decreased significantly and therefore meaningful comparisons across texts becomes possible.A simplified text is generated that can be visually inspected. There are no scientific standards for defining information as irrelevant. The user has to determine the most appropriate level of data reduction considering his research question.
AutoMap allows the researcher to use a three step process for data reduction: punctuation, deletion and generalization.
AutoMap helps to make decisions and to realize the generalization.
By determining the punctuation the user decides whether statements within sentences, paragraphs or the entire text are considered in the analysis.
On a more detailed level deletion and generalization are applied to filter the text.

Deletion removes words from the text which do not help answer the research question such as proper names, pronouns, conjunctions, articles, prepositions and notations. AutoMap has two delete lists available – an extensive one and a limited one – and the researcher can modify these or design a unique one.

Generalization involves the application of a thesaurus, which is typically designed specifically for a dataset. AutoMap uses the entries in the thesaurus to search the text and “translate” specific words and phrases into more basic concepts specified by the researcher.
Once the user has defined a Thesaurus, AutoMap offers two ways to apply it: When the words and phrases that are included in the thesaurus get replaced by their corresponding key concepts, the rest of the text can be maintained or neglected. The difference between the two methods is the resulting data reduction, that is much higher if all those concepts, that are not included in the thesaurus, will be neglected.

When the pre-processed text will be analyzed in order to code a map, statements will be placed between the concepts within every single window.
If the user did not apply a delete or a thesaurus, statements will be placed between all contiguous concepts.
If data reduction were applied, AutoMap offers two methods to place statements between concepts (e.g. direct or rhetorical adjacency.)

Windowing is a method that codes the (filtered) text as a map by putting relationships between pairs of Concepts that occur within a window.
A window is a set of contiguous concepts.
By determining the window size the user defines how proximally distant concepts can be from each other and still have a relationship.
AutoMap offers Windowing as a completely automated process. The user can select any window size between 2 and 100.

5. Papers on AutoMap

Kathleen M. Carley, 1997, "Extracting Team Mental Models Through Textual Analysis." Journal of Organizational Behavior, 18: 533-538.

Abstract:
An approach, called map analysis, for extracting, analyzing and combining representations of individual's mental models as cognitive maps is presented. This textual analysis technique allows the researcher to extract cognitive maps, locate similarities across maps, and combine maps to generate a team map. Using map analysis the researcher can address questions about the nature of team mental models and the extent to which sharing is necessary for effective teamwork. This technique is illustrated using data drawn from a study of software engineering teams. The impact of critical coding choices on the resultant findings is examined. It is shown that various coding choices have systematic effects on the complexity of the coded maps and their similarity. consequently a thorough analysis requires analyzing the data several times under different coding choices. For example, re-analysis under different coding scenarios revealed that although members of successful teams tend to have more elaborate, more widely shared maps than members of non-successful teams, this difference is significant only when the data is unfiltered. Thus a better interpretation of this result is that all teams have comparable models, but successful teams are able to describe their models in more ways than are non-successful teams.

Eleanor T. Lewis & Jana Diesner & Kathleen Carley, 2001, "Using Automated Text Analysis to Study Self-Presentation Strategies"
Presented at the Computational Analysis of Social and Organizational Systems (CASOS) conference, Pittsburgh Pennsylvania, July 2001. Available through the CASOS working paper series.

Abstract:
Extracting and representing the networks of ties between concepts in a set of texts creates a “map” of each text. Map analysis allows a researcher to compare the networks of ties between concepts in these texts by systematically reducing their content. The goals of this research paper are to answer both a methodological and a substantive question. First, how do the choices a researcher makes about how to generate maps using an automated text program alter the results, and how do these results compare to the results of hand-coding? Second, how can we interpret the results of map analysis to better understand the strategies authors use to manage their self-presentation, a central purpose of many texts. The texts we use are a subsample of a dataset of applications by entrepreneurs for an “Entrepreneur of the Year” award. Applicants value uniqueness in their application’s content because it sets them apart and demonstrates their worthiness for the award, but the value placed on uniqueness in the structure of their accounts is not as clear. Our analysis allows us to extract four general self-presentation strategies: the prepared entrepreneur, the driven entrepreneur, the creative niche entrepreneur, and the humble entrepreneur (a single entrepreneur may employ multiple strategies).

For further Information about Textual Analysis see:
http://www.hss.cmu.edu/departments/sds/faculty/carley/publications.htm