Description of Projects
AutoMap:
Extract, Analyze and Represent Individual Mental Models
Jana Diesner
Advisor: Prof. Kathleen M. Carley
1. General Information
2. Product
3. Key Terms
4. Coding Choices: Filtering
and Windowing
5. References
1. General Information
AutoMap extracts, analyzes and represents cognitive
maps of texts as representations of individual's mental models. Mental models
are conceptual networks of relations between concepts. Texts contain a portion
of the author's mental model.
Differences in the distribution of concepts and relationships among the Concepts
across texts provide insight into the similarities and differences in the
content and structure of texts.
The employed algorithm is based on Carley’s
approach to coding texts as cognitive maps and Danowski’s approach for proximity
analysis. In the map analysis method both approaches are combined.
AutoMap contains content analytic techniques and map analytic techniques
to code and analyze texts and a toolkit for data reduction. The content analytic
techniques are focused on collecting and analyzing quantitative information
on the concepts. The map analytic techniques extract conceptual networks
of texts.
AutoMap is not restricted to any language.
2. Product
AutoMap is a program written in Java, that runs under the Linux, UNIX and
Windows operating systems. The release that has been made available as AutoMap.exe
runs only under Windows.
As an input AutoMap takes one or several texts. The user can pre-process
the text and determine the window size and further settings according to
the research question. Every step of data reduction is visualized and can
be stored for further analysis. As an output AutoMap generates the
pre-processed text, a coded map, a visualization of the map and a statistical
overview.
There are no scientific standards for defining information as irrelevant
or how to create a delete list or a thesaurus. The user has to determine
the most appropriate level of generalization considering his research question.
AutoMap supports this decision making process.
More detailed information about the program and its functionlies is provided
in the General Information
file and the AutoMap Help.
Download AutoMap
.
3. Key Terms
1. A concept is a single idea or ideational
kernel represented by a single word or phrase.
2. A relation is a connection between two concepts.
3. A statement is a set of two concepts and the
relation between them.
4. A map is a network of concepts formed from statements.
Two statements are linked if they share one concept.
5. A Thesaurus is a set of key concepts and their
synonyms.
6. A key concept is a concept that other concepts
will be translated into.
4. Coding Choices: Filtering and Windowing
Coding a map is a two-step process that requires the
user to make decisions about Filtering a text and Windowing
(proximity analysis).
The coding choices may change the analysis results significantly.
Filtering a text means to reduce the data to a minimized set of the
most relevant, content-bearing terms. Pre-processing is a semi-automated,
iterative process that allows the user to stay close to the data and to beyond
explicitly articulated ideas to implied ideas. The size of both text and
map is decreased significantly and therefore meaningful comparisons across
texts becomes possible.A simplified text is generated that can be visually
inspected. There are no scientific standards for defining information as
irrelevant. The user has to determine the most appropriate level of data
reduction considering his research question.
AutoMap allows the researcher to use a three step process for data reduction:
punctuation, deletion and generalization.
AutoMap helps to make decisions and to realize the generalization.
By determining the punctuation the user decides whether statements within
sentences, paragraphs or the entire text are considered in the analysis.
On a more detailed level deletion and generalization are applied to filter
the text.
Deletion removes words from the text which do not help answer the
research question such as proper names, pronouns, conjunctions, articles,
prepositions and notations. AutoMap has two delete lists available – an extensive
one and a limited one – and the researcher can modify these or design a unique
one.
Generalization involves the application of a thesaurus, which is
typically designed specifically for a dataset. AutoMap uses the entries in
the thesaurus to search the text and “translate” specific words and phrases
into more basic concepts specified by the researcher.
Once the user has defined a Thesaurus, AutoMap offers two ways to apply
it: When the words and phrases that are included in the thesaurus get replaced
by their corresponding key concepts, the rest of the text can be maintained
or neglected. The difference between the two methods is the resulting data
reduction, that is much higher if all those concepts, that are not included
in the thesaurus, will be neglected.
When the pre-processed text will be analyzed in order to code a map, statements
will be placed between the concepts within every single window.
If the user did not apply a delete or a thesaurus, statements will be placed
between all contiguous concepts.
If data reduction were applied, AutoMap offers two methods to place statements
between concepts (e.g. direct or rhetorical adjacency.)
Windowing is a method that codes the (filtered) text as a map by putting
relationships between pairs of Concepts that occur within a window.
A window is a set of contiguous concepts.
By determining the window size the user defines how proximally distant concepts
can be from each other and still have a relationship.
AutoMap offers Windowing as a completely automated process. The user can
select any window size between 2 and 100.
5. Papers on AutoMap
Kathleen M. Carley, 1997, "Extracting Team Mental Models Through Textual
Analysis." Journal of Organizational Behavior, 18: 533-538.
Abstract:
An approach, called map analysis, for extracting, analyzing and combining
representations of individual's mental models as cognitive maps is presented.
This textual analysis technique allows the researcher to extract cognitive
maps, locate similarities across maps, and combine maps to generate a team
map. Using map analysis the researcher can address questions about the nature
of team mental models and the extent to which sharing is necessary for effective
teamwork. This technique is illustrated using data drawn from a study of
software engineering teams. The impact of critical coding choices on the
resultant findings is examined. It is shown that various coding choices have
systematic effects on the complexity of the coded maps and their similarity.
consequently a thorough analysis requires analyzing the data several times
under different coding choices. For example, re-analysis under different
coding scenarios revealed that although members of successful teams tend
to have more elaborate, more widely shared maps than members of non-successful
teams, this difference is significant only when the data is unfiltered. Thus
a better interpretation of this result is that all teams have comparable
models, but successful teams are able to describe their models in more ways
than are non-successful teams.
Eleanor T. Lewis & Jana Diesner & Kathleen Carley, 2001, "Using
Automated Text Analysis to Study Self-Presentation Strategies"
Presented at the Computational Analysis of Social and Organizational Systems
(CASOS) conference, Pittsburgh Pennsylvania, July 2001. Available through
the CASOS working paper series.
Abstract:
Extracting and representing the networks of ties between concepts in a set
of texts creates a “map” of each text. Map analysis allows a researcher to
compare the networks of ties between concepts in these texts by systematically
reducing their content. The goals of this research paper are to answer both
a methodological and a substantive question. First, how do the choices a researcher
makes about how to generate maps using an automated text program alter the
results, and how do these results compare to the results of hand-coding? Second,
how can we interpret the results of map analysis to better understand the
strategies authors use to manage their self-presentation, a central purpose
of many texts. The texts we use are a subsample of a dataset of applications
by entrepreneurs for an “Entrepreneur of the Year” award. Applicants value
uniqueness in their application’s content because it sets them apart and demonstrates
their worthiness for the award, but the value placed on uniqueness in the
structure of their accounts is not as clear. Our analysis allows us to extract
four general self-presentation strategies: the prepared entrepreneur, the
driven entrepreneur, the creative niche entrepreneur, and the humble entrepreneur
(a single entrepreneur may employ multiple strategies).
For further Information about Textual Analysis see:
http://www.hss.cmu.edu/departments/sds/faculty/carley/publications.htm