WO1999064945A2

WO1999064945A2 - Computer systems and computer-implemented processes for knowledge management using relevancy to search for, acquire and organize information for multiple users

Info

Publication number: WO1999064945A2
Application number: PCT/US1999/013124
Authority: WO
Inventors: Leon Chernyak; Greg Margolin; Arkady Berenstein; Vladimir Dorman
Original assignee: Newsphere, Inc.
Priority date: 1998-06-10
Filing date: 1999-06-10
Publication date: 1999-12-16
Also published as: WO1999064945A3; WO1999064945A9; AU4679099A

Abstract

Knowledge about a task performed by at least one user may be represented using a data structure including information descriptive of relationships among a plurality of nodes. The plurality of nodes include a plurality of contextual nodes corresponding to tasks performed by the at least one user, and a plurality of textual nodes corresponding to units of information and separately from the information itself various contexts in which the information may be relevant. The data structure may, for example, be represented as a graph including a plurality of contextual vertices corresponding to the plurality of contextual nodes, a plurality of textual vertices corresponding to the plurality of textual nodes, and a plurality of edges corresponding to the relations among the plurality of nodes.

Description

COMPUTER SYSTEMS AND COMPUTER-IMPLEMENTED PROCESSES FOR

KNOWLEDGE MANAGEMENT USING RELEVANCY TO SEARCH FOR. ACQUIRE

AND ORGANIZE INFORMATION FOR MULTIPLE USERS

This application claims priority from a U.S. Provisional Patent Application entitled

"Computer Systems and Computer-Implemented Processes for Knowledge Management Using Relevancy to Search for, Acquire and Organize Information for Multiple Users," Serial No. 60/088,821, filed on June 10, 1998, which is hereby incorporated by reference.

Background

A computer system generally permits an individual or a group of individuals to create, store, search for, and retrieve information in the computer system in a number of ways. One problem in such a computer system is permitting an individual to store information such that it is easily locatable and retrievable at a later time by the same individual or by another individual using the computer system.

This problem generally is solved by providing a user with some mechanism that permits the user to define characteristics of the information so that the user then can search on those characteristics. For example, a user may define folder names, file names, and file types using a file system. These names and other characteristics may be used to search for information in the file system. As another example, a database may be used that permits a user to associate other information, such as a type of document, a set of key words, or a hierarchical categorization of information, with the information being stored. Such other information may be used in search queries to locate and retrieve the information. In general, the mechanism provided by a computer system is predefined and structured or is defined by each individual. Searching for information is generally limited to a search query that uses key words determined by the user, which is applied to the set of information using some form of logic, such as boolean logic or fuzzy logic, to determine whether information matches the search query. A user generally has little guidance from the computer system in selection of key words to use. When search results are retrieved, the items of information generally are prioritized in an order that reflects a measure of how well each item matches the search query.

For an individual using the computer system, these limited mechanisms for classification, searching, and prioritizing information make it difficult to keep track of, search for and retrieve relevant information. These problems are compounded when multiple users create, store, search for, and retrieve information in a distributed computer network. In particular, what is relevant to one individual may not be relevant to another individual, or may be relevant in a different way. A way that a user decides to organize information may make sense to that user at that time, but may not make sense to another user or even to that same user at a later point in time. As a result, users often lose documents even in a well-organized file system.

Summary

A knowledge management system is provided which represents knowledge as a knowledge base including relationships among information and the contexts in which the information may be relevant. Information may be represented as an information object, which may be stored in any computer readable format, such as a text or graphics document or file. email object, audio or multimedia file, executable model object, database record, transaction object, and combinations and variations thereof. Information and contexts may be represented separately in the knowledge base. In one embodiment, for example, an information object is represented in the knowledge base as a textual node which may, for example, be a link to the information represented by the information object. In one embodiment, a "local context^" corresponds to knowledge, developed by one or more users, about a particular activity or task within a knowledge domain. In one embodiment, a local context is represented in the knowledge base as a contextual node. A local context may include links to one or more textual nodes, one or more contextual nodes, and relationships among these textual nodes and contextual nodes. If a local context includes a link to a textual or contextual node, the local context "contains" that textual or contextual node.

The knowledge base may include one or more "global contexts," each of which corresponds to knowledge developed by a community of users in a particular knowledge domain. For example, a global context may correspond to a broad area of knowledge such as automobiles. organizational structures, or patent law. A global context is an aggregate of multiple contexts. For example, a global context corresponding to knowledge relevant to automobiles might be aggregated from a local context corresponding to knowledge relevant to the task of purchasing an automobile and a local context corresponding to knowledge relevant to the task of selling an automobile. The separation of information from context draws, in part, from the realization that users typically do not search for, retrieve, organize, and manipulate information in a vacuum. Rather, users typically interact with information when engaging in a particular activity or task. The information that is relevant to a particular user, therefore, depends on the task being performed by the user. For example, consider a user whose task is to purchase an automobile. In the course of performing this task, the user likely searches for, obtains, and organizes information related to the purchase of automobiles. Using a conventional system, such a user might attempt to obtain information related to this task by searching a database for information related to automobile engines using a database query including the phrase "automobile engines.'^" Such a search is typically performed by scanning documents in the database, and any associated meta-data, for the phrase "automobile engines." Because the search is based purely on information (i.e., the phrase "automobile engines") and does not incorporate any knowledge about the context in which the user is searching for the information (i.e., to obtain information relevant to the task of purchasing an automobile), the results of the search might include many documents — such as engineering specifications of automobile engines — that are not relevant to the user's task.

In contrast, this knowledge management system is task-centered, rather than information-centered. Because the system is task-centered, searching, acquisition, and organization of information is based on the user's task. For example, a user with a task to be performed engages with the system to interactively develop a formulation of the task. The task formulation that is developed thereby is used to identify an existing local context corresponding to the task or to generate a new local context corresponding to the task. The local context thus identified or generated is referred to as the user^'s workspace. When a local context is shared by more users, a user can browse the workspace to examine the knowledge that has been developed within it by another user of the system. The workspace may be considered as a repository of knowledge about a particular task developed by a community of users. As described in more detail below, the user browsing the workspace can both examine the knowledge developed in the workspace by other users and modify the contents of the workspace, thus further developing the knowledge it contains. As described above, a local context may include links to one or more textual nodes, one or more contextual nodes, and relationships among the textual nodes and contextual nodes. These relationships may include, for example, relationships between textual nodes, relationships between contextual nodes, and relationships between textual nodes and contextual nodes. Although one embodiment of a data structure representing a local context is described below, local contexts may be represented in any way. For example, relationships between nodes contained in a local context may be represented separately from links from the local context to the textual nodes and contextual nodes that it contains. In one embodiment, for example, the knowledge base is represented as a data structure including (1) a collection of nodes comprising pointers to information objects and contexts, and (2) a collection of links among the nodes. The presence of a textual node or contextual node within a local context may, for example, indicate that the textual node or contextual node is relevant to the task corresponding to the local context. For example, if another user has previously used the system to assist in the task of purchasing an automobile in Massachusetts, the local context corresponding to this task (the "car purchasing local context") may include links to information (e.g., documents accessible via the World Wide Web) related to the purchase of automobiles in Massachusetts, such as the web sites of car dealerships. The car purchasing local context may also include links to other local contexts (e.g., a local context corresponding to the task of purchasing trucks in

Massachusetts) that are related to the purchase of automobiles. Furthermore, the car purchasing local context may, for example, include a relationship between a web site of a car dealership and the web site of the manufacturer of the cars sold by the car dealership. Similarly, the car purchasing local context may include a relationship between a web site of a car dealership and a local context corresponding to the task of obtaining financing for an automobile purchase. Similarly, the car purchasing local context may include a relationship between two local contexts, such as between a local context corresponding to the task of purchasing trucks in Massachusetts and a local context corresponding to the task of obtaining financing for an automobile purchase. Links contained within a local context may be generated interactively by users of the system. The user may, for example, directly indicate to the system which links should be generated, such as by indicating to the system that a textual node or contextual node should be added to a local context, or by indicating that a relationship should be created between two nodes within a local context. Furthermore, the user may instruct the system to analyze a set of textual and contextual nodes and to create or propose links within a local context as a result of the analysis. If the system proposes links, the user may accept the links, reject the links, or modify the links prior to incorporating them into the local context. The system may determine the relevance of one textual node to another based on a comparison of geometric representations of the information objects corresponding to the textual nodes. For example, the system may generate a diagram of a textual node representing terms within the information object corresponding to the textual node and relationships among the terms. The system may generate or propose links between two textual nodes based on the determined relevance of the information objects corresponding to the two textual nodes.

Links within a local context can include qualitative and/or quantitative information. For example, a link may include information descriptive of: (1) a degree of relevancy of nodes connected by the link, (2) a direction of the link, (3) a type of the link, (4) a source of the link (e.g., an author of the link), (5) an age or creation date of the link, and (6) a local context or local contexts that contain the link. A local context may contain more than one link between the same pair of nodes. The number of links between a pair of nodes may, for example, indicate a degree of relevancy of the two nodes to each other within the local context that contains the nodes.

As described above, the user's workspace is associated with a formulation of the user's task or with a number of previously formulated user's tasks. In one embodiment, the workspace may include the task formulation to which it corresponds. In one embodiment, for example, the workspace is represented by a data structure that includes information descriptive of the task formulation. The task formulation may, for example, be a textual, linguistic, or algebraic articulation of the task. The data structure representing the workspace may further include, for example, information descriptive of the task or tasks associated with the workspace, such as information descriptive of (1) notes taken by a user while working on the task or tasks, (2) semantic clouds and/or search queries formulated while working on the task, (3) project management information related to the task (e.g., a due date of the task, the name of a task leader, names of members of a team working on the task), (4) whether the task is related to another task, and (5) behavior preferences, such as whether documents retrieved from a search while performing the task are to be pre-processed.

Textual and/or contextual nodes may be represented as objects having one or more methods that may be performed by the objects. For example, a textual node corresponding to a document created by a word processing application program may include an editing method that. when executed, invokes the word processing application program to edit the document.

Similarly, a document corresponding to a transaction (e.g., a purchase order) may, for example, include a method which, when executed, performs the transaction to which the document corresponds.

As described above, in one aspect, the knowledge management system provides a knowledge base representing knowledge. In another aspect, the knowledge management system provides methods for generating and manipulating the knowledge base. For example, the knowledge management system provides methods for generating the knowledge base described above, in which information may be represented in the knowledge base separately from contexts. The knowledge management system also provides methods for identifying and generating workspaces based on task formulations, as described above. For example, in response to submission of a task formulation by the user, the knowledge management system may search a global context to identify workspaces (i.e., local contexts) within the global context that correspond to tasks that are similar to the task formulation submitted by the user. The system may then, for example, incorporate links from the identified workspaces into the user's workspace, or, based on the identified workspaces, the system may create a new workspace. The knowledge management system also has means for interactively formulating a user's task articulation or a search query. In one embodiment, a task formulation is a set of questions, phrases, or terms associated with a workspace. When a user submits a task formulation, the system may, for example, respond to the formulation with related terms from task formulations corresponding to existing workspaces. The system may, for example, suggest that the user add terms to the user's task formulation from related task formulations. When the user subsequently explores the user's workspace and documents retrieved from searches of external information sources, the system may suggest further modifications to the user^'s task formulation based on such exploration. For example, the system may suggest that the user add terms that the user highlights in documents retrieved from an external search to the user's task formulation. The user may accept or reject the system^'s recommendations, and the user may further modify the task formulation in any way at any time.

The knowledge management system also may refine a query on a database of information objects. In doing this, the system may use a search language vocabulary. The search language vocabulary may be received from outside the system and/or actively formed by the system to reflect the variety of tasks performed by the users of the system. The search language vocabulary and its structure may be updated and refined continuously and automatically. The system uses the search language vocabulary for automatic elaboration of search queries (called "semantic clouds") based on the user's task formulation and on the pattern of the current user's activity within the system. The search language vocabulary may be structured as a union of possibly overlapping sub-sets of words (called "semantic groups"). A semantic cloud (i.e., an automatically generated search query) may originally be formed as an intersection of a semantic group and of a collection of words picked up by the user during his current activity within a given workspace. The system refines semantic clouds for the user based on results of searches. Based on the efficiency of the refined semantic clouds, the system updates the search language vocabulary and refines the structure of the search language. Information objects returned as the result of a search of an information source may be incorporated into the user^'s workspace or be incorporated into a new permanent or temporary workspace. The changes in the workspace that have been approved by the user may be automatically incorporated into the global context so that the knowledge gained by the user can be shared with other users. Other users may either accept or reject changes to their own workspaces caused by changes in the global context. The knowledge management system enables context-based searching by retrieving information from the global context and from external sources based on the user's articulation of his or her task. Such searching is likely to retrieve information that is most relevant to the user^'s task and to accomplish such retrieval more quickly than conventional systems by directly identifying the most relevant information. Relevancy of information is defined interactively with the user and organized by the knowledge management system based on relevancy to users' tasks. In such a system, access to external information influences the structure and content of the knowledge base, and the structure and content of the knowledge base influences the process of information acquisition. The system thus provides a computing environment that automates the search for internal and external information and the direction of communication in an organization in a particular knowledge domain based on relevance of information to the users. An individual user may use the entire system for creating his own collaborative environment in which different forms of his activity are presented as different and mutually- coordinated task-specific local contexts (i.e., the user's workspaces).

Providing users with the ability to update global contexts as described above allows collaboration among many users working within the same knowledge domain. By incorporating the results of users' activities into the global context, the system may be used to allow users to quickly learn from and collaborate with each other. For example, changes made to a local context by one user may be viewed by other users and influence other users' task formulations and searches. Previous versions of local contexts may be viewed by users to, for example, examine the ways in which tasks were formulated, developed, and changed over time. By creating shared local contexts for tasks performed by a team and integrating them into the global context, for example, team members may be kept constantly up to date on evolving group knowledge. Furthermore, this collaborative aspect of the knowledge management system encourages organizational learning. An organization learns to the extent that it can remember and re-use what individual members of that organization have learned both individually and in conjunction with each other. Learning in an organization includes continuous testing of experience, and the transformation of that experience into knowledge that is accessible to the whole organization and that is relevant to the organization's purposes. Since the system captures and aggregates learning, the knowledge base developed by the system represents what has been learned by an organization.

The knowledge management system also has means for browsing the knowledge base. The system may, for example, present the user with a visual representation of the knowledge corresponding to a context, which the user may then browse. The visual representation may, for example, be a graph, a list, a table, or a "file and folder"-type display. When the visual representation is a graph, nodes may be represented as vertices of the graph and links between nodes may be represented as edges connecting the corresponding vertices. The system may visually indicate kinds and degrees of relevance in any of a number of ways. For example, when the visual representation is a graph, the length or thickness of an edge connecting two vertices may represent a degree of relevance of the information objects represented by the two vertices. Similarly, the color of the edge may represent a kind of relevance between the two information objects represented by the two vertices. When the system displays a visual representation of a context, the system may visually indicate a recommended order in which to browse the nodes of the context. The system may, for example, generate a recommended browsing order geometrically, e.g., as a path in the graph.

More than one information object may be associated with a textual node. For example, a textual node may be associated with multiple closely related information objects retrieved from external sources. In this case, the system may treat the collection as a single document associated with a single textual node. When the user creates a text associated with a textual node, the system may associate all the versions of the document with the textual node. Multiple information objects corresponding to a textual node may be represented in the visual representation of the textual node as, for example, a thumbnail in which subsections of the textual node are represented either as a reduced-sized view of the document or as a set of rows and columns, in which the presence or absence of a selected term in a subsection is indicated by a mark in the corresponding column or row. A subsection of a textual node may include, for example, a line, a sentence, a paragraph, or a chapter. Subsections of a textual node may also be represented in the visual representation of the textual node as, for example, a highlighted document, in which multiple, disconnected segments of a document can be highlighted. The kind of highlighting used may, for example, indicate an action to be taken by the system with respect to the highlighted portion of the textual node.

In another aspect, the knowledge management system has means for measuring knowledge. Measurements may, for example, include measurements of (1) a general volume of knowledge, (2) the density of knowledge, (3) informational value of a given textual node, (4) mutual relevance of textual nodes within a given local context; (5) relevance of textual to contextual nodes within a given local context; (6) efficiency of collaborative relationships and information sharing between different local contexts; and (7) a value of a user's contribution to a local context and to the overall body of knowledge. Such measures of relevancy and efficiency may be used, for example, to measure the amount of progress that has been made in the achievement of a task and a relative degree of contribution of different users to a task. The knowledge management system may be implemented as a layer on top of existing information systems, such as databases, file systems, and intranets. Because contexts may contain references to information (e.g., URLs to documents on an intranet), rather than the information itself, the system may be used in conjunction with any pre-existing system without importing information from such a system into the knowledge management system. This layering of the knowledge management system on top of existing information systems minimizes the amount of storage for maintaining the knowledge management system and allows the knowledge management system to make use of existing tools for viewing, editing, and storing information.

The knowledge management system may be implemented, for example, according to a client-server architecture in which a knowledge management server maintains a knowledge base and a client request and receives knowledge from the knowledge management server. For example, the client may be implemented as a standalone application or a web browser applet written in a language such as Java for allowing a user or group of users to access a knowledge base via the World Wide Web. The server may be an application program that accumulates and shares knowledge in a data structure such as that described above.

Although the system described above is described as a knowledge management system, application of the techniques described herein is not limited to knowledge management. Rather, aspects of the system described herein may be directed to other applications, including, but not limited to, the following: personal information management, file system browsers for a personal computer, Internet web browsers, e-mail clients, search engines, groupware, medical records management, customer support systems, project planning systems, testing and problem management systems, electronic book indices, patent writing and reviewing, sales force automation and sales relationship management, engineering document management, collaborative document development and project work for trusted communities (e.g., a team within an enterprise) and for non-trusted communities, Internet portals, personalization systems, and business process re-engineering. Relationships within a global context and the local contexts from which it is aggregated are developed over time by a community of users working within a knowledge domain rather than being pre-defined by a system administrator. Rather than relying on rigid categories or rules, the knowledge management system learns emerging relationships among information and local contexts over time by keeping track of what information and local contexts users find relevant to tasks they perform. Such tracking of the use of the system allows the knowledge management system to recognize and support new and evolving categories. Furthermore, this tracking allows the knowledge management system to both be responsive to the tasks that users perform and to help users better formulate and execute their tasks. The knowledge management system may accommodate multiple — sometimes contradictory — organizing frameworks, recognizing that there is no single "correct" way to organize information that is useful to all users in all situations.

Because the knowledge management system automatically tracks the ways in which users use knowledge to perform their tasks and the ways in which users indicate that information and local contexts are relevant to their tasks, there is no need for any a priori taxonomy to categorize the user's activity. The system automatically elaborates the categorized activities (i.e.. workspaces) and their taxonomy (presented in a "domain vocabulary", as described below). In one aspect, a first data structure, tangibly stored on a computer-readable medium, is provided. The first data structure represents knowledge about a task performed by at least one user. The first data structure includes information descriptive of relationships among a plurality of nodes. The plurality of nodes include a plurality of contextual nodes corresponding to tasks performed by the at least one user, and a plurality of textual nodes corresponding to units of information. The textual nodes may, for example, be information objects. The information objects may, for example, be information objects stored in a knowledge base of a knowledge management system, data objects stored in a database system, files stored in a computer file system, information objects accessible via an intranet, or information objects accessible via an internet. The textual nodes may include references (e.g., URLs) to information objects, in which case the information objects may, for example, be documents stored in files on a computer- readable medium.

The relationships among the nodes may, for example, include pairwise relationships (e.g. orderings or degrees of relevancy) between pairs of nodes. The relationships may indicate degrees of relevancy between pairs of nodes .

The first data structure may further include, for example, information descriptive of (1) sources of the relationships among the plurality of nodes, (2) ages of the relationships, or (3) the task performed by the at least one user. The relationships may include at least two relationships between a first one of the plurality of nodes and a second one of the plurality of nodes. The information descriptive of sources of the relationships may, for example, identify authors of the relationships, and the method may further include a step of generating a measure of a degree of contribution to the data structure of a select one of the authors. The method may further include a step of, for example, generating a measure of the quantity and/or quality of knowledge represented by the data structure. In another aspect, a second data structure, tangibly stored on a computer-readable medium, is provided that represents knowledge in a domain. The second data structure includes information descriptive of (1) a plurality of first data structures as described above, and (2) relationships among the plurality of first data structures. In another aspect, a third data structure, tangibly stored on a computer-readable medium, is provided that represents knowledge in a plurality of domains. The third data structure includes information descriptive of (1) a plurality of second data structures as described above, and (2) relationships among the plurality of second data structures. In another aspect, a method for generating a data structure representing knowledge about a task performed by at least one user is provided. The method includes a step of: generating information descriptive of relationships among a plurality of nodes, the plurality of nodes including: (1) a plurality of contextual nodes corresponding to tasks performed by at least one user; and (2) a plurality of textual nodes corresponding to units of information. The textual nodes may, for example, include information objects. The information objects may, for example, be (1) information objects stored in a knowledge base of a knowledge management system, (2) data objects stored in a database system, (3) files stored in a computer file system, (4) information objects accessible via an intranet, or (5) information objects accessible via an internet. The textual nodes may include references (e.g., URLs) to information objects.

The relationships among the nodes may, for example, be pairwise relationships between pairs of nodes (e.g.. degrees of relevancy between pairs of nodes, orderings of pairs of nodes). The information descriptive of the relationships among the plurality of nodes may, for example, be information descriptive of sources of the relationships or information descriptive of ages of the relationships. The relationships may include information descriptive of the task performed by the at least one user. The relationships may include at least two relationships between a first one of the plurality of nodes and a second one of the plurality of nodes.

The method may further include a step of displaying a visual representation of the data structure on an output device. The visual representation may include visual representations of a plurality of document containers corresponding to the plurality of contextual nodes, and visual representations of a plurality of documents corresponding to the plurality of textual nodes. The document containers may include directories on a storage device. The visual representation may, for example, include a list including elements corresponding to the contextual nodes and elements corresponding to the textual nodes. The visual representation may, for example, be a graph including a plurality of contextual vertices corresponding to the plurality of contextual nodes; a plurality of textual vertices corresponding to the plurality of textual nodes; and a plurality of edges corresponding to the relations among the plurality of nodes. The visual representation may include graphical elements corresponding to the plurality of nodes, and the method may further include a step of indicating an order of navigation of the graphical elements based on relevancies of the corresponding textual nodes to the task. The relevancies may be determined based upon a geometric comparison of the documents corresponding to the textual nodes. The information objects may. for example, be documents, and the visual representation may include document icons corresponding to the documents. The document icons may, for example, indicate the presence of terms in the documents that are relevant to the task. The method may further include steps of: indicating a selected portion of the information object; and persistently storing information descriptive of the indication of the selected portion of the document. The method may further include steps of indicating a selected portion of the information object; and indicating that the selected portion of the information object is relevant to the task performed by the user. The method may further include steps of: indicating a selected portion of the information object; and modifying a formulation of the task performed by the user based on the selected portion of the information object. The method may further include steps of: indicating a selected portion of the information object; and modifying a search query of information sources based on the selected portion of the information object. The method may further include a step of: generating a measure of a relevance of a select one of the plurality of nodes to the task performed by the user.

In another aspect, a method for generating a data structure representing knowledge in a domain is provided. The method includes steps of: generating information descriptive of a plurality of first data structures as described above, and generating information descriptive of relationships among the plurality of first data structures. In one embodiment, the method further includes steps of identifying a task performed by a user, and identifying a select one of the plurality of data structures representing knowledge about the identified task. In a further embodiment, the method further includes steps of: defining a user-task graph including (1) a plurality of user vertices corresponding to a plurality of users; (2) a plurality of task vertices corresponding to tasks performed by the plurality of users; and (3) a plurality of edges connecting each user vertex in the plurality of user vertices to task vertices corresponding to tasks performed by the user to which the user vertex corresponds; and selecting a task vertex connected to a select one of the plurality of user vertices, the select one of the plurality of user vertices corresponding to the user.

In another aspect, a method for generating a data structure representing knowledge in a plurality of domains is provided. The method includes steps of: generating information descriptive of a plurality of first data structures as described above, and generating information descriptive of relationships among the plurality of first data structures. The generating step may be performed in response to user input indicating the relationships. The generating step may include a step of analyzing information descriptive of the nodes to determine the information descriptive of relationships among the plurality of nodes. The method may further include a step of receiving input from a user indicating the relations among the plurality of nodes. The generating step may include generating information descriptive of relationships among a plurality of nodes based on the input from the user, and the method may further include steps of analyzing information descriptive of the plurality of nodes, generating information descriptive of proposed relationships among the plurality of nodes based on the analysis; presenting the information descriptive of the proposed relationships to a user; and receiving input from the user with respect to the information descriptive of the proposed relationships. The receiving step may include, for example, a step of receiving input from the user indicating (1) an addition of a relationship between select ones of the plurality of nodes, (2) a deletion of a relationship between select ones of the plurality of nodes, or (3) whether the user approves of the proposed relationships among the plurality of nodes. The receiving step may include a step of receiving input from the user indicating relevancy relationships among the plurality of nodes.

In another aspect, a method for generating a formulation of a task performed by a user is provided. The method includes steps of generating an initial formulation of the task; generating proposed modifications to the initial task formulation based on a plurality of formulations of tasks performed by at least one user; receiving user input with respect to the proposed modifications; and modifying the initial formulation of the task based on the proposed modifications and the user input. The step of generating an initial formulation of a task may include, for example, a step of receiving the initial formulation of the task from the user. The step of generating proposed modifications to the initial task formulation may include steps of comparing the initial formulation of the task to the plurality of formulations of tasks to determine degrees of similarity between the initial formulation of the task and the plurality of task formulations: and generating proposed modifications including elements selected from select ones of the plurality of formulations whose degree of similarity to the initial formulation of the task is within a predetermined limit. The step of receiving user input may include a step of receiving terms selected by the user from documents, and the step of generating proposed modifications to the initial task formulation may include a step of generating proposed modifications including the terms selected by the user. In another aspect, a method for generating a formulation of a search query related to a task performed by a user is provided. The method includes steps of: generating an initial formulation of the search query; retrieving information from an information source based on the source query; and generating the formulation of the search query based on the initial formulation of the search query and the retrieved information.

In another aspect, a method for generating a formulation of a search query related to a task performed by a user is provided. The method includes steps of: receiving information from the user related to the task; and generating the formulation of the search query based on the information received from the user and information derived from at least one formulation of a search query generated by at least one user who performed the task. The information received from the user may include terms selected by the user from documents related to the task. The information derived from at least one formulation of a search query generated by at least one user who performed the task may include a search language vocabulary including a plurality of sets of terms derived from the at least one formulation of a search query. The generating step may include a step of generating at least one semantic cloud including terms that are common to the terms selected by the user and terms in a select one of the plurality of sets of terms in the search language vocabulary. The method may further include steps of selecting a select one of the at least one semantic cloud; and performing a search of an information source based on the selected semantic cloud. The units of information may include units of information produced as a result of performing the search.

In another aspect, a knowledge management system is provided. The knowledge management system includes an information source and a knowledge management interface configured to provide an interface between the information source and a plurality of users, the knowledge management interface presenting information in the information source based on its relevance to a task performed by at least one of the plurality of users.

Brief Description of the Drawings

FIGS. 1A-1C are data flows diagrams of embodiments of computer systems that manage knowledge using global and local context graphs;

FIG. 2 is a data flow diagram of an aspect of the system of FIG. 1A relating to searching, browsing, and editing documents: FIG. 3 is a diagram of a graph of a global context;

FIG. 4 is a flow chart of a process for assisting a user in the formulation of a task; FIG. 5 is a two-dimensional representation of a user-task graph; FIGS. 6A-D are tables descriptive of information contained within data structures used in the systems of FIGS. 1A-C;

FIGS. 6E-G are diagrams of data structures used in the systems of FIGS. 1A-C;

FIG. 7 is a two-dimensional representation of a local context graph; FIG. 8 is a flow chart of a method for generating a recommended path of navigation through a two-dimensional representation of a local context graph;

FIG. 9 is a flow chart of a method for transforming a selected terms document into a proto-task document;

FIG. 10 is a graphical representation of semantic clouds; FIG. 11 is a flow chart of a method for performing search and enrichment;

FIG. 12 is a diagram of a window displaying a task document, a two-dimensional representation of a local context graph, and found documents;

FIG. 13 is a diagram of an updated local context graph; and

FIG. 14 is a diagram of a history graph.

Detailed Description

The following detailed description should be read in conjunction with the attached drawings in which similar reference numbers indicate similar structures. All references cited herein are hereby expressly incorporated by reference. Referring to FIG. 1A, in one embodiment, a community of users uses a knowledge management system 10a to develop a domain-specific, dynamic knowledge base and to assist in the performance of domain-specific tasks. A user with a task to perform begins interacting with the system 10a by interactively developing a formulation of the task to be performed. The user generates input 130, representing an initial formulation of the task, to a task formulator 128. The task formulator 128 may suggest modifications to the user's initial task formulation based on information contained in a task repository 132 which contains information about tasks previously performed by the same user or other users. When the user considers the formulation of the task complete, the user instructs the task formulator 128 to generate a final task formulation 126. In this way, the system 10a uses knowledge developed over time about various tasks and the information that is relevant to such tasks to aid the user in the formulation of his task.

A localization process 102a uses the task formulation 126 to extract from a global context 100a local context 104a corresponding to the task represented by the task formulation 126. The extracted local context 104a includes knowledge developed by the user or by other users who have previously used the system to perform the task. As a result, the extracted local context 104a includes information that is likely to be relevant to the user's task and that is organized in a way that is related to the user's task. The local context 104a is presented to the user, such as by displaying a visual representation of the local context 104a on a computer monitor. The visual representation of the local context 104a may, for example, display both visual representations of the information objects and local contexts that are referenced by the local context 104a, as well as the relationships within the local context (e.g., relationships between information objects). The user may examine the knowledge in the local context 104a and in related local contexts and extract from the local context 104a knowledge that the user determines to be relevant to the user's task using a browsing, editing, and searching module 105, which is described in more detail below.

If the user is satisfied that the knowledge extracted from the local context 104a is sufficient to complete the user^'s task, the user may terminate his interaction with the system 10a. If the user desires to obtain additional knowledge from sources external to the knowledge management system 10a, the user may use the browsing, editing, and searching module 105 to interactively formulate a query that can be used to search external information sources. The browsing, editing, and searching module 105 may assist the user in the formulation of the query by suggesting search terms drawn from a domain-specific search language vocabulary that may, for example, be developed over time by the user and other users in the process of performing a similar task. The knowledge management system 10a may also suggest a grouping and prioritizing of search terms drawn from the knowledge extracted by the user from the local context and from other information descriptive of the user's task.

When the system 10a interactively completes formulation of the external query, the browsing, editing, and searching module 105 searches external information sources using the query. The results of the query are pre-processed by the browsing, editing, and searching module 105 to determine their relevance to the user's task so that the results may be presented to the user in a way that indicates the relevance of the search results to the user's task. The system may, for example, prioritize the results of the search based on their similarity to the user's task formulation and to information previously stored in the local context. The system may, for example, recommend an order in which the search results should be examined by the user. The results of the query may then be presented to the user in a manner that indicates the priorities assigned to them.

The user may further use the browsing, editing, and searching module 105 to explore the results of the search and to extract information from the results that the user determines to be relevant to his task. Furthermore, the user may use the browsing, editing, and searching module 105 to update the local context 104a based on the search results to incorporate into the local context 104a knowledge that the user has gained about the relevance of information and other local contexts to the user's task. The system 10a assists the user in enrichment of the workspace by suggesting places for the search results in the workspace as new textual nodes. The user may modify the local context 104a in other ways, such as by modifying relationships among different units of information within the local context.

A globalization process 106 incorporates the updated local context 104a back into the global context 100a to reflect any changes that have been made to the local context 104a. Changes to the local context 104a that potentially impact other users' local contexts may be rejected by the other users. Any users who rejected the modified local context 104a continue to work with the unmodified version of the local context 104a. Such incorporation of the updated local context 104a represents the user's contribution to the domain-specific knowledge represented by the global context 100.

The knowledge management system 10a described above may be implemented as a layer on top of existing information systems, such as databases, file systems, and intranets. Because local contexts may contain references to information objects (e.g., Uniform Resource Locators (URLs) to documents on an intranet), rather than containing the information objects themselves, the system may be used in conjunction with any pre-existing system without importing information objects from such a system into the knowledge management system 10a. Such implementation of the knowledge management system 10a as a layer on top of existing information systems both minimizes the amount of storage used to maintain the knowledge management system 10a and allows the knowledge management system 10a to make use of existing tools for viewing, editing, and storing information.

For example, the knowledge management system 10a may be implemented as a layer on top of a personal computer operating system, such as Microsoft Windows, to provide an interface between the user and the computer's file system. In such an implementation, the information objects referenced by local contexts includes files stored on the file system, and the local contexts include references to files and relationships among such files. The knowledge management system 1 Oa assists the user in the formulation of a task, as described above, using a graphical user interface. After formulating a task, the user is presented with a visual representation of the local context corresponding to the user's task. Such a visual representation may, for example, be a graph, a list, a table, or a "file and folder" -type display. The user may navigate through this visual representation, as described above, and then perform a search of files on the file system if desired. The user may then modify the local context and incorporate any changes back into the global context. Using the knowledge management system as an interface to the file system of an operating system such as Windows 98 allows the user to quickly find and organize a variety of information, when compared with the conventional method of storing files and shortcuts (links) to files in a fixed, hierarchical format.

The knowledge management system 10a has many other potential applications, such as the following. The knowledge management system 10a may be used as an interface to an electronic book, allowing multiple users to indicate which parts of the book are relevant to their tasks. The local contexts generated by users' activities may serve as electronic indices or tables of contents to be used by other users performing similar tasks.

The knowledge management system 10a may be used in conjunction with a personal information management system, such as a contact management system for a corporate sales force. In such an application, the knowledge management system 10a may be used to manage local contexts corresponding to particular sales tasks and to identify how different contacts are relevant to each other in the context of those tasks.

The knowledge management system 10a may be used to organize a customer support database in a computer technical support organization, to help customer support representatives store and make use of the knowledge they collectively develop about how different pieces of information are relevant to solving different technical problems.

The knowledge management system 10a may be implemented, for example, according to a client-server architecture in which a knowledge management server maintains a knowledge base and a client request and receives knowledge from the knowledge management server. For example, the client may be implemented as a web browser applet written in a language such as Java for allowing a user or group of users to access a knowledge base via the World Wide Web. In general, the knowledge management system 10a may be used as a front end to any database or information store to develop domain-specific stores of knowledge. Alternatively, the knowledge management system 10a may be implemented as a client-side portal which serves as a door between the user and the information within and beyond the user's local computer.

As an example, consider a user whose task is to purchase an automobile. Assume for the purposes of illustration that the global context 100a corresponds to knowledge about automobiles that has been developed over time by a community of users interacting with the knowledge management system 10a. Examples of ways in which such knowledge may be developed are described in more detail below.

In one embodiment, the local context 104a may include: (1) textual nodes (e.g., documents accessible via an intranet or a database), (2) contextual nodes, and (3) relationships between the textual nodes and contextual nodes included within the local context. Such relationships may include relationships between textual nodes, information between contextual nodes, and relationships between textual nodes and contextual nodes.

Referring to FIG. IB, the local context 104a may, for example, be represented as a local context graph 104b, including textual vertices corresponding to textual nodes and contextual vertices corresponding to contextual nodes. Edges between vertices in the local context graph 104b indicate relationships between the textual and/or contextual nodes corresponding to the vertices, as described below. The global context 100a may, for example, be represented as a global context graph 100b including contextual vertices corresponding to local contexts, in which edges between contextual vertices correspond to relationships between the corresponding local contexts.

Referring to FIG. 4A, a process 150, carried out by the task formulator 128, assists the user in generating the task formulation 126. The user begins his interaction with the system 10b, in pursuit of his task, by logging in to the system 10b. The user may, for example, enter a user name and password (step 150), thus allowing the knowledge management system 10b to identify the user and to identify any tasks that the user has previously performed using the system 10b. The knowledge management system 10b can use such information to help the user better formulate his task and, as a result, to help the user to obtain the knowledge that is most relevant to his task. For example, the knowledge management system 10b may store information about tasks previously performed by users in a data structure representing a user-task graph (described below with respect to FIG. 5) which includes a user vertex for each user who has previously used the system, a task vertex for each type of task previously performed using the system, and an edge between a user vertex and a task vertex if the corresponding user has performed the corresponding task. After the user enters his user name and password, the task formulator 128 may search the user-task graph for a user vertex corresponding to the user's name (step 154). If there is no user vertex in the user-task graph corresponding to the user's name (decision step 156), a user vertex corresponding to the user's name is added to the user-task graph (step 158) and the task formulator 128 requests that the user to input at least one phrase descriptive of the task the user wishes to perform (step 160). The phrase may, for example, be in the form of a plain English statement or question, such as "I would like to purchase an automobile."

If there is a user vertex in the user-task graph corresponding to the user's name (decision step 156), the task formulator 128 presents the user with a list of workspaces in which the user previously performed tasks (step 162). The user may then submit an initial task formulation either by selecting one of the workspaces from the presented list or by entering a phrase corresponding to a new task to be performed (step 164).

The task formulator 128 generates an initial task formulation (step 166). The task formulation 126 may, for example, include nouns extracted from the phrase or phrases inputted or selected by the user in step 160 or 164. Nouns may be extracted by, for example, comparing words in the phrase or phrases with a standard dictionary of nouns.

After generation of an initial task formulation, the knowledge management system 1 Ob interacts with the user to further refine and develop the task formulation. For example, the task formulator 128 may suggest changes to the initial task formulation based on task formulations previously used by the user or by other users. The task formulator 128 may, for example, search the global context graph 100b for contextual vertices corresponding to the task formulation (step 170). If a contextual vertex sufficiently matching the task formulation cannot be found (step 172), the task formulator 128 may suggest modifications to the task formulation (step 174). For example, the task formulator 128 may suggest that particular words be added to the task formulation. If the user's initial task formulation is "I would like to purchase an automobile," the task formulator 128 may, for example, suggest that the user add the words "in Massachusetts" to the task formulation if a previous user used the system to assist in purchasing an automobile in Massachusetts. Such interactive formulation of the task can help the user to think more clearly about the task to be performed and help the user to take advantage of knowledge gained by other users that has been incorporated into the system.

The user may modify the task formulation according to the suggested modifications, according to the user's own desired changes, or a combination of both (step 176). If the user modifies the task formulation, the task formulation is again matched against the contextual vertices in the global context graph (step 170). The task formulation may thereby be interactively modified and improved.

Alternatively, if a contextual vertex sufficiently matching the task formulation cannot be found (step 172), the task formulator 128 may add a new contextual vertex to the global context graph 100 corresponding to the set of nouns contained within the task formulation. The task formulator 128 may suggest a location for such a new vertex in the global context graph 100. For example, the task formulator 128 may suggest that the new vertex be placed in proximity to contextual vertices corresponding to tasks which most closely match the task formulation. Contextual vertices added in this way may be permanently added to the global context graph 100 if the user's task completes successfully and if the local context graph 104b is subsequently reincorporated back into the global context graph 100b by the globalization process 106, as described in detail below.

If a contextual vertex in the global context graph 100 matching the task formulation is found (decision step 172), the found contextual vertex is selected as the "primary contextual vertex" (step 178). If the user declines to modify the task formulation after step 174 (step 180), then a contextual vertex corresponding to the task formulation is created and placed in the global context graph 100b (step 182). The contextual vertex created in step 182 is selected as the primary contextual vertex (step 183). Formulation of the user's task is now complete (step 184). The localization process 102b generates the local context graph 104b by selecting a subgraph of the global context graph 100b centered on the primary contextual vertex and extending outward from the primary contextual vertex by a predetermined number of edges (step 186). The predetermined number may, for example, be 2, in which case the local context graph 104b may include all vertices and edges within two edges of the primary contextual vertex. The local context graph 104b represents the part of the global context graph 100b that is most relevant to the user's task.

Assume for purposes of example that the user has settled on "I would like to purchase an automobile in Massachusetts" as a task formulation. If the knowledge management system has previously been used (either by the user or by other users) to perform the user's task, then the extracted local context corresponding to this task formulation may include knowledge gained by the system during such previous uses. For example, the relationship described above between a web site of a car dealership and a web site of the manufacturer of the cars sold by the car dealership may have been generated by a previous user who, while using the system to assist in the task of purchasing a car, browsed the two web sites and determined that they were relevant to each other within the context of purchasing a car. Knowledge gained by previous users about the relevancy of information and context to a particular task is thereby incorporated into the knowledge management system for future use. This interactive growth of knowledge enables the development of domain-specific and activity-specific knowledge by a community of users over time.

The localization process 102b generates a visual representation 200 (FIG. 7) of the local context graph 104b (step 188). Referring to FIG. 2, a browser 120 generates display information 124 corresponding to the two-dimensional representation of the local context graph 200, and presents the display information 124 to the user on a display device such as a computer monitor. Such a visual representation allows the user to quickly and intuitively grasp the nature of the knowledge that has been developed within the local context. Visual representations other than a graph, such as a list, table, or "file and folder" -type display may be used to represent the local context graph.

FIG. 7 shows an exemplary two-dimensional representation of the local context graph 200 generated from the point of view of a primary contextual vertex C3. As shown in FIG. 7, the primary contextual vertex is not shown in the two-dimensional representation of the local context graph 200, because the two-dimensional representation of the local context graph 200 corresponds to a projection of the local context graph 104b onto the plane of the screen taken from the point of view of the primary contextual vertex. As further shown in FIG. 7, only those vertices in the global context graph 100b which are within the predetermined radius (i.e., two edges) of the primary vertex C3 are present in the two-dimensional representation of the local context graph 200. The user may explore the local context 104b to further the completion of the user's task.

For example, the user may submit navigation commands 122 to the browser 120 to navigate through the two-dimensional representation of the local context graph 200. For example, selecting a textual vertex, such as the textual vertex T2, causes the browser 120 to modify the display information 124 so that the document represented by the selected textual vertex is displayed on the screen. Selecting a contextual vertex causes the browser 120 to modify the display information 124 so that a two-dimensional representation of the local context graph corresponding to the selected contextual vertex is displayed on the screen. In other words. selecting a contextual vertex within the browser 120 allows the user to browse two-dimensional representations of local context graphs other than the two-dimensional representation of the local context graph corresponding to the primary contextual vertex.

When the two-dimensional representation of the local context graph 200 is first displayed on the screen, the browser 120 indicates a recommended path of navigation through the two-dimensional representation of the local context graph 200 from one textual vertex to another by, for example, highlighting the path. The user may choose to navigate the documents in this order or in any other order.

Referring to FIG. 8, a recommended path of navigation may be generated as follows. A counter variable I is initialized with a value of one, and a variable V representing a vertex is initialized to the primary contextual vertex (step 202). Textual vertices connected to vertex V in the local context graph and corresponding to documents containing the I-th word of the task formulation are identified (step 203). One of the found textual vertices is selected (step 204). A directed edge from vertex V to the selected textual vertex is generated, indicating that the preferred navigation path includes navigating from vertex V to the selected textual vertex (step 205). If I is not equal to the number of words in the task formulation (decision step 206), then the variable V is assigned to the selected textual vertex and the variable I is incremented (step 207), and control returns to step 203. Otherwise, generating of the recommended navigation path is complete (step 208), and the recommended navigation path may be displayed to the user, as described above, using the special directed edges just described.

As previously mentioned, a user may cause the browser 120 to generate display information 124 corresponding to a document by selecting a textual vertex in the two- dimensional representation of the local context graph 200 corresponding to the document. While a document is displayed on the screen, the user may select words within the document using, for example, a keyboard or a mouse. When the user selects words in the document, the browser 120 places the selected words within the selected terms document 134. As the user browses documents using user navigation commands 122, the browser 120 adds words selected by the user in any of the documents to the selected terms document 134. The browser 120 may also add additional information to the selected terms document when the user selects words from a document, such as (1) an author of the document, (2) a source of the document, (3) the date on which the document was created, (4) the date on which the words were selected, and (5) any additional information related to the selected terms or the document from which the selected terms were selected, such as notes taken by the user with respect to the document or a title supplied by the user to associate with the document or with notes taken by the user. When the user finishes browsing documents corresponding to textual vertices of the two-dimensional representation of the local context graph 200, the user indicates to the browser 120 that browsing is completed.

Referring to FIG. 9, the search language pre-processor 136 pre-processes the selected terms document 134 to generate the proto-task document 138. To describe this process, the "search language vocabulary" is first described. The search language vocabulary is a set S of words selected by users during browsing of the two-dimensional representation of the local context graph 200, words selected by users from found texts 112, and words selected by the system 10 from documents it has pre-processed. The words in the search language vocabulary are organized into n possibly intersecting sub-sets S(0), S(l), S(2). ..., S(n), referred to as "semantic groups." Semantic groups consist of elements which may be either single words or several words if the several words form a single term. The words in the search language vocabulary may, for example, be limited to nouns (including proper names and addresses), in which case the semantic group S(0) may be reserved for addresses (including addresses of external search engines).

It is possible that the user completes his task merely by exploring the local context corresponding to the task. In such a case, the user indicates that his task is completed, and he logs out of the system. He may then use the task document 138 to further pursue his task. However, in some cases the user desires to obtain additional knowledge beyond that which is incorporated within the local context corresponding to the user's task. In such a case, the user may use the knowledge management system to obtain information from external information sources (e.g., web sites, search engines, the user's hard disk drive), examine such information, and incorporate such information into the task document. Furthermore, the user may incorporate information obtained from external information sources into the local context corresponding to the user's task. The local context thus gains knowledge from the user's activity, and such knowledge may be used by the same user or by other users in subsequent sessions.

For example, referring to FIG. 9, words selected by the user during browsing of the two-dimensional representation of the local context graph 200 (i.e., the words in the selected terms document), words selected by the user from found texts 112, and words selected by the system 10 from documents it has pre-processed are placed into a temporary semantic group. - zo - which is temporarily added to the set of existing semantic groups (step 210). This temporary semantic group is used, in future sessions, to form semantic clouds in the same way as permanent semantic groups are used to form semantic clouds, as described in more detail below. In this way, the temporary semantic group is used to update and restructure the search language vocabulary.

The search and enrichment module 108 selects a semantic group which contains a word of the element of the domain vocabulary 132 corresponding to the task formulation 126 (step 212). Semantic groups selected in this way are referred to as "special semantic groups." The search language pre-processor 136 also selects additional semantic groups which have any non- empty intersection with a special semantic group (step 214). These additionally selected semantic groups are referred to as "semi-special semantic groups." The search language preprocessor 136 takes the intersection of the selected terms document with the special semantic groups and the semi-special semantic groups (step 216). The groups of words that result from these intersections are referred to as semantic clouds. The proto-task document is then generated (step 218). The proto-task document includes (1) the user's user ID, (2) the task formulation 126. and (3) the semantic clouds generated as described above. The proto-task document may also, for example, include other information from the selected terms document, such as the author, source, and date information described above. Referring to FIG. 10, the semantic clouds are presented graphically to the user using a graphical display 220. Semantic clouds having common terms may be displayed as intersecting circles or clouds in such a way that the common terms are located within the area of intersection.

The user may select a graphical representation of one of the semantic clouds using an input device such as a mouse, causing the search and enrichment module 108 to launch a search of the information sources 1 10 based on the words contained within the selected semantic cloud. The result of the search is a set of found texts 112.

The knowledge management system pre-processes the found texts 112. The preprocessing performs two functions: (1) to prioritize the found texts 112 based on their relevance to the user's task, and (2) to identify those parts of the found texts 1 12 that are most relevant to the user's task. For example, referring to FIG. 11, the search and enrichment module 108 pre- processes and prioritizes the found texts 112 as follows. The search and enrichment module 108 selects the semantic cloud which was used to launch the search which produced the found texts 112 (step 230). For each of the found texts 112, the search and enrichment module 108 identifies the locations of words from the selected semantic cloud which appear in the found text (step 232). The editing system 114 displays a window on the screen for each of the found texts 112 (step 234). The window for a found text displays segments of the found text surrounding the words in the found text which are in the selected semantic cloud. For example, the window may display three lines of text above and below each matching word. If a word which does not belong to the search language vocabulary appears frequently (e.g., at least twice) within the found text, the word is added to the temporary semantic group. If a word which belongs to a special semantic group appears frequently in the found text, the word is added to the temporary semantic group.

Words within a found text which occur within a predetermined distance (measured in. e.g., number of sentences) of words in the selected semantic cloud with at least a predetermined frequency are added to the temporary semantic group. For example, the predetermined distance may be two sentences and the predetermined frequency may be three, in which case any word within a found text which occurs at least three times within two sentences of the found text is added to the temporary semantic group.

The user may then browse and edit the results of the search. For example, the user may use the editing system 114 to edit the found texts 112 and the proto-task document and to generate additional created texts 118 (step 236). The proto-task document may be amended, for example, with additional text or other information representing proposed solutions to the user's task. Referring to FIG. 12, the editing system 114 displays a window 250 which displays the task document 138, the two-dimensional representation of the local context graph 200, and documents 252a-c from which the user previously selected words while reading the found texts 112. The window 250 includes graphical links 254a-c from words in the proto-task document 138 to the documents 254a-c from which the words were selected. The window 250 also includes graphical links 254d and 254f from documents 252a and 252c to textual vertices T7 and TI in the two-dimensional representation of the local context graph 200 corresponding to the documents. The two-dimensional representation of the local context graph 200 also includes a "virtual" textual node T. corresponding to a document retrieved from the external information sources 110, but which has not yet been permanently added to the two-dimensional representation of the local context graph 200. A link 254e connects document 252b, whiGh corresponds to the virtual textual node T . to the virtual textual node T . When the user is finished using the editing system 1 14 and decides that the task has been accomplished, revision of the proto-task document is complete, and the proto-task document is subsequently referred to as the "task document" (step 238).

The search and enrichment module 108 generates a diagram representing the words identified in step 232 and the distances between them (step 240). More specifically, each vertex in the diagram corresponds to an occurrence of an identified word in the found text. The distance between two vertices representing a first word and a second word, respectively, corresponds to the number of sentences separating the first word and the second word in the found text. The distance may be zero. An edge connecting two vertices is generated if the distance between the words corresponding to the vertices is less than or equal to two.

The search and enrichment module 108 preprocesses the proto-task document as soon as the document contains a predetermined number of coherent sentences. The proto-task document may contain words from one or more searches of external information sources 1 10 resulting from the user's selection of one or more semantic clouds. For each such semantic cloud, the search and enrichment module 108 identifies the locations of words from the semantic cloud which appear in the proto-task document (step 242) and generates a diagram of the words in the same way as described above with respect to step 240 (step 244). As a result, one or more diagrams of the proto-task document are created, each such diagram corresponding to words generated from a search based on a particular semantic cloud. The diagrams generated as described above are used to prioritize found documents 112 in order of the diagrams' similarities to diagrams of the proto-task document (step 246). The degree of similarity between two diagrams may be determined on the basis of, for example, similarity of the diagrams' shapes, similarity of the number of vertices in the diagrams, similarities of the words to which vertices in the diagrams correspond, and combinations thereof. For example, the search and enrichment module 108 may first determine whether two diagrams have the same geometric shape (e.g., square or triangle). If the two diagrams have the same geometric shape, the search and enrichment module 108 compares the words corresponding to the vertices of the diagrams. If the words corresponding to the vertices of the diagrams have more than a predetermined degree of similarity, then the two diagrams are considered similar to each other.

The user may modify the local context to incorporate the knowledge gained by the user. The user may, for example, add references in the local context to information that the user considers relevant to the user's task. For example, if the local context is represented as a graph, the user may add to the local context a textual vertex corresponding to a document that the user considers relevant to the user's task. Similarly, the user may add a contextual vertex to the local context corresponding to another local context that the user considers to be relevant to the user's task. The user may also add, delete, or modify relations (which, in the case of a graph, are represented by edges) in the local context graph. For example, the user may add an edge between two textual vertices in the local context graph to indicate that the two documents represented by the two textual vertices are related to each other within the context of the user's task. For example, in one embodiment, for each of the found texts 112 and created texts 118, the search and enrichment module 108 determines a recommended location in the local context graph 104a at which to place a textual vertex corresponding to the text (step 248). The system 10b may, for example, recommend that a textual vertex corresponding to a found text be placed near textual vertices corresponding to other documents whose diagrams are similar to that of the found text. The search and enrichment module 108 also recommends placement of edges connecting new textual vertices to each other or to existing vertices in the local context graph 104b. For example, the search and enrichment module may suggest that edges be created connecting textual vertices corresponding to the found texts to the primary contextual vertex. The system similarly recommends a location for a textual vertex corresponding to the task document (step 250).

Alternatively, the search and enrichment module 108 may suggest a recommended location in the local context graph 104b at which to place a textual vertex corresponding to a document retrieved as a result of a search based on a semantic cloud as follows. As previously mentioned, the user navigates the two-dimensional representation of the local context graph 200 using the browser 120. The textual vertices visited by the user while browsing the two- dimensional representation of the local context graph 200 and the edges connecting those textual vertices form a trajectory. The search and enrichment module 108 may suggest that a textual vertex corresponding to the document be connected to a "semantic segment" of the trajectory navigated by the user. A semantic segment may initially be selected as a sub-trajectory (of the trajectory navigated by the user) which contains all of the textual loci from which terms were selected by the user and which led to the generation of the semantic cloud. Vertices and edges WO 99/64945 »_fi PCT/US 1

may be removed from the semantic segment until the semantic segment contains only one vertex corresponding to each term in the semantic cloud.

If the vertices of the diagram of the document to be added to the local context graph 104a are in a one-to-one correspondence with the vertices of the semantic segment, and the edges of the diagram of the document to be added to the local context graph 104a are in a one-to-one correspondence with the edges of the semantic segment, then the diagram has a complete geometric affinity with the semantic segment. If the diagram has a complete geometric affinity with the semantic segment, and the vertices of the diagram correspond to the words selected from the documents corresponding to the vertices of the semantic segment, selection of the semantic segment is complete. If the diagram does not have a complete geometric affinity with the semantic segment, the semantic segment may be modified or expanded until the diagram has a complete geometric affinity with the semantic segment. For example, the semantic segment may be expanded to include all vertices and edges in the global context graph 100 that are within one edge of the existing vertices in the semantic segment. If the semantic segment may be modified or expanded to have a complete geometric affinity with the diagram, then selection of the semantic segment is complete.

The search and enrichment module's recommendations are displayed to the user through the graphical user interface of the two-dimensional representation of the local context graph 200. For example, referring to FIG. 13, the search and enrichment module 108 recommends that a textual vertex T8 corresponding to one of the found texts 1 12 and textual vertex T9 corresponding to one of the created texts 1 18 be placed at the indicated locations and connected to each other by an edge. Furthermore, the search and enrichment module 108 recommends that a textual vertex TD corresponding to the task document be placed at the indicated location. If the search and enrichment module 108 generates a recommended placement by selecting a semantic segment, the semantic segment is graphically presented to the user by, for example, highlighting the vertices and edges of the semantic segment in the graphical display of the two-dimensional representation of the local context graph 200. The search and enrichment module 108 suggests to the user that a textual vertex corresponding to the document be placed near the semantic segment and that the textual vertex be connected by an edge to one or more of the vertices of the semantic segment.

The user may accept the recommended locations of the textual vertices and edges or place the textual vertices and any desired edges manually through the browser 120 using user navigation commands 122 (step 252). The user may connect textual vertices to other textual vertices or to contextual vertices using the browser 120.

The local context is incorporated back into the global context to reflect any changes that have been made to the local context. Returning to the example of the car purchasing local context, any changes that are made to the car purchasing local context may be viewed by other users, and may be used by the knowledge management system to help other users perform such functions as formulating a task and performing a search of external information sources. Furthermore, changes to a local context may be rejected by other users. Any users who reject a modified local context continue to work with the unmodified version of the local context. For example, in one embodiment, when the user has completed updating the local context graph 104a using the graphical user interface of the browser 120, the globalization process 106 merges the updated local context graph 104a into the global context graph 100. The user's session is now complete. Other users are informed of changes caused by modification of the global context graph which potentially affect those users^" current contexts. These other users may accept or reject such potential changes to their current contexts.

The search language vocabulary is updated using the temporary semantic group as follows. If, in the course of subsequent sessions, a search based on the semantic cloud formed from the temporary semantic group is fully successful, i.e., the search results to modification of the local context graph 104a dn to subsequent modification of the global context graph 100, then the temporary semantic group becomes a permanent semantic group. If, in the course of subsequent sessions, the semantic cloud formed from the temporary semantic group does not lead to a search that results in modification of the global context graph 100, then the temporary semantic group is deleted. If, in the course of subsequent sessions, the semantic cloud formed from the temporary semantic group leads only to partially successful searches, the elements of the temporary semantic group are distributed among permanent semantic groups that overlap with the temporary semantic group. A search may be considered partially successful if, for example, the search leads to the retrieval of documents which are used in the creation of the task document (and, as a result, links in the task document to such documents are created and preserved), but no new textual vertices corresponding to the documents are added to the local context graph 104a.

The elements of the temporary semantic group may be distributed among permanent semantic groups that overlap with the temporary semantic group as follows. The "distance" from a word W in a document to a subset S of words in the document may be defined as the minimum of the distances from (i.e., the number of sentences between) the word W to each word in the subset S. The "distance" from a word in a document to a semantic group which does not contain the word may be defined as the distance between the word and the intersection of the document with the semantic group. The "relative distance" between a word and a semantic group with respect to a semantic cloud may be defined as the average of the distances between the word and the semantic group in all of the documents which were retrieved by a search based on the semantic cloud and which contain the word. The "absolute distance" between a word and a semantic group may be defined as the minimum of relative distances between the word and the semantic group over all semantic clouds used to generate searches in a particular session. In the case of a partially successful search, all of the above-defined distances are calculated, and each word in the temporary semantic group which is not contained within the search language vocabulary is inserted into the permanent semantic group which has the least absolute distance from the word. Data structures used in the system shown in FIGS. 1 A-C are now described.

For example, referring to FIG. 5, an embodiment of the user-task graph is now described in more detail. The user-task graph contains two kinds of vertices: (1) user vertices, which correspond to the names of users who have previously used the system 10b, and (2) task vertices corresponding to tasks which have previously been performed by users. If a user has previously performed a particular task, then the user's user vertex in the user-task graph is connected by an edge to the task vertex corresponding to the task. For example, FIG. 5 shows a two-dimensional representation 190 of a user-task graph. The user- task graph includes four user vertices (U,, U₂, U₃, and U₄) corresponding to four users who have used the system 10a, and four task vertices (T,, T₂, T₃, and T₄) corresponding to the four tasks that have been performed by the users. Edges connect each user vertex representing a user to the task vertices representing the tasks previously performed by the user. For example, the user vertex U, is connected by edges to task vertices T,, T₃, and T₄, indicating that user corresponding to user vertex U, has previously performed the tasks corresponding to task vertices T,, T₃, and T₄.

FIG. 6A shows a data structure 600 that may be used to implement the global context 100a. Although the data structure 600 is shown and described herein as an example, any data structure that includes information descriptive of the nodes contained within the global context 100a and the relationships among those nodes may be used to implement the global context 100a. As shown in FIG. 6A, the data structure 600 is represented as a table. The information stored in the entries of the table indicates, for each contextual node in the global context represented by the data structure: (1) which contextual and/or textual nodes are contained within the contextual node, and (2) what relationships exist among the nodes contained within the contextual node. For example, consider the first entry in the data structure 600. The "Contextual Node" field of the first entry contains a value of Cl, indicating that the first entry refers to a contextual node Cl . The value stored in the "Contextual Node" field may take any form that uniquely identifies a contextual node. The "Node 1" and "Node 2" fields of the first entry contain values of C2 and TI, respectively, indicating that the contextual node Cl includes a relationship between contextual node C2 and textual node TI. Similarly, the second entry indicates that contextual node C 1 includes a relationship between contextual node C3 and contextual node C2. More generally, a relationship within a contextual node between a first node and a second node is represented in the data structure 600 by an entry having a "Contextual Node" field identifying the contextual node, a "Node 1 " field identifying the first node, and a "Node 2" field identifying the second node. The first and second nodes may each be either a textual node or a contextual node.

As described above, a contextual node can include a link from itself to a node that does not have any relationship to any other node within the contextual node. Such a link is represented within the data structure 600 as shown in the third entry of the data structure 600. The contextual node Cl includes a link to contextual node C4, and contextual node C4 does not have a relationship to any other node within the contextual node Cl . This is represented in the data structure 600 by an entry having a "Contextual Node" field with a value of Cl, a "Node 1 " field with a value of C4, and a "Node 2" field having a null value.

Identifiers of contextual and textual nodes (e.g., Cl, TI) in the data structure 600 may take the form of any unique identifier. For example, references in the data structure 600 to a node may be a pointer to a data structure representing the node or may be an integer index into a table of nodes. The data structure 600 may also include additional fields. For example, the data structure 600 may include fields indicating: (1) a degree of relevance indicated by a relationship between two nodes, (2) a date or time on which the relationship was created, (3) an author of the relationship, (4) a strength of the relationship, or (5) a direction of the relationship (e.g., whether the relationship is bi-directional). FIG. 6B shows a data structure which may be used to implement the user-task graph. The data structure is an instance of a class named UserTaskGraph. The UserTaskGraph instance includes one object of the class Edge for each edge in the user-task graph. Each Edge object includes the following fields. An integer IndexEdge field contains the index of the Edge object. i.e., the position of the Edge object within the set of Edges in the UserTaskGraph object. An integer IndexNameUser stores an index into a list of user names maintained by the system 10a, where the list may be defined as an array. The user name at index IndexNameUser of the list of user names is the user name corresponding to one of the vertices of the edge represented by the Edge object. Similarly, an integer IndexElementDomVoc stores an index into the domain vocabulary 132. The domain vocabulary element at index IndexElementDomVoc is the domain vocabulary element corresponding to the other vertex of the edge represented by the Edge object. The user "enters" the system 10a through an edge of the user-task graph if the first and second vertex of the edge correspond to the name of the user and the task selected by the user. respectively. Various statistics may optionally be maintained by the system 10a relating to user entrances into the system. Such statistics may be used, for example, to evaluate the ways in which the system 10a is being used by different users and to thereby evaluate the relevancy of information contained in the system 10a to different users. For example, an integer array field NumbOfEntr may be used to store the number of entrances to the system 10a as of the end of sequentially numbered user sessions. For example, assume that the user begins his first session by entering the system 10a through an edge A. At the end of this first session, NumbOfEntr[0] = 1 for the Edge object corresponding to edge A, and NumbOfEntr [0] = 0 for all other edges in the user-task graph. If the user enters the system 10a through an edge B to begin his second session. then NumbofEntr[l] = 1 for the Edge object corresponding to Edge A, NumbOfEntr[l] = 1 for the Edge object corresponding to Edge B, and NumbOfEntr[l] = 0 for all other edges in the user- task graph. If the user enters the system 10a through an edge A to begin his third session, then NumbofEntr[2] = 2 for the Edge object corresponding to Edge A, NumbOfEntr [2] = 1 for the Edge object corresponding to Edge B, and NumbOfEntr[2] = 0 for all other edges in the user-task graph. In other words, the value of NumbOfEntr [n] for an Edge object is equal to the number of times the user has entered the system 10a through the edge corresponding to the Edge object as of the end of session number n (where the first session is session number zero). An integer field NumbOfEntr stores the total number of entrances to the system through the Edge. A date field DateOfLastEntr stores the date of the last entrance through the Edge, and a date field DateOfFirstEntr stores the date of the first entrance through the Edge. A floating point Weight field stores a "frequency" of entrances through the Edge, and is equal to NumbOfEntr/(DateOfLastEntr - DateOfFirstEntr). The UserTaskGraph object also includes an instance of a Statistics object, which may be used to generate statistics such as measures of knowledge and innovation, as described above. The Statistics object contains the following fields. A floating point array field WeightName stores an "average frequency" of the UserTaskGraph. This is computed as the arithmetic mean of the frequencies of Edges in the UserTaskGraph having user vertices corresponding to the same user name:

Weight[i,k]

WeightName[i] fc = l

where I is an index corresponding to the I-th user's name, k is an index corresponding to the k-th task vertex in the UserTaskGraph, and k_maλ is the number of task vertices connected by edges to the I-th user vertex (i.e., the number of task vertices through which the I-th user has entered the system). A date array field EfficientDateName has I elements, where EfficientDateName[i] for I corresponding to the I-th user stores the date of the I-th user^'s most recent entrance through an edge whose frequency is greater than or equal to the average frequency.

FIG. 6C illustrates a data structure which may be used to implement the proto-initiati g document and the initiating document. The data structure is an instance of a class named InitiatingDocument. A string array field Text, which is initially empty, contains the text of the initiating document. A sub-object, of the class NounList, contains a list of nouns corresponding to the user^'s task. Each element of the list in the NounList object contains a string field Noun and an integer field Index. The Noun field contains the text of the noun and the Index field contains the index of the noun into the element of the domain vocabulary 132 corresponding to the initiating document. The InitiatingDocument object also contains a date object named Date containing the date on which the initiating document was created. A string UserName field contains the name of the user who wrote the initiating document. An integer IndexElementDomVoc contains the index into the domain vocabulary 132 of the domain vocabulary element corresponding to the initiating document. An integer array field - JO -

SetNounMatchDomVoc contains indices into the TextID field of the nouns in the initiating document that match the corresponding element of the domain vocabulary.

FIG. 6D shows a data structure which may be used to represent the search language vocabulary. As shown in FIG. 6D, the data structure representing the search language vocabulary includes a sub-objects of the class WordList for storing the list of words in the search language vocabulary and related information. The WordSLVoc field of the WordList sub-object is an array of character strings, each string corresponding to a word in the search language vocabulary. The order in which strings are stored in the WordSLVoc array does not necessarily correspond to the order of the corresponding words in the search language vocabulary. Rather, an array of integral indices Index WordSLVoc stores indices into WordSLVoc of the words in the search language vocabulary; i.e., word i in the search language vocabulary can be found at WordSLVoc[IndexWordSLVoc[i]]. An integer field NumberWordSLVoc stores the total number of words in the search language vocabulary. An integer field NumberSemanticGroupsSLVoc stores the total number of semantic groups in the search language vocabulary .

The data structure representing the search language vocabulary includes one or more sub-objects of the class SemanticGroup. Each SemanticGroup sub-object includes the following fields. An integer field IndexSemanticGroupSLVoc stores the index number of the SemanticGroup, i.e., a unique integer value indicating the position of the SemanticGroup object within the set of SemanticGroup objects. An integer array Index WordSemanticGroupSLVoc contains indices into WordSLVoc (the list of words in the search language vocabulary) for each word in the semantic group, i.e., WordSLVoc[IndexWordSemanticGroupSLVoc[n]] is the n-th word in the semantic group. An integer field NumberWordSemanticGroupSLVoc stores the total number of words in the semantic group. Referring to FIG. 6E. a textual node may be represented by a textual node data structure

610. The textual node data structure 610 shown in FIG. 6E is shown merely for purposes of example; textual nodes may be implemented in any way using a variety of data structures. As shown in FIG. 6E, the textual node data structure maintains a reference to an information object 612. The information object 610 may be any kind of information object, as described in more detail above. The textual node 610 also maintains a reference to any cached versions 614 of the information object 612 and a reference to any pre-processing information 616 generated by the browsing, editing, and searching module 105 as a result of pre-processing the information object 612, as described above. The textual node 610 includes a reference to highlighting information 618 that indicates which portion or portions, if any, of the information object 612 have been highlighted by the user. Furthermore, as described above, the textual node 610 includes methods 620 for performing operations on the information object 612 and notes taken by the user about the information object 612. The textual node data structure 610 need not include references to all of the components 612-624; rather, the textual node data structure 610 may be implemented using any combination of the components 612-624. Furthermore, the textual node data structure 610 may be persistently stored along with any or all of the components 612-624.

Referring to FIG. 6F, a contextual node may be represented by a contextual node data structure 630. The contextual node data structure 630 shown in FIG. 6F is shown merely for purposes of example; contextual nodes may be implemented in any way using a variety of data structures. As shown in FIG. 6F, the contextual node data structure maintains a reference to a task articulation 632 which may, for example, represent the task formulation 126. The contextual node data structure 630 also maintains a reference to any semantic clouds 634 that were generated while performing the task associated with the task articulation 632. The contextual node data structure 630 maintains a reference to a type 636, which may, for example, be used to distinguish different types of contextual nodes from each other. For example, in one embodiment, nodes contained within a contextual node can be organized within the contextual node, such as by organizing the nodes into a hierarchy of contextual nodes containing other nodes. In this embodiment, the contextual nodes in the hierarchy that are used only for organizational purposes may be marked in the contextual node data structure 630 as being of a different type than other contextual nodes.

The contextual node data structure 630 maintains a reference to a task document 642 (e.g., the task document 138) associated with the task articulation 632. As described above, the contextual node data structure 630 maintains a reference to methods 640 that may be performed on the contextual node data structure 630 and a reference to project management information 638 related to the task described by the task articulation 632.

Referring to FIG. 6G, the knowledge management system 10a may also store relationship information 644 representing a relationship between two nodes. Although as shown in FIG. 6G the relationship information 644 indicates a relationship between a textual node represented by the textual node data structure 610 and contextual node represented by the- contextual node data structure 630, the relationship information may represent a relationship between any combination of textual and/or contextual nodes. The relationship information 644 may, for example, be represented by the data structure 600 shown in FIG. 6A. Referring again to FIG. 6G, the task document 642 associated with the contextual node data structure 630 may mtain a link 646 to the notes 624 associated with the textual node data structure 610. The task document 642 may include multiple links to notes associated with multiple textual node data structures 610.

Aspects of the task repository 132 may be implemented using a "domain vocabulary," which represents a domain-specific vocabulary that has been built up over time as a result of different users performing different tasks. The domain vocabulary may be represented as a set, each element of which is a set of words. Each element of the domain vocabulary corresponds to a task formulation previously generated by the task formulator 128. Each element of the domain vocabulary is related to one and only one contextual vertex of the global context graph 100b. Two different elements of the domain vocabulary may, however, correspond to the same contextual vertex in the global context graph 100. The function of the domain vocabulary is to act as repository of typologized tasks.

The domain vocabulary may, for example, be represented by a data structure including a set, elements of which each include a collection of at most ten nouns and a number between zero and one. Each element in the set corresponds to an element of the domain vocabulary. The collection of at most ten nouns in an element corresponds to the words in the corresponding element of the domain vocabulary. The decimal number in an element indicates a usefulness of the element and may be used to prioritize elements.

For example, when the task formulator 128 searches the global context graph 100b in step 170 (FIG. 4A), the task formulator 128 may search the global context graph 100b for contextual vertices corresponding to elements of the domain vocabulary which sufficiently match the task formulation. Whether an element of the domain vocabulary sufficiently matches a task formulation may be determined, for example, by determining whether the words contained within the element of the domain vocabulary match at least a predetermined percentage of words in the task formulation 126. Two words may be considered to match each other if, for example, the two words are the same or if one of the words is the plural form of the other word. The domain vocabulary may, for example, only contain singular forms of words. The predetermined percentage may be, for example, ninety percent. Words that the task formulator 128 suggests to add to the task formulation 126 may be selected, for example, from elements of the domain vocabulary which most closely match the task formulation 126. When a contextual vertex sufficiently matching the task formulation 126 cannot be found (step 172), the task formulator 128 may add the set of nouns contained in the task formulation 126 as a new element of the domain vocabulary. According to the embodiment of the system 10b shown in FIG. 1C, a graph of local context graphs 104c contains vertices which are in a 1-to-l correspondence with contextual vertices in the global context graph 100c. When the localization process 102c generates a local context graph, a vertex corresponding to the local context graph is added to the graph of local context graphs 104c. Vertices in the history graph 260 may be connected by edges. If two local context graphs overlap within the global context graph 100c, and the primary contextual vertices of those local context graphs lie within the area of the overlap, then the corresponding vertices in the graph of local context graphs 104b are connected by an edge.

Referring to FIG. 14, and according to the embodiment of the system lOd shown in FIG. ID, a chronological record of local context graphs may be stored in a history graph 104d containing context history vertices. There is a 1-to-l correspondence between contextual vertices in the global context graph lOOd and context history vertices in the history graph 104d. Each context history vertex in the history graph 104d contains a stack of local context graphs, each of which is centered around the contextual vertex in the global history graph lOOd to which the context history vertex corresponds. After the end of a user's session, the local context graph that was navigated by the user during the session is added to the stack of local context graphs contained within the context history vertex corresponding to the task vertex of the edge in the user-task graph through which the user entered the system 1 Od. The local context graphs contained within each stack are stored in chronological order; the most recent local context graph is referred to as being on the "top" of the stack. In this way, each vertex of the history graph 104d contains a discrete history of the local context graphs centered around the corresponding contextual vertex in the global context graph lOOd. For example, context history vertex CHI contains a stack of local context graphs corresponding to previous versions of contextual vertex Cl in the global context graph lOOd, with the most recent local context graph on top. For any two local context graphs in two stacks in the history graph 104d, if the two local context graphs overlap within the global context graph lOOd, and the primary contextual vertices of those local context graphs lie within the area of the overlap, then the corresponding local context graphs in the history graph 104d are connected by an edge. WO 99/64945 _ _4Q _ PCT/US99/13124

The system lOd graphically informs the user about the difference between the local context graph currently being viewed and previous versions of the local context graph. The system lOd may, for example, graphically indicate (using, e.g., a distinctive color) those vertices and edges which are new, i.e., which did not exist in the previous version of the local context graph, or which were present in the previous version of the local context graph but which do not exist in the present version of the local context graph.

When a local context graph is displayed on the screen, the browser 120 provides the user with the ability to navigate to previous and subsequent versions of the context graph. The browser visually indicates to the user which version of the local context graph the user is currently browsing. The user uses user navigation commands 122 to display previous and/or subsequent versions of the local context graph to be displayed.

A computer system for implementing the system of FIGS. 1A-1D and FIG. 2 as a computer program typically includes a main unit connected to both an output device which displays information to a user and an input device which receives input from a user. The main unit generally includes a processor connected to a memory system via an interconnection mechanism. The input device and output device also are connected to the processor and memory system via the interconnection mechanism.

It should be understood that one or more output devices may be connected to the computer system. Example output devices include a cathode ray tube (CRT) display, liquid crystal displays (LCD), printers, communication devices such as a modem, and audio output. It should also be understood that one or more input devices may be connected to the computer system. Example input devices include a keyboard, keypad, track ball, mouse, pen and tablet, communication device, and data input devices such as sensors. It should be understood the invention is not limited to the particular input or output devices used in combination with the computer system or to those described herein.

The computer system may be a general purpose computer system which is programmable using a computer programming language, such as C++, Java, or other language, such as a scripting language or assembly language. The computer system may also include specially programmed, special purpose hardware. In a general purpose computer system, the processor is typically a commercially available processor, of which the series x86 and Pentium processors, available from Intel, and similar devices from AMD and Cyrix, the 680X0 series microprocessors available from Motorola, the PowerPC microprocessor from IBM and the Alpha-series processors from Digital Equipment Corporation, are examples. Many other processors are available. Such a microprocessor executes a program called an operating system, of which WindowsNT, UNIX, DOS, VMS and OS8 are examples, which controls the execution of other computer programs and provides scheduling, debugging, input/output control, accounting, compilation, storage assignment, data management and memory management, and communication control and related services. The processor and operating system define a computer platform for which application programs in high-level programming languages are written.

A memory system typically includes a computer readable and writeable nonvolatile recording medium, of which a magnetic disk, a flash memory and tape are examples. The disk may be removable, known as a floppy disk, or permanent, known as a hard drive. A disk has a number of tracks in which signals are stored, typically in binary form, i.e., a form interpreted as a sequence of one and zeros. Such signals may define an application program to be executed by the microprocessor, or information stored on the disk to be processed by the application program. Typically, in operation, the processor causes data to be read from the nonvolatile recording medium into an integrated circuit memory element, which is typically a volatile, random access memory such as a dynamic random access memory (DRAM) or static memory (SRAM). The integrated circuit memory element allows for faster access to the information by the processor than does the disk. The processor generally manipulates the data within the integrated circuit memory and then copies the data to the disk when processing is completed. A variety of mechanisms are known for managing data movement between the disk and the integrated circuit memory element, and the invention is not limited thereto. It should also be understood that the invention is not limited to a particular memory system.

It should be understood the invention is not limited to a particular computer platform, particular processor, or particular high-level programming language. Additionally, the computer system may be a multiprocessor computer system or may include multiple computers connected over a computer network. It should be understood that each module (e.g. 100, 102a, 104a. 106. and 105) in FIG. 1A may be separate modules of a computer program, or may be separate computer programs. Such modules may be operable on separate computers. Data (e.g. 100 and 104a) may be stored in a memory system or transmitted between computer systems. The invention is not limited to any particular implementation using software or hardware or firmware, or any combination thereof. The various elements of the system, either individually or in combination, may be implemented as a computer program product tangibly embodied in a machine-readable storage device for execution by a computer processor. Various steps of the process may be performed by a computer processor executing a program tangibly embodied on a computer-readable medium to perform functions by operating on input and generating output. Computer programming languages suitable for implementing such a system include procedural programming languages, object-oriented programming languages, and combinations of the two. Having now described a few embodiments, it should be apparent to those skilled in the art that the foregoing is merely illustrative and not limiting, having been presented by way of example only. Numerous modifications and other embodiments are within the scope of one of ordinary skill in the art and are contemplated as falling within the scope of the invention.

Claims

1. A data structure, tangibly stored on a computer-readable medium, representing knowledge about a task performed by at least one user, the data structure comprising information descriptive of: relationships among a plurality of nodes, the plurality of nodes comprising: a plurality of contextual nodes corresponding to tasks performed by the at least one user; and a plurality of textual nodes corresponding to units of information.

2. The data structure of claim 1 , wherein the textual nodes comprise information objects.

3. The data structure of claim 1, wherein information objects comprise information objects stored in a knowledge base of a knowledge management system.

4. The data structure of claim 1, wherein information objects comprise data objects stored in a database system.

5. The data structure of claim 1, wherein information objects comprise files stored in a computer file system.

6. The data structure of claim 1. wherein information objects comprise information objects accessible via an intranet.

7. The data structure of claim 1. wherein information objects comprise information objects accessible via an internet.

8. The data structure ofclaim 1, wherein the textual nodes comprise references to information objects.

9. The data structure of claim 8, wherein references comprise uniform resource locators.

10. The data structure ofclaim 8, wherein information objects comprise documents stored in files on a computer-readable medium.

11. The data structure of claim 1 , wherein the relationships among the nodes comprise pairwise relationships between pairs of nodes.

12. The data structure of claim 11 , wherein the relationships comprise degrees of relevancy between pairs of nodes.

13. The data structure of claim 11, wherein the relationships comprise orderings of pairs of nodes.

14. The data structure of claim 1, further comprising information descriptive of sources of the relationships.

15. The data structure of claim 1, further comprising information descriptive of ages of the relationships.

16. The data structure of claim 1. wherein the relationships include information descriptive of the task.

17. The data structure of claim 1, wherein the relationships include at least two relationships between a first one of the plurality of nodes and a second one of the plurality of nodes.

18. The data structure of claim 1, wherein the contextual nodes include task information descriptive of the tasks to which the contextual nodes correspond.

19. The data structure of claim 18, wherein the task information comprises information descriptive of notes taken by at least one user while working on at least one of the tasks. WO 99/64945 . ,. PCT/US99/13124

- 45 -

20. The data structure of claim 18, wherein the task information comprises information descriptive of a search query performed by the at least one user while working on at least one of the tasks.

21. The data structure of claim 18, wherein the task information comprises project management information related to at least one of the tasks.

22. A data structure, tangibly stored on a computer-readable medium, representing knowledge in a domain, the data structure comprising information descriptive of: a plurality of data structures according to claim 1 ; and relationships among the plurality of data structures.

23. A data structure, tangibly stored on a computer-readable medium, representing knowledge in a plurality of domains, the data structure comprising information descriptive of: a plurality of data structures according to claim 22; and relationships among the plurality of data structures according to claim 22.

24. A method for generating a data structure representing knowledge about a task performed by at least one user, comprising a step of: (A) generating information descriptive of relationships among a plurality of nodes, the plurality of nodes comprising: a plurality of contextual nodes corresponding to tasks performed by at least one user; and a plurality of textual nodes corresponding to units of information.

25. The method of claim 20, wherein the textual nodes comprise information objects.

26. The method of claim 20, wherein information objects comprise information objects stored in a knowledge base of a knowledge management system.

27. The method of claim 20, wherein information objects comprise data objects stored in a database system.

28. The method ofclaim 20, wherein information objects comprise files stored in a computer file system.

29. The method of claim 20, wherein information objects comprise information objects accessible via an intranet.

30. The method of claim 20, wherein information objects comprise information objects accessible via an internet.

31. The method of claim 20, wherein the textual nodes comprise references to information objects.

32. The method of claim 31 , wherein references comprise uniform resource locators.

33. The method of claim 31, wherein information objects comprise documents stored in files on a computer-readable medium.

34. The method of claim 20, wherein the relationships among the nodes comprise pairwise relationships between pairs of nodes.

35. The method ofclaim 34, wherein the relationships comprise degrees of relevancy between pairs of nodes.

36. The method of claim 34, wherein the relationships comprise orderings of pairs of nodes.

37. The method ofclaim 34, further comprising a step of generating information descriptive of sources of the relationships.

38. The method ofclaim 20, further comprising a step of generating information descriptive of ages of the relationships.

39. The method of claim 20, wherein the relationships include information descriptive of the task.

40. The method ofclaim 20, wherein the relationships include at least two relationships between a first one of the plurality of nodes and a second one of the plurality of nodes.

41. A method for generating a data structure representing knowledge in a domain, the method comprising steps of: generating information descriptive of a plurality of data structures according to claim 1 : and generating information descriptive of relationships among the plurality of data structures.

42. The method of claim 41 , further comprising: identifying a task performed by a user; and identifying a select one of the plurality of data structures representing knowledge about the identified task.

43. The method of claim 42, further comprising: defining a user-task graph comprising: a plurality of user vertices corresponding to a plurality of users; a plurality of task vertices corresponding to tasks performed by the plurality of users: and a plurality of edges connecting each user vertex in the plurality of user vertices to task vertices corresponding to tasks performed by the user to which the user vertex corresponds; and selecting a task vertex connected to a select one of the plurality of user vertices, the select one of the plurality of user vertices corresponding to the user.

44. A method for generating a data structure representing knowledge in a plurality of domains, the method comprising steps of: generating information descriptive of a plurality of data structures according to claim 41 : and generating information descriptive of relationships among the plurality of data structures according to claim 41.

45. The method ofclaim 41, wherein the generating step is performed in response to user input indicating the relationships.

46. The method of claim 44, wherein the generating step includes analyzing information descriptive of the nodes to determine the information descriptive of relationships among the plurality of nodes.

47. The method of claim 44, further comprising: receiving input from a user indicating the relations among the plurality of nodes.

48. The method of claim 44, further comprising: analyzing information descriptive of the plurality of nodes; generating information descriptive of proposed relationships among the plurality of nodes based on the analysis; presenting the information descriptive of the proposed relationships to a user; receiving input from the user with respect to the information descriptive of the proposed relationships; and wherein step (A) comprises generating information descriptive of relationships among a plurality of nodes based on the input from the user.

49. The method ofclaim 48, wherein the receiving step comprises receiving input from the user indicating an addition of a relationship between select ones of the plurality of nodes.

50. The method ofclaim 48, wherein the receiving step comprises receiving input from the user indicating a deletion of a relationship between select ones of the plurality of nodes.

51. The method of claim 47. wherein the receiving step comprises receiving input from the user indicating whether the user approves of the proposed relationships among the plurality of nodes.

52. The method ofclaim 47, wherein the receiving step comprises receiving input from the user indicating relevancy relationships among the plurality of nodes.

53. A method for generating a formulation of a task performed by a user, the method comprising steps of: generating an initial formulation of the task; generating proposed modifications to the initial task formulation based on a plurality of formulations of tasks performed by at least one user; receiving user input with respect to the proposed modifications; and modifying the initial formulation of the task based on the proposed modifications and the user input.

54. The method of claim 53, wherein generating an initial formulation of a task comprises receiving the initial formulation of the task from the user.

55. The method ofclaim 53, wherein the step of generating proposed modifications to the initial task formulation comprises: comparing the initial formulation of the task to the plurality of formulations of tasks to determine degrees of similarity between the initial formulation of the task and the plurality of task formulations; and generating proposed modifications including elements selected from select ones of the plurality of formulations whose degree of similarity to the initial formulation of the task is within a predetermined limit.

56. The method of claim 53, wherein the step of receiving user input comprises receiving terms selected by the user from documents, and wherein the step of generating proposed mofidications to the initial task formulation comprises generating proposed modifications including the terms selected by the user.

57. A method for generating a formulation of a search query related to a task performed by a user, the method comprising steps of: generating an initial formulation of the search query; retrieving information from an information source based on the source query; and generating the formulation of the search query based on the initial formulation of the search query and the retrieved information.

58. A method for generating a formulation of a search query related to a task performed by a user, the method comprising steps of: receiving information from the user related to the task; and generating the formulation of the search query based on the information received from the user and information derived from at least one formulation of a search query generated by at least one user who performed the task.

59. The method of claim 58, wherein the information received from the user comprises terms selected by the user from documents related to the task.

60. The method of claim 59, wherein the information derived from at least one formulation of a search query generated by at least one user who performed the task comprises a search language vocabulary including a plurality of sets of terms derived from the at least one formulation of a search query.

61. The method of claim 60, wherein the generating step comprises: generating at least one semantic cloud including terms that are common to the terms selected by the user and terms in a select one of the plurality of sets of terms in the search language vocabulary.

62. The method of claim 61, further comprising: selecting a select one of the at least one semantic cloud; and performing a search of an information source based on the selected semantic cloud.

63. The method of claim 62, wherein the units of information include units of information produced as a result of performing the search.

64. A data structure generated by the method of claim 20.

65. The method of claim 64, further comprising a step of: displaying a visual representation of the data structure on an output device. WO 99/64945 _<- . PCT/US99/13124

66. The method of claim 65, wherein the visual representation comprises: visual representations of a plurality of document containers corresponding to the plurality of contextual nodes; and visual representations of a plurality of documents corresponding to the plurality of textual nodes.

67. The method of claim 65, wherein the document containers comprise directories on a storage device.

68. The method of claim 64, wherein the visual representation comprises: a list including elements corresponding to the contextual nodes and elements corresponding to the textual nodes.

69. The method of claim 64, wherein the visual representation comprises: a graph, comprising: a plurality of contextual vertices corresponding to the plurality of contextual nodes; a plurality of textual vertices corresponding to the plurality of textual nodes; and a plurality of edges corresponding to the relations among the plurality of nodes.

70. The method of claim 64. wherein the visual representation includes graphical elements corresponding to the plurality of nodes, and further comprising a step of: indicating an order of navigation of the graphical elements based on relevancies of the corresponding textual nodes to the task.

71. The method of claim 70, wherein the relevancies are determined based upon a geometric comparison of the documents corresponding to the textual nodes.

72. The method of claim 64, wherein the information objects comprise documents, and wherein the visual representation comprises: document icons corresponding to the documents.

73. The method of claim 72, wherein the document icons indicate the presence of terms in the documents that are relevant to the task.

74. The method of claim 21 , further comprising: indicating a selected portion of the information object; and persistently storing information descriptive of the indication of the selected portion of the document.

75. The method of claim 21 , further comprising: indicating a selected portion of the information object; and indicating that the selected portion of the information object is relevant to the task performed by the user.

76. The method of claim 21 , further comprising: indicating a selected portion of the information object; and modifying a formulation of the task performed by the user based on the selected portion of the information object.

77. The method of claim 21 , further comprising: indicating a selected portion of the information object; and modifying a search query of information sources based on the selected portion of the information object.

78. The method of claim 20, further comprising: generating a measure of a relevance of a select one of the plurality of nodes to the task performed by the user.

79. The method ofclaim 37, wherein the information descriptive of sources of the relationships identifies authors of the relationships, and further comprising: generating a measure of a degree of contribution to the data structure of a select one of the authors.

80. The method of claim 20, further comprising: generating a measure of the quantity of knowledge represented by the data structure.

81. The method of claim 20, further comprising: generating a measure of the quality of knowledge represented by the data structure.

82. A knowledge management system, comprising: an information source; a knowledge management interface configured to provide an interface between the information source and a plurality of users, the knowledge management interface presenting information in the information source based on its relevance to a task performed by at least one of the plurality of users.