CN101044472A - Methods and systems for semantic identification in data systems - Google Patents

Methods and systems for semantic identification in data systems Download PDF

Info

Publication number
CN101044472A
CN101044472A CNA2005800290342A CN200580029034A CN101044472A CN 101044472 A CN101044472 A CN 101044472A CN A2005800290342 A CNA2005800290342 A CN A2005800290342A CN 200580029034 A CN200580029034 A CN 200580029034A CN 101044472 A CN101044472 A CN 101044472A
Authority
CN
China
Prior art keywords
data
project
identifier
semantic
semantic identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2005800290342A
Other languages
Chinese (zh)
Inventor
拉塞尔·G.·安德森
穆哈梅德·伯兹亚内
文森特·A.·马斯特罗
罗伯特·C.·韦伯三世
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of CN101044472A publication Critical patent/CN101044472A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

A method and a system are provided, which relates to a semantic identifier, a switching engine and an abstraction level property of a center or a data base. The semantic identifier is convenient for exclusively identifying an item according to the relation between the item and others without the necessary for storing the other data. The switching engine can convert a mode, language and/or data model of the data, metadata and semantic identifier to another mode, language and/or data model. The abstraction level property of the center or the data base is used for conveniently distinguishing a plurality of instances or modes of the item.

Description

Method for recognizing semantics in the data system and system
Related application
The application requires in the U.S. Provisional Application NO.60/606 that is entitled as " Methods andSystems for Semantic Identification in Data Systems " of submission on August 31st, 2004,407 right of priority.
Technical field
The present invention relates to the field of infotech, more particularly, relate to the field of data integrated system.
Background technology
The appearance of computer applied algorithm makes many business processes faster, more efficient; But, use the surge of the various computing machine application program of different pieces of information structure, communication protocol, language and platform to cause the infotech foundation structure of typical commercial enterprise extremely complicated.Different business processes in the typical enterprise may use diverse computer applied algorithm, every kind of computer applied algorithm is for specific business process exploitation and optimizes, rather than be exploitation of whole enterprise and optimization.For example, enterprise may have a kind of special computers application program and a kind of diverse computer applied algorithm that is used to write down the client affiliated person that is used to follow accounts payable.In fact,, for example keep centralized client's linked database when enterprise even identical business process also may use more than one computer applied algorithm, but when the employee for example remains on themselves affiliated person's information in the personal information manager.
Though the special purpose computer application program is brought the advantage of customized solution, but the surge of special purpose computer application program can cause efficient low, for example in whole enterprise, repeatedly repeat to import and handle identical data, perhaps when enterprise carried out another process can benefit from the data relevant with process, enterprise can not utilize these data.For example, if the accounts payable process is separated with the order process with supply chain, enterprise may accept and supply from its credit and can cause enterprise to refuse the client's of its order order so.Enterprise can benefit from crosses over various computer applied algorithms other example of the consistent access of its all data is had a lot.
Many companies have recognized that and are devoted to satisfy to cross over different application in the commercial enterprise, the needs of shared data.Thereby, as a kind of message based strategy of handling from the data of separate sources, the integrated EAI in other words of enterprise application has appearred.Along with the increase of computer applied algorithm at complicacy and quantitative aspects, EAI runs into many challenges, handles different agreements from needs, to the data volume that need deal with continuous increase and transactions and ever-increasing strong desire to data integration faster.Taked the whole bag of tricks, comprised the lowest common denominator method, atom method and bridge methods EAI.But EAI is based on the communication between the single application program.An obvious defects is that along with the linearity increase of platform and application program, the complicacy of EAI solution increases for how much.
Though data integrated system provides the useful tool of the needs that solve enterprise, but such system is deployed as client's solution usually.They have very long deployment cycle, and require senior technique drill, to adapt to the variation of the pattern of enterprises and information requirements aspect.Existence is to the needs of data integrated system, and described data integrated system allows to use in the corporate environment that constantly changes, reuse and modify feature.A kind of instrument like this is to be convenient to discern this project uniquely according to the relation of a project and other project, and does not need to preserve the semantic identifier of other data.Conversion (translation) engine is another kind of such instrument, and it can be transformed into another kind of form, language and/or data model to data, metadata, semantic identifier and other project from a kind of form, language and/or data model.At last, the character of the abstraction hierarchy of center (hub) or database is convenient to distinguish a plurality of examples or the form of a certain project.
Summary of the invention
Can there be a semantic identifier in a project.Described project can be an object, data item, data, row, OK, table, database, example, attribute, metadata, notion, exercise question, theme, semantic identifier, other identifier, the RFID label, the seller, the supplier, client, the individual, team, tissue, the user, network, system, equipment, family, the shop, product, product line, product feature, product specification, product attribute, price, cost, bill of materials, shipping data, tax data, course, educational program, the position, map, department, tissue, organism, process, rule, law, fee system (rating system), article, the service and service provides or other project or notion.A project can be relevant with an integrated operation of data and/or data integration platform.Semantic identifier can be according to this project of relation recognition of project and one or more other projects.Relation can be to lack relation.Relation can be based on semanteme.Relation can relate to the position of this project in concerning hierarchy.
Semantic identifier can be the unique identifier of project.Unique semantic identifier of a certain project may be considered all relations lacked of relation than this project and other project.The semantic identifier that to create a pass with the minimum number of guaranteeing uniqueness be the basis is favourable.The number of the relation that unique semantic identifier of a project of establishment is required can change according to linguistic context.Semantic identifier can depend on linguistic context.Semantic identifier can be dynamic.
Can preserve in order to the grammer that string structure or form are preserved, keep, write down, handle and/or explained, keep, write down, processing and/or interpretive semantic identifier.Grammer and/or string structure or form are analysable.Grammer and/or string structure or form can be blocked, be revised, be shortened, be resolved or be resequenced.Block, revise, shorten or resequence grammer and/or string, it is possible still keeping unique identifier.
Semantic identifier can be associated with semantic context, described semantic context is such as being step in enterprise's method, data in the database, the data in the row or column, the row or column in the table, row or column in the database, data in the table, the table in the database, the metadata in the database, the project of center or storage vault, project in the database, the project in the table, the project in the row, project in the row, people in the mechanism, the sender of communication or recipient, the user on the network, system on the network, equipment on the network, the member in the family, the article in the shop, dish on the menu, product on the product line, the product during product provides, course or step in education or the training plan, position on the map, the position of article, the department of mechanism, the individual in the group, rule in the rule system, service in the one cover service, the entity in the organisational level of enterprise, the entity in the supply chain, consumer in the market, the buyer of purchase decision, the price of commodity or service, the cost of commodity or service, the assembly of product or system, the step of method and/or the member of group.
In one embodiment, database can have the table that contains row.Unique semantic identifier of these row can be " the row title that the table name of database-name claims ".Can utilize following grammer: the row title:: table name claims:: this unique semantic identifier is preserved, keeps, writes down, handles and/or explained to database-name.Described grammer and/or any relevant string can be resolved, and unnecessary element can be removed.For example, if only there is a database, so following grammer still produces the unique identifier of these row: the row title:: table name claims.Create unique semantic identifier and do not need database relation.In another example, database may have only a table, thereby following grammer is the unique identifier of these row: the row title:: database-name.Create unique identifier and do not need the table relation.Use short grammer and/or string can reduce the processing time and raise the efficiency.
Transform engine can be to one or more semantic identifiers, and database comprises the database of semantic identifier, and infosystem comprises that the infosystem of semantic identifier or other project carry out conversion operations.Conversion operations is convertible or otherwise revise form, language and/or the data model of semantic identifier.Conversion operations can relate to from one or multidata instrument, language, form and/or data model at least a other the conversion or the mapping of data tool, language, form and/or data model, perhaps from least a other data tool, language, form and/or data model to or the conversion or the mapping of multidata instrument, language, form and/or data model.Conversion operations can relate to is to and from DataStage 7, QualityStage, BusinessObject, IBM-DB2 Cube Views, UML 1.1, and UML 1.3, ERStudio, ProfileStage, conversion or the mapping of PowerDesigner (supporting Packages and ExtendedAttributes in addition) and/or MicroStrategy.Transform engine and/or conversion operations can be included among the agency of unit.Transform engine, the mapping of conversion operations or conversion operations can be followed the tracks of at initial semantic context and data converted in the conversion operations of carrying out back and forth between the semantic context after changing.Can carry out, carry out and/or implement conversion operations in batches, in real time and/or continuously.The form that can serve, for example the form as the part of service-oriented architecture provides service or makes it available.
In case for semantic identifier, database, the database that comprises one or more semantic identifiers, infosystem, the infosystem or other project that comprise one or more semantic identifiers, there is conversion operations, it can be converted into or change certainly so, be mapped to, be linked to other semantic identifier arbitrarily, database, the database that comprises one or more semantic identifiers, infosystem comprises the infosystem of one or more semantic identifiers or shares other project of at least one conversion operations and other semantic identifier arbitrarily, database, the database that comprises one or more semantic identifiers, infosystem, other project that comprises the infosystem of one or more semantic identifiers or share at least one conversion operations is used together or is associated.
Project can be in a variety of forms or example have for example physical modeling activity and/or logic modeling activity.Project, comprise any related data or metadata can be in a variety of forms or example be present in database and/or in the heart.Various forms or example in order to distinguish project can use any distinguishing feature, for example abstraction hierarchy, position in the hierarchy is with another purpose relation, one or more discriminative attributes of project, wherein find the linguistic context of this project, find physical location of this project or the like.
In one embodiment, during the table of by name " employee " can be brought in the heart.The center effector in have example in two kinds of forms of " employee " in the heart; A kind of corresponding to the physical modeling activity, another kind of corresponding to the logic modeling activity.The abstraction hierarchy character that centre data is collected is convenient to distinguish physical model and logical model example or form.
When conversion operations is carried out in response inquiry, transform engine can be from the center or database capture, pack into or obtain all items.It is subsequently according to distinguishing characteristics, such as the position in abstraction hierarchy, the hierarchy, with the attribute of another purpose relation, project, physical location etc., filter, select, preserve, conversion, revise or otherwise act on project.In alternative, when conversion operations is carried out in the response inquiry, transform engine can be at the center or database filters, selects, preserves, conversion, revise or otherwise act on project, comprise any data and/or metadata, only capture, pack into or obtain the related abstractions level, perhaps have those projects in association attributes, position, relation, place etc.Carry out described filtration, selection, storage, conversion, modification or other operation in the time of can or designing when moving, and can carry out described filtration, selection, storage, conversion, modification or other operation in batches, in real time or continuously.In an embodiment, filter, select, storage, conversion, revise or other operation can be when exploitation, during design or the information that obtains by transform engine and/or system during operation or be input as the basis, the for example mapping of data model, data model, the distinguishing feature of the grammer of identifier etc.Described information can be dynamically updated in real time.Thereby in a preferred embodiment, the select command that is used for selecting from database data can be revised according to the known mapping of database by system, so that select the logic project and ignore physical item, vice versa.
In some cases, in whole process, filtration, selection or other operation are more near center or database, and operation is just efficient more and fast more.Transform engine can carry out conversion operations to inquiring about itself, and generation can be sent directly to the inquiry or the select command of the modification of center or database.Inquiry of revising or select command can be taked directly and the form of center or database compatibility.
In others, computer program can comprise computer usable medium, described computer usable medium comprises computer readable program code, wherein when carrying out on one or more computing machines, described computer readable program code makes one or more computing machines carry out above-mentioned any one or several different methods.
" International Business Machine " used herein or " IBM " refer to Armonk, the International Business Machines Corporation of NewYork.
" data source " used herein or " datum target " intention has the wideest possible implication that conforms to these terms, comprise database, a plurality of databases, the storage vault information manager, formation, messenger service, storage vault, data facility, data storage device, data set provider, the website, server, computing machine, Computer Memory Unit, CD, DVD, flash memory device, central storage means, hard disk, the multiple-tuned data storage device, RAM, ROM, flash memory, storage card, temporary storing device, permanent storage, tape, locally-attached calculation element, the calculation element of long-range connection, wireless device, non-wireless means, mobile device, central means, the web browser, client, laptop computer, personal digital assistant (" PDA "), telephone set, cellular telephone, mobile phone, information platform, analytical equipment, treating apparatus, other device of other device of system of commercial enterprise or deal with data or preservation data that provide or out of Memory, and remain on structuring or the unstructured data that uses in any said system, perhaps any fluidisation, messageization, the file or the file type of the data in event driven or other source, with combination in any recited above, the specific unless otherwise noted implication or the upper and lower of phrase require other implication.Storing mechanism is any logical OR physical unit, resource or the device that can serve as data source or datum target.
" Enterprise Java Bean (EJB) " comprises the server side component architecture of J2EE platform.That EJB supports to develop fast, simplifiedly is distributed, transactional, safety and transplantable java application.The containment system structure of consumption when EJB support to allow message, and support to distributed transaction is provided, thus utilize database update, the Message Processing of J2EE architecture and can participate in identical affairs linguistic context with being connected of business system.
" JMS " means the Java messenger service, and the Java messenger service is based on a kind of enterprise message service of the J2EE enterprise architecture of Java." JCA " means the J2EE connector architecture of the J2EE platform that is described in more detail below.Though will be appreciated that EJB, JMS and JCA be the present age the distributed transaction environment in the popular software instrument, but provide similar functions any platform, system or architecture can and data integrated system described herein adopt together.
" in real time " used herein comprises the time cycle near business or professional duration, and comprises and off-line, for example the process that takes place in batch operation every night or serve relative process that takes place or service in business operation or business.According to the duration of business process, can comprise several seconds in real time, part second, a few minutes, several hours or even several days.
" business process " used herein, " business logic " and " business " comprises can be by any method of enterprise's execution, service, operation, process or affairs include, but is not limited to sell, marketing, contract performance, stock control, price, product design, the occupation service, financial service, administration, finance, insurance, analyze, make a contract, information technology service, data storage, data mining, the transmission of information, the transmission of article, scheduling, communication, investment, transaction, the supply of material, promote, advertisement, bid, engineering, make, supply chain management, human resource management, data processing, data integration, Work Process Management, Software Production, hardware is produced, the exploitation of new product, research, exploitation, policing feature, quality control and insurance, packing, logistics, customer relation management, handle discount and return, customer support, product maintenance, telemarketing, enterprise propagates, investor's relation is many with other.
" service-oriented architecture " used herein (SOA) comprises the service of the part of the foundation structure that constitutes commercial enterprise.In SOA, service becomes the building block of application development and deployment, is convenient to quick application development and avoids redundant code.But every kind of service imbody is constrained on surrounding environment, such as one group of business logic on the target of the data output of the source of the data input of service or service or business rules.The various examples of SOA are provided in the following description.
" metadata " used herein comprises the data of the linguistic context of the data that generation is just processed, the data that just processed data are relevant, the information relevant with the linguistic context of relevant information, the information relevant with the origin of data, the information relevant with the position of data, the information relevant with the connotation of data, the information relevant with the life-span of data, the information relevant with the title of data, the information relevant, the information relevant and/or the information relevant with any out of Memory of the linguistic context that relates to data with the field of data with the unit of data.
" WSDL " used herein or " Web Services Description Language (WSDL) " comprise network are served that (web service usually) is described as acts on the one group of end points that comprises towards the message of document or procedure-oriented information.Operation and message are bound to subsequently on the concrete procotol and message format, thereby define an end points by abstractdesription.Relevant concrete end points is combined into abstract endpoints (service).WSDL is extendible, thereby allows the description of end points and their message, and no matter what message format or procotol are used to communication.
" unit agency " used herein comprises and calls transform engine or other device carries out conversion operations or other operated system or method to data or metadata.Conversion operations or other operation can relate to data or the conversion of metadata from one or more forms, language and/or data model to one or more forms, language and/or data model.
Description of drawings
Fig. 1 is the synoptic diagram with commercial enterprise of a plurality of business processes, and each business process can comprise a plurality of different computer applied algorithms and data source.
Fig. 2 is the synoptic diagram of the data integration of expression a plurality of business processes of crossing over commercial enterprise.
Fig. 3 is the synoptic diagram that a plurality of data sources of being expressed as commercial enterprise provide the architecture of data integration.
Fig. 4 represents a project relevant with other project.
Fig. 5 represents a project relevant with other project.
Fig. 6 A is illustrated in a project in a certain linguistic context.
Fig. 6 B is illustrated in a project in a certain linguistic context.
Fig. 7 represents some string.
Fig. 8 represents a project and corresponding string.
Fig. 9 represents a string and some variation thereof.
Figure 10 represents to act on the transform engine of some string.
Figure 11 represents can be in a variety of forms or a project existing of example.
Figure 12 represent can be in a variety of forms or example be present in a project in center or the database.
Figure 13 be illustrated under the different abstraction hierarchies in the heart project.
Figure 14 is illustrated in the transfer process that all items is captured at database or center.
Figure 15 is illustrated in database or center filtering item purpose transfer process.
Figure 16 represents the transfer process of translation and inquiry.
Embodiment
In the following description, identical Reference numeral refers to identical parts, unless offer some clarification in addition.
Invention disclosed herein can be taked pure hardware embodiment, pure software embodiment or comprise hardware component and the form of the embodiment of software part.In a preferred embodiment, realize the present invention with software, described software comprises (but being not limited to) firmware, resident software, microcode etc.
In addition, the present invention can take can from computing machine can with or the form of the computer program of computer-readable medium access, described computing machine can with or computer-readable medium provide for computing machine or the usefulness of any instruction execution system or the program code that is used in combination with it.For this explanation, computing machine can with or computer-readable medium can be any apparatus that can comprise, preserve, transmit, propagate or transmit for the usefulness of instruction execution system, equipment or device or the program that is used in combination with it.
Described medium can be electronics, magnetic, light, electromagnetism, infrared or semiconductor system (or equipment or device) or propagation medium.The example of computer-readable medium comprises semiconductor or solid-state memory, tape, dismountable computer disk, random-access memory (ram), ROM (read-only memory) (ROM), hard disc and CD.Present examples of optical disks comprises Compact Disc-Read Only Memory (CD-ROM), CD-read/writable memory device (CD-R/W) and DVD.
Be suitable for preserving and/or the data handling system of executive routine code will comprise at least one pass through bus directly or indirectly with the processor of memory element coupling.The local storage that adopts the term of execution that memory element can being included in program code actual, mass storage and the interim storage of at least some program codes is provided to reduce the term of execution, must be fetched the number of times of code from mass storage.
I/O I/O device (including but not limited to keyboard, display, indicating device etc.) in other words can directly or by I/O controller and data handling system placed in the middle couple.
Network adapter also can couple with system, thereby makes the data handling system can be by special use placed in the middle or public network, couples with other data handling system or remote printer or memory storage.In the just at present available disparate networks adapter of modulator-demodular unit, cable modem and Ethernet card some.
Fig. 1 represents to be convenient to the integrated platform 100 of the various data of commercial enterprise.This platform comprises a plurality of business processes, and each business process can comprise a plurality of different computer applied algorithms and data source.This platform can comprise several data sources 102, and data source 102 can be aforesaid those data sources.These data sources can comprise the various data types from various physical locations.For example, data source can comprise the system from provider such as Sybase, Microsoft, Informix, Oracle, Inlomover, EMC, Trillium, First Logic, Siebel, PeopleSoft, IBM, Apache or Netscape.Data source 102 can comprise uses data product or standard, such as the system of IMS, DB2, ADABAS, VSAM, MD series, UDB, XML, composite plane file or ftp file.Data source 102 can comprise the file of being created or being used by application program such as Microsoft Outlook, Microsoft Word, Microsoft Excel, MicrosoftAccess, and such as ASCII, CSV, GIF, TIF, PNG, PDP the file of standard format.Data source 102 can come from different positions, and perhaps they can be positioned at the center.The data of supplying with from data source 102 can be different form arrive, and have the compatible different-format that also may be mutually incompatible of possibility.
Datum target illustrates in the back.In general, these datum targets can be that above-mentioned arbitrary data couples 102.This difference of name aspect generally is illustrated in the data integration process, and data system provides data and still receives data.But, will be appreciated that this difference is not intended to pass on the difference of ability between data source and the datum target (unless spelling out in addition), because in the data integrated system of routine, data source can receive data, datum target can provide data.
The platform of graphic extension also comprises data integrated system 104 among Fig. 1.Data integrated system can be simplified the result who receives inquiry or retrieval command as data integrated system, from the data aggregation of data source 102.Data integrated system 104 can send order to one or more data sources 102, so that data source provides data to data integrated system 104.Because the data that receive can be the various forms that comprise the metadata of variation, so the data of the reconfigurable reception of data integrated system, so that the data that receive can be combined subsequently so that focus on.Explanation can be by the function of data integrated system 104 execution in more detail below.
Platform 100 also comprises several searching systems 108.Searching system 108 can comprise database or the processing platform that is used for further handling from the data of data integrated system 104.For example, data integrated system 104 can purify, combination, conversion or otherwise handle data that its receives from data source 102, so that searching system 108 can use the data of processing to produce the report 110 useful to enterprise.Report 110 can be used for the report data association, answer complex query, answer simple queries or form enterprise or other useful report of user, and can comprise raw data, form, chart, figure or from any other performances of the data of searching system 108.
Platform 100 also can comprise database or data base management system (DBMS) 112.Database 112 can be used for provisionally, for good and all or chronically preserves data.For example, data integrated system 104 can be collected data from one or more data sources 102, and data conversion is become compatible form, perhaps is suitable for the form of combination mutually.In case data are transformed, data integrated system 104 can be kept in the database 112 so that retrieval after a while according to decomposed form, array configuration or other form so.
Fig. 2 is the synoptic diagram that the data integration of a plurality of entities of commercial enterprise and business process is crossed in expression.In the embodiment of graphic extension, the information flow between data integrated system 104 simplified user interface systems 202 and the data source 10.Data integrated system 104 can receive the inquiry from interface system 202, wherein said inquiry make the extraction that resides in the data in one or more data sources 102 and possible be for conversion into essential.Interface system 202 can comprise any device or the program of communicating by letter with data integrated system 104, such as the web browser of working on laptop computer or desktop computer, cellular telephone, personal digital assistant (" PDA "), networking platform and the device that is attached thereto or any other device or the system that may connect with data integrated system 104 faces.
For example, the user may operate a PDA, and sends information request by WiFi or WAP (wireless access protocol)/wireless mark up language (" WAP/WML ") interface to data integrated system.Data integrated system 104 can receive this request, and produces the inquiry of any request, so that from the website or other data source 102, such as ftp file site access information.(in this example, PDA) Jian Rong form is transmitted to interface system 202 subsequently and checks for users and handle can be extracted and to be transformed into and send the interface system of request from the data of data source 102.In another embodiment, data may before be extracted from data source, and were stored in the independent database 112, and database 112 can be other data facility that data warehouse or data integrated system 104 use.Data may be stored in the database 112 according to the situation after the conversion or with its original state.For example, data can be saved according to the situation after the conversion, so that can be combined in another conversion process from the data of many data sources 102.For example, can be transmitted to data integrated system 104 from the inquiry of PDA, data integrated system 104 can be from database 112 information extractions.After described extraction, data integrated system 104 can send the packed format of data conversion one-tenth with the PDA compatibility to PDA afterwards.
Fig. 3 is expression provides the synoptic diagram of architecture from the data integration of a plurality of data sources 102 to commercial enterprise.An embodiment of data integrated system 104 can comprise from data source and extracts data and analyze the train value and the tableau format of source data, and the discovery data phase 302 of carrying out other process.Find the recommendation that data phase 302 also can produce tableau format, relation and key word about datum target.More senior analysis (profiling) and audit function can comprise the accuracy of data area affirmation, the accuracy of calculating, if-then assessment etc.Find that data phase 302 can make data normalization, for example unusual with other by the redundant correlativity of eliminating in the source data.Find that data phase 302 can provide other function, such as deep-cutting the exception in (drill down) data source 102 so that further analysis perhaps can realize the direct analysis of host data.In the WebSphere of IBM ProfileStage product, can find a non-limitative example of the commercial embodiment of finding data phase 302.
Data integrated system 104 also can comprise the data preparatory stage 304, and in the data preparatory stage 304, data are prepared, standardization, coupling or otherwise handle, thereby produce after a while with the qualitative data that is transformed.The data preparatory stage 304 can be carried out generic quality of data function, such as the correct coupling in mediation inconsistency or the inspection data (comprising coupling one to one, one-to-many coupling and removal repeating data).The data preparatory stage 304 also can provide the exclusive data enhancement function.For example, the data preparatory stage 304 can guarantee that the address meets improved international communication with transnational postal index.The data preparatory stage 304 can make position data meet spatial information and manage with transnational geocoding standard.The address can be revised or increase to the data preparatory stage, to guarantee that address information according to the U.S address corrigendum that government checks and approves, obtains the qualification of United States postal service mail rate discount.Similarly analysis and data modification can be provided for Canada and Australian mail system, and Canadian and Australian mail system is the correct mail discount offered rate of address.In the WebSphere of IBM QualityStage product, can find a non-limitative example of the commercial embodiment of data preparatory stage 304.
The data conversion stage 308 of data after data integrated system also can comprise conversion, an enrichment and send conversion.The data conversion stage 308 can be carried out the transition service, such as the reorganization of data with form again, and calculates according to the business rules and the algorithm of system user.The data conversion stage 308 also can be organized into target data and call Data Mart (datamart) or cubical subclass, so that analyze more tuning ground deal with data in the linguistic context at some.The data conversion stage 308 can adopt bridge, translater or other interface (following general introduction) to cross over the various data sources of data integrated system 104 uses and the various software and hardware architectures of datum target.The data conversion stage 308 can comprise graphical user interface, command line interface, and perhaps some of these interfaces make up and design cross-platform 100 data integration operation.In the WebSphereDataStage of IBM product, can find a non-limitative example of the commercial embodiment in data conversion stage 308.
Can utilize executed in parallel system 310, perhaps carry out the stage 302,304,308 of data integrated system 104, with the performance of optimization system 104 according to the mode of serial or combination.
Data integrated system 104 also can comprise the metadata management system 312 of the metadata that management is relevant with data source 102.In general, metadata management system 312 can be crossed over all instruments in the data integration environment, exchange, integrated, the management of metadata is provided and analyzes.For example, metadata management system 312 can provide different sources, for example the WebSphereODBC MetaBroker of IBM, CA ERwin, the WebSphere ProfileStage of IBM, the WebSphere DataStage of IBM, the WebSphere QualityStage of IBM, common, the general addressable view of the data among IBM DB2Cube Views and the Cognos Profilestage.The variation that metadata management system 312 also can be data structure is provided for analyzing the analysis tool of data lineage and influence.Metadata management system 312 also can be used for the business data nomenclature of data preparation data definition, algorithm and business environment in the data integrated system, and described nomenclature can come forth so that use in whole enterprise.In the WebSphereMetaStage of IBM product, can find a non-limitative example of the commercial embodiment of data management system 312.
Referring to Fig. 4, can be about various linguistic context and the hierarchy explanation project relevant, so that catch the semantic context of project with enterprise.Thereby, semantic identifier of Fig. 4 description entry purpose.Project can be object, class, attribute, data item, data model, metadata schema, model, definition, identity, structure, language, mapping, relation, example or other project or notion, comprises another semantic identifier.Semantic identifier can be according to the attribute of project, the physical location of project, and the relation of project and one or more other projects, for example the relation in hierarchy waits the identification project.In some cases, relation can be defined as not existing of a certain particular kind of relationship.Relation can relate to the position of project in concerning hierarchy.For example, in Fig. 4, can be according to the relation of project 1 5202 and related with it other project, identification project 1 5202.Project 1 5202 can be identified as directly related with project 2 5204, project 3 5204 and project 4 5210, with project 5 5212 indirect correlations, and by project 5 5212 and project 5 5210 and project 6 5214 indirect correlations.Project 1 also can be identified as directly related with project 2 5204, project 3 5204 and project 45210.In an embodiment, the indirect relation between project 1 5202 and project 5 5212 and the project 6 5214 can be recorded in project 1 5202 in the relation of project 4 5210.Except static identifier, the identification of this series connection or recurrence also allows dynamic identifier.For example, if the relation between project 4 5210 and the project 6 5214 changes, incorporating into by project 4 5210 so, the semantic identifier that comprises project 2 5204, project 3 5204 and project 4 5210 of project 1 5202 can embody this variation, need not be updated, directly be included in the semantic identifier as project 6 5214 for the variation of explanation project 6 5214 aspects.
Fig. 5 has represented one of semantic identifier more specifically example.Jim can be identified as and live in 111 Anyroad, and Anytown, Anystate USA, telephone number are that 555-555-5555 and SSN (social security number) are the Jim of 013-65-8067.On the other hand, can according to Jim and other people's relation recognition he.As shown in Figure 5, Jim can be identified as the son of Betty, the brother of Larryt and Jeff, the father of Jessica, and the nephew of Frank.
Semantic identifier can be the unique identifier of a project.In the example of Fig. 5, if only exist one to be the son of Betty in the world, the brother of Larryt and Jeff, the father of Jessica, and the nephew's of Frank Jim, this semantic identifier is the unique identifier of Jim so.Unique semantic identifier of a certain project may be considered all relations lacked of relation than this project and other project.In the example of Fig. 5, be the son of Betty if having only a Jim in the world, the brother of Larry and the father of Jessica, the individualism of these relations just is enough to produce a unique semantic identifier so.Do not need to consider the relation of Jim and Jeff and Frank.The semantic identifier that to create a pass with the minimum number of guaranteeing uniqueness be the basis is favourable.For example, if semantic identifier will be stored in the database 112 or will be handled by data integrated system 104, so not too complicated semantic identifier needs less space, and is convenient to handle more quickly.
The number of creating the required relation of unique semantic identifier of a certain project may change according to linguistic context.Fig. 6 A describes two projects being concerned about: project 1 5402 and project 7 5404.In linguistic context A 5408, can project 1 5402 and project 7 5404 be distinguished according to the relation of project 1 5402 with project 5 5410 and project 6 5412.Promptly, in linguistic context A, unique semantic identifier of project 15402 can be directly relevant with project 2,3 and 4 semantic identifier, by the indirect semantic identifier relevant with project 5 5410 of project 4, by the indirect semantic identifier relevant with project 6 5412 of project 5 5410 and project 4.In linguistic context A, the unique identifier of project 75494 can be only with project 2 and 3 directly related semantic identifiers.Fig. 6 B is illustrated in different context B, the project 1 5402 among the linguistic context B 5414.In order in linguistic context B5414, to discern project 1 5402 uniquely, can consider the direct relation of project 1 5402 and project 4, with not existing of the direct relation of project 6, perhaps with the indirect relation of project 5 in any one or a plurality of.In linguistic context B 5414, project 1 5402 can be identified as directly related with project 2 and 3 uniquely semantically, but not directly related with project 6.Thereby, between linguistic context A 5408 and linguistic context B 5414, the unique identifier difference of project 1.Thereby here among the embodiment of data integrating method of Miao Shuing and system, the semantic identifier of project can possess the linguistic context relevant identifier of this project such as the project relevant with data integration operation or data integration platform.In an embodiment, can such linguistic context relevant identifier be kept in the data storage bank atomic format.
In other embodiments, linguistic context A 5408 can be two different imports, mapping, operation version run version, model, the agency of unit (metabroker) model, example, instrument, view, object, class, project, relation, attribute or above-mentioned combination in any arbitrarily with B 5414.Coupling or comparison means can relatively be acted on behalf of in model, example, instrument and/or the project in different imports, operation version, model, unit, the grammer of the identity of a project, and relatively determine or help to determine will take or will avoid taking what action according to described.For example, matching engine can compare model and the first model of acting on behalf of the B use that import example A uses.According to this relatively, can determine that unit acts on behalf of B can visit import example A under the situation that does not have conversion or modification data and metadata, comparison means can instruct unit to act on behalf of B and move on.In another example, can compare instrument A 5408 and instrument B 5414, can determine to carry out the cross tool object and merge, wherein the object of another instrument can be visited and use to each instrument.In an embodiment, comparison means can trigger translating equipment and help the cross tool object to merge, for example set up bridge, the agency of unit, center (hub) etc., so that change any object that needs conversion, described conversion is such as being conversion based on the different grammers of the processing of the identity of specific project in each corresponding tool, perhaps based on the conversion of other difference between the described relatively more definite instrument.
In an embodiment, can preserve in order to the grammer that string structure or form are preserved, keep, write down, handle and/or explained, keep, write down, processing and/or interpretive semantic identifier.Fig. 7 describes grammer and an example of the correspondence string that constitutes with this grammer.Grammer 5502 can be the row title:: table name claims:: database-name.This grammer can be relevant with the syntax identifier of a certain row of a table in the identification database.The string 5504 that constitutes with this grammer can be the age:: the employee:: employee's database.This string can be relevant with the semantic identifier at the age of a certain employee in the identification particular employee database.In the example of Fig. 6 B, can be corresponding to the string of the semantic identifier of project 15402 among the linguistic context B 5414: with the direct relation of project 2:: with the direct relation of project 3:: with the indirect relation of project 4.Semantic identifier and corresponding string also can embody not existing of direct relation between project 1 5402 and the project 6.
In Fig. 8, the semantic identifier of the string format of project 9 5602 can be: directly to project 2:: directly to project 3:: directly to project 4:: receive project 55604.String can be resolved.Grammer and/or string can be resequenced by the element of brachymemma, modification and/or grammer and/or string.In Fig. 9, string 5702 is blocking of string 5604, and string 5704 is blocking and revise and/or resequencing of string 504, and string 5708 is individual character and/or rearrangements of string 5606.Describedly block, revise and/or resequence and to be undertaken by transform engine.When the uniqueness with regard to semantic identifier, when not needing to be included in grammer and/or the string all and concerning, brachymemma grammer and/or string are useful.Suppose that all items is all directly related with project 3 in the appointment linguistic context of string 5604; For example, project 3 is wherein to preserve the database of all items.String 5604 can be gone here and there 5702 thereby produce by brachymemma, omits the relation that relates to project 3, remains a unique identifier simultaneously.Brachymemma grammer and/or string can reduce memory requirement, and improve treatment effeciency.The order that changes the relation in grammer and/or the string also is of value to the processing time that reduces the data integration process.If at first handle not too common relation, system only needs visit and the processing less relation relevant with a certain project just can discern this project probably so.For example, if project seldom is relevant with project 3, project still less is relevant with project 4, and numerous items is relevant with project 2, depends on linguistic context so, compares with string 5604, and string 5708 may be convenient to identification project 9 in the short time.Discerning project 9 in this linguistic context uniquely may only need preceding two elements of string 5708, and needs first three element of string 5604.
Transform engine can be to one or more semantic identifiers, and database 112 comprises the database 112 of semantic identifier, and infosystem comprises that the infosystem of semantic identifier or other project carry out conversion operations.Figure 10 describes and to act on the semantic identifier that is presented as string 5804, and act on be presented as the string that is arranged in database 5808 the transform engine 5802 of semantic identifier.Conversion operations is convertible or otherwise revise form, language and/or the data model of semantic identifier.Conversion operations can relate to from one or multidata instrument, language, form and/or data model at least a other the conversion or the mapping of data tool, language, form and/or data model, perhaps from least a other data tool, language, form and/or data model to or the conversion or the mapping of multidata instrument, language, form and/or data model.For example, conversion operations can relate to the known data integration instrument of being to and from, conversion or mapping between the perhaps known data integration instrument, described known data integration instrument for example is the WebSphere DataStage 7 of IBM, the WebSphere QualityStage of IBM, Business Object instrument, IBM-DB2 Cube Views, UML 1.1, UML 1.3, ERStudio, the WebSphere ProfileStage of IBM, PowerDesigner (supporting Packages and Extended Attributes in addition) and/or MicroStrategy instrument.Transform engine and/or conversion operations can optionally be included among the agency of unit.Can be in batches, carry out in real time and/or continuously, carry out and/or implement conversion operations.The form that can serve, for example the form as the part of service-oriented architecture provides service or makes it available.SOA can be the part of foundation structure of the enterprise computing system of commercial enterprise.In SOA, service becomes the building block of application development and deployment, allows quick application development and avoids redundant code.Environment around each service imbody, for example invisible one group of business logic of target or the business rules of the data output of the source of the data of service input or service.Thereby service can be reused in company with various application programs, as long as set up correct input and output between services and applications.Service-oriented architecture allows the protection service to avoid the influence of environmental change, even consequently computer environment on every side is changed, this architecture also can operate as normal.Thereby service does not need to be recorded into the result that foundation structure changes, and this can save time and work.SOA can be used for web service, can relate to three examples, an ISP, a service requester and a service logger (registry).Register can be public Register or special registers.Service requester can be searched for Register and seek suitable service.In case find suitable service, service requester can receive and call this and serve necessary code, for example Web Services Description Language (WSDL) (" WDSL ") code.WSDL is the programming language that is generally used for describing the web service.Service requester for example by the message of appropriate form Simple Object Access Protocol (" the SOAP ") form of web service message (for example about), is connected with the ISP, so that call this service subsequently.Soap protocol is the preferred protocol that transmits data in the web service.The Interchange Format of message between soap protocol definition web service client and the web service server.Soap protocol uses extensible markup language (" XML ") scheme, and XML is the common similar language throughout standard that is used for flag data in the web service, but also can use other markup language.
In case for semantic identifier, database 112, the database 112 that comprises one or more semantic identifiers, infosystem, the infosystem or other project that comprise one or more semantic identifiers, there is conversion operations, it can be converted into or change certainly so, be mapped to, be linked to other semantic identifier arbitrarily, database 112, the database 112 that comprises one or more semantic identifiers, infosystem comprises the infosystem of one or more semantic identifiers or shares other project of at least one conversion operations and other semantic identifier arbitrarily, database 112, the database 112 that comprises one or more semantic identifiers, infosystem, other project that comprises the infosystem of one or more semantic identifiers or share at least one conversion operations is used together or is associated.In an embodiment, for example by the center of atomic data storage vault as conversion operations, except other, the mapping of conversion operations is followed the tracks of at initial semantic context and data converted in the conversion operations of carrying out back and forth between the semantic context after changing.According to linguistic context, the suitable identifier of data can change, for example under the situation that semantic context changes, by changing or brachymemma grammer and/or string, thereby can realize more effective storage or processing faster, perhaps be used to form the relation of unique identifier by change.Thereby in the various linguistic context of using data item, dynamic identifier can in conjunction with the advantage that can recall conversion and fast processing, active data be handled and the advantage of valid function.
The project of appointment, such as the project that in model, has identity can be in a variety of forms or example have for example physics example and logic modeling example.Figure 11 describes a project, i.e. employee information table 5902.But notion or entity " employee " can multiple different form be present in the enterprise.For example, employee's table 5902 form that can preserve the physics table of the value relevant with the employee is present in the physical data storage means.On the other hand, the entity employee also can be expressed as logical instance, for example represents employee's icon or text in the logic modeling activity 5908, perhaps various other form or examples.That is, identical project (comprising any relevant data or metadata) can be striden view, model, structure or data integration environment, in a variety of forms or example be present in database, data storage bank, model, the center etc.Figure 12 is described in and is a kind of form or single-instance in the database 6002, and/or is more than one form or employee's table 5902 of example in database 6004 or center 6008.
For various forms or the example of distinguishing project, can use any distinguishing feature, abstraction hierarchy for example, the physical property of project, the project position in hierarchy, the position of project in database, wherein find the linguistic context of project, the grammer of project, the relation of project and other project, the attribute of project, the classification of project or other characteristic.For example, referring to Fig. 5, distinguish projects according to age, sex, color development, IQ, political affiliation and/or past three number of times of seeing the doctor the middle of the month, difference is individual in this case in other words.For example, if the age be chosen as the product distinguishing characteristics, so Jessica be unique one less than 10 years old, Betty be unique one between 57 years old and 67 years old, Jim be unique one 37 years old.In another example, the multi-form or example of project can be present in different abstraction hierarchies or the different linguistic context.For example, employee table can be in a variety of forms or example be present in the center 6102, a physics employee table 5904 (such as being used for and being kept at database about the relevant value of employee's data) for example is with logic employee model 5908 (in the view that will be used in the process relevant with the employee).
The different instances of distinguishing the specific project of identification can realize various other methods and process.For example, in one embodiment, project can be brought to the center such as the table that is called " employee ".The center gatherer in have example in two kinds of forms of " employee " in the heart; A kind of corresponding to the physical database example, another kind of corresponding to the logic modeling activity.Distinguishing feature, such as in result from the project characteristic of this project in the heart can be for the usefulness of distinguishing physics example and logical model example or form.In an embodiment, distinguishing feature can be called as abstraction hierarchy, so that discriminate between logical abstraction hierarchy and physics abstraction hierarchy.In other cases, can get up further feature and item association in the center, such as multi-form identifier, relation, class, attribute, physical location, logical place, model etc.
As shown in Figure 14, when operating, for example select to be loaded into the data in the database, translation data produces inquiry when waiting, system, such as transform engine 6204 can be from the center 208 or database 6210 capture, pack into or obtain all items.It can select or filter 6204 projects according to any distinguishing feature.For example, it can be selected or leach has the physics abstraction hierarchy and other project has special relationship, has the logical abstraction level, creates before the date and time of regulation, perhaps has those examples or the form of any other distinguishing feature.Thereby method and system described herein is handled the example of identical items or entity selectively according to any distinguishing feature.
As shown in Figure 15, when responding inquiry 6202 data integration operation, during such as conversion operations, transform engine 6204 can be at the center 6208 or database 6210 filter or options, comprise any data and/or metadata, only capture, pack into or obtain those projects of related abstractions level.For example, it can leach or select those examples or the form with logical abstraction level, only keeps to have those examples or the form of physics abstraction hierarchy.Carry out described filtration or selection in the time of can or designing when moving, can carry out described filtration or selection in batches, in real time or continuously.In an embodiment, the form of the service of the RTI in the architecture that can be service-oriented provides such filtration or system of selection.
Filter or select can be when exploitation, the information that obtained by transform engine and/or system during design or during operation is the basis, for example mapping of data model, the mapping of metadata schema, distinguishing feature, the relation of project and other project, the attribute of project, the perhaps grammer of identifier.In an embodiment, described information can be dynamically updated in real time.
In whole process, filtration or selection are more near center or database, and operation is just efficient more and fast more.As shown in Figure 16, transform engine 6204 can itself carry out conversion operations to inquiring about 6202, and the inquiry of produce revising 6402, the inquiry 6402 of modification can be sent directly to center 6208 or database 6210 so that further handle.For example, available directly and the form of the native format compatibility of center 6208 or database 6210 present the inquiry 6402 of modification.For example, by presenting inquiry with the native format of database 6210, system can improve the treatment effeciency to this inquiry.Similarly; inquiry 6402 can be filtered; perhaps can produce the order such as select command; to keep logic modeling entity rather than physical entity; in this case; can be according to being suitable for the logic modeling activity, rather than the form (for example graphical user interface) that is suitable for database is presented inquiry 6402.Certainly, not only inquiry, and other message and operation all can be filtered according to abstraction hierarchy, and make it possible to stride data integration platform and follow the tracks of identical entity, and according to the identical entity of proper handling environmental treatment of specific data integration activity.
Method and system described herein can be used to catch semantic context, and about the various project deal with data integrated task relevant with enterprise, described various projects are such as being object, data item, data, row, OK, table, database, example, attribute, metadata, notion, exercise question, theme, semantic identifier, other identifier, the RFID label, the seller, the supplier, client, the individual, team, tissue, the user, network, system, equipment, family, the shop, product, product line, product feature, product specification, product attribute, price, cost, bill of materials, shipping data, tax data, course, educational program, the position, map, department, tissue, organism, process, rule, law, fee system, article, service and/or service provide.
Method and system described herein can be used in the various semantic contexts, the step in enterprise's method for example, the data in the database, the data in the row or column, row or column in the table, row or column in the database, the data in the table, the table in the database, metadata in the database, the project of center or storage vault, the project in the database, the project in the table, project in the row, project in the row, the people in the mechanism, the sender of communication or recipient, user on the network, system on the network, the equipment on the network, the member in the family, article in the shop, dish on the menu, the product on the product line, the product during product provides, course or step in education or the training plan, position on the map, the position of article, the department of mechanism, individual in the group, rule in the rule system, the service in the cover service, the entity in the organisational level of enterprise, entity in the supply chain, consumer in the market, the buyer of purchase decision, the price of commodity or service, the cost of commodity or service, the assembly of product or system, the step of method, the member of group or other are many.
Though about some preferred embodiments the present invention has been described, but should have understood and those skilled in the art will recognize that other embodiment, and described other embodiment is in the scope of the present disclosure.

Claims (35)

1, a kind of data integrating method comprises:
Be provided for concerning the semantic identifier of discerning described project according to project and another purpose;
Obtain the mapping of data model, so that the semantic identifier of the project in can the specified data model;
Mapping is associated with the data integration function, and the execution of wherein data integration function based on described mapping and described semantic identifier one of at least.
2, in accordance with the method for claim 1, wherein project comprises object, data item, data, row, OK, table, database, example, attribute, metadata, notion, exercise question, theme, identifier, semantic identifier, the RFID label, the seller, the supplier, client, the individual, team, tissue, the user, network, system, equipment, family, the shop, product, product line, product feature, product specification, product attribute, price, cost, bill of materials, shipping data, tax data, course, educational program, the position, map, department, tissue, organism, process, rule, law, fee system, article, one or more in providing of service and service.
3, in accordance with the method for claim 1, wherein relation relates to the position of project in concerning hierarchy.
4, in accordance with the method for claim 1, wherein semantic identifier is the unique identifier of project.
5, in accordance with the method for claim 1, wherein semantic identifier to lack but the pass of the unique enough numbers of sufficient to guarantee identifier is the basis than whole relations of this project and other project.
6, in accordance with the method for claim 1, wherein semantic identifier is the basis to guarantee that the unique minimized number of identifier is closed.
7, in accordance with the method for claim 1, wherein semantic identifier is the identifier relevant with linguistic context of project.
8, in accordance with the method for claim 1, wherein preserve semantic identifier according to atomic format.
9, in accordance with the method for claim 1, wherein semantic identifier is kept in the data storage bank according to atomic format.
10, in accordance with the method for claim 1, wherein semantic identifier is dynamic.
11, in accordance with the method for claim 1, wherein semantic identifier changes according to linguistic context.
12, a kind of method of carrying out the data integration process comprises:
Model and data set are associated; With
Formation is from the select command of data centralization option, and wherein the form of select command is based on the distinctive characteristics of determining from described model of project.
13, in accordance with the method for claim 12, wherein when moving, carry out the formation of this select command/inquiry at the process of using select command/inquiry.
14, in accordance with the method for claim 12, wherein when using the process of select command/inquiry, design carries out the formation of this select command/inquiry.
15, a kind of method of carrying out the data integration process comprises:
Model and data set are associated; With
Form the inquiry of this data set of inquiry, the form of wherein said inquiry is based on the distinctive characteristics of determining from described model of project.
16, in accordance with the method for claim 15, wherein when moving, carry out the formation of this select command/inquiry at the process of using select command/inquiry.
17, in accordance with the method for claim 15, wherein when using the process of select command/inquiry, design carries out the formation of this select command/inquiry.
18, a kind of data integrated system comprises:
Concern the semantic identifier of discerning described project according to project and another purpose;
Make it possible to the mapping of data model of the semantic identifier of the project in the specified data model;
Make the device that is associated with the data integration function of mapping, the execution of wherein data integration function based on described mapping and described semantic identifier one of at least.
19, according to the described system of claim 18, wherein said relation relates to the position of project in concerning hierarchy.
20, according to the described system of claim 18, wherein semantic identifier is the unique identifier of project.
21, according to the described system of claim 18, wherein semantic identifier is to lack than whole relations of this project and other project but the pass of the unique enough numbers of sufficient to guarantee identifier is the basis.
22, according to the described system of claim 18, wherein semantic identifier is the basis to guarantee that the unique minimized number of identifier is closed.
23, according to the described system of claim 18, wherein semantic identifier is the identifier relevant with linguistic context of project.
24,, wherein preserve semantic identifier according to atomic format according to the described system of claim 18.
25, according to the described system of claim 18, wherein semantic identifier is kept in the data storage bank according to atomic format.
26, according to the described system of claim 18, wherein semantic identifier is dynamic.
27, according to the described system of claim 18, wherein semantic identifier changes according to linguistic context.
28, according to the described system of claim 18, wherein by catch with and the direct relation of second project with first project of direct relation, semantic identifier is recursively caught the indirect relation with second project.
29, according to the described system of claim 18, wherein in string, catch semantic identifier, and if wherein unique identifier do not need all elements, then block described string.
30, according to the described system of claim 18, wherein data integration function is a conversion operations.
31, according to the described system of claim 30, wherein conversion operations is revised one or more in the data model of the language of form, semantic identifier of semantic identifier and semantic identifier.
32, according to the described system of claim 30, wherein the mapping of conversion operations can be followed the tracks of at initial semantic context and the data that are converted in the operation of carrying out back and forth between the semantic context after changing.
33, according to the described system of claim 30, wherein provide conversion operations as the service in the service-oriented architecture.
34, according to the described system of claim 18, also comprise distinctive characteristics, selectively the filtrator of the example of filter logic entity according to entity.
35, according to the described system of claim 34, wherein said distinctive characteristics is obtained from described mapping and described semantic identifier one of at least.
CNA2005800290342A 2004-08-31 2005-08-31 Methods and systems for semantic identification in data systems Pending CN101044472A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US60640704P 2004-08-31 2004-08-31
US60/606,407 2004-08-31

Publications (1)

Publication Number Publication Date
CN101044472A true CN101044472A (en) 2007-09-26

Family

ID=36000723

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2005800290342A Pending CN101044472A (en) 2004-08-31 2005-08-31 Methods and systems for semantic identification in data systems

Country Status (4)

Country Link
EP (1) EP1815349A4 (en)
JP (1) JP2008511936A (en)
CN (1) CN101044472A (en)
WO (1) WO2006026702A2 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102402507A (en) * 2010-09-07 2012-04-04 重庆邮电大学 Heterogeneous data integration system for service-oriented architecture (SOA) multi-message mechanism
CN102419744A (en) * 2010-10-20 2012-04-18 微软公司 Semantic analysis of information
CN102541861A (en) * 2010-12-14 2012-07-04 金蝶软件(中国)有限公司 Method, device and system for establishing mapping relation in system integration
CN102792301A (en) * 2010-03-12 2012-11-21 微软公司 Semantics update and adaptive interfaces in connection with information as a service
CN104461494A (en) * 2014-10-29 2015-03-25 中国建设银行股份有限公司 Method and device for generating data packet of data processing tool
CN111373365A (en) * 2017-10-12 2020-07-03 惠普发展公司,有限责任合伙企业 Pattern syntax

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7849090B2 (en) * 2005-03-30 2010-12-07 Primal Fusion Inc. System, method and computer program for faceted classification synthesis
CN101226523B (en) 2007-01-17 2012-09-05 国际商业机器公司 Method and system for analyzing data general condition
JP5183150B2 (en) * 2007-10-30 2013-04-17 アズビル株式会社 Information linkage window system and program
EP2112593A1 (en) 2008-04-25 2009-10-28 Facton GmbH Domain model concept for developing computer applications
US8428984B2 (en) * 2009-08-31 2013-04-23 Sap Ag Transforming service oriented architecture models to service oriented infrastructure models
US20140129533A1 (en) * 2012-11-08 2014-05-08 Microsoft Corporation Intermediary model to handle web vocabulary conflicts
US10360201B2 (en) * 2016-07-11 2019-07-23 Investcloud Inc Data exchange common interface configuration
KR102150335B1 (en) * 2019-01-17 2020-09-01 주식회사 쓰리데이즈 Database management system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5692184A (en) * 1995-05-09 1997-11-25 Intergraph Corporation Object relationship management system
US6044374A (en) * 1997-11-14 2000-03-28 Informatica Corporation Method and apparatus for sharing metadata between multiple data marts through object references
WO2002021259A1 (en) * 2000-09-08 2002-03-14 The Regents Of The University Of California Data source integration system and method
US6937983B2 (en) * 2000-12-20 2005-08-30 International Business Machines Corporation Method and system for semantic speech recognition

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102792301A (en) * 2010-03-12 2012-11-21 微软公司 Semantics update and adaptive interfaces in connection with information as a service
CN102792301B (en) * 2010-03-12 2015-05-06 微软公司 Semantics update and adaptive interfaces in connection with information as a service
CN102402507A (en) * 2010-09-07 2012-04-04 重庆邮电大学 Heterogeneous data integration system for service-oriented architecture (SOA) multi-message mechanism
CN102402507B (en) * 2010-09-07 2014-07-09 重庆邮电大学 Heterogeneous data integration system for service-oriented architecture (SOA) multi-message mechanism
CN102419744A (en) * 2010-10-20 2012-04-18 微软公司 Semantic analysis of information
US9076152B2 (en) 2010-10-20 2015-07-07 Microsoft Technology Licensing, Llc Semantic analysis of information
CN102419744B (en) * 2010-10-20 2015-07-22 微软公司 Semantic analysis of information
US11301523B2 (en) 2010-10-20 2022-04-12 Microsoft Technology Licensing, Llc Semantic analysis of information
CN102541861A (en) * 2010-12-14 2012-07-04 金蝶软件(中国)有限公司 Method, device and system for establishing mapping relation in system integration
CN104461494A (en) * 2014-10-29 2015-03-25 中国建设银行股份有限公司 Method and device for generating data packet of data processing tool
CN104461494B (en) * 2014-10-29 2018-10-26 中国建设银行股份有限公司 A kind of method and device for the data packet generating data processing tools
CN111373365A (en) * 2017-10-12 2020-07-03 惠普发展公司,有限责任合伙企业 Pattern syntax

Also Published As

Publication number Publication date
EP1815349A2 (en) 2007-08-08
JP2008511936A (en) 2008-04-17
WO2006026702A2 (en) 2006-03-09
EP1815349A4 (en) 2008-12-10
WO2006026702A3 (en) 2006-04-27

Similar Documents

Publication Publication Date Title
CN101044472A (en) Methods and systems for semantic identification in data systems
US7805341B2 (en) Extraction, transformation and loading designer module of a computerized financial system
Bernstein et al. Information integration in the enterprise
US8060553B2 (en) Service oriented architecture for a transformation function in a data integration platform
US8041760B2 (en) Service oriented architecture for a loading function in a data integration platform
US7814142B2 (en) User interface service for a services oriented architecture in a data integration platform
US7814470B2 (en) Multiple service bindings for a real time data integration service
CN103620601B (en) Joining tables in a mapreduce procedure
US8375046B2 (en) Peer to peer (P2P) federated concept queries
JP4571636B2 (en) Service management of service-oriented business framework
US20080250006A1 (en) Peer to peer (p2p) federated concept queries
US20050262193A1 (en) Logging service for a services oriented architecture in a data integration platform
US20060010195A1 (en) Service oriented architecture for a message broker in a data integration platform
US20050223109A1 (en) Data integration through a services oriented architecture
US20050262189A1 (en) Server-side application programming interface for a real time data integration service
US20050234969A1 (en) Services oriented architecture for handling metadata in a data integration platform
US20060069717A1 (en) Security service for a services oriented architecture in a data integration platform
US20050228808A1 (en) Real time data integration services for health care information data integration
US20050235274A1 (en) Real time data integration for inventory management
US20050232046A1 (en) Location-based real time data integration services
US20080228716A1 (en) System and method for accessing unstructured data using a structured database query environment
JP2006528800A (en) Self-describing business object
CN101040280A (en) Metadata management
US8364651B2 (en) Apparatus, system, and method for identifying redundancy and consolidation opportunities in databases and application systems
US8086568B2 (en) Peer to peer (P2P) concept query notification of available query augmentation within query results

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20070926