Skip to main content

Unstructured Information Management Architecture

IBM technology that supports the implementation, composition, and deployment of UIMA applications.

Date Posted: December 16, 2004

alphaworks tab navigation


 

Update: June 13, 2008 UIMA SDK has been moved to Apache as Incubator OpenSource project. Two Apache UIMA components (SemanticSearch 2.1 and IBM UIMA wrapper) are still available here.

 

What is Unstructured Information Management Architecture?

Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. UIMA is a framework and SDK for developing such applications. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at. UIMA enables such an application to be decomposed into components, such as "language identification" > "language-specific segmentation" > "sentence boundary detection" > "entity detection (person/place names, etc.)". Each component must implement interfaces defined by the framework and must provide self-describing metadata through XML descriptor files.

The framework manages these components and the data flow between them. Components are written in Java™ or C++; the data that flows between components is designed for efficient mapping between these languages. In addition, UIMA provides capabilities for wrapping components as network services, and it can scale to large volumes by replicating processing pipelines over a cluster of networked nodes.

How does it work?

UIMA SDK was originally developed by IBM® and made available here at alphaWorks®. In October 2006, IBM donated UIMA SDK to Apache; ongoing development will be done in the open-source style by the Apache UIMA community. For further details about Apache UIMA and its development process, please refer to the Apache UIMA Web site.

There are still some IBM products in the field that uses older IBM UIMA releases instead of the new Apache UIMA releases. If you need an older IBM UIMA release, please check the IBM product page for UIMA on developerWorks® in order to get the product-aligned version of IBM UIMA. The Java source code for some of the older IBM UIMA releases is available at SourceForge.

IBM technology related to Apache UIMA

The alphaWorks UIMA pages contain some additional components and technologies that work with Apache UIMA and enrich the functionality of Apache UIMA. Currently, the available components are as follows:

About the technology author(s)

Technology related to UIMA was developed by teams from IBM Research and IBM Software Group. It is a world-wide effort, with significant participation from the following IBM sites:

Apache UIMA is being developed by the Apache open-source community.

Trademarks