Data Cleansing
Your business intelligence is only as good as your data. Being more agile and effective requires your organization to be able to effectively aggregate its disparate data repositories, different versions of entities and its redundant silos.
Data cleansing is a key component of every successful data integration, data warehouse and business intelligence initiative. Dirty and incomplete data limits the usefulness of data warehouses and business intelligence systems. A typical data warehousing project requires approximately 18 months of development and a substantial IT budget. It is typical for more than half of the cost to be allocated to the data integration and the data cleansing efforts.
FAST ESP’s Data Cleansing feature uses linguistics to convert multiple structured data repositories into a clean master index. It resolves conflicts and removes duplicate data, leaving only the actionable information. Higher-quality data makes users and organizations more successful, while making more cost-effective use of the IT architecture. It also helps improve decisions, reduce IT costs and efforts and increase productivity. Data Cleansing is FAST ESP’s framework for data integration and data cleansing. It is a collection of technologies, best-practice methodologies and design patterns that can be re-used across a wide set of data integration challenges. Data Cleansing applies extract, transform and load (ETL) technology for data extraction and integration. It also drives FAST ESP’s index processing for linguistic analysis and fuzzy matching. Records are automatically compared and merged, creating a clean master index in as little as a few days.
Source: FAST ESP Brochure 2007
