Data Quality Features Comparison Matrix

License Type and Indemnification

  Talend Open Studio for Data Quality Talend Platform for Data Management
Open Source License checkmark  
Access to Source Code checkmark checkmark
Subscription License   checkmark
Indemnification/ Warranty   checkmark

Support and Documentation

  Talend Open Studio for Data Quality Talend Platform for Data Management
Community-based: Forums, Bugtracker... checkmark checkmark
Access to Talend Technical Support checkmark checkmark
Documentation checkmark checkmark
Premium Service Levels   checkmark

Core Data Profiling

  Talend Open Studio for Data Quality Talend Platform for Data Management
Database Analyses (Database Content, Catalog, Schema) checkmark checkmark
Table Analyses (Column Set, Functional Dependency, DQ Rule) checkmark checkmark
Column Analyses checkmark checkmark
Redundancy Analysis checkmark checkmark
Column Correlation Analyses (numerical, time, nominal) checkmark checkmark
Visual Display of Results (text and charts) checkmark checkmark
Drill Down in Data checkmark checkmark
Custom Analysis checkmark checkmark

Advanced Data Profiling

  Talend Open Studio for Data Quality Talend Platform for Data Management
SQL Pattern Library checkmark checkmark
Regular Expression Pattern Library checkmark checkmark
Pattern Test View checkmark checkmark
Discovery of Patterns in Data (Pattern Frequency, SoundEx Frequency) checkmark checkmark
Pattern Customization checkmark checkmark
Indicators (Simple, Text, Summary and Advanced Statistics & Metrics) checkmark checkmark
Indicator Customization in SQL or Java (Count, Real Value, Match, Frequency) checkmark checkmark
Indicators & Patterns Shared on Talend Exchange checkmark checkmark
SQL Business Rules on table analyses checkmark checkmark
Data Quality Thresholds checkmark checkmark
Analysis results stored in datamart   checkmark
Batch Execution of Analyses   checkmark
History of Analyses   checkmark
Report Generation   checkmark
JasperReports XML-based Report   checkmark

Data Cleansing

  Talend Open Studio for Data Quality Talend Platform for Data Management
Fully Integrated with Talend Enterprise Data Integration   checkmark
Pattern Matching   checkmark
Interval Matching   checkmark
Name and Address Cleansing   checkmark
Third-Party Address Validation Services   checkmark
Fuzzy Matching (Soundex, SoundexFR, Levenshtein, Jaro-Winkler, Q-gram)   checkmark
Record Matching (match, no match, suspect)   checkmark
Fuzzy Deduplication   checkmark
Threshold Verification   checkmark

Reporting and Portal

  Talend Open Studio for Data Quality Talend Platform for Data Management
Intuitive Administration Web-based Console   checkmark
Report on Potential Primary Keys   checkmark
Report on Orphan Tables   checkmark
Access to All Reports Generated Through The Studio   checkmark
Access to OLAP Analysis Structures   checkmark
Customized Queries   checkmark
Customized Reports   checkmark
Report Import/Export   checkmark
Amazon EC2 Lifecycle Control   checkmark

Teamwork and Development Consolidation

  Talend Open Studio for Data Quality Talend Platform for Data Management
Shared Repository with Check In/Out   checkmark
Access Rights Management   checkmark
User Management with LDAP Directory   checkmark
Store Analyses, Metadata & Projects in SVN   checkmark
Store Analyses, Metadata & Projects in existing SVN   checkmark

Data Stewardship

  Talend Open Studio for Data Quality Talend Platform for Data Management
Role-based Task Resolution   checkmark
Assign Tasks   checkmark
Resolve Data Integrity Issues   checkmark
Resolve Data Matching Conflicts   checkmark
Web User Environment   checkmark

Implementation

  Talend Open Studio for Data Quality Talend Platform for Data Management
Job Designer   checkmark
800+ Connectors   checkmark
Wizards   checkmark
Load Balancing, High Availability, Failover   checkmark
Hadoop (optional)   checkmark

Open Source License


Talend Open Studio products are free to download and use under an open source licenses.
Details of the license used can be found on the specific talend.com product download page.

Access to Source Code


All open source projects are accessible from the public http://www.talendforge.org web site and for complete transparency and consistency; Talend provides the source code of all of the tools available in the commercial edition to clients who request access.

Subscription License


The "enterprise" versions include value-added features and services that enhance the open source products; these versions are distributed under a commercial license.
Talend's pricing model guarantees transparency and predictability: the price is not based on the volumes of data or potential additional needs for connectors or CPUs, rather it corresponds to the number of developers (Studio), the level of features (edition selected) and the subscription term.

This subscription approach guarantees your return on investment: the number of licenses can be increased or decreased every year to adapt to the evolution of a project's range and its staff.

The Talend solutions are cheaper to deploy, maintain and support; they are 50 to 80% less expensive than the equivalent proprietary solutions.

Indemnification/ Warranty


Because open source software results from collaborative development efforts, the final code combines contributions from diverse resources. If the integration of the various contributions to the code is not carefully managed and controlled, the final software use might infringe upon the original contributors’ rights. The end user might then be subject to legal and financial prosecution for infringement, even though such infringement was not intentional. Talend offers an Indemnification clause to its subscription customers. This guarantees customers that Talend will provide legal and financial protection, in the event that the Talend code infringes the rights of a third party.

Community-based: Forums, Bugtracker...


The Talend user community, composed of tens of thousands of professionals, is extremely active. The main contributions of the community include:

  • Testing and the quality of new versions,
  • Requests for new features,
  • Product translation and localization,
  • Support and exchanges via the forums,
  • Development and sharing of new components, connectors, jobs, models and other plug-ins.

Talend Exchange enables community members to publish their own plug-ins in order to share them with other users. Some of these contributions are ultimately integrated into the product, after Talend’s in-house R&D team completes in-depth testing and improvements.


Additionally, Talend contributes to numerous key open source projects and is a member of the Eclipse and Apache Foundations.


For more info on this see http://coders.talend.com

Access to Talend Technical Support


By subscribing to Talend Support Services, you benefit from the experience of our in-house technical experts, who are daily in touch with our R&D team. These services were established to insure effectiveness, security, and peace of mind of our subscription customers.


/services/services

Documentation


The documentation is available as free download in PDF format, in English and French Here

Database Analyses (Database Content, Catalog, Schema)


Offers an overview of the content of a catalog. It computes the number of tables and the number of rows per table for each catalog and/or schema.

Table Analyses (Column Set, Functional Dependency, DQ Rule)


Apply your Data Quality rules (DQ Rules) on a selected table.

Column Analyses


Select indicators on selected column, such as number of nulls, frequency table, summary statistics, pattern matching indicators, etc.

Redundancy Analysis


Compares the data of two sets of columns. You can use this analysis to verify foreign key/primary key relationships or to compare the content of two tables.

Column Correlation Analyses (numerical, time, nominal)


Shows correlations between nominal and numerical data in a bubble chart with the goal to easily find data quality issues by looking at outliers in the graph.

Visual Display of Results (text and charts)


View results of profiling results in compelling charts and graphs.

Drill Down in Data


Use the Data Explorer to drill down into individual data sources and view specific records.

Custom Analysis


Customize the analysis you run on your data to test that it conforms to any data shape, length or set of discrete values.

SQL Pattern Library


Use out-of-the-box or custom SQL expressions to evaluate and test data.

Regular Expression Pattern Library


Use regular expressions to evaluate data validity, including e-mail, part numbers, postcodes and more.

Pattern Test View


Create your own patterns in SQL or using regular expressions and test them against your data.

Discovery of Patterns in Data (Pattern Frequency, SoundEx Frequency)


Identify common data shapes and patterns in your data.

Pattern Customization


Develop custom analysis to test that your data conforms to any pattern you specify

Indicators (Simple, Text, Summary and Advanced Statistics & Metrics)


Indicators include row counts, null counts, unique values, duplicate counts, blank counts, min/max lengths, frequencies, patterns and specific phone number statistics and much more. Also includes mathematical statistics like mean, median and range.

Indicator Customization in SQL or Java (Count, Real Value, Match, Frequency)


Create your own personalized indicators and manage them in the same way that you manage any system indicators.

Indicators & Patterns Shared on Talend Exchange


Join the community of Talend users and share your custom indicators and patterns on Talend Exchange.

SQL Business Rules on table analyses


Apply your data quality rules to a selected table, allowing you to run a custom set of tests every time the data is updated.

Data Quality Thresholds


Define expected thresholds for data, including minimum/maximum values or lengths. Then see the outliers within your data.

Analysis results stored in datamart


Store the results of your data profiling analysis in a data mart to follow the history of data quality and track improvement or degradation in quality.

Batch Execution of Analyses


Run your analysis as part of a data integration or MDM job, or call analysis from an outside application.

History of Analyses


Store the history of data quality.

Report Generation


Generate reports to share with your team in PDF, HTML, XLS formats and more.

JasperReports XML-based Report


Generate data quality metrics that are easily consumable by JasperReports.

Fully Integrated with Talend Enterprise Data Integration


Access our full suite of data integration tools and transformations, including access to over 800 data sources.

Pattern Matching


Ensure that data conforms to specific shapes and patterns.

Interval Matching


Look for number of outliers in the number of times items appear in an attribute.

Name and Address Cleansing


Cleanse common address attributes, like name, address, state, city, postal code using included patterns and reference data. Leverage any trusted source to standardize and enrich your data.

Third-Party Address Validation Services


Leverage third-party address validation vendors to check addresses and validate them for postal discounts.

Fuzzy Matching (Soundex, SoundexFR, Levenshtein, Jaro-Winkler, Q-gram)


Includes algorithms for finding relationships in data using fuzzy matching. Use one included or customize with your own algorithms.

Record Matching (match, no match, suspect)


Based on your thresholds for matching, identify records that are matches, records that are unique, and those records that need manual inspection to determine match status.

Fuzzy Deduplication


Create a database with only unique records, leveraging fuzzy matching algorithms.

Threshold Verification


Turn up and down the sensitivity of your matching algorithms by assigning weights.

Intuitive Administration Web-based Console


Publish data quality metrics to a web-based portal to share the status of data quality with a cross-functional team.

Report on Potential Primary Keys


Understand which attributes are potential primary keys and validate those attributes that should be unique.

Report on Orphan Tables


Check the relationships of tables in your relational database and uncover orphan tables

Role-based Task Resolution


Assign tasks to your cross-functional team that help mitigate match results or address specific records that don’t comply with data quality rules.

Assign Tasks


Administrators can assign tasks to a cross-functional team.

Resolve Data Integrity Issues


Users can fill in missing data and resolve other data quality issues in the data stewardship console

Resolve Data Matching Conflicts


Users can visually review whether two records match should there be uncertainty in the matching algorithms

Web User Environment


The data stewardship supports a cross-functional team with an easy-to-use web-based work environment.

Amazon EC2 Lifecycle Control


Using the Talend Administration Console (TAC), you can automatically setup, deploy and shutdown a job running on Amazon EC2)