In today's blog, I am covering a preview of the latest performance benchmark that the R&D and Talend Labs has run based on the TPC-H benchmark tests.
As ever, it is Talend’s mission to provide easy to use big data integration tools with the industry’s highest performing, most scalable integration code running natively on Hadoop.
As a part of this mission, we put every product release through a rigorous set of performance and scalability tests, including a performance benchmark developed by the Transaction Processing Performance Council, known as TPC-H.
In the latest release of Talend Big Data, we have implemented some key performance strategies and optimisations in Talend Studio that ensure that the Java code that is generated for MapReduce is already optimised. In previous versions these optimisations were possible, however it was incumbent upon the Talend Developer to implement them, or even know that the patterns and good-practice approach existed.
Talend has taken time to embedd the following optimisations in the Studio Design Time, the benefits of this generated output (deployed natively onto the Hadoop nodes), results in an performance uplift of 67 percent as compared to version 5.4.1.
The details of this TPC-H benchmark will be published as part of the 5.5.1 release.
In the meantime, ask your other integration vendors how they implement the performance strategies above in a graphical tool without the need to expert knowledge and man-years of experience... see what answers they can give :-)