Groupon builds on Talend Data Integration

Best deals with Talend


Launched in November 2008 in Chicago, Groupon has since grown to over 1,000 regional offices with more than 10,000 employees in 48 countries, and features thousands of deals every day. Social shoppers from New York to Hong Kong and from Oslo to Cape Town can no longer imagine life without Groupon. Groupon’s business model has revolutionized the way companies reach new customers. Regardless of whether its business partners are looking to win new customers, sell products or promote the best vacation spots, Groupon reaches more people, more directly and more quickly than any other medium. Worldwide, hundreds of millions of newsletter subscribers and smartphone users use Groupon to check out the best stuff to do, see, eat and buy. The key to Groupon’s success lies in unbeatable prices with top-rated local business partners, giving customers every reason to feel comfortable venturing out and trying something new.


Key within the Groupon corporation is the Berlin (Germany) office, looking after IT, product management and online marketing for more than 35 countries.

Groupon’s IT infrastructure is the supporting backbone for all of the company’s business activities. The company’s exceptional growth – transitioning from a start-up to a publicly-traded Internet giant in just a few years – placed the IT infrastructure under considerable pressure, however. Even the sheer volume of data – which is Groupon’s lifeblood – is a challenge in itself. Every day the company has to process more than 1 terabyte of raw data in real time and store this information in various database systems.

The Berlin office manages this data for more than 35 countries in Europe, South America, Asia and the Pacific region, which includes Australia and New Zealand. To support corporate decision-making, Groupon uses a MicroStrategy Business Intelligence platform running on a Teradata data warehouse, with MicroStrategy serving as the reporting frontend. Groupon relies on Talend’s data integration products to import data from different sources to the data warehouse, transform it and then export it to various target systems.


“As a typical startup, we kicked off with an IT backbone built almost entirely on open source technology,” recalls Rafael Herrera, Head of BI International at Groupon. “The key factor for us – besides the cost gains – was scalability. We needed a framework that could support dynamic growth from the outset. In the meantime, we also have proprietary solutions in place so the licensing model is not necessarily the deal maker for us. Talend’s open source technology is a key building block within our IT landscape to this day.”

Since 2011, Talend has been the central data integration platform at Groupon. In the initial stages the company deployed Talend Open Studio for Data Integration, the community version. Following successful execution of the first jobs, management quickly decided to switch to Talend Data Integration.

Every day, Groupon runs around 1,000 different data integration jobs involving Extract, Transform and Load (ETL) processes. Some of these jobs run once a day, whereas others run every hour or even more frequently. Talend’s integration solution loads data in parallel to several databases. In addition to the main Teradata warehouse, Groupon also deploys PostgreSQL and Exasol databases and a Customer Relationship Management (CRM) solution. The e-mail marketing and Online Transaction Processing (OLTP) systems are the most important data sources.

Most of the data that Groupon manages is generated by its e-mail marketing system, which is one of the company’s core business tools. The aim is to align marketing activities more closely with target group interests and preferences. All of this customer data, sourced from over 30 countries, is loaded to the data warehouse through the Talend integration platform. Groupon uses an OLTP solution to handle millions of transactions every day. This open source solution runs on a PostgreSQL database. Every 5 minutes the Talend platform replicates data from the OLTP system to the Teradata warehouse. Customer data is also loaded from various sources to the Salesforce CRM solution and then synchronized on an hourly basis with the data warehouse to ensure that Groupon always has a single source of truth for all customer intelligence.


“Talend is cost-effective, easy to use, readily adaptable and extremely versatile,” says Herrera. “With the help of the graphical user interface we can easily and quickly link up a large number of source systems using the standard connectors. To connect the Teradata database we had to develop an interface component ourselves but that was an easy task for our internal developers.”

The main objective of the Talend integration platform is to feed data from the varied sources – including the CRM, e-mail marketing and OLTP systems – into the data warehouse as rapidly as possible. This aggregated data is then available to sales consultants on the ground, and to sales managers to support strategic decisions.

“The very nature of our business model means we have to analyze vast amounts of data in near real time in order to identify developments and trends as they emerge. Our Talend integration solution means that our data warehouse is always updated with the latest information, giving us as precise an image of the current situation as possible,” continues Herrera. “At the end of the day information-driven decisions can only ever be as good as the underlying information, and our information is always very good – thanks in no small part to Talend.”