Close

Running AdTech analytics workloads at scale on the cloud – Podcast featuring VP of Tech Uzi Blum

Our VP of Technology, Uzi Blum, was invited to share his knowledge and Applift’s approach to cloud computing podcast presented by Rackner.

Uzi Blum

The Cloud Native Show presented by Alex Raul features insights from the tech industry’s top leaders covering a range of topics all within the data sector like cloud computing, IT modernization, and digital transformation. 

Topics discussed include cloud native technologies like Kubernetes, DevSecOps, Serverless, cybersecurity, hybrid cloud, private cloud, AWS, Azure, GCP, open source and much more.

Listen to find out how organizations large and small are using cloud technologies to further their mission.

Here’s the podcast written transcript:

[00:05] Alex: Welcome to the Cloud Native Show presented by Rackner. My name is Alex Raul, and I’m here with Uzi Blum. How are you doing, Uzi?

[00:13] Uzi: Doing great. Thanks for inviting me to the show.

[00:16] Alex: Happy to have you. So you’re currently the VP of Technology for Applift, correct?

[00:22] Uzi: Yeah, absolutely.

What is Applift and what are integrations do you work with?

[00:24] Alex: Why don’t you tell us what does Applift do? What are you in the business of doing?

[00:31] Uzi: Sure! We’re in the domain of adtech (advertising technology), and Applift is a mobile performance advertising company. Simply put, when you’re developing an app, what you want to do is put it in the App Store or Google Play. Also, you want to have users (obviously). Just putting it out there means you’re competing with more than 9 million apps on Google Play for example. It would be very difficult for a user to just organically find the app that you developed and that’s where companies like Applift help out.

Our goal is to find the best users for the app owner. Once the user is there, we also help to re-engage users who stop using your app. Ultimately out clients want mobile users that they can monetize so its better if they stick around for a long time. They monetize either by showing ads or by getting the users to purchase something. We have to find the right users who are a good match for a given app/product/service.

[01:53] Alex: What sort of integrations are you working with? Are you working with Google or Facebook Ads? What sorts of ad engines or ad platforms are you working with?

[02:07] Uzi: The mobile ad market is huge. I would say 85% of it is controlled by Facebook and Google, as you mentioned. We work with the other 15%, which is still a huge market opportunity and enough to allow us to build a big business.

There are two ways to build an advertising supported business – as you mentioned, Google and Facebook are platforms that have a lot of traffic, but there’s are a lot of other websites or apps that achieve the same. Take Candy Crush, for example, a lot of users obviously know this and they spend a lot of time just playing and playing it, and so forth. That’s potentially a place to show ads.

We work with all kinds of apps and websites. We don’t integrate, at least not actively right now, with Facebook and Google.

For our day-to-day business we work in two ways:

First, we work with supply partners. We talk with those app owners either directly or through aggregation companies – these are our supply partners. We provide them campaigns and we tell them “these are the fees you’re going to get if you deliver the right users”.

The other way is what we call RTB or “real time bidding”. Before a user is going to see an ad there’s an automated, lightning fast process on an ad exchange where companies like Applift get certain data of the user loading a page (or in-app view) – their operating system info, geolocation, what other apps have they been using and more – and have to decide if they want to place a bid to show their particular ad to the user. 

Usually there are several companies bidding on that ad view. The highest bid wins and the winner gets to show their ad. This is where we have more control over the process as compared to others supply partners but it’s also more challenging because of the competition.

When it comes to retargeting then real-time bidding is the only way to target users. Only if we know that a certain user was interested in buying a t-shirt – maybe they put it in the shopping cart – but never actually completed the purchase, can we retarget them to try to reengage them.

[04:51] Alex: Okay, so you’ve been at Applift for a few years, correct? How old is the company?

[04:58] Uzi: It’s over six years old by now. Yeah, around six and a half and yeah, been there for quite some time.

What brought you to be in this role and in the AdTech industry?

[05:08] Alex: Have you been in the industry your entire career? How did you get into the industry if not?

[05:15] Uzi: I’ve been working in consulting before. As part of working with consulting, I was working with over 40 different companies. You get to see a lot of areas of the industry but not necessarily really focusing on that. Then when I got the opportunity from Applift, it sounded really, really interesting, because one of the things that is very interesting about this ad tech industry is the volume and the speed! I mean, you’re tracking impressions, and clicks, and all kinds of installs, and in-app purchases. This is a lot of events. Marketing on billboards compared to mobile is very, very different. It’s like data is everywhere. The big volume is one challenge and I’m a big data fan, so this was very, very attractive for me.

Second thing is what’s now called fast data. We have to respond really, really quick for every action that happens, whether it’s on the bidder side or whether it’s on a click to see if it’s a valid click or not.

[06:48] Alex: That’s a great segue way. As a VP of technology, what is your day-to-day focus? Are you trying to build systems to be able to accomplish these sorts of data speeds? These core use cases that you mentioned before when you were talking about the market case. What is your day-to-day like at Applift? What are you focusing on?

[07:14] Uzi: Obviously there are the technical challenges coming every now and then. We’re constantly doing load testing on our systems and if we see that certain databases is not fitting our needs, then we see that it’s not going to hold on for a long time, then we change those kind of things.

Recently we changed to Skylla, for example. It’s a new, Cassandra-style database and we really like it. I’m not going to say which kind of other tools, not to benchmark here. Those are making sure we are up to the pace and volume. On the other hand, there are all the business challenges, and those are huge. Our clients are becoming more educated, and ask for better and smarter delivery on the campaigns. They’re willing to pay the same, but want higher quality results.

For example, mobile ad fraud became a huge, huge topic. High number clicks and app installs came from unwanted sources. Whether it’s click farms, bots or other sources Applift is absolutely against it.In the last two years we’ve invested a lot of time and analytics power to figure out how to track the fraud cases and how to prevent fraud. We also need to report to our clients and partners about the real numbers for transparency.

There’s a lot of industry knowledge about the new tricks of fraudsters and a lot of tech involved – both 3rd party solutions but also stuff we’ve built in-house. The volume of low quality and fraudulent traffic requires automating everything, it’s too much for manual campaign management.

[09:31] Alex: Right, so are you running machine learning models for the fraud detection?

[09:37] Uzi: Some of it is machine learning. Some of it are manual rules based on our expertise that we have in place. Some of it is dictated by the clients. It really depends on the client we’re working with and the type of app that we’re promoting

For example, a basic fraud detection rule is as simple as “if the app is very heavy then the click-to-install-time (CTIT) can’t be very short unless you’re on a super fast wifi. So we have some very basic rules to all the way identifying patters and spikes. Those are the things we have to react to really fast in order not to damage the campaign results.

How is Applift currently using cloud services ?

[10:34] Alex: It sounds like you guys are doing a lot in the realm of technology and using a lot of different tools. I want to ask in what key use cases you are using cloud services for. Are you fully built on the cloud? Are you somewhat on-prem, somewhat in the data center, somewhat in the cloud? How does that work?

[10:55] Uzi: We’re fully in the cloud and very happy it’s an option. I know that not every company can do that due to security issues. We work both in AWS and in Google Cloud Platform. GCP is mainly serving the business intelligence and analytics/data science teams, and AWS is mostly for engineering where the tracking and campaign management system are run. We can talk more about the reason for the split, because obviously it means maintaining two different infrastructures.

[11:50] Alex: I’d be interested in hearing that if you’re willing to talk about it.

[11:55] Uzi: Initially, we were only working with AWS, and then we had a couple of challenges when it comes to data aggregation and data ingestions and processing. We’re talking about relatively large volumes, around 500 million events daily. We also want to store this data for quite a while – the ad clicks, installs installs and post-install events to do some analysis over time.

When we started then we actually used a self hosted Presto. Even if we did an excellent job in terms of number of servers up in the air, we were very, very limited on how fast we could ingest data, process it, and query it. In some cases, we were limited to one day of data at a time. This was not ideal – we really blocked the business that way.

Then we ran a POC with a couple of  cloud solutions. A a business intelligence analytics and data engineering team we didn’t have a lot of dev ops power. We’re now in a better place – our dev ops team is doing an amazing job. We said we really have to look for software service solution. There are some database that claim to be software service, but you still need to maintain the number of servers, number of nodes and size and so forth.

Eventually we ended up with BigQuery, which is awesome. It’s basically one of my go-to tools that I would recommend to anyone who is dealing with big volumes. This database just ingest crazy fast and you can query terabytes of data without blinking an eye. It’s really, really fast. It’s really maintaining itself on its own. You pay as you grow. You don’t need to worry about more storage. You don’t need to worry about the fact that a certain query would not process. If you take good care of it and monitor it well then it’s not even that expensive plus there’s no dev ops involved in the process.

[15:20] Alex: So you’re split halfway, basically? Or certain business launches on the AWS site and then, in just capabilities are on the GCP side?

[15:31] Uzi: These systems need to communicate with one another so most of these files, either we ingest it from S3 or we just put it on cloud storage and from there. The communication has been done, so quite straightforward.

[15:50] Alex: What sorts of workloads are you running in order to perform these analytics capabilities? Are you running compute on both sides of GCP and AWS?

[16:04] Uzi: The analytics preliminary is on GCP, so the analytics and BI team is working with Python for ingestions. When it comes to some data science project, our go-to tool is be Python. Another very cool feature is that GCP has is something called CoLab. This is basically Jupiter Notebook on the cloud, which is awesome because you can just share it with any user as if it were a G-doc. You write your code and it has a native connector to BigQuery. You get everything in place and run your Python, your scripts and everything, and just share it with people, so it’s really, really amazing.

Data progression at Applift

[17:21] Alex: How have you seen things progress on technology, particularly the cloud side, since you started in 2016? Have you been – has the main change been the shift to GCP to some extent? Or have you been using some other cloud native services that you’d say you probably weren’t using at the beginning, apart from just general compute and storage?

[17:44] Uzi: We started with GCP and one of the things that is really amazing with working with cloud is that those solutions are constantly  updated. One very interesting feature – CoLab was not there when we started. It started working with some kind of Python integration with BigQuery and over time, it’s been upgraded and add more features.

Recently, BigQuery and G Sheets got a native connector, so any user from his favorite G Sheet can connect to that. You get for free, Data Studio, which is kind of the BI tool. Is in beta, which is why it’s free but users love it. There are constantly new features being added and we have a very good relationship with the GCP team here in Berlin. They keep updating us on new features – every month we’re like oh, wow, this is included and those new features, we’ve been waiting for that.

On AWS, we have started to work with Kubernetes in the last nine months, and it’s been such an awesome experience. Our dev ops guys moved to kubernetes but we probably need a separate show just to talk about kubernetes. There’s a dev ops meetup in our office next week with a talk by one of our guys on the topic. It’s very simple, very straightforward, and we’re thinking of slowly moving everything to our kubernetes cluster.

[20:14] Alex: So are you running on EKS?

[20:16] Uzi: No, we are doing it on our own from scratch. EC2 machines, we put the nodes there, we decide on the distribution, and on the deployments. Everything is running there and it gives us many advantages. For example, when you run with Mesos, we had to work with particular services that are safe states and kubernetes we put in there are Skylla and Kafka. It’s all in one place and our dev ops team just enjoys working with it.

[21:02] Alex: Great, so are you running mostly stateless workloads on kubernetes? Are you experimenting with stateful also?

[21:11] Uzi: Yeah, we are working with both. We have trackers and storage services that we developed on our own, and they’re all on kubernetes. As I’ve said, we also have a database, whether it’s Influx or Skylla, where we store the data – Kubernetes allows you to do those kind of things.

[21:42] Alex: As you can see, the next two to three years of Applift, is that where you see a lot of stuff going? You mentioned trying to move more stuff onto Kubernetes. Is that what you would say is the main movement as far as what you’re trying to do technologically, or are there other things in the pipeline that you’re excited for as far as Applift goes?

[22:02] Uzi: On the cloud side moving all our application to microservices would be probably a fast-forward, easy to scale, and high availability. Definitely Kubernetes would be a good place to go for that.

Technology-wise there are so many challenges. If we look at BIA, the business intelligence analytics, so far we had the challenges of data ingestions and transformations and we will have to have more skills around data science. Two years ago we started to empower our analytics team,  but also more and more of our developers are require to have data science skills in order to find patterns, correlation, and basically come up with smart optimization to find the best users for our clients.

[23:28] Alex: You mentioned potential move to microservices, how large is the engineering organization at Applift?

[23:35] Uzi: We’re 20 people and we’re very picky to make sure we have the right culture fit. The team has managed to so much in a short amount of time. Having the right infrastructure in place has allowed small teams like BIA (just four people) to move super fast.

By the way, we’re hiring!

What excites you most about the future of tech ?

[24:20] Alex: I always like to end with a discussion on what you’re most excited about in technology on a broader scale, so not just Applift. Maybe it’s your industry. Maybe it’s just something you’re personally interested in. What would you say you’re most excited about in a two, three, five-year time frame?

[24:46] Uzi: One of the things that I’m really interested in personally is NLP (natural language processing). In Applift, we’ve done a couple of related projects. When it comes to, for example, app similarity – actually just released some kind of blog talking about this topic, but I think text similarity and matching is a very, very interesting domain. I feel like personally, I’ve not explored enough and I would definitely be happy to get into this area. There’s so much data in articles and blogs. For example, I’m a big Medium fan. Every day instead of checking the news or Facebook, I just go on Medium on the way to work and read a few things there. Yeah, just to look at the relevancy around that, recommendations related to text, and so forth. Those are things I think are going to shape the future.

[26:09] Alex: That’s awesome. Well, thank you so much, Uzi. That was fantastic.

Ready to hear the podcast live? Listen to it here.

For more great data insights, connect with Uzi Blum on LinkedIn.

X