
Tarush Aggarwal is our guest this week, and we have a fantastic conversation about democratizing data reporting, and how even non-data-focused businesses need to have deep reporting capabilities to be competitive. Regardless of your role or industry, this is an episode worth checking out!
Watch this episode on YouTube: https://youtu.be/Ehl4heGovas
Save 20% on your first order at the DATAVERSITY Training Center with promo code “AlgminDL” – https://training.dataversity.net/?utm_source=algmindl_res
Connect with the host, Anthony J. Algmin, on LinkedIn: https://www.linkedin.com/in/anthonyjalgmin
Become a Data Leader – https://algmin.com
About Tarush Aggarwal:
Tarush is the Founder and CEO of 5x, which offers data reporting as a service, so users can make data-driven decisions faster, which are necessary to succeed.
Tarush is also one of the leading experts in leveraging data for exponential growth in the world!
Previously, as the Head of Data at WeWork, Tarush scaled the system to support 12k employees, as well as 100 data team members. Tarush was also the first data engineer at Salesforce in 2011.
5x – https://5x.company/
Episode Transcript:
1
00:00:05,905 –> 00:00:10,226
anthony: Welcome to the Data Leadership Lessons podcast. I’m your host, Anthony J. Algmin
2
00:00:10,393 –> 00:00:13,663
anthony: Data is everywhere in our businesses and it takes leadership to make the most of it.
3
00:00:13,747 –> 00:00:17,200
anthony: We bring you the people stories and lessons to help you become a data leader. Our
4
00:00:17,200 –> 00:00:19,419
anthony: Today on data leadership lessons, we welcome
5
00:00:19,652 –> 00:00:24,691
anthony: Tarush Aggarwal. Tarush is the founder and CEO of 5x, which offers data reporting
6
00:00:24,691 –> 00:00:29,095
anthony: as service so users can make data driven decisions faster, which are necessary to
7
00:00:29,179 –> 00:00:29,596
anthony: succeed previously as the head of previously, as the head of data at We work.
8
00:00:29,596 –> 00:00:31,448
anthony: succeed previously as the head of previously, as the head of data at We work.
9
00:00:31,681 –> 00:00:35,452
anthony: Teruch scaled the system to support twelve thousand employees as well as a hundred
10
00:00:35,685 –> 00:00:39,789
anthony: data team members. Tarush was also the first data engineer at Sales force in two
11
00:00:39,789 –> 00:00:39,906
anthony: data team members. Tarush was also the first data engineer at Sales force in two
12
00:00:39,973 –> 00:00:43,660
anthony: thousand and eleven to reach, as one of the world’s leading experts in leveraging
13
00:00:43,810 –> 00:00:46,546
anthony: data for exponential growth. Jewish, welcome to the show.
14
00:00:47,213 –> 00:00:49,749
tarush: thank you so much, Anthony roingsed to here. thanks
15
00:00:49,916 –> 00:00:55,038
anthony: and you’re coming to us from Costa Rica, and in our pre-show, I saw this amazing place
16
00:00:55,205 –> 00:01:01,044
anthony: that you’re at. I am in dreary Chicago land right now and it’s just abysmal, so
17
00:01:01,044 –> 00:01:04,881
anthony: you got to breathe the good energy of the wonderful weather to us. so like, um,
18
00:01:05,115 –> 00:01:08,001
anthony: like you do with all our first time guesss. Why’t you just take a couple minutes?
19
00:01:08,084 –> 00:01:10,236
anthony: Give us the story of your background. How
20
00:01:10,553 –> 00:01:13,673
anthony: like what you’ve done in your career has led you up to what you’re doing now at
21
00:01:13,757 –> 00:01:15,358
anthony: 5x and will kind of take it from there
22
00:01:15,875 –> 00:01:20,046
tarush: amazing yet. I again’t. thank so much for haing in the show. Really excited to chat.
23
00:01:20,680 –> 00:01:24,834
tarush: and uh, thank you for being patient. While the I know the preon boarding wasn’t as
24
00:01:25,001 –> 00:01:27,887
tarush: as easy as being in New York, so really appreciate that.
25
00:01:29,005 –> 00:01:34,043
tarush: Um, you know, I come from a. I come from a pretty technical background. growing up.
26
00:01:34,127 –> 00:01:39,566
tarush: My mom ran an e-commerce business. I was exposed, Um to software engineering pretty
27
00:01:39,716 –> 00:01:45,488
tarush: early on. She joks that I was you so technical because she was was teaching herself
28
00:01:45,889 –> 00:01:48,842
tarush: half of those things while she was pregnant with me. Um,
29
00:01:49,959 –> 00:01:54,197
tarush: so you know, sort of figure out very early on. Computer science was my area, Got to
30
00:01:54,197 –> 00:01:55,248
tarush: go to Connegie Mellon,
31
00:01:56,282 –> 00:02:01,805
tarush: After college, got a job at Sales for Dot com Silicon Valley, and super excited, and
32
00:02:02,288 –> 00:02:05,959
tarush: and I remember showing up to work on the first day and realizing very very quickly
33
00:02:06,526 –> 00:02:11,481
tarush: that software engineering was not my cup of tea. That focusing on a you know a small
34
00:02:11,714 –> 00:02:16,035
tarush: feature for an extended period of time is super super important, but just not
35
00:02:16,202 –> 00:02:20,757
tarush: something I was personally interested in, and more importantly in a software
36
00:02:20,924 –> 00:02:26,763
tarush: engineering. It had become an art form where the rules, um, of best practices and
37
00:02:26,846 –> 00:02:28,681
tarush: how to do it were already established,
38
00:02:29,716 –> 00:02:35,088
tarush: so it was very much learning the rules, being able to play inside this well crafted
39
00:02:36,039 –> 00:02:42,195
tarush: Um. art form. if you may, and you. I didn’t personally interest me and I, At that
40
00:02:42,362 –> 00:02:47,634
tarush: point this was two thousand eleven. Um. No one was really talking about data and I
41
00:02:47,800 –> 00:02:52,522
tarush: got to with a product manager. Focus on Um. You know building this framework which
42
00:02:52,522 –> 00:02:58,528
tarush: allowed sales for to extract metrics from log files. Sas, was used as for Uh, For
43
00:02:58,678 –> 00:03:02,849
tarush: products like benchmarking, used it to figure out engagement of different customers.
44
00:03:02,916 –> 00:03:06,519
tarush: How they’re using the sales for product. This later became the product and Lid,
45
00:03:06,603 –> 00:03:11,808
tarush: Esteam got hundreds of engineers on that team now and year before that. I moved on.
46
00:03:12,609 –> 00:03:14,527
tarush: Um. Most recently I,
47
00:03:15,562 –> 00:03:20,283
tarush: I got to join. We work again pretty early on with the data team Was of just a few
48
00:03:20,516 –> 00:03:25,488
tarush: people have skilled it. Up to a hundred plus people jump started data engineering
49
00:03:25,638 –> 00:03:31,077
tarush: data platform, Um, got to work on Um. Our out
50
00:03:32,929 –> 00:03:37,483
tarush: got to lead our China platform efforts when we doing a lot of cool stuff on machine
51
00:03:37,634 –> 00:03:41,087
tarush: learning and facial detection, and all of these other cool things which I
52
00:03:41,154 –> 00:03:45,475
tarush: ▁ultimately never got to see light of day just because of the uh of the direction of
53
00:03:45,475 –> 00:03:50,046
tarush: the Ip at that time. But I found myself at the beginning of covertt, interestingly
54
00:03:50,280 –> 00:03:55,268
tarush: enough, in Bali, Indonesia, which is extremely different from Silicon Valley in New
55
00:03:55,268 –> 00:03:56,269
tarush: York,
56
00:03:56,836 –> 00:03:59,889
tarush: And you know taking some time off from the valley
57
00:04:00,857 –> 00:04:01,858
tarush: helped me to
58
00:04:03,159 –> 00:04:07,880
tarush: in really understand what’s happening in terms of data. and that’s that. In know the
59
00:04:07,880 –> 00:04:08,698
tarush: next ten years
60
00:04:09,716 –> 00:04:14,921
tarush: every company is really go. A need to invest more in data capabilities. Things like
61
00:04:15,004 –> 00:04:19,726
tarush: how a customer using your product you go to market’ shargy, where different segments
62
00:04:19,876 –> 00:04:21,961
tarush: which a lifetime value. Um.
63
00:04:23,396 –> 00:04:26,916
tarush: these are the type of use cases companies are going to start focusing on in the next
64
00:04:27,083 –> 00:04:31,554
tarush: ten years. What’s happening today is that ten percent of the world or highly
65
00:04:31,804 –> 00:04:36,609
tarush: technical companies can really go do this. So the way Google and Facebook and Apple
66
00:04:36,759 –> 00:04:42,048
tarush: get value from data is very different from your typical series a company. And what’s
67
00:04:42,115 –> 00:04:45,401
tarush: happening is that the modern data stack is just becoming more and more more
68
00:04:45,568 –> 00:04:51,724
tarush: sophisticated, so all of a sudden, you know we require a few data engineers just to
69
00:04:51,808 –> 00:04:55,478
tarush: stitch together this stack, and all of a sudden we care about security in P, i, I
70
00:04:55,561 –> 00:05:00,199
tarush: and g, d, p, R, and all of these other things. So for every company which are just
71
00:05:00,283 –> 00:05:05,405
tarush: getting started for them to start from scratch, is no longer really feasible. so
72
00:05:05,888 –> 00:05:11,878
tarush: essentially started five ▁s, to help ninety percent of companies get value from data
73
00:05:11,961 –> 00:05:17,567
tarush: on the same way top tech companies can do it, Um we, with A with a much lower
74
00:05:17,717 –> 00:05:22,922
tarush: barrier to entry, And that’s reallyly what I got super excited about, and
75
00:05:23,239 –> 00:05:26,843
tarush: coincidentally that happen being in Baally, which is not a technical place at all,
76
00:05:27,643 –> 00:05:30,847
tarush: but yet a little bit of up about my background in fives.
77
00:05:31,681 –> 00:05:38,721
anthony: Yeah. Well, it’s interesting because you’ve seen perspectives from fast growing of
78
00:05:38,805 –> 00:05:40,807
anthony: Startu companies like we work, and then sales
79
00:05:41,040 –> 00:05:44,877
anthony: force, which is obviously pretty well established at this point. and you’ve seen
80
00:05:45,194 –> 00:05:51,434
anthony: how Silicon Valley operates and where there so much emphasis on that software
81
00:05:51,601 –> 00:05:55,521
anthony: engineering side and a lot of the processes and the tools that people are using are
82
00:05:55,605 –> 00:05:59,759
anthony: so far advanced. And then you go into the database and’ like what is this? This is
83
00:05:59,926 –> 00:06:01,444
anthony: dark ages. Comparatively,
84
00:06:02,562 –> 00:06:08,234
anthony: do you have hypothesis on why that is like. How did we end up in this spot? As far
85
00:06:08,234 –> 00:06:09,235
anthony: as you can tell,
86
00:06:10,253 –> 00:06:11,254
tarush: yeah, you know. I think
87
00:06:12,355 –> 00:06:17,160
tarush: that’s a great question. and if you look at the data space, it’s much newer than
88
00:06:17,160 –> 00:06:21,481
tarush: software engineering. Right this pretty much. I think Two thousand Eleven was the
89
00:06:21,481 –> 00:06:26,602
tarush: first the data engineers were recognize as a profession, and the Goodra market
90
00:06:27,003 –> 00:06:31,157
tarush: strategy data employed is very much replicated software engineering, where you build
91
00:06:31,240 –> 00:06:36,279
tarush: a technical product and use setary technical teams. And that’s typically, you know
92
00:06:36,446 –> 00:06:40,516
tarush: what we have done in this space so far. If you look at data, Vse, Software
93
00:06:40,683 –> 00:06:42,435
tarush: engineering, data is a lot broader.
94
00:06:42,502 –> 00:06:43,503
anthony: Mhm.
95
00:06:43,503 –> 00:06:49,242
tarush: a lot more. Every business beyond some size should should focus on how customers are
96
00:06:49,325 –> 00:06:52,278
tarush: using their product. What to do with the gold market strategy? So
97
00:06:53,329 –> 00:06:59,085
tarush: as an whole, data is a lot broader than software engineering. Yet they’ve copied the
98
00:06:59,085 –> 00:07:03,489
tarush: gold market strategy of software engineering companies? So it leads to
99
00:07:04,841 –> 00:07:09,245
tarush: this landscape where a lot of companies want to get value from data, but really
100
00:07:09,479 –> 00:07:13,966
tarush: don’t have the technical expertise of bandwith to go. Where do it? So we see a lot
101
00:07:14,117 –> 00:07:18,921
tarush: more companies playing at tools like Google and ▁lyrics, or maybe even something
102
00:07:19,155 –> 00:07:23,476
tarush: like a sort of tableau, which is a very low level and they don’t really have the
103
00:07:23,559 –> 00:07:26,045
tarush: right resources or expertise to bed the up level.
104
00:07:27,363 –> 00:07:30,967
anthony: Yeah, you know, and I think about it too. where like. We’ve had data for a long
105
00:07:31,200 –> 00:07:34,153
anthony: time, and and I think it’s a new area
106
00:07:35,521 –> 00:07:39,842
anthony: cause, data traditionally in the enterprise came from
107
00:07:40,960 –> 00:07:46,082
anthony: operations and operations. While very important, is kind of boring, but creates a
108
00:07:46,165 –> 00:07:52,638
anthony: lot of data and so a lot of the early data work was because we had to manage the
109
00:07:52,638 –> 00:07:57,193
anthony: data being spun off from operations. And so it was like we’ receiving, we’re being
110
00:07:57,360 –> 00:08:01,848
anthony: reactionary. We’re catching that data like Yeah, we create some operational reports
111
00:08:01,914 –> 00:08:06,886
anthony: and will do some stuff. But but the notion of analytics to drive improvements or
112
00:08:07,036 –> 00:08:11,524
anthony: drive new new innovations and opportunities in a business that was so secondary,
113
00:08:11,841 –> 00:08:17,113
anthony: and to the point where we didn’t even bother with it until like we’ pushing you
114
00:08:17,363 –> 00:08:22,001
anthony: well into the two thousands, at least right, and so outside of niche industries
115
00:08:22,168 –> 00:08:25,838
anthony: that we’re kind of predicated on that. I grew up in the financial industry where
116
00:08:26,005 –> 00:08:30,476
anthony: our traders were using advanced analytics. You know as early as anyone, because
117
00:08:30,643 –> 00:08:34,714
anthony: they could get an edge in the trading place. So it it actually directly tied to
118
00:08:34,881 –> 00:08:36,482
anthony: their business model. But many of the
119
00:08:36,482 –> 00:08:38,968
anthony: organizations out there were just like Hey, We got to get our sales information.
120
00:08:39,202 –> 00:08:42,088
anthony: We’ve got to populate accounting stuff. That’s about it like we know what we’re
121
00:08:42,154 –> 00:08:45,608
anthony: doing. We’re we’re going to look at operational metrics as as incremental
122
00:08:45,841 –> 00:08:49,996
anthony: improvement as opposed to innovative improvements. But where I’ve seen software
123
00:08:50,479 –> 00:08:55,768
anthony: engineering influence stateta analytics. Now we start to see. Hey, there’s really
124
00:08:56,002 –> 00:09:02,074
anthony: this convergence of those two areas and new and novel ways that everybody says
125
00:09:02,475 –> 00:09:07,113
anthony: today. Like Hey, these are table stakes. Any business, even the smallest of the
126
00:09:07,196 –> 00:09:12,318
anthony: small businesses need to have a Da to analytics capacity so that they can read the
127
00:09:12,401 –> 00:09:16,155
anthony: market so that they can understand where’s their opportunity? How do they find ways
128
00:09:16,322 –> 00:09:20,793
anthony: to grow their businesses at at literally any size? And so it sounds to be like what
129
00:09:21,043 –> 00:09:22,078
anthony: five ▁x is doing
130
00:09:23,112 –> 00:09:29,035
anthony: is making that capability available to literally any businesses out there? Am I
131
00:09:29,118 –> 00:09:31,354
anthony: reading that right? Is that what is that what you guys are doing?
132
00:09:31,804 –> 00:09:34,040
tarush: Yeah, yeah, that’ doing. But thinking of
133
00:09:35,074 –> 00:09:40,680
tarush: just on tangent, you know, I think you hit the nail in the head that every business
134
00:09:41,080 –> 00:09:45,718
tarush: now needs to do this. And really that happened for a few reasons. But if you look in
135
00:09:45,718 –> 00:09:51,007
tarush: the macro, you know, Five, ten years ago we were using much fewer platforms you were
136
00:09:51,240 –> 00:09:54,927
tarush: using. Gail, and that was calling, and everything,
137
00:09:56,679 –> 00:10:00,182
tarush: what’s happening today is that the average startup is using ten tow different data
138
00:10:00,182 –> 00:10:01,183
tarush: sources.
139
00:10:01,183 –> 00:10:05,404
tarush: All of a sudden, we like using ▁zoom for video and slack for messaging, something
140
00:10:05,404 –> 00:10:06,405
tarush: else,
141
00:10:06,922 –> 00:10:12,044
tarush: stripe and square for Ps and ▁ze T, and sales from much smaller use.
142
00:10:13,329 –> 00:10:18,768
tarush: What this means that the capabilities of each of these tools is becoming very limit
143
00:10:19,485 –> 00:10:22,038
tarush: because they have a much smaller s
144
00:10:23,155 –> 00:10:25,808
tarush: of data which they have access to. So
145
00:10:26,926 –> 00:10:29,879
tarush: this this inherent need to recentraliise data
146
00:10:31,163 –> 00:10:37,086
tarush: and answer holistic questions across it. So you, given that this is accelerated by
147
00:10:37,236 –> 00:10:43,242
tarush: these companies and products not being able to answer holistic questions? Five A.
148
00:10:44,193 –> 00:10:48,597
tarush: When we were only using Facebook for ads, Facebook had a pretty good idea on your
149
00:10:48,764 –> 00:10:53,085
tarush: marketing. Spendard could give you pretty sophisticated insights. But now for using
150
00:10:53,235 –> 00:10:57,723
tarush: five tools, you can’t compare Facebook candidates to Google and Itx apples to
151
00:10:57,807 –> 00:11:01,794
tarush: apples. So you need to injestt it into your own warehouse and make your own sense of
152
00:11:01,794 –> 00:11:07,233
tarush: it. So I think’s like the macro trend of what’s happening, and this is a pretty
153
00:11:07,299 –> 00:11:08,300
tarush: technical trend.
154
00:11:08,300 –> 00:11:12,722
anthony: So before we get into the the five ▁x, then causecause it it just. Reminded me when
155
00:11:12,805 –> 00:11:17,193
anthony: I was growing up, like all the old guys would say, You know history keeps repeating
156
00:11:17,443 –> 00:11:20,963
anthony: itself. You’re going to see these patterns come up over and over again, and now I’m
157
00:11:21,047 –> 00:11:26,318
anthony: one of the old guys. And so I’m like Da. this is exactly the same dynamic I’ve been
158
00:11:26,402 –> 00:11:28,888
anthony: talking about in business intelligence forever. And
159
00:11:30,005 –> 00:11:32,641
anthony: if you need data analytics from a particular
160
00:11:33,993 –> 00:11:38,397
anthony: one, singular source, that singular source, that that system is probably going to
161
00:11:38,481 –> 00:11:42,318
anthony: be fine at giving you the analytics in that system alone. But the moment you need
162
00:11:42,485 –> 00:11:47,840
anthony: to start seeing the analytics across two, three, five, twelve different systems,
163
00:11:48,557 –> 00:11:52,962
anthony: you’re going to need something that solves for the system, not just integrate each
164
00:11:53,045 –> 00:11:56,082
anthony: of the individual point solutions, because because they’re not designed to do that,
165
00:11:56,716 –> 00:12:02,872
anthony: they introspective, they’re not overlaying that analytics need. And that’s it’s the
166
00:12:03,122 –> 00:12:06,959
anthony: same problem that we had with basic reporting twenty
167
00:12:07,276 –> 00:12:08,794
anthony: thirty years ago. you know,
168
00:12:10,045 –> 00:12:15,084
tarush: spot on. and you’ right, sales force would love you to push metrics into sales force
169
00:12:15,234 –> 00:12:18,838
tarush: and have you you know reporting on top of it, But it’s not really built for that.
170
00:12:19,155 –> 00:12:22,608
tarush: It’s built more as you know, a relational data model
171
00:12:23,409 –> 00:12:28,447
tarush: and you, and it tries to you know, force stuff into it. Um, and that’s not. you
172
00:12:28,447 –> 00:12:29,448
tarush: know. You know that
173
00:12:30,116 –> 00:12:33,402
tarush: that. that’s not quite where the industry is going under other earth. That’s not a
174
00:12:33,486 –> 00:12:36,122
tarush: long term solution on any of the best practice of hard. Go do it
175
00:12:36,555 –> 00:12:41,277
anthony: Yeah, so so tell me about five acts. Where did five axs come from? in terms of its
176
00:12:41,444 –> 00:12:45,448
anthony: origin story? And what are you trying to to serve and differentiate yourself in the
177
00:12:45,448 –> 00:12:46,449
anthony: marketplace to do?
178
00:12:46,765 –> 00:12:50,603
tarush: absolutely. So you know five ▁x came with the realization that
179
00:12:51,720 –> 00:12:57,159
tarush: they are a small subset of companies which are getting exponentially more value from
180
00:12:57,326 –> 00:13:02,515
tarush: data than others and these companies are technical in background, But what they
181
00:13:02,681 –> 00:13:08,287
tarush: really have is they’re able to have these large platform teams which spin together
182
00:13:08,604 –> 00:13:11,874
tarush: the modern data stack, which sort stitch it together. So you know you have five
183
00:13:12,041 –> 00:13:16,128
tarush: layers in the modern data stag data collection, ingestion storage modeling,
184
00:13:16,128 –> 00:13:17,129
tarush: reporting
185
00:13:17,713 –> 00:13:21,884
tarush: each of them today have a billion orllar player. So if you’re a start up looking to
186
00:13:21,884 –> 00:13:25,488
tarush: get started, you have to go sign five enterprise contracts. Probably spend months
187
00:13:25,638 –> 00:13:30,276
tarush: trying to stitch together the stack on top of it. Now stakeholders want to consume
188
00:13:30,442 –> 00:13:34,196
tarush: it on different ways. You have B. I users wantnna Slic, and ice and Anys, when
189
00:13:34,446 –> 00:13:38,918
tarush: around, experimentation and Aml engineers who want run their models On Top of this,
190
00:13:39,084 –> 00:13:43,322
tarush: we now care about G, e, p, r and p. I, and security, and all of these different
191
00:13:43,556 –> 00:13:49,879
tarush: things. So it’s non trivial for companies to go build this from scratch every time,
192
00:13:50,045 –> 00:13:54,833
tarush: unless it’s really they inside their d n a right. So they end up either hacking it
193
00:13:55,084 –> 00:13:59,238
tarush: together, or you know, hiring an engineer, but spending more than half of their time
194
00:13:59,488 –> 00:14:04,193
tarush: building the stack and maintaining it, And that’s just not comparative anymore. And
195
00:14:04,760 –> 00:14:09,014
tarush: the the big companies have the luxury of having you are ten, twenty engineers. sort
196
00:14:09,014 –> 00:14:10,015
tarush: of doing this.
197
00:14:11,483 –> 00:14:16,355
tarush: So Fivevex came from the idea that you know how do we make it very, very, very
198
00:14:16,605 –> 00:14:21,393
tarush: simple for these ninety per cent of companies to really get the same level of value
199
00:14:21,877 –> 00:14:26,515
tarush: from data as some of the bigger companies. So you know, the first thing we do is
200
00:14:26,765 –> 00:14:31,086
tarush: think of us as that large platform team where you know we are.
201
00:14:32,187 –> 00:14:33,188
tarush: We are
202
00:14:33,239 –> 00:14:37,476
tarush: working with the best and class providers across those five layers of the stack we
203
00:14:37,560 –> 00:14:41,797
tarush: integrated in. You get security, P, I, I, g, p, R, compliance. Order those things
204
00:14:42,047 –> 00:14:47,720
tarush: for free, but more interestingly, Um, you know, as you have Nius cases, like pushing
205
00:14:47,886 –> 00:14:52,675
tarush: data back into application tools, what we call reverse E Ta or Hobb, we do data,
206
00:14:52,841 –> 00:14:56,996
tarush: lineage or observability, which is becoming a huge area or running machine learning
207
00:14:57,162 –> 00:15:01,634
tarush: models For all of these different vendors, You don’t need to figure out what’s the
208
00:15:01,717 –> 00:15:06,689
tarush: best tool out there. How do you integrate those tools? We do that for you, we figure
209
00:15:06,922 –> 00:15:10,759
tarush: out the billing. We work with all these vendors and have backck and billing, so you
210
00:15:10,993 –> 00:15:16,198
tarush: get you know one touch, one click integrations with the best in class tools. You get
211
00:15:16,365 –> 00:15:19,168
tarush: direct back and billing as a single platform,
212
00:15:19,885 –> 00:15:25,557
tarush: so in day one itself you are extremely productive. The platform is ready. It’s
213
00:15:25,724 –> 00:15:29,645
tarush: almost like you had your own large back from teams stitching it together. So that’s
214
00:15:29,795 –> 00:15:34,033
tarush: the first thing we do and we also have a service where you know. Currently we’re
215
00:15:34,116 –> 00:15:38,437
tarush: intervining about a hundred engineers a day fully automatically. Um, you know about
216
00:15:38,520 –> 00:15:41,640
tarush: eighty percent automatically in India, less than two per cent of it,
217
00:15:43,409 –> 00:15:46,762
tarush: less than two per cent of them make it through our interview process. And then
218
00:15:46,929 –> 00:15:50,766
tarush: because we control the stack, we can pretrain these engineers on the staff,
219
00:15:51,317 –> 00:15:55,804
tarush: and we give you these engineers to become an extension of your team. And you know
220
00:15:56,205 –> 00:16:00,676
tarush: the infrastructure plus the engineers. What this means is you really get everything
221
00:16:00,926 –> 00:16:06,115
tarush: you need to run a comprehensive data strategy and five ▁x companies end up using
222
00:16:06,281 –> 00:16:11,487
tarush: data to make decisions in the first one month of of starting an engagement. Whereas,
223
00:16:11,637 –> 00:16:15,808
tarush: if you try to go do this yourself, spend a few months to hire engineers. Sort figure
224
00:16:15,958 –> 00:16:18,677
tarush: out what stack you want to use. Sign those enterprise contracts.
225
00:16:20,129 –> 00:16:22,998
tarush: It really at least a year before you make decisions.
226
00:16:25,434 –> 00:16:29,605
anthony: It’s It’s an interesting model, because it. I think it starts to address some of
227
00:16:29,605 –> 00:16:33,926
anthony: the big challenges that I see in the marketplace right now with if you’re a start
228
00:16:34,076 –> 00:16:37,763
anthony: up, it can be very difficult. even just the the hiring process to get data
229
00:16:37,996 –> 00:16:40,632
anthony: engineers. You know you’re going to need them full time. You know you’re going to
230
00:16:40,716 –> 00:16:44,636
anthony: need a certain amount of capacity to be able to handle the information loads that
231
00:16:44,636 –> 00:16:50,075
anthony: you’re going to have, but it’s very difficult to create an attractive job posting,
232
00:16:50,392 –> 00:16:55,447
anthony: and to you know, attract a a data engineer, a capable one or an architect in into
233
00:16:55,514 –> 00:17:00,085
anthony: your small business. Unless you know you, you can convince them to chase the dream
234
00:17:00,486 –> 00:17:04,807
anthony: of you know, the I P. O, and the you know the Big pay day and what have you that
235
00:17:05,040 –> 00:17:09,678
anthony: sometimes works. But I think that people today, especially data engineers tend to
236
00:17:09,845 –> 00:17:15,200
anthony: gravitate towards more stable Um environments. And I think that if you can solve
237
00:17:15,434 –> 00:17:20,322
anthony: for data engineering consulting, which is effectively what you’re doing by creating
238
00:17:20,639 –> 00:17:25,194
anthony: that ability to recruit and hire talented people that can then be deployed to your
239
00:17:25,277 –> 00:17:30,082
anthony: clients, It it expands what your service offering is capable of doing. While
240
00:17:30,315 –> 00:17:34,486
anthony: recognizing that traditional project based consulting models aren’t really
241
00:17:34,720 –> 00:17:39,441
anthony: sufficient for the kind of constant load that a data engineering component would
242
00:17:39,441 –> 00:17:40,559
anthony: typically have would. Would you
243
00:17:40,659 –> 00:17:41,660
anthony: agree with what I said?
244
00:17:42,194 –> 00:17:46,598
tarush: yeah, Totally, you know. We don’t see ourselves as a typical consultant company Who
245
00:17:46,682 –> 00:17:48,917
tarush: comes and does you know a short term project?
246
00:17:50,769 –> 00:17:54,757
tarush: Typical consulting companies can start from scratch right every new engagement.
247
00:17:54,923 –> 00:17:59,478
tarush: whether are in houseteam or consultant, You really start from scratch and typical
248
00:17:59,645 –> 00:18:02,915
tarush: consultants come in short term. They might set up the warehouse. They might focus
249
00:18:03,315 –> 00:18:08,220
tarush: new for a few cases. We see ourselves as a long term replacement for Da engineering
250
00:18:08,220 –> 00:18:09,221
tarush: needs right,
251
00:18:09,221 –> 00:18:12,040
tarush: Typical fivex companies are looking at us long term.
252
00:18:14,126 –> 00:18:17,880
tarush: They have you know they want to get value from data they don’t know, Kind of hard to
253
00:18:17,963 –> 00:18:23,402
tarush: go. do this. Iss. Really time for that core data stack. How you collect a digested
254
00:18:23,635 –> 00:18:29,007
tarush: modelate, Structured reported. It’s really time for that part of the stack to be
255
00:18:29,074 –> 00:18:33,879
tarush: commodized, And when companies want a higher data hire us. They now want a higher
256
00:18:34,046 –> 00:18:38,367
tarush: data scientist and analysts who can focus on the high levels of the stack, which is
257
00:18:38,433 –> 00:18:41,887
tarush: a competitive advantage, which is where you can build Da, our products or build
258
00:18:41,954 –> 00:18:47,326
tarush: models and really use that as an advantage. They have no advantage in doing the core
259
00:18:47,559 –> 00:18:52,447
tarush: stuff. They just need to go do it. And what we’re showing now is we can just do it
260
00:18:52,598 –> 00:18:57,085
tarush: faster, cheaper and higher quality than you can do it in house or work with a
261
00:18:57,085 –> 00:19:02,357
tarush: consultant company. And you know we aren’t replacing data teams. We’re not. We’re
262
00:19:02,441 –> 00:19:07,329
tarush: not making data teams obsolete, which is having them focus on the higher levels of
263
00:19:07,396 –> 00:19:11,166
tarush: the stack, which is really where they want to be focusing on. right. like I, I sort
264
00:19:11,233 –> 00:19:15,153
tarush: of read the study that eighty percent of data scientists today spend time investing
265
00:19:15,320 –> 00:19:19,474
tarush: there and cleaning it and modeling it, And that’s not what they want to be doing.
266
00:19:20,209 –> 00:19:25,314
tarush: Um, And we can just do it right now. Um, you know, because of this platform we’ve
267
00:19:25,397 –> 00:19:29,401
tarush: built and stitch it together, and because we can hire really smart engineers in
268
00:19:29,484 –> 00:19:35,073
tarush: India, and then pretrain them on the best on the modern day best practices. Um, you
269
00:19:35,157 –> 00:19:39,077
tarush: know where we? we see ourselves as you know, fifty per cent platform, fifty percent
270
00:19:39,328 –> 00:19:44,917
tarush: services, and together, Uh, we, our goal is really hard to. We commodateize this
271
00:19:45,083 –> 00:19:50,038
tarush: layer and just do it as a service across any you know, sort of sies, a early stage
272
00:19:50,289 –> 00:19:52,274
tarush: company, but also you know about
273
00:19:53,475 –> 00:19:56,929
tarush: about twenty percent of our companies right now. Are you know? Fortune five hundred
274
00:19:57,162 –> 00:20:02,034
tarush: companies, Public companies, Uh, they’re non technical. So, think of you know the
275
00:20:02,117 –> 00:20:05,888
tarush: Mcdonald’s Burger Kings, or like the large liquor companies of the world, where they
276
00:20:05,954 –> 00:20:10,525
tarush: might have used excenenture in the past, but realize that Ecentra is not a long term
277
00:20:10,759 –> 00:20:12,928
tarush: solution towards owning your data stack.
278
00:20:13,078 –> 00:20:16,765
tarush: And now we can just do it as a service far more effectively and farm more cost
279
00:20:17,065 –> 00:20:18,066
tarush: effectively. Um.
280
00:20:19,134 –> 00:20:20,135
tarush: So for them,
281
00:20:20,552 –> 00:20:24,006
anthony: Yeah, well, and and I think in in having spent a fair amount of time in large
282
00:20:24,156 –> 00:20:28,877
anthony: organizations, you know, the the traditional consulting model is very expensive and
283
00:20:29,044 –> 00:20:34,316
anthony: isn’t particularly well suited to these set of challenges. And I think about you
284
00:20:34,399 –> 00:20:38,954
anthony: know, from that recruiting perspective even if you wanted to getting people to say
285
00:20:39,037 –> 00:20:43,358
anthony: hey, I’m I’m at Mcdonald’s and you know, great global organization, but not
286
00:20:43,525 –> 00:20:47,763
anthony: necessarily the first place, the top tech talent is thinking of going to work,
287
00:20:47,996 –> 00:20:52,801
anthony: Right they? there? it’s there’s. There’s a a added challenge when the places that
288
00:20:52,968 –> 00:20:57,356
anthony: people want to be tend to be the technology companies themselves, the Googles of
289
00:20:57,356 –> 00:21:01,193
anthony: Facebooks, the Silicon valley companies that are building out the technologies.
290
00:21:01,360 –> 00:21:04,313
anthony: That’s those are the people who are going to get the highest talent. But if I
291
00:21:04,479 –> 00:21:08,634
anthony: Mcdonalds, I don’t want to settle for second tier. And so I’ve got a gap there
292
00:21:08,717 –> 00:21:10,552
anthony: that’s going to be difficult to fill. The other
293
00:21:10,719 –> 00:21:15,273
anthony: thing that you werere talking about, Uh, really, uh, got me thinking is that you
294
00:21:15,273 –> 00:21:19,044
anthony: know, so my audience knows like I’m all about data leadership. Data leadership is
295
00:21:19,127 –> 00:21:22,481
anthony: my thing. That’s what I do. That’s what my book is on. That’s what the podcast is
296
00:21:22,648 –> 00:21:27,286
anthony: about. and I think about the. The fundamentals of data Leadership are really about
297
00:21:27,602 –> 00:21:33,208
anthony: having a business do something different as a result of the insights it gets from
298
00:21:33,275 –> 00:21:37,446
anthony: data. So data analytics technology doesn’t really matter. It’s all about improving
299
00:21:37,679 –> 00:21:41,917
anthony: business outcomes, So how am I driving an increase in revenue, a decrease in cost,
300
00:21:42,167 –> 00:21:46,555
anthony: better risk management. That’s really it. And when I think about what matters in
301
00:21:46,722 –> 00:21:50,158
anthony: that, it’s the decision and the action of the business.
302
00:21:51,443 –> 00:21:56,164
anthony: Everything precursor to that, the analys of the Da. the the data warehousing and
303
00:21:56,164 –> 00:21:59,918
anthony: the modelling. I have a question for your round Dayta modeling in a minute, but the
304
00:22:00,802 –> 00:22:04,323
anthony: uh. all of that heavy lifting on the data stuff
305
00:22:05,357 –> 00:22:09,928
anthony: isn’t to your point, like it’s not a competitive advantage. It’s table stakes. It’s
306
00:22:09,995 –> 00:22:14,316
anthony: stuff everybody has to do, and if you can solve for that most efficiently and most
307
00:22:14,483 –> 00:22:19,204
anthony: cost effectively, working with the top talent wherever you can source it, and your
308
00:22:19,438 –> 00:22:24,726
anthony: internal energies as an organization with those people who have to know your
309
00:22:24,960 –> 00:22:29,681
anthony: business from the inside. Because that’s where the business decisions get made. You
310
00:22:29,765 –> 00:22:35,604
anthony: can connect up this massive power of data analytics without trying to recreate it
311
00:22:35,754 –> 00:22:39,274
anthony: in every business that exists out there, and that’s I think. it’s an important
312
00:22:39,608 –> 00:22:43,995
anthony: lesson for those those students of data leadership say. what is the competitive
313
00:22:44,079 –> 00:22:48,166
anthony: advantage for me? If I’m a trading firm, Maybe the analytics and the stack because
314
00:22:48,316 –> 00:22:52,237
anthony: of the speed and latency competitive threat, Maybe that is part of your core
315
00:22:52,404 –> 00:22:59,277
anthony: business, But if you are a C, p G firm, you are not competing on most of that
316
00:22:59,444 –> 00:23:05,434
anthony: stack. You’re getting to that last one or two parts of that life cycle for
317
00:23:05,517 –> 00:23:08,804
anthony: it to actually matter for your business competitively. So what? what’s your
318
00:23:08,887 –> 00:23:11,523
anthony: reaction to that? Would you? Would you do agree with that or what what else
319
00:23:11,573 –> 00:23:12,574
anthony: would you say to it?
320
00:23:12,924 –> 00:23:16,762
tarush: no, I. I, one hundred percent agree with that. You know, the legwork is just
321
00:23:16,928 –> 00:23:21,633
tarush: something you have to do was the cost of admission towards eventually figuring out
322
00:23:21,800 –> 00:23:25,554
tarush: the higher levels of the stack where you have your competitive advantage, insights
323
00:23:25,637 –> 00:23:29,007
tarush: or strategy pieces. Making all of that stuff and data engineering and
324
00:23:29,074 –> 00:23:33,795
tarush: interestingding data was just the cost of admission. You know, what’s happening now
325
00:23:34,045 –> 00:23:39,568
tarush: is is just becoming so much more clear how effective this model is like. We just had
326
00:23:39,718 –> 00:23:42,604
tarush: one of our first few K studies which show that we were
327
00:23:44,206 –> 00:23:49,244
tarush: twenty to twenty five times cheaper than a consultant coming in and doing it, And we
328
00:23:49,478 –> 00:23:53,965
tarush: did more with one engineer in three months. Then they did with five engineers in six
329
00:23:54,199 –> 00:23:58,603
tarush: months. And it fundamentally comes from two different areas, and the the main one
330
00:23:58,753 –> 00:24:02,841
tarush: is, we don’t start from scratch Right on day one. You get the infrastructure area to
331
00:24:02,924 –> 00:24:08,113
tarush: go, and that takes companies about six months by itself. And what? that also means
332
00:24:08,196 –> 00:24:12,117
tarush: that companies have to maintain it, Whereas in the five ▁x model, you know, we have
333
00:24:12,200 –> 00:24:16,755
tarush: a large platform team doing this. So you know what’s a big focus for us in the next
334
00:24:16,922 –> 00:24:21,560
tarush: six months is day observability in Da age. But as soon as we figure what the right
335
00:24:21,726 –> 00:24:26,364
tarush: vendor is in the direction we want to go, all of our customers get it for free. And
336
00:24:26,515 –> 00:24:30,368
tarush: you know that is serious engineering manpower. If you try to go do Itrself,
337
00:24:31,887 –> 00:24:35,240
tarush: and The, and you know, the I think, the more interesting piece which you sort of
338
00:24:35,323 –> 00:24:40,679
tarush: touched on is top, Talland, historically has wanted to go work for Google and and to
339
00:24:40,896 –> 00:24:41,897
tarush: dear, sort of companies.
340
00:24:43,081 –> 00:24:46,434
tarush: And you know what we’re really seeing is that
341
00:24:47,953 –> 00:24:53,725
tarush: talent is now also a little bit fed up of being one of fifty data engineers in
342
00:24:53,959 –> 00:24:58,847
tarush: inphosis or an ecenture, or you know, even in a top tech company because your your
343
00:24:58,997 –> 00:25:01,716
tarush: relative marginal impact starts to dwindle
344
00:25:03,084 –> 00:25:08,607
tarush: what what really happens in the Fiex model Is that you know because of our platform,
345
00:25:09,007 –> 00:25:13,962
tarush: a relatively midlevel engineer is just capable of doing so much more, and
346
00:25:15,247 –> 00:25:17,315
tarush: we solve for that problem that you
347
00:25:17,399 –> 00:25:21,803
tarush: know most businesses today. If they hire a data higher, they don’t speak data, and
348
00:25:21,803 –> 00:25:25,640
tarush: the data hier doesn’t speak business, and they don’t really have a good way of
349
00:25:25,807 –> 00:25:31,313
tarush: working well together. and hence the company needs to get to a certain size. Really
350
00:25:31,646 –> 00:25:37,168
tarush: higher, data leadership. and only at that point are they able to really work with
351
00:25:37,235 –> 00:25:41,640
tarush: your midlevel engineer. With. finally, the data leadership acts like that translator
352
00:25:42,440 –> 00:25:46,678
tarush: in the five ▁x model. You know the process by which we work with the company makes
353
00:25:46,845 –> 00:25:51,483
tarush: it very clear of you know what we need from the company and what the engineer needs
354
00:25:51,633 –> 00:25:55,887
tarush: to do in order to add value. In some ways, You know the think of us as that
355
00:25:56,037 –> 00:25:57,088
tarush: translator between
356
00:25:58,206 –> 00:26:02,594
tarush: having hiring engineering talent and actually effectively communicating with the
357
00:26:02,677 –> 00:26:06,848
tarush: company on helping the company hit their business decisions. Because you know at the
358
00:26:06,848 –> 00:26:11,236
tarush: end of the day no one hires data for the sake of building a data team. It’s always a
359
00:26:11,319 –> 00:26:12,604
tarush: means to optimize the business,
360
00:26:13,805 –> 00:26:19,394
tarush: and what I’m extremely bullish on is this idea of you know, with the five ▁x
361
00:26:19,644 –> 00:26:23,565
tarush: platform with this process were built on helping engineers work with businesses.
362
00:26:24,766 –> 00:26:29,638
tarush: You know, a single data engineer is able to go into a company and really adds so
363
00:26:29,638 –> 00:26:30,639
tarush: much more value
364
00:26:31,473 –> 00:26:36,278
tarush: by being in a more diverse environment where they bring data skills to the table
365
00:26:36,528 –> 00:26:40,448
tarush: which the company really needs, and being in the same room as a decision maker in
366
00:26:40,448 –> 00:26:44,436
tarush: the Ceo Andchief Product Officer, instead of being one of fifty,
367
00:26:46,204 –> 00:26:52,277
tarush: being a fifty percent data team and a larger company where they are optimising one,
368
00:26:52,444 –> 00:26:58,199
tarush: ▁query, or like running you, one small part, one level of one small segment of a
369
00:26:58,199 –> 00:26:59,200
tarush: product, so
370
00:27:00,118 –> 00:27:02,604
tarush: I believe that with our. Model,
371
00:27:04,039 –> 00:27:08,193
tarush: A lot of data engineers are very quickly are very excited with what we are doing,
372
00:27:08,526 –> 00:27:14,049
tarush: and the idea being that their skill sets are so much more valued and appreciated in
373
00:27:14,049 –> 00:27:15,050
tarush: companies
374
00:27:15,567 –> 00:27:17,485
tarush: that are just at that right size.
375
00:27:19,204 –> 00:27:22,724
anthony: it’s It’s almost like you’re you’re able to create
376
00:27:23,842 –> 00:27:28,713
anthony: an environment or or an opportunity for those data engineers to have a a massive
377
00:27:28,797 –> 00:27:32,233
anthony: arsenal at their disposal like they’re They’re you’ creating. I don’t want to use.
378
00:27:32,233 –> 00:27:35,754
anthony: the term. Super Soldiers is not quite what I’m going for, but they, they’re able to
379
00:27:35,837 –> 00:27:41,359
anthony: come in, and I have this vision of like the Iron Man, uh costume, or something like
380
00:27:41,593 –> 00:27:45,997
anthony: the The Mechanical Suit Is that they’re able to do so much more than a single
381
00:27:46,247 –> 00:27:50,552
anthony: individual typically would be, because they have the surroundings of this platform
382
00:27:50,802 –> 00:27:55,040
anthony: that five ▁x has created so that they can do so much, but you still have to have
383
00:27:55,357 –> 00:27:59,594
anthony: that connection point to what’s unique in that business, and so your model is is to
384
00:27:59,678 –> 00:28:03,998
anthony: put them with that business and to. Have that that the best part of consulting is
385
00:28:04,165 –> 00:28:08,153
anthony: coming in and and personalizing for that organization what that organization
386
00:28:08,319 –> 00:28:12,640
anthony: particularly needs when I, when I was doing consulting, Um more frequently. I
387
00:28:12,807 –> 00:28:16,961
anthony: always said the patterns are easy. I’ve been doing this a long time. I can see the
388
00:28:16,961 –> 00:28:22,317
anthony: patterns immediately. What I home in on is what makes this company unique. What?
389
00:28:22,317 –> 00:28:25,920
anthony: What is the problem that this company has that is different than every other
390
00:28:26,087 –> 00:28:30,642
anthony: company I’ve seen? Because if I can get to that, we’ve got figured out like that Is
391
00:28:30,792 –> 00:28:33,278
anthony: the part of the solution you can solve any problem.
392
00:28:34,079 –> 00:28:39,367
anthony: You have to understand the problem at that level. And the only way you can do that
393
00:28:39,834 –> 00:28:45,607
anthony: is to recognize all of that platform. All of that power that five ▁x delivers to
394
00:28:45,673 –> 00:28:50,645
anthony: those data engineers is an important part, but not the entire solution. It’s that
395
00:28:50,795 –> 00:28:53,998
anthony: human who can really look at that uniqueness that
396
00:28:54,482 –> 00:28:58,403
anthony: becomes that critical link and five really helping that organation achieve
397
00:28:58,553 –> 00:29:00,088
anthony: everything possibly do.
398
00:29:00,088 –> 00:29:05,560
tarush: yet you know, our is very much in, you know, getting started as a data engineer when
399
00:29:05,560 –> 00:29:08,446
tarush: it was recognised as a profession and I remember Maxim.
400
00:29:10,999 –> 00:29:15,236
tarush: The the Guyt, a B and B wrote this super famous articlecause. it was called the Rise
401
00:29:15,320 –> 00:29:19,808
tarush: of the data engineer and that was such an inspiring. There was such an inspiring
402
00:29:20,041 –> 00:29:24,195
tarush: article which sort spoke about this new field which did engineering, and how every
403
00:29:24,362 –> 00:29:28,917
tarush: company is going really need it. and you know, in some ways like
404
00:29:30,769 –> 00:29:34,122
tarush: what we are building. If you look at five acts and the people were hiring right, we
405
00:29:34,289 –> 00:29:39,727
tarush: hire data engineers and our goal is really hard to commodize this and offered as a
406
00:29:39,878 –> 00:29:44,365
tarush: mass market service to help you know any business at that stage where they need
407
00:29:44,599 –> 00:29:51,005
tarush: this. So it’s you know, I really do resonate with your analogy of
408
00:29:52,674 –> 00:29:54,275
tarush: you know, solving all of those
409
00:29:55,560 –> 00:30:00,999
tarush: all of the challenges in data engineering today and really giving them superpows so
410
00:30:01,166 –> 00:30:05,403
tarush: that we can start to imped them across all of these businesses globally and give
411
00:30:05,553 –> 00:30:08,356
tarush: them the right tool to give the companies the right tos to work with these
412
00:30:08,439 –> 00:30:14,529
tarush: engineers, because the skill set they bring in terms of data hygiene, data quality,
413
00:30:14,762 –> 00:30:18,116
tarush: building or building the right foundations for you to
414
00:30:19,167 –> 00:30:20,919
tarush: leverage data for your,
415
00:30:22,353 –> 00:30:24,122
tarush: for your top of final metrics, and
416
00:30:25,323 –> 00:30:30,361
tarush: to of fun insights. Is extremely critical, and something which every business is
417
00:30:30,445 –> 00:30:34,516
tarush: going to need in the next ten years, and the businesses which are unable to do it
418
00:30:34,849 –> 00:30:39,087
tarush: just won’t be able to compete, Because when your competition is able to understand
419
00:30:39,404 –> 00:30:44,359
tarush: its market strategy, How customers are using your product, You’ unable to do it. We
420
00:30:44,442 –> 00:30:49,247
tarush: sort a sous in digital marketing ten years ago. with the rise, Google and companies
421
00:30:49,647 –> 00:30:54,285
tarush: that don’t do digital marketing today don’t exist anymore like
422
00:30:55,003 –> 00:30:59,557
tarush: we don’t hear them, because they gone. The same thing is happening in data ten years
423
00:30:59,724 –> 00:31:03,561
tarush: later, and history has show us what happened with companies that didn’t get into
424
00:31:03,645 –> 00:31:07,966
tarush: digital marketing. The companies that don’t get into data now just won’t exist in
425
00:31:07,966 –> 00:31:08,967
tarush: years from now.
426
00:31:09,517 –> 00:31:15,757
anthony: I completely agree and I think that the barriers to entry for data are so low. Like
427
00:31:16,157 –> 00:31:21,996
anthony: the. The amount that you can do with very limited investment is tremendous now. And
428
00:31:22,180 –> 00:31:26,634
anthony: and that’s raised that minimum kind of table stakes level of like. If you just want
429
00:31:26,801 –> 00:31:30,805
anthony: to compete. If you want to exist, there’s a certain amount that you that you just
430
00:31:30,955 –> 00:31:35,843
anthony: have to do. And and so I feel like I, I have to just be in full disclosure to the
431
00:31:35,994 –> 00:31:40,481
anthony: audience out there, In turns to you, is is I grew up as a data engineer. Like,
432
00:31:40,632 –> 00:31:45,687
anthony: probably more so than anything, I would consider the core of my career started as
433
00:31:45,920 –> 00:31:49,841
anthony: being a data engineer. And so this is all very close to my heart. and it’s great
434
00:31:50,074 –> 00:31:52,877
anthony: because I could like. twenty years ago. I could have told you I like date
435
00:31:53,044 –> 00:31:56,965
anthony: engineers. We need more respect, and so now we’re finally at this point where we
436
00:31:57,048 –> 00:32:03,288
anthony: can see in the marketplace where this value proposition has become understood and
437
00:32:03,288 –> 00:32:07,842
anthony: and realized, and that successful business is focused in this what could be
438
00:32:07,925 –> 00:32:12,313
anthony: considered a relatively narrow band of the entire value proposition of an
439
00:32:12,313 –> 00:32:13,314
anthony: organization.
440
00:32:14,082 –> 00:32:19,354
anthony: It, it is now clear that that’s an incredibly important part of that that value, so
441
00:32:19,604 –> 00:32:23,207
anthony: speaking, has one day to engineer or to another. I promsed that I had a question
442
00:32:23,358 –> 00:32:26,561
anthony: for yourndta modeling. And and this takes us in a little bit of a different
443
00:32:26,644 –> 00:32:29,280
anthony: direction, But I have to ask you anyway. be cause. there’s only so many guests that
444
00:32:29,364 –> 00:32:34,952
anthony: I can ask this question to. Is data modeling dead like have we? Is data modeing
445
00:32:35,286 –> 00:32:40,725
anthony: still relevant today? Because I think there’ a many organizations out there,
446
00:32:41,042 –> 00:32:45,279
anthony: especially when you think of things like Tbau or these other tools that try to blur
447
00:32:45,680 –> 00:32:49,767
anthony: what used to be very distinct lines between different parts of that data life
448
00:32:49,917 –> 00:32:55,123
anthony: cycle. There’s there’s an ability to do so much with the technology
449
00:32:56,474 –> 00:33:03,197
anthony: that some of the fundamentals of modeling data or creating um you abstractions from
450
00:33:03,448 –> 00:33:07,118
anthony: the data sources and ways that can connect to the business. I think some of that
451
00:33:07,368 –> 00:33:12,640
anthony: has certainly diminished in attention. Despite what we were just talking about with
452
00:33:12,724 –> 00:33:16,878
anthony: data engineering being recognized as super important. I see them moving in a little
453
00:33:16,878 –> 00:33:19,998
anthony: bit of a different direction, And I want to see what. What’s your take on the
454
00:33:20,081 –> 00:33:21,833
anthony: situation of data modeling today?
455
00:33:21,983 –> 00:33:22,984
tarush: Yeah, you
456
00:33:24,602 –> 00:33:27,638
tarush: think I think demolling is more important than ever. And
457
00:33:28,673 –> 00:33:32,927
tarush: what we’re seeing is with a lot of these tools like tableau, and these bi toools,
458
00:33:33,077 –> 00:33:36,597
tarush: which can directly connect your source systems and give you a level of insights.
459
00:33:37,248 –> 00:33:41,636
tarush: Those are great ster tools to get started when you don’t have these capabilities.
460
00:33:42,353 –> 00:33:47,642
tarush: But unless you, you know, if you don’t have that clean, distinct datamanding layer
461
00:33:47,809 –> 00:33:52,447
tarush: which is separate from your row data, lay, then what starts happening is your? for
462
00:33:52,597 –> 00:33:57,719
tarush: any new analysis or inside these tools, you are starting to have application logic
463
00:33:57,885 –> 00:34:02,757
tarush: inside of them, which means that Tablelo is now got some sort of application logic
464
00:34:02,924 –> 00:34:05,076
tarush: on top of your or data, which is what you report
465
00:34:05,393 –> 00:34:10,114
tarush: on. If you surface this data back to your customers, you have another copy of it.
466
00:34:10,281 –> 00:34:14,118
tarush: You start having a massive fan out problem, and every time you want to change one
467
00:34:14,202 –> 00:34:19,640
tarush: metric, you not have to change it in ten different places, And that is just software
468
00:34:19,707 –> 00:34:20,708
tarush: engineering. We have
469
00:34:22,760 –> 00:34:23,878
tarush: that this is data
470
00:34:25,079 –> 00:34:30,118
tarush: and that’s you, very, very significant. and it’s just like building a skyscraper.
471
00:34:30,368 –> 00:34:34,355
tarush: Right, You have to build. You have to dig up the earth and build a foundation. If
472
00:34:34,355 –> 00:34:38,593
tarush: you want to build a skyscraper. Now, if you’re serious about getting value from
473
00:34:38,760 –> 00:34:44,849
tarush: data, you, ignoring your datamanding, lay and directly and going, you know. going
474
00:34:45,083 –> 00:34:49,954
tarush: back to your row data directly allows you to build a few stories.
475
00:34:51,322 –> 00:34:54,675
tarush: You. You’re not really going to be able to goil build skyscraper. If you ignore your
476
00:34:54,675 –> 00:35:00,114
tarush: data Mar layer, So you know, in a world where we now want to go higher and higher
477
00:35:00,281 –> 00:35:05,553
tarush: and more you upstream, In terms of this pyramid, you know, Using Malo permits, needs
478
00:35:05,636 –> 00:35:12,527
tarush: for data data collection, ingestion storagening, reporting, um experimentation, Anl.
479
00:35:12,927 –> 00:35:17,081
tarush: If you choose to ignore your data modeling layer becomes really hard to eventually
480
00:35:17,248 –> 00:35:23,087
tarush: focus on data science and all of those areas, because you just have all of this debt
481
00:35:23,888 –> 00:35:27,008
tarush: in the middle of your pyramid, which becomes very expensive to maintain.
482
00:35:27,992 –> 00:35:31,279
anthony: I, I completely agree with you, and and I’m I was hoping that that was the
483
00:35:31,362 –> 00:35:35,283
anthony: direction that you werere going to go in. Um. And and I and I use the analogy I
484
00:35:35,366 –> 00:35:39,437
anthony: like the analogy of of creating that skyscraper. I think it’s it’s you know. The
485
00:35:39,687 –> 00:35:43,674
anthony: the classic analogy is is creating this house and it’s not about a house anymore.
486
00:35:43,841 –> 00:35:47,762
anthony: It’s about creating the skyscraper. It’s a it’s Um analogy. We’ve been using Um in
487
00:35:47,845 –> 00:35:51,032
anthony: my circles for a little while, and I, I think it works really well, so I. I’
488
00:35:51,516 –> 00:35:56,954
anthony: thrilled to hear you. Uh, talk about that. Um. I also would argue that I think that
489
00:35:57,121 –> 00:36:02,159
anthony: sometimes you can see syptoms of the lack of certain capabilities in other areas. I
490
00:36:02,243 –> 00:36:08,966
anthony: would argue that the lack of data modeling has led to a lot of challenges when it
491
00:36:09,033 –> 00:36:13,838
anthony: comes to master reference data in in organizations, because the lack of modeling on
492
00:36:13,838 –> 00:36:18,559
anthony: the front end creates what ends up being du. duplicate. copies and variant copies
493
00:36:18,726 –> 00:36:22,880
anthony: is of different sets of data that should be managed more collectively. And if you
494
00:36:22,964 –> 00:36:26,484
anthony: don’t have decent data modeing, you’re not going to have decent master data and
495
00:36:26,484 –> 00:36:28,152
anthony: reference state of management. And so you
496
00:36:28,402 –> 00:36:32,874
anthony: s startar to see these things trickle out into big problems. Because if if you
497
00:36:32,957 –> 00:36:37,445
anthony: don’t solve for that you, you’ve got challenges to reconcile data that needs to to
498
00:36:37,528 –> 00:36:38,796
anthony: work across different systems.
499
00:36:39,163 –> 00:36:44,518
tarush: hundred percent, I think you know. Um, I refer to that as having a single source of
500
00:36:44,602 –> 00:36:49,957
tarush: truth, and you know not having a, not having a single, a single maring layer, which
501
00:36:50,041 –> 00:36:54,595
tarush: powers then, B. I, machine learning, or of these other, All of these other layers
502
00:36:54,929 –> 00:36:59,233
tarush: leads to having multiple sources of truth, and very practically multiple sources of
503
00:36:59,317 –> 00:37:03,321
tarush: truth is probably the worst thing for companies, because what happens is that
504
00:37:03,554 –> 00:37:07,008
tarush: finance looks at a number and sales looks at a number, and these numbers don’t
505
00:37:07,241 –> 00:37:12,763
tarush: match. And for a data team, you know, you lose all opportunity at that point to
506
00:37:12,914 –> 00:37:17,568
tarush: explain that we were coping the data two different, D, two different data sources,
507
00:37:17,718 –> 00:37:20,288
tarush: and that finance went ahead and change some of those numbers on an Excell
508
00:37:20,438 –> 00:37:25,009
tarush: spreadsheet and sales the same. Hence those two numbers don’t match. You never get
509
00:37:25,159 –> 00:37:28,996
tarush: an opportunity to come out of that you, you know at that point instantly just
510
00:37:29,246 –> 00:37:33,634
tarush: becomes the numbers don’t matches. We don’t trust the data team and every department
511
00:37:33,884 –> 00:37:38,205
tarush: becomes their own gatekeeper, and those are tricky, tricky problems to reallyly come
512
00:37:38,356 –> 00:37:44,278
tarush: out of, and it all comes back to you know, having a consistent data modeling layer
513
00:37:44,679 –> 00:37:50,284
tarush: which is agreed on, and then the businesses you know, using that single source of
514
00:37:50,284 –> 00:37:54,038
tarush: truth, and if anyone wants to change it, not changing it directly, but actually
515
00:37:54,288 –> 00:37:58,843
tarush: changing it inside the single source of truth, you know. For people in the data
516
00:37:59,026 –> 00:38:00,027
tarush: space, this seems really,
517
00:38:00,995 –> 00:38:04,448
tarush: you know fundamental, and I love the fact that you sort about this up. but
518
00:38:05,633 –> 00:38:10,838
tarush: given what you know, modern tools allow us to bypass this and get to insight. you
519
00:38:11,005 –> 00:38:16,594
tarush: directly. Um. these symptoms of the surfaces sort of actually surface in these in
520
00:38:16,761 –> 00:38:22,600
tarush: these other ways, and at that point you know companies feel extremely helpless in in
521
00:38:22,767 –> 00:38:25,086
tarush: being able to execute their own dataies.
522
00:38:25,753 –> 00:38:31,192
anthony: I, you’re right and that that notion of data trust right. That comes back to when
523
00:38:31,275 –> 00:38:34,879
anthony: we talked about data leadership and how it’s about driving action. Well, if you
524
00:38:34,879 –> 00:38:37,765
anthony: don’t trust the date, if you don’t trust these analytics that you’re getting,
525
00:38:37,999 –> 00:38:42,803
anthony: you’re not going to take new action based on them, and then you’re going to kind of
526
00:38:43,287 –> 00:38:48,559
anthony: wither away the entirety of that argument that that you made earlier around. Hey,
527
00:38:48,793 –> 00:38:53,848
anthony: every organization needs this data analytic’s capability to drive these actions to
528
00:38:53,848 –> 00:38:57,518
anthony: drive these innovations. Well, if you don’t trust them if you have them, it’s just
529
00:38:57,601 –> 00:39:00,488
anthony: as bad as not having them at all. so this is literally.
530
00:39:01,605 –> 00:39:07,595
anthony: The lack of data modeling can be traced to businesses that will fail. I think is
531
00:39:07,678 –> 00:39:11,599
anthony: what we’ve just strung together. and and I think it’s that important. Uh, because
532
00:39:11,999 –> 00:39:18,239
anthony: that data trust is so incredibly essential to getting people to take action based
533
00:39:18,322 –> 00:39:23,594
anthony: on that, especially when that action that that the data suggests may run counter to
534
00:39:23,844 –> 00:39:25,846
anthony: their intuition. And that’s
535
00:39:25,846 –> 00:39:26,847
anthony: the whole point.
536
00:39:27,665 –> 00:39:28,666
tarush: you,
537
00:39:29,066 –> 00:39:30,067
tarush: I think
538
00:39:31,569 –> 00:39:35,005
tarush: this. you know. This is a great example of something which comes with experience
539
00:39:35,406 –> 00:39:40,194
tarush: right, Like something like how to set up a data modeling stack, And what? and what
540
00:39:40,361 –> 00:39:42,363
tarush: happens when this is not done correctly.
541
00:39:43,564 –> 00:39:48,686
tarush: What we’re seeing is that the data stack like these best practices are no longer
542
00:39:48,836 –> 00:39:52,523
tarush: just in modellling, but they’re also haeens. Like you know, five years ago,
543
00:39:52,673 –> 00:39:55,409
tarush: companies will building their own pipelines and injesting data themselves
544
00:39:55,726 –> 00:39:59,480
tarush: and today, if you do that, you is not going to be able to focus on the other area as
545
00:39:59,713 –> 00:40:04,602
tarush: using fully managed data pipeline. And ▁l T makes so much more sense. We see these
546
00:40:04,769 –> 00:40:08,839
tarush: kind of best practices across all of the layers, right or all the way from data
547
00:40:09,006 –> 00:40:13,077
tarush: collection. How you fire events in the front, and how you track that rece, an
548
00:40:13,160 –> 00:40:16,847
tarush: ingestion layer near the average Sa, up as a set of ten to twelve different data
549
00:40:16,997 –> 00:40:21,802
tarush: sources, mapping that stock into your store. How you store it and row in your role,
550
00:40:21,886 –> 00:40:26,524
tarush: Lay, or how you build this data modeling there? And then how does the data modeling
551
00:40:26,674 –> 00:40:31,162
tarush: layer feed reporting or machine learning? And what are the best practices there?
552
00:40:31,245 –> 00:40:35,966
tarush: Where you have application logic you know. Going back to the first thing I said is
553
00:40:36,033 –> 00:40:41,639
tarush: the modern data stack is becoming really complicated and you know these are super
554
00:40:41,889 –> 00:40:47,561
tarush: nuanced things which no longer makes sense for you to go to yourself, right. We saw
555
00:40:47,728 –> 00:40:52,766
tarush: that we marketing where you know five years ago. We realize that just posting a
556
00:40:52,766 –> 00:40:57,004
tarush: picture on Instgram is no longer marketing and that if you really want to do digital
557
00:40:57,154 –> 00:41:01,408
tarush: marketing, you know, bring in experts who already know how to do it and you know
558
00:41:01,725 –> 00:41:03,644
tarush: that same analogy is with
559
00:41:05,563 –> 00:41:09,884
tarush: it. It just doesn’t make sense in in our opinion, for for like companies to go
560
00:41:10,034 –> 00:41:16,841
tarush: focused on all of these areas. Um, this. it’s so nuanced and the r y offer is just
561
00:41:17,007 –> 00:41:21,162
tarush: having a a modeling layer which you then anyway, need to build on top of group when
562
00:41:21,245 –> 00:41:25,249
tarush: it comes to decision making, or or insights or or sort machine learning.
563
00:41:26,333 –> 00:41:27,334
anthony: Yeah,
564
00:41:28,085 –> 00:41:32,406
anthony: I think that’s excellent point. And so we’re running out of time. I want to ask you
565
00:41:32,473 –> 00:41:33,524
anthony: one more question. So
566
00:41:35,125 –> 00:41:36,160
anthony: to ▁ot, a
567
00:41:37,761 –> 00:41:42,716
anthony: statement I first heard in an agile conference, I think many years ago is that
568
00:41:43,767 –> 00:41:49,039
anthony: as as difficult and as complex and as massive as the Dta vis and complexity and
569
00:41:49,039 –> 00:41:50,474
anthony: everything that we’re dealing with is today,
570
00:41:51,525 –> 00:41:56,313
anthony: it’s never going to get easier than it is now. And so as we look towards the
571
00:41:56,397 –> 00:41:57,915
anthony: future, what do you see
572
00:41:59,200 –> 00:42:04,722
anthony: evolving and coming next in like the next five or ten years in this space? Where do
573
00:42:04,722 –> 00:42:05,723
anthony: you see it going?
574
00:42:06,123 –> 00:42:11,645
tarush: Yeah, sure, so, I think, in the last five years the understanding on what the modern
575
00:42:11,729 –> 00:42:15,633
tarush: data stack is right across these five layers which keep talking, Da collection,
576
00:42:15,799 –> 00:42:21,155
tarush: andgestion modeling, Uh data collection in storage, bonding reporting has become
577
00:42:21,322 –> 00:42:24,842
tarush: fairly well understood. Right, who are the best players over there? We now have five
578
00:42:25,009 –> 00:42:28,529
tarush: billion dollar players across each of these spaces. Um,
579
00:42:30,047 –> 00:42:35,002
tarush: what you know what I see happening of the next few years is is layer six to ten
580
00:42:35,569 –> 00:42:39,723
tarush: start to get more predefined right, And these areas like reversityity are taking
581
00:42:39,957 –> 00:42:43,160
tarush: these insights from the warehouse, pushing it back into your source systems, data
582
00:42:43,394 –> 00:42:47,881
tarush: observ ability, a sort of lineage where Uh, being able to at track these different
583
00:42:48,198 –> 00:42:53,404
tarush: data sources, Um, just to make sure that if a job fails how to, we rerun them, or
584
00:42:53,804 –> 00:42:59,159
tarush: just knowing where they come from having meaddta around them, areas like machine
585
00:42:59,393 –> 00:43:01,328
tarush: learning obviously is going to be a big one.
586
00:43:01,962 –> 00:43:06,684
tarush: I think you know the best practices of what are the billion dollar players across
587
00:43:06,834 –> 00:43:10,521
tarush: each of these stacks. And really, how do they fit into the modern day data stack
588
00:43:10,754 –> 00:43:15,643
tarush: become? you know, extremely relevant in what we’re seeing in Uh Maturk. State of
589
00:43:15,643 –> 00:43:19,713
tarush: data. Twenty twenty, one, Which is something I really enjoy reading. Also help
590
00:43:20,047 –> 00:43:25,402
tarush: validate is that companies today want the flexibility. they don’t want to be locked
591
00:43:25,636 –> 00:43:30,441
tarush: into one platform. So the Amazons or the Google, you know ecosystem is not that
592
00:43:30,441 –> 00:43:31,442
tarush: compelling companies
593
00:43:31,475 –> 00:43:34,762
tarush: wa to use Snowflake for storage, and they want to use D B T. and they want to use
594
00:43:34,928 –> 00:43:40,434
tarush: Fivetrand, So you know there’s going to be a lot of innovation in these microlayers,
595
00:43:40,601 –> 00:43:45,005
tarush: not microas, But you know a single layer instead of platformizing, four or five
596
00:43:45,155 –> 00:43:50,277
tarush: layers at a time, And I think you know the. I think the next few layers and the and
597
00:43:50,361 –> 00:43:55,399
tarush: the and the vendors across those layers becomes more and more and more standardized.
598
00:43:55,482 –> 00:43:59,887
tarush: And you know we start. You know this starts to emerge. Uh, a sort of winner across
599
00:44:00,037 –> 00:44:04,758
tarush: some of these different layers. So as layers you know six to ten, and you know at
600
00:44:04,842 –> 00:44:08,529
tarush: this point it’s no longer hierarcharal. You know, after the Da of malding layer it
601
00:44:08,595 –> 00:44:12,116
tarush: all kind of becomes impalling like you do, reporting Imp pandem with machine
602
00:44:12,282 –> 00:44:18,205
tarush: learning and panel with reverse C. ▁l, so not really six to ten in terms of the
603
00:44:18,355 –> 00:44:22,042
tarush: hierarchy ructure, but the next Pa layers
604
00:44:23,077 –> 00:44:28,766
tarush: or of use cases sort of start to emerges trends, And I think one of those layers,
605
00:44:29,483 –> 00:44:33,637
tarush: Um, is what Five Ex is focused on is really assembling the data Sta.
606
00:44:34,838 –> 00:44:38,208
tarush: This is going to be. This is going to become another layer. as the da’ Sta becomes
607
00:44:38,359 –> 00:44:42,679
tarush: so much more complicated. And you realise we need we, ten different vendors, or
608
00:44:42,913 –> 00:44:46,517
tarush: they? You know we, We strongly believe this is going to be this room of a company
609
00:44:46,917 –> 00:44:51,088
tarush: like us, which standardizes and gives it you as a single offering. That’s one area
610
00:44:51,238 –> 00:44:55,159
tarush: which obviously extremely bullish on, but there going to be more and more layers
611
00:44:55,242 –> 00:44:58,679
tarush: over there. And for the layers which have now been established, we’re going to start
612
00:44:58,679 –> 00:44:59,680
tarush: to find.
613
00:45:01,565 –> 00:45:05,803
tarush: We’re starting to find winners and sort of factor champions of those layers, which
614
00:45:06,036 –> 00:45:10,040
tarush: of Emerg, and start to sort of marketal
615
00:45:11,241 –> 00:45:12,276
tarush: those in those areas.
616
00:45:12,960 –> 00:45:17,364
anthony: You know every now and then as a podcast host you, you. thank yourself for asking
617
00:45:17,514 –> 00:45:21,518
anthony: the question that you as And that was one of those times that was awesome. Like I,
618
00:45:21,518 –> 00:45:26,407
anthony: I really like the way you characterize some of these additional layers, and I think
619
00:45:26,473 –> 00:45:31,762
anthony: you’re right. I think you’re absolutely spot on with where this likely heads,
620
00:45:31,995 –> 00:45:35,599
anthony: because we’ve seen patterns of maturity and I was thinking like the accordion
621
00:45:35,766 –> 00:45:40,237
anthony: expands and contracts. We see these repeating cycles, but I totally agree. I think
622
00:45:40,320 –> 00:45:42,406
anthony: that you’re going to see more clearer
623
00:45:43,340 –> 00:45:44,341
anthony: stages of
624
00:45:45,275 –> 00:45:49,997
anthony: that evolution of data and and subsequent to those five stages that exist today. I
625
00:45:49,997 –> 00:45:54,234
anthony: think that’ I think that’s brilliant. I think great great insights to close the
626
00:45:54,234 –> 00:45:57,755
anthony: show with, and and we are definitely out of time now. So, to thank you so much for
627
00:45:57,838 –> 00:46:00,808
anthony: being on the show with us, I really appreciate it. This has been amazing.
628
00:46:01,008 –> 00:46:04,178
tarush: Thank you so much. Thank you. A Have mean. it was so fun. It was so fun chatting
629
00:46:04,178 –> 00:46:05,179
tarush: with Yestday, morning.
630
00:46:05,679 –> 00:46:08,632
anthony: absolutely. and thank you all for joining us today. You’ll find more information
631
00:46:08,966 –> 00:46:12,886
anthony: and links in the show notes. Dive deeper with my book at DataLeadershipBook.com
632
00:46:13,036 –> 00:46:17,357
anthony: Com and use promo code “ALGMINDL” at the DATAVERSITY Online Training Center for twenty
633
00:46:17,524 –> 00:46:20,878
anthony: percent off your first purchase. And if you enjoy our show and would love your own,
634
00:46:21,044 –> 00:46:24,565
anthony: but don’t know where to start, visit Algmin.com to learn how we make having
635
00:46:24,715 –> 00:46:28,635
anthony: your own video podcast as easy as joining a call and sending an email. Stay safe
636
00:46:28,886 –> 00:46:31,355
anthony: during these unusual times and go make an impact