Seed to Exit
Welcome to Seed to Exit, the ultimate podcast all about startups, scaling, and venture capital. Your host is Riece Keck: Startup veteran and recruitment entrepreneur.
Join us as we dive into the journeys of startup founders and venture capitalists who share their insights, successes, and lessons learned from seed stage to successful exit.
Each episode, we bring you candid conversations with startup founders, executives, and investors. Whether you're looking for inspiration, actionable advice, or a deeper understanding of the startup ecosystem, Seed to Exit offers invaluable knowledge and real-world experiences to help you on your entrepreneurial journey.
Tune in to Seed to Exit and get ready to be inspired, educated, and connected with the exciting and ever-changing world of startups and venture capital.
Seed to Exit
Manu Sharma, Founder and CEO of Labelbox | Transforming AI Model Development | The Evolution of AI Data Labeling and Future Insights
Discover how Manu Sharma, the visionary founder and CEO of Labelbox, transformed his childhood inspirations into a pioneering force in AI model development. From his formative years in India where he was surrounded by a family of artists and engineers, Manu nurtured a passion for building and tinkering with limited access to technology. This episode promises to take you on a journey from his learning through encyclopedias to his impactful career in technology, highlighting how his self-taught skills in software and coding laid the foundation for Labelbox, a platform essential for enhancing AI models.
Explore the fascinating evolution of AI data labeling as Manu shares how early experiences with Flash programming led to the creation of Labelbox. As a key player in transforming base AI models into more aligned and human-interactive versions, Labelbox operates as a “data factory” that connects qualified experts with labeling tasks. Manu explains the company's innovative approach to overcoming data scarcity and the potential of synthetic data, offering insights into the critical role of data and compute in training AI models.
Journey with us into the transformative growth of Labelbox from its inception through its adaptation to the rise of models like ChatGPT. Manu discusses the strategic engagement with researchers and engineers and how the company navigated a rapidly changing market to become a leader in the AI industry. He provides a glimpse into the future of AI, focusing on its potential to serve as personal teachers and enhance learning across languages. As Manu reflects on the legacy he hopes to leave, he envisions a world where AI systems significantly impact everyday life, emphasizing Labelbox’s commitment to advancing AI's reasoning abilities.
All Links: linktr.ee/startup_recruiting
LinkedIn: www.linkedin.com/in/riecekeck/
Twitter/X: x.com/tech_headhunter
Recruitment: www.mindhire.ai
Youtube: https://www.youtube.com/@seedtoexitpod
To build these amazing models that we all are using every day, they fundamentally need two things. One is compute, so they need lots of GPUs to train these models on. The second big ingredient that goes into training these models is data, and there are two kinds of generally the process of training these models. One is pre-training, where you may have heard of hey, these models are trained on the internet data. That's true, and a lot of these like hey, these models are trained on the internet data. That's true, and you know a lot of these companies. They download all of the data, they crawl all the websites, all the sources. In fact, they will even buy data from private places, like you know libraries and you know private kind of websites like Reddit and Stack Overflow and so forth. So that initial phase is to essentially train a base model.
Speaker 2:Thanks for listening to Seed to Exit. I'm very grateful to have you tune into another episode, and this is one I'm very excited about, so today I'm thrilled to welcome Manu Sharma. Manu is the founder and CEO of Labelbox, which is a data factory platform that's becoming essential for developing AI models. Under his leadership, labelbox has secured over $188 million in funding from investors like A16Z and SoftBank, so we're going to cover a lot of topics on entrepreneurship, ai development and what the future of AI looks like. I hope you enjoy the show.
Speaker 3:You're listening to the Seed to Exit podcast with your host, rhys Keck. Here you'll learn from startup executives, founders, investors and industry experts. You'll learn from the best about building amazing products, scaling companies, raising capital, hiring the right people and more. Subscribe and listen in for new episodes and enjoy the show.
Speaker 2:Okay, manu, welcome on. Excited to have you, excited to be here. Thank you for having me Subscribe and listen in for new episodes and enjoy the show. I found a lot of great entrepreneurs have had unique upbringings or something that really shaped them into what they are. So, just for context, what was your upbringing like? Anything formative that gave you the drive to where you've gotten to where you are today.
Speaker 1:Yeah, so I grew up in India, north part of India, a place called Roorkee, and I grew up in a family of artists and engineers and as far as I can remember, my memories are filled with like sort of building or tinkering with things with my father or my grandfather and a lot of my time during kind of the middle school and so forth. You know my mom was pursuing fashion design and so forth, so I kind of got to see the art world. Uh, I used to often spend time in textile factories. Uh, you know where the cotton comes from, the fields, and on the other side, you know the high quality fabric is exported outside of india and across the world and so you know a lot of these. Those are sort of my most cherished memories growing up. But at the same time, you know life was fairly simple growing up, in a sense that I didn't have access to a lot of things like computer or phone and so forth until really later ages, later in teens ages and later in teens and um, and so a lot of the things that I got fat, like a lot of the ways that I would learn about, um, things that were interested to me, was in libraries, so I would go to libraries and learn from encyclopedia like encyclopedia and kind of explore subjects like physics and chemistry and, you know, come back home and try to tinker with things that could, you know, get close to some of those experiments I would learn in those books. Anyway, but you know, those formative times led me to dream of pursuing technology and you know particularly aircraft design and aerospace engineering. You know people like I would watch sometimes astronauts on TV and so forth and that felt like, ok, this is super cool that humans can actually do these things.
Speaker 1:And you know, short, kind of fast forward, with a lot of hard work and luck, I was able to come to America for uh, for my um, for my studies and um, the the like I was because I was very singularly focused on few subjects. I thought at the time aerospace engineering was really really interesting, um and uh, I loved the airplanes. I wanted to explore, like um, explore even potentially being a pilot, and so forth. Then some other subjects like physics and technology were also very interesting. I was also very good with software and coding growing up, anyway, so I got to America as part of the education program.
Speaker 1:Got to America as part of the education program and I think that was sort of the kind of the big leap for me to come to America to learn all these amazing ways in our system. Here is the country about entrepreneurship, about building businesses, following your passions and so forth, and so that's kind of like how it all began. Uh, and some of these most formative times, you know, uh, a lot of the uh, a big, of a big challenge of you know, um, that I had to overcome was to, you know, come to America and and be in the right sort of uh, have the right conditions for me to be able to, like, start companies or stay in the country and so forth.
Speaker 2:And yeah, like you know so, Well, I think that's super cool for a couple of reasons. One, I mean I think we all wanted to be astronauts when we were kids, so the fact that you actually built an aerospace company later in life is really fascinating. One question you mentioned that you were really good with technology and coding, but you also mentioned that you didn't get a phone or computer until really your teenage years. So how did you even get started in software engineering? Was it entirely self-taught?
Speaker 1:Yeah, when I first got my computer at home I got hooked into it. I would, you know, learn all kinds of software programs and you know I learned tools like AutoCAD, photoshop, of course, programming with. You know, basic at a time, at some point I learned Flash, and so a lot of these, like it sort of like opened like a new world. There's all these new things I could learn, um, uh and so forth. And of course, when I would go to school, school had computer lab and and that's where I learned. So in a very span of like maybe five years or six years or something like that, I, um, I got really good at coding, um, in fact so much so that I would hack uh in our school and have all these tricks and so forth with my buddies. So yeah, it was a very fast like I learned. This is really cool. I spent a lot of time right away and in fact so much so that I first I even attempted to start a business with my Flash programming skills. I tried to build a. You know, I saw something like from US, like somebody made a million dollar homepage and was selling pixels for a dollar and I thought that was super cool. At the time I was like, oh my God, maybe I could go do something like this, like an idea like that. So I built a you know website on on flash and um, you know there was sort of like this being out there, getting your idea out and doing things that that was sort of the most important kind of current from that story. But you know, um, so that's like how it all gets started um and um.
Speaker 1:And what's really interesting is a lot of people ask me um, how am I doing the software company right now? Uh, given all of this background, and it turns out that nearly every engineering discipline that um one you know goes into school for um, requires software. Engineering requires, like you, to code. You know all of these equations in physics or math and so forth. They have to ultimately be coded up to run simulations, to run programs and so forth. So so coding and software is a very core part of every engineering discipline. At this point, however, there is kind of a professional software engineering skills that you don't really get to learn in school as much, which is about how do you make production software at scale and so forth, which really comes from the experience.
Speaker 2:Absolutely, and you mentioned the software company that you're doing now, and Labelbox occupies a fairly unique space within the realm of AI in the context that, for you know, the layman is probably not familiar with what you do. Right, you generally will think of a foundational model like ChatGPT or Cloud, or you'll have your point solutions, which is like AI for sales, but LabelBox is neither of those things. So, for the non-technical or the semi-technical person, could you explain what LabelBox is and how it's used to train models?
Speaker 1:Yeah, so to build these amazing models that we all are using every day, they fundamentally need two things. One is compute, so they need lots of GPUs to train these models on. The second big ingredient that goes into training these models is data, and there are two kinds of kind of generally the process of training these models. One is kind of pre-training, where you may have heard of, okay, these models are trained on the internet data. Well, that's kind of that's true, and you know a lot of these companies. They download all of the data, they crawl all the websites, all the sources. In fact they will even buy data from private places like libraries and private websites like Reddit and Stack Overflow and so forth. So that initial phase is to essentially train a base model like have a basic intelligence from all of this trillions of words that are on internet, all of this trillions of words that are on internet. However, that model, that base model, is not as useful in everyday life. It has base intelligence but it's not useful. It's not. It cannot interact with humans. And so there is another process called post-training, and it's an umbrella term for variety of methods and tools and techniques to align that model or tune that model to work with humans, and this also requires a ton of data, but in a very specialized way. And so what Labelbox does is it produces data with humans, with expert humans. That essentially leads the improvement of base models to more aligned models that we all everyday use, and the kind of data is actually very kind of.
Speaker 1:Right now I find it very hard to you know, like, hey, when I'm using any of these foundation models or chat assistants like the assistants know so much about everything, like I feel like I have very little knowledge to add to these models.
Speaker 1:But turns out that creators of these models are able to figure out and see, well, the models needs to be improved in mathematics or physics or, you know, maybe aircraft design.
Speaker 1:And the only way to actually improve the performance of these models is by finding these experts who are great mathematicians, great physicists or great aircraft designers, and produce the data in a manner that will teach the model the new knowledge or the new way to understand and reason about these things. And so Labelbox essentially does that. So we have a very big network where anyone can sign up and join and pass through the exams and if they are qualified, if they pass with good scores, they will be matched to a job that will be about labeling data and getting paid, and the data is used for these foundation models, and so we have this network called alignercom with double R, and then we also have our software product and software platform called Labelbox, and in conjunction we operate this as a data factory and ultimately, the data produced by our factory is used to improve these state of the art models every day.
Speaker 2:That's super interesting. So like, let's say, I'm an underpaid physics professor that wants to have something of a side hustle, I can then go to a look at all of the data that you're bringing in and then annotate it using my PhD level experience in order to improve the data that's going, then to call it open AI or anthropic.
Speaker 1:That's right and in fact we have a ton of professors and postdoc students in America and actually many other places in Europe and India that are part of our network, and you know, and they're getting paid very attractive amounts of money to do this job.
Speaker 2:So you mentioned earlier on the pre-training side how these companies are ingesting, you know, huge amounts of data, practically all of the data in the world, or as much as they can get their hands on, and I know that there's been concerns that eventually we're going to quote unquote run out of data, and so then the answer to that is synthetic data. So I'm curious what is your take on how to solve that problem? And then, how does the data annotation that your company does?
Speaker 1:work with synthetic data. Yeah, I think in many ways we have kind of reached those limits of the data when it comes to pre-training, because everyone has access to these web data sets and everyone basically have done all these partnerships to get private data from different companies, different places or different networks. And you know, arguably like, let's say, in the next phase of technology development, multimodal data like videos and images and audio. Maybe we'll tap into those sources more effectively very soon. But generally speaking, yes, like there is a new kind of data required and I think there are really promising approaches to produce that data. Some of these techniques are with synthetic generation.
Speaker 1:A very simple intuition about synthetic generation would be that, hey, let's say we figured out a wide range of books that were never published on the internet.
Speaker 1:We can use an LLM to go through those books and that knowledge and come up with interesting derivative data sets like question and answers and so forth, and that's a great kind of way to amplify the data by using an LLM.
Speaker 1:That's kind of like a very simple example. But there's so many more varieties of these ways of producing synthetic data and then I think on top of that there is going to be need for data with humans and AI in conjunction. So think about today, like right now it's almost becoming unimaginable to write software without AI assistance. And if you're refactoring a very large code base as part of the task to teach the models how to refactor, code, the humans, our aligners, are more likely, you know, going to be more successful by using a co-pilot assistance to navigate the code base and re-architect and refactor. And that's a very simple, canonical example of kind of this hybrid approach where humans are going to use AI to produce new forms of data. That is ultimately going to help improve these models and I think we're going to see both of these techniques be very effective. It already is and will, I think, drive or fuel the next phase of the AI development.
Speaker 2:And so then, how do you see that interplay between the human labeling and the AI-assisted labeling evolving over the course of the next few years? Like to go back to the example of the physics professor do you anticipate that a physics professor will then be using an AI to help with the human labeling, or is it something that just should be kept completely separate?
Speaker 1:I think.
Speaker 1:So I think it's going to be kind of this very nice confluence of kind of synergistic kind of relationship in producing new kind of data.
Speaker 1:And you know, not all kinds of data may benefit from such an approach, but I think my guess is that vast amount of data will be produced with AI and humans working together.
Speaker 1:And you know, ai is kind of like a much more like an ether, like a technology that is can be infused in every little aspects of like what you do every day.
Speaker 1:And so if it's providing this acceleration in you know little things like hey, I want to search for new knowledge because I want to understand what this new concept is before I write my kind of response to the question, like that is an act that can be further accelerated by AI, and so you can just have a much more purpose-built search and knowledge system that gives you that knowledge. And perhaps you know, the AI can help you figure it out. Like hey, you have provided a response here, or the task you have done. Potentially, these are some errors that the AI has figured out already that needs to be fixed, and so that's an example, I think, where AI and humans are going to work together to have that superhuman capability, human capability, right. So you know, I think AI is a tool and if the tool is used in a manner that the tool can be used in a manner that it produces a very big leverage, and I think we are seeing that already in early phases of data production today.
Speaker 2:And it's likely going to become more pervasive very soon? And how do you handle data privacy and security concerns, particularly for industries that have very sensitive data, Like, for example, LabelBox sells into healthcare, which, of course you have HIPAA and all the other data privacy concerns? How do you handle that from an annotation or data perspective?
Speaker 1:Yeah, so as a company, we, from day one, have been very privacy first organization. So you know, we not only we have all the kind of basic practices of HIPAA compliance and SOC 2 compliance and GDPR and things like that. That helps us to even operate our business in those markets. On top of that, we take a lot of steps such as anonymizing customer data. When it goes to the humans for labeling, every aligner, every human AI tutor that we have have very strong confidentiality agreements with us. They are part of the same kind of similar scrutiny we have around security and compliance privacy compliance as our full-time employees. So we are able to guarantee our customers a very wide, comprehensive coverage of security compliance, privacy practices, including with the humans that are part-time working on labeling the data Super interesting.
Speaker 2:Well, we've talked a lot about Labelbox the product, but I want to talk a little bit more about Labelbox the company. So, just to go back to the beginning, I was listening to a podcast that you did a few years ago and you said you reached out to Brian, your co-founder, and you agreed that the next phase of your careers was that you were going to build the company, or build a company together. So where did that come from and what gave you the inspiration to start Labelbox?
Speaker 1:So I think for both of us, we've had many, many shared experiences since college times, and one of the most important um and um, um, an essential aspect of our shared experience was coming up with ideas and uh and finding opportunities in the world where we could um, uh contribute uh directly, um, you know, in term, in form of producing um objects or uh solutions, um, uh, and you know and pursue that and and I think, particularly in america, it's, you know, it's something that is just so uh incredible that there's opportunities like that. You could have an idea, I can work really hard to it on it, and there's all these environmental factors that help me that is is conducive for you to just go out there and do it. And so since college times, we were into this thing. However, we did not know a lot about building companies far from venture-backed companies and so forth, and so that was sort of the backdrop. We built small businesses and we pursued interesting ideas in renewable energy. We even had a space company where we built hardware that went to International Space Station. This is, you know, we were doing this while I was in college at Stanford, brian was at Boeing, and so we were always pursuing these kind of things.
Speaker 1:However, in 2018, 2017, 2018, ai was taken off, particularly deep learning, computer vision, and I was at a company called Planet Labs and we saw firsthand that, oh okay, to produce, to develop these AI systems, you need human supervision. And we had a very simple insight for a long, very, very long time, potentially forever humans are always going to want AI to be more aligned to themselves. And you know, in other words, if you build whatever kind of level of AI, whether it's for your company, whether it's like an AGI, the creators of those AI systems want to make sure that AI is behaving the way they want it to. And if that is the case, then how would you, how are these companies and teams going to ensure AI is aligned? And the answer is actually comes down to data, because that is a primary form of communicating with an AI, and, and so we saw an opportunity to build products and services in that sort of kind of intersection of human and AI. And you know, and we thought, we made a bet that supervision, human supervision, is going to be required, albeit that it will change dramatically as technology gets better and better, which it already has.
Speaker 1:Back in the day, 2018, the data labeling or annotation, was very meticulous. You know you're going to have to teach, uh, ai basically every little detail about you know, hey, I want you to uh, like, this is what a car looks like. Uh, now, um, the, the, the, the labeling is actually at a much more higher sophistication. Uh, you know, professor of, labeling is actually at a much more higher sophistication. You know, professor of, let's say, mathematics is teaching AI about how to reason about that problem at a very high level in a natural language. You know so just in the span of six years. You know, the interface and the way these models are taught are becoming actually more like humans, are becoming actually more like humans, like the way we would teach students and teach each other through this communication and preferences. Sometimes, preferences is really the way to go about making a judgmental quality, because we sometimes are not able to describe why such a thing is so great, but we're able to say, like that certainly is better than everything else we've seen before, and so that is how it's being done today, and I think it's probably going to take even more simpler forms into the future.
Speaker 1:How'd you get your first customers?
Speaker 1:We launched our product on Reddit and we kind of hacked a version of LabelBox and on nights and days and nights and weekends, we kind of hacked a version of Labelbox on nights and days and nights and weekends and we launched on Reddit and it struck the chord with the community of computer vision researchers and so forth and they started signing up.
Speaker 1:They started asking us hey, this is kind of cool. Maybe what do you think about this extra two or three features? Maybe what do you think about this extra two or three features? And every day we would wake up, we would see more signups and we would ship more features based on what we had learned the day before. And that's how the snowball effect started. And, yeah, just a matter of two or three months, we were charging for the software and our tools. And when we started charging, then over the course of the next few months, we started adding zeros to our price. We started selling it at $10, $100, $1,000. And when we saw that price elasticity like okay, we can keep charging far more money that's when we realized this is actually becoming very valuable and perhaps we should really just go at it full time.
Speaker 2:Super interesting. So when you say you launched on Reddit and you were having people sign up, then presumably you're not. Were you trying to get people to sell to people on reddit or were you trying to get the annotators that could then come in and add their knowledge, or both?
Speaker 1:um, so um, we um in the early days, like, our product was basically a software tool that allowed them to label data by themselves, and so the simple example would be let's say you are a healthcare company and you have maybe access to five or 10 radiologists and you want to just label that data, and at the time, all of the tools were desktop tools, and we were probably the first tool that was cloud-based, where you simply loaded up the data and you could kind of collaboratively label data among five people in a world, again in the backdrop, where everything was desktop and you can imagine how good you can coordinate five different people different places on a desktop tool to label all kinds of data.
Speaker 1:You'll have to divide all the data to five people and all that stuff. We just handled all that, and so that was really our core first product, and we were going out to Reddit and all these different places online to essentially find and appeal to the researchers, the engineers who are building these systems, these AI systems, and so that's kind of like how we began. Now I should also say that before building this product, we spent many months researching the space and we talked to a number of people in Silicon Valley who were building AI companies, and we honed down like, okay, this is actually a problem. People, companies need much more better, much better ways to label the data, manage the process and all that. And so that led to conviction like, okay, let's go build a prototype. That led to like, okay, launching this on online um and um. And then, and then you know, yeah, that's how we started.
Speaker 2:So when did you, when did you switch from that B2C product selling to the researchers to more of a B2B product, and how did you make that transition?
Speaker 1:We were B2B from day one and a lot of these researchers were like. So some of them were certainly hobbyists, but many of them were part of the companies they were working in. So you know, a lot of this ai development could really happen in companies that had a lot of data and so, um, right, so like you could only build, let's say, a, a vision system, uh, for vehicles, for cars, if you have cars on the road, or you can only build medical imaging AI if you have medical data. So these researchers, they were employed by these companies and that's how we had that insight. We really needed to talk to them and engineers and who had the problem, and they were more likely than not building their own tools themselves, using the open source tools perhaps, or they may have tried some legacy providers and that was not really an ideal experience for them.
Speaker 2:Got it Okay, excuse me. And then on the hiring side, how did you hire your first 10 to 15 employees and how did you convince people to make that early jump?
Speaker 1:Well, I think the first 10 people most of them, came from our prior networks or prior experience.
Speaker 1:We knew these people from our prior jobs, from our prior jobs and um and so, um, we essentially went into that mode where, like, we were hiring our first few employees, um and colleagues to like going through the people we had admired the most in our previous jobs, and not everybody, of course, uh, wanted to jump and you a very wild rest experience with a super early stage startup. But many people did, and so Brian, myself, dan we basically went into our respective companies and felt like who are the great people that we worked with and that helped and it also runs out very quickly. We were early in our careers. We only knew so many people, and that helped and it also runs out very quickly. We were early in our careers, we only knew so many people, and I think a lot of the hiring after that, after a handful of people, then, really was about spreading the name and going around in our communities in San Francisco to find people who were interested in a very early stage startup experience and, of course, then convincing them to join Got it.
Speaker 2:So that was quite a few years ago and Labelbox has obviously grown tremendously since then. Talk me through what the last few years have been like and how you've gotten the company towards that today.
Speaker 1:Yeah, I mean, last few years has been extremely dynamic, to say the least. Nobody expected the breakthroughs that we are now using every day. I think no one really expected, like, how amazing chat, gpt and the the transformer was. I think we had a very early insights or sort of a vibes coming in Like, okay, you know, as people are pretty excited about transformers, this and that, but it wasn't really until chat GPT launched that the interface and everything became like aha, this, okay, this is how it's going to be.
Speaker 1:And so you know, when you're building a company, one of the most important aspects about scaling the organization is that you're operating in a market that is stable and you know when the markets are stable, you can operate a team. You know, kind of in a much more conventional kind of all the ways that you know you learn from all the like entrepreneurs, like how do you scale, and you know solve problems and grow the team, things like that. But our market actually is very dynamic, like you know. It just changed so fast, so quickly. It just changed so fast, so quickly and because of that we had to kind of shift towards a much more early stage startup mode where you know we are less management, less sort of layers of hierarchy and much more kind of the information flow from the market and insights is, you know, dissipated to everyone much more quickly. And so that's what we did, and um and and that has helped us to understand and build new products and services better, faster than otherwise we would have.
Speaker 1:And so in the in the last couple of years, in the last couple of years, Labelbox has evolved from providing software tools and software platform to a data factory, and what I mean by data factory is we are a company that provides all of the tools, the system, the infrastructure for a company to operate a data factory themselves to produce this labeled data for their AI models.
Speaker 1:But we also operate the data factory ourselves and we essentially sell the data, provide the data to the customers who want a much more kind of fully managed experience, and we do that together in a same platform. And so this is really really remarkable. It's enabling AI teams to produce higher quality data faster. They're able to have this iterative feedback, so an AI lab that is really trying to innovate on physics or mathematic problems are really able to talk to these human experts in those fields very rapidly and figure out how best to produce data in, whatever their domains are, and so that's the kind of the evolution of Labelbox, I would say, in the last two years, and we have one of the fastest growing network right now. So you know, if any of the listeners here are interested in a side hustle and think they can contribute in whichever domains that they are a part of, please check out alignercom. And, yeah, awesome.
Speaker 2:So I was going to ask you about that. How transformative was that for Labelbox in the chat GPT moment when it was released, did you have like a holy shit, this is going to change everything for us moment, or did it kick in a few days later? What was that?
Speaker 1:like slower, maybe over the course of months and you know, maybe like six months or so, and it was more in a form of like how the businesses were reacting, and so we knew that it was something really interesting and very profound potentially. However, we didn't know how the businesses would react to it. We didn't know how the businesses would react to it and like how would AI teams, like all our customers that we have in different industries, would react to it? And in many ways, a lot of the interest for most of the enterprises just went into generative AI, because it's like was sort of instant ROI for them to do things that they otherwise couldn't have done before, and so that was sort of a learning experience for us throughout a period of time, and that also meant their priorities were shifting. They were re-architecting their team and their focus.
Speaker 1:Some companies were no longer going to build AI systems anymore because they're going to just rent AGI or like the AI from foundation model companies, and some companies had much more clarity where, like, okay, this is their moment and they're going to use these foundation models as a new backbone and then build their custom AI on top using that architecture, and so that thing played out over the course of months and quarters, and that helped us understand, like, okay, there are going to be some customers in the enterprise segment that are going to be building AI systems because they have the data, they have the talent to go develop those things. And then there are going to be a new class of customers, their frontier AI labs or foundation model developers, so generative AI startups that are actually going to need a completely new way of data that has not been produced before, and so we need to also serve them. And so that was all the learnings of, I would say, 2023.
Speaker 2:And where is Labelbox at now, just so people don't have to Google?
Speaker 1:from a funding and a headcount perspective, yeah, we are over 150 people company primarily based in San Francisco, but we have few satellite offices around the country and in New York City and Poland, and we are hiring in many, many various roles, and our company is venture-backed. So we've raised about $190 million from a variety of investors like SoftBank, a6&z, kleiner, perkins and Google, and, yeah, so we've intentionally kept our company very focused towards this product and the things that we're building. And one of the learnings of this generative AI also is that you don't need to have as many people as before generative AI. A lot of our teams are more productive, actually by being smaller, but by leveraging a lot of the AI tools and and so that's been kind of a revelation for us to have a huge unlock for us to stay nimble, move fast, have fun and also, you know, have that sort of vibe of like smaller teams and so forth.
Speaker 2:So, in terms of the guiding principles that have gotten to you, to where you're at today staying nimble, staying lean. What else have been those guiding principles, either from a product or a people perspective?
Speaker 1:and always kind of figuring out where the puck is going and having the company and the teams aligned towards, always be pointed towards where the things are going. Irrespective of all the past decisions and places we may have been, it's always about the future and I think that's been sort of like a guiding principle. Places we may have been, it's always about the future and I think that's been sort of like a guiding principle in all forms of, in all facets of company building. It comes to people, the teammates, it comes to strategy, it comes to what we're going to build. You know how are we going to communicate about ourselves? So it's like, you know, it's always like where the things are going, where the future is, and I think I think that's probably my most profound guiding principle at the moment that I can think of. But yeah, that's like it's so, it's so true, you know.
Speaker 2:Love that Always be learning. So when you, when you think about the future, love that Always be learning. So when you think about the future, what do you see, both in the field of general AI?
Speaker 1:I think it's been conveyed by so many people like this looks like the most profound one, because it's perhaps because there's no ceiling to it, like AI and AGI, there's no end to it, like it could just keep getting smarter and smarter, and like, what does that mean? And all that, and so I think, um, so, with that as a backdrop, um, I think we are. We are seeing, um, these amazing, amazing capabilities of ai systems and how it's changing everyday job, right, so it for our teams. It's unimaginable to write software without co-pilot AI systems. It is unimaginable for us to write documents or any writing that we do, it's unimaginable to do without AI today, right, and so there's so much of these things that are already intrinsically changing and, by the way, six months from now, nine months from now, I think we might see similar significant shifts that we saw in the last year. So this is really an exponential trend. It's it's very hard to fathom the progress. You know, I speak to my phone all the time now, I never used to do that, uh, even a year ago, and like these are the things that are changing basically, like these subtle things that are changing about, you know, everyday life. Uh, I think is the most exciting part of it. Honestly, um, I still get mesmerized like how can an AI just like produce? Sounds like that are uncanny at this point, you know, and it's pretty remarkable. I'm particularly excited about having a personal teacher. You know, for me, I'm learning, I love asking questions, I love like, just like going deep into a topic and so forth. So I use tools like Perplexity, claude and Gemini and all these AI tools, and it's really cool for me to just always discover new knowledge. I'm very excited about my kids doing the same way. Discovering new knowledge like that. I think it's going to be very profound. So those are the things that I think are particularly exciting.
Speaker 1:When it comes to Labelbox, we are always asking a question what is the next AI capability in terms of reasoning, in terms of the things that it cannot do yet, and how will we help the world by producing the right kind of data that will enable them to achieve that milestone, achieve that breakthrough, right? So there are a few examples of it today. Ai is really really good in English language, but if it comes to like different languages around the world, it's not so engaging yet. Around the world, it's not so engaging yet, and so a lot of the work that is going right now is to make these AI models have stronger performance in other languages like Spanish, portuguese, hindi, basically like following the world's population.
Speaker 1:Then a big focus for everyone is around agents. Like, okay, how are we gonna make um ai go beyond the assistant? Um, assistants are cool, but in most businesses, um, you need something more, like like replacing a, like a full function, like you know, whatever the job that is like, how do we augment that entirely? That means that the ai needs to be able to interact with different pieces of software, understand how to operate those software stack and so forth, and that needs to be taught to the AI systems. It's not, you know, yet as robust as people would like. That means they need new data, and so we are focused on that.
Speaker 1:And then the world is I, we're going to be very much multimodal, and so what? One of the great things I like about google gemini is that you could like upload a video and audio and documents and ask a comprehensive question that does analysis across all those modalities. And you know that is a a pretty convincing um experience to say like the world is, the AI systems are going to be inherently multimodal. They're going to be able to sense audio video. You know documents like all text and process it all together.
Speaker 1:And so what does it mean for the rest of the companies that are trying to build multimodal models, and how do we go produce that data at scale? And so, you know, reasoning is another one Turns out. So, like you know, one of the ways we've the recent OpenAI models that have been released, open models that exhibit really remarkable reasoning skills. A lot of the data that was produced were with, again, people with phd backgrounds, the domains and basically like taking a very abstract problem and decomposing that into steps of like, hey, how would a person make it into smaller problems and the questions one would ask in a more logical, rational way, and that that reasoning trace was then used to improve that system as an example. And so those are the things we think about in Labelbox in that context of this exponential trend and new capabilities and all that.
Speaker 2:Super exciting. On a slightly more personal note, I know you mentioned your kids. I have a two-year-old myself and you know it's funny because when I think about technology, developments in technology, I often think in the window of the next two to three years. But when I think about you know, in the context of my son, I'm thinking in the next 20 and what the world is going to look like when he's a grown-up. And frankly, I have no idea. I don't think anybody does. But you know for yourself, you know your series D, the. The average startup life cycle to exit is, you know, seven to 10 years. You're touching on seven. Obviously, I have no idea what's going on. You know in the internal workings of the company, but we're probably at a point where in the, in the not so distant future, you know the time may or may not come for an exit. So for yourself, in terms of you know life, life potentially post label box, and in terms of building a legacy, what do you want that to be?
Speaker 1:I don't have, I haven't really thought about anything like it, but I think one of the things most likely going to be true is that I will be, you know, I'll be building, I will be, um, um, you know I, um, I'll be building, I'll be building products and uh, solutions, and um, that's just very intrinsic part of who am I and who I am, and so, um, uh, I think, um, you know, um, it's that's the core essence of it, like it's so much fun and, uh, it's just my way of expressing in the world where, like, I love building things, um, in all different facets, um, whether it's products in softer land, whether it's like objects and furniture at home and so forth, and so, um, I think that's going to be likely true.
Speaker 1:Um, you know, um, we have, we have long ways to uh go with even realizing the many aspects of our vision at Labelbox. You know this AI alignment is becoming more and more important now than it has ever been, and we play a very important role in helping companies align their models, and so I think you know, so that's like that's kind of what I think about it we play a very important role in helping companies align their models, and so I think you know. So that's like that's kind of what I think about it.
Speaker 2:Cool, I love that. Well, manu, it's been such a pleasure chatting with you. Thank you so much for coming on the show. I really appreciate it.
Speaker 3:Thank you for having me, thanks for listening to See to Exit. If you enjoyed the episode, don't forget to subscribe and we'll see you next time.