Intro:
[music plays]
Niki: I’m Niki Christoff and welcome to Tech’ed Up.
On today’s episode, Jessica Powell, CEO and co-founder of AudioShake, joins me from the Bay Area. She’s my old boss and she’s a serious music fan. AudioShake has used AI for several years to pull stems from songs. (Don’t worry; this episode explains how that works and what it means.)
With AI dominating tech right now and all of our headlines, we know that the future of music will be impacted, and Jess gives us her predictions.
Niki: Today on the podcast, I am thrilled to welcome Jessica Powell, who's calling in from the Bay Area.
Jess, welcome.
Jessica: Thanks for having me.
Niki: So the reason I'm excited about this episode is we do a lot of policy. We do a lot of crypto. And today, we're talking about something that's sexy and techie, which is music and AI. And I am grateful for you taking the time to chat about what you're doing.
Jessica: Thanks for having me.
Niki: So, just this quick background, Jess and I go back to the, like, [chuckling] Pleese- Pleistocene era? How do we say it?
[crosstalk]
[both laugh]
Niki: the Jurassic- dinosaur era at Google. We were there for a long time before even smartphones and for a period of time, you were my boss, for which, I give you my condolences. [both chuckle]
Thank you for managing me.
And we'll talk a little bit about your international time because it, it leads into what you're doing now for a living, which is you are the co-founder and CEO of AudioShake.
So, let's talk about what you're doing. What's AudioShake?
Jessica: Sure. So we are an AI technology that splits audio into its different parts, which when you hear that, you're like, [chuckling] “What? I don't understand.”
So, if you think of a piece of music. , you could split that piece of music into the vocals, the drums, the bass, others, for example. Think of a conversation in a room. Maybe we want to separate the crowd from the person speaking. Maybe you're editing a podcast and you want to separate you and me talking and any background noise that might appear.
So basically anything that if you can split up audio, you can all of a sudden do all kinds of really cool, interesting things with it that range from basic editing to actually powering, like, experiences at scale that otherwise couldn't exist today, where you are, for example making it possible for someone in the U.S. to create their video in 40 different languages and ship it around the world in a matter of seconds. Making audio interactive where I move my hand while I'm playing with an app, a VR/AR app, and the music responds to what I'm doing.
It's a ton of different use cases, but essentially what we do is use AI to make audio customizable, interactive, and editable.
Niki: So, I want to dig in a little more to some of the specific examples that you talked about, but as I was preparing for this episode, I heard you talking about stems. And so, even though we don't have a music audience, I think it's kind of nice for people to learn [checking] a piece of vocabulary about tech. So, each of those pieces, like the vocals or the bass, is called a stem, right? And you're basically separating out those stems. Is that correct?
Jessica: Right. Exactly! If someone wanted to separate the dialogue from the music in the back, right? Those would be the dialogue stem or the music stem. So, just think of them as the parts.
Like, if you were, if you were taking a hammer to a song if you could do that and you were splitting them apart, you would separate them into their different instruments, for example, and those would be the instrument stems.
Niki: So, okay. Which leads to how you started thinking about this. So, like, I mentioned, we were both at Google and these sort of big corporate communications jobs. Now, we're doing something, [chuckling] both of us are doing things that are really different, which I, which I love for us, but some of your time overseas and in Japan influenced this concept, right?
How did you end up starting a startup looking at music tech?
Jessica: Yeah. So, I lived in Japan and along with my co-founder, and we both did a ton of karaoke. And [chuckling] karaoke is amazing, particularly in Japan and in countries where they love karaoke, but it's still terrible, right? Because you are largely doing re-records, so you're not singing along to the original songs and on top of that, the catalogs are very limited.
So, I was really into old punk and old hip hop, and I couldn't find those. I could sing Katy Perry, or I could sing Oasis, but I couldn't get these songs from like the 80’s that I really wanted to sing. And so, Luke and I kind of had this moment where like, y’know, what would be really cool is if you could strip all the world’s music, like the original songs of their vocals and turn any track into a karaoke track.
And that was the seed of an idea. We started; first, it was very much just a hobby and like a fun cocktail party trick, and then it turned into something real.
Niki: So the idea that you could strip out the vocals and just have the instrumental, a lot of older music, they may not have been able to pull out the vocal. So you can, after the fact, using AI, do that with any song in theory and make an instrumental version.
Jessica: Right. Well, so it's interesting. Karaoke, as I understand it, the way, the reason that they didn't use the originals is because, in the same way that I think when you and I were growing up, you didn't really see Hollywood stars on TV shows. Right? Like, there was a, there was a difference between the Hollywood star and the TV star where you, whereas you see a lot more back and forth between those two now. Right? And in the same sense, I think karaoke when it first appeared, it was seen as a lesser art and artists did not want to be associated with karaoke.
And so, an entire industry popped up, which was the rerecord business. Where companies would pay studio musicians to create, to recreate these songs, they would still pay for what's called the publishing, the songwriting for those songs, but they would not pay for what's called the master recording. So they would not pay for, say, um, the original Katy Perry track. They would pay for the songwriting of the Katy Perry track and record their own version of that and then that industry has just persisted.
But I think with the, what started with YouTube, right, which sort of normalized mass content consumption and creation; that then turned into TikTok, which is normalized participation and throwing yourself into music.
Today's artists have really grown up with a whole new experience of how their fans experience their music and how they communicate with their fans. I think the attitude towards something like karaoke and a whole bunch of other uses that we'll probably get into has changed quite a bit. And so, now there is quite a lot of interest in doing karaoke to the original tracks, which makes intuitive sense, right?
Fans want to sing along to the songs that they, that they love, not another person's version of them.
Niki: Yeah, exactly. And not the limited catalog, which [Jessica: yeah] I, like, I'm the worst karaoke invite because I'm, I'm the American Pie Girl, which is just terrible. [Jessica: laughs] And I apologize to anyone I've ever done karaoke with!
Jessica: I'm Patsy Cline. I do Patsy Cline. [Niki: Oh Yeah, That’s] Only cause it's, as long as you stay in her low register, like, you can, you can pull it off.
Niki: You just mentioned TikTok and the idea that artists are thinking differently about their music and how their listeners consume it and their fans consume it.
The last concert I actually went to was a Green Day concert. I'm, I'm from a micro-generation part of Gen X, so I thought I was at a Weezer concert, but it was actually, [chuckling] they were actually just the opener. It was actually Green Day. And you have a really good example of how Green Day used AudioShake.
Jessica: Oh, yeah! So what they did was, in their case, this actually falls into kind of the older music category.
They have an album from 1991, that has a song on it, 2000 Light Years Away, which is very well known, they lost all the masters, so they lost the recordings for that album, which is kind of crazy [Niki: It’s unbelievable] It's very sad and not uncommon, right?
Let me come back to Green Day, ‘cause you said something earlier that I, that might be useful for people to think about.
The way music used to be recorded is that if we go back to the 1930s - that's essentially, like, a live recording. Everyone's in the room. They're playing. There's no way to split that apart. Right. So, if you think of, like, for example, we have separated, y’know, When You Wish Upon a Star, we've separated Nina Simone. These are recordings that you'd have no other way to split them apart. They were never even necessarily multitracked or those multitracks; those tapes might've been lost.
Then you get to the Beatles. And they start multitracking. That means you've got John and Paul and Ringo all recording their parts. Then you make the shift to, and you could then conceivably later create stems from that. So, instead of having say, 40 tracks, y’know, all the different guitars and so forth, you create one guitar, extract one guitar stem.
Anyway, the, the definition, the difference between tracks and stems really does not matter that much except to audiophiles, but just saying it ‘cause someone on this podcast will be like, “She's confusing the two terms!”
I am not good, sir. I promise. [Niki: laughs]
Moving on, we get to the transition to the digital era. And a lot of, no one was really holding on to stems necessarily because they didn't know that there was any value. And it really wasn't until about 2015, 2017, that contracts became much more buttoned up and requiring stems to be passed on.
So the vast majority of the world's recordings don't have stems. And the recordings that today do have stems, which is most contemporary content, most contemporary recordings, those stems are wildly different and variable. So, they're fine if you want to make a remix and you're working in a, what's called a digital workstation, a DAW, but they're not fine if you want to do anything algorithmic with them because they are a mess, a jumble. They're, they weren't made to be worked with at scale.
AudioShake basically addresses both of those issues because we can create stems and do these separations for any kind of content, any kind of music, or dialogue. We can both open up older music to revenue streams that exist today for artists as well as make the current content usable in coding environments, which otherwise they wouldn't be.
So in Green Day's case, they're in the first bucket, which is the this older content that didn't have its stems. And they used AudioShake to split their track apart into the vocals, the drums, the bass. They uploaded that to TikTok and TikTok has a cool feature called Duet that lets users basically download the audio and play around with it.
This Duet audio basically made it possible for all of the Green Day fans to play guitar along with their favorite band.
So any of us that ever learned to play an instrument when we were younger, like sitting on our bed, trying to like, in my case, trying to pick out baselines or trying to imagine ourselves in the band like this, they basically made it possible.
So it's a very cool use.
Niki: {interrupting excitedly] So, you just said, “learning to play the baseline.” So just, I'm not going to put too fine a point in it, but you were giving an example of learning to play the bass to Fugazi and having to, like, go back on your CD to try to play it along. And what you're saying is you can, you can pull that out completely so you can hear it on Duets, and you can hear yourself playing with the actual music.
Jessica: Yeah, exactly. I don't know how Ian McKay would feel about AI intersecting with his music, but at least we've got [interrupts herself excitedly] like, he's from a local DC band!
Niki: Yeah! I know! [chuckling] I love Fugazi. I was, meanwhile, like, learning to play the oboe to the Jurassic Park theme song. [Jessica: laughs] Very, very hip. [Jessica: Very cool!] Super, super cool. [chcukling] Just take off my headgear and play my oboe. I wish that wasn't a real story. It is!
So that is an example of where the masses can use it, but you're mostly focused on artists. You're not making a consumer product. Digital contracts might include like, oh, “You own these different pieces to it just going forward.” But if you're like an indie artist, you may not have thought about that ahead of time or have it available.
So, you guys have like a platform where if you're an indie artist, you can just go on and kind of do this yourself so that you can get the pieces of your own music, but you're focused primarily not on like a consumer product, although you could be.
And I'm curious about your thoughts because we're in an AI moment, and things are being open to consumers, but you're really focused first and foremost on the musicians.
Jessica: Yeah. So, we are a B2B technology. We essentially provide audio infrastructure. So, too, it can be directly to musicians and artists. I imagine you're talking about all of the well, certainly all the debate around generative AI and the arts.
Niki: Exactly!
Jessica: And training data. And then, of course, like, the “Fake Drake” thing, as well.
Niki: The “Fake Drake”, right! So that's what I want to talk about. You have this Medium piece, which we're going to link to, about “Fake Drake” being a deep fake, which I thought was a really interesting analysis, so.
Jessica: Yeah. I don't know if people agree with me on it.
[crosstalk]
Jessica: I have no idea. Well, the thing that's interesting to me, so, just, I guess for your listeners, a song popped up by a, someone posted under, I think it was called Ghostwriter, said that the song was a hundred percent AI that had been created entirely with AI.
And the song was Drake and The Weekend singing, and singing about Toronto, singing about Selena Gomez, but of course, they did not sing any of these things. It was generated by AI. And the song went viral, then, and, and all of this, I should just also say as an aside, the song - highly unlikely it was actually 100% created by AI.
That point might be academic. The research community cares a lot about it. It might be academic. Eventually, you will be able to do this fully with AI, but today you kind of have to piece together a lot of different AI technologies. And then still do some human, I think, editing to get it to sound as good as that track did. [Niki: mm-hmm]
But we'll put that aside for the moment. The song goes viral. Music industry freaks out, issues takedown request, and then, y’know, and then that leads to more fake covers being created. And the thing that was interesting to me watching the whole thing was that it reminded me a lot of when I was a kid and then a teenager and being someone who was really passionate about music and watching all the debates that would happen around remixing and sampling.
And when I was younger, that was sort of in the era of, like, DJ Shadow and J Dilla and those kinds of things, but, of course, those guys were coming off of the tradition that was started by, like, The Bomb Squad and Public Enemy and y’know, all of the hip hop artists, y’know in the in the 80s that were doing these really amazing things with sampling which was seen as this very subversive or renegade and illegal thing, right?
And what's interesting to me is that if you just look at remix culture, the perception of that has evolved considerably since those days where now today people recognize, it's still not authorized, but I think labels and AR executives and artists recognize the value that remixes provide, right? Doing a calypso take or a reggaeton take or whatever it is on the original work can actually drive entirely new listeners to your music.
And what I thought was interesting seeing the “Fake Drake” thing is I was like, well, you could actually look at a lot of this through a remix lens and think of this as, sort of, just the next iteration of that, right? Isn't it natural at a time where we're all throwing ourselves into music and speeding up, slowing down songs on TikTok, or trying to karaoke or, y’know, insert ourselves into the music?
What, what more logical next iteration of fandom and fan engagement than trying to create music with your favorite artist's voice, right?
And most of these things, like remixes, are not going to go viral. They're not going to be that good and if they do, the label would have mechanisms in order to do takedowns or to claim them for monetization purposes. So, I can see a whole infrastructure that could easily spring up to support all that.
At the same time, I think one thing that's really fascinating to me; I'm generally very worried about deep fakes [Niki: l’m also very worried] on the internet, like [Niki: Same, same!], and I think it's really, sort of, fascinating the different ways that we treat different industries and what we will tolerate. Right?
That we will, for example, we have no problem spending, I don't know how many dollars every week on our Starbucks, but, like, paying more than $9 a month to access the entire world's catalog of music is somehow offensive to us, right? And somehow that we are all so disconcerted by political deep fakes and would be outraged to think of people making deep fakes of ourselves.
Like, think about how you would feel if, like, there was someone that created a deep fake, which wouldn't be hard to do. It still wouldn't be perfect, but like, cast forward a couple years of you doing a podcast where you're asking terribly offensive things or you're saying offensive things, right? And you don't even have to be a public figure for that immediately, I think on a visceral level, to really just kind of be, like, “That, that, that shouldn't be allowed.” So, I think it's an interesting question.
Like why is that? Okay to do in certain contexts and yet in music, we don't think of the artist’s voice as there’s somehow, we think it’s ok for it be reappropriated.
Niki: Yeah, it's true. You're absolutely right! The most disconcerting thing is to me is when you have a deep fake, obviously of a, of a politician where they're saying something they never said, [Jessica: yeah] and that is getting to a, to a space where it looks eerily real and people truly can't tell the difference.
I mean, I guess that's because we think the stakes are obviously so high, right? It's going to change what voters think was said. It's very hard to put the genie back in the bottle, but you're right, it's a huge, I would feel, and I hope no one does make a deep fake of me, or if you do make me sound cool or like smarter than I am, but like, I don't want a deep fake made of me!
I have friends who are reporters who've already had people trolling them trying to do this. It's kludgy, but, like, it's a way of harassing them. And you're right. When we think of artists, though, say that these are essentially Drake and The Weekend's voices being misappropriated in a way that is like a deep fake that should, should unnerve us, but it doesn't! And maybe it goes to, I think it's, it is a violation of their, their privacy or right. I don't know. I don't know which law it is.
Jessica: Yeah, I think it's the likeness, y’know, so I think, like, on a legal side, [interrupts self humorously] let me do everything that as comms people, we always told people not to do, which is to freelance on legal stuff! I have no doubt you have more than a few, in a DC podcast, lawyers listening to this. [chuckling] You’re a lawyer!
Niki: I'm a lawyer, but also, I haven't practiced law in 25 years.
Jessica: The right to publicity and likeness rights and everything, I think are what ultimately gives the artists, or the labels acting on behalf of the artists, the ability to request takedowns and everything.
But I think what's interesting, I mean, and I do think, y’know; obviously context matters alot. Things that are being created as parody that are - I mean all the things that fall into fair use and so forth, like how the nature and the intense and the purpose of summary appropriation is relevant. And so, I don't want to draw too much of a false equivalency between political deep fakes and artistic deep fakes.
And I think, again, like the majority of these AI covers or AI fakes in music are coming from a place of appreciation and it's fandom and it's people, right? I'm not creating a, an AI cover of an artist I hate. Why would I spend the time on that? Right? [Niki: Right] Like it's more like I love Beyonce and I want to hear Beyonce singing this Ed Sheeran song because I think it'd be wild to imagine her voice singing on that song.
So I think all that does matter and it's all relevant, but I think my point in writing that post was just that it's so easy to think of these things as just another product. And I think it's just important to keep in mind that this is someone's voice, which is like this incredibly intimate thing. And even though we think that we own these artists and that they're kind of almost like goods [chuckles], it's, like, it's them, you know, and I think it's just, we need to keep that in mind.
And so, I think there will be a whole infrastructure that will spring up and there'll be a ton of artists that will eventually take advantage of this. But I think it's just very easy, particularly in tech, for us to forget about the human side of it. And I think it's important for us to think of it as from that human angle as well. ‘Cause some artists aren't going to want that, and you're not going to be able to get rid of piracy and infringement or any of those kinds of things.
Like, once the genie's out of the bottle, it's out of the bottle. But if you can create underlying infrastructure that creates more controls and creates, like, maybe a legitimate marketplace for the artists who do want to, I think it does a lot to chip away at the Illegal or unethical activity,
Niki: Which sort of leads us back exactly to what you're doing, which is if you provide artists, or their managers, or producers with the ability to expand the reach of their song in different ways where people could, fans, can use it and play with it and take their actual vocals and remix them or sample them in a way that they, hopefully, would be compensated for, but even if it just, even if they just decided, “Well, I wouldn't want to take it down because it's expanding the, the use of my music.”
That's one place to put the energy toward rather than, “Oh, let's get the robots to spoof their voice saying something they never said.” So, it sort of creates a, as you said, like infrastructure for the musicians to monetize what is already happening, which is people want to be creative with the music they're hearing.
Jessica: Yeah. I just want, I'd really like to see [chcukling ruefully] artists make more money and I'd really like to see them have more options in what happens to their music. So, my hope is that AudioShake can help expand the opportunities for that music by enabling all this interactivity but that we can do it in a way where the music industry and the people they represent are partners rather than just having everything imposed on them.
Like, the classic way that everything works in the music industry is a tech company comes up and they infringe, infringe on the back of the content of the industry, like, of the artists, like they build their user base on that content. Eventually, they get to a size where they're large enough, then the industry sues them that then leads to the license, which then normalizes the use.
In that lawsuit, it's very rare that those payouts actually go back to the artists, but then you have a licensed use going for it forward. But it's crazy, right? Like, that's not how it should work. I think, y’know, I am sympathetic to people that are building things in the music space. Because if you are a small company, if you're a startup and you want to do anything with music, like, you can't, right?
There's- you, you do not have the know-how or the staff to go, to know and to go negotiate with at least the three major labels, the 15 largest publishers, the PROs, and manage all of those conversations. Like, you just would never launch your product.
So, I also think that there's a lot that the music industry could do to make it easier for people to be compliant right from the start because I don't think most people, despite what people think about, y’know, people working in the Valley. I don't think most people wake up and think, like, they want to steal content, right? [chuckles]
It's more like they, they see this super exciting use that they want to build, like, a YouTube or a Musical.ly, and there's no way for them to build it at the start. So yeah, I'm sympathetic to both sides, but in terms of us, we really wanted to start off from the start with artists.
Niki: And I actually realized I had a personal example of a use case, which is the intro music to this podcast. And by the way, I'm not taking feedback on it, although people have given me feedback on it, [chuckles] but it's an instrumental jazzy version of a song I really like.
I had originally wanted to get a slightly different song and I reached out to the producer because I wanted to pay for it. I didn't; I'm not stealing anybody's music. And what he said was, “I actually would love to let you use this and license it to you. We do own it, but there is a sample on it that we don't own. And so, I'm not comfortable doing that.”
And what I realized after learning about AudioShake is you could strip that out in theory, the one little, it's one little thing that he could easily potentially take out and then could sell the music which the artist owns and I would pay for it just like I'm paying for the music I am using.
And it never even occurred to me that this is a- I should actually let him know! But it's, like, such a great song, but he couldn't he couldn't license it and I did want to pay him for it. So-
Jessica: Yeah, we have a, there are, there are a lot of folks on the label side that will use our tech for that. Y’ know, the albums that were released where perhaps the samples couldn't get cleared and then they wanted to remove them, or perhaps samples were made that were never authorized to be made.
Niki: I just want to hit on one more thing before we wrap. So, at the very beginning, you talked about touching a wall and a sound would happen. But I think that there's a cool application that we didn't cover, which is the video game world.
Jessica: Mm-hmm [Niki: and VR] Yeah! I mean, imagine, so today, imagine you're playing a video game.
There's maybe, I don't know, a handful of songs in that video game that someone prepared for that, that game. The gaming company would have gone and licensed those songs, would have gone to the label, would have asked for stems, maybe they would have had the stems, maybe they wouldn't have had the stems, it would have taken them forever to get the stems, but the very nature of what they're doing, they can only probably get a handful of songs.
Gamers listen to more music than your average consumer. They also turn down the music a lot of times in the games because they want to listen to their own music. [Niki: Aah] They don't want to listen to what the gaming company has like imposed on them.
So, imagine a world in the future where they're playing in their living room, they're listening to Drake, and the system detects- this already is very easy to do - the system detects that they are listening to Drake, it matches it to the Drake in their corpus, it's already been separated by AudioShake in a really standardized way. And then they're just applying rules to that music so that when you are going into the caves if they want to create a creepy sound and they want to drop all of the music except for the bass, right? They're calling the bass stem and they could do that for like a million songs, which is not something you could do today.
So, anything where you're thinking about making audio customizable and interactive and responsive to what a person is doing, you need some element of standardization, and that's where we are helpful.
Niki: I think that's a perfect anecdote to end on because it kind of sums up everything you're doing, which is you're applying a tech solution, AI, to create more options for consumers in a way that could never be done manually. So, it gives people more creative control over the environment they're working in.
It gives them more options to give this, the artists more avenues for fans to listen and use it.
And you're doing it as we're, we're ever more online and have higher expectations that we can have creative control and, and playfulness, but, but while thinking through the remuneration part of it and the respect for the artists and their voices so that it gives people choices.
So, I love it! I'm totally for what you're doing!
Jessica: Excellent. Great!
[both laugh]
Niki: If anybody is interested in using it, you guys are at AudioShake.ai, but more likely, people might be interested for sure in following you. You're on Twitter; you're on LinkedIn, you guys have won a bunch of awards. That's how I started tracking what you were doing.
I was like, what is Jess Powell doing? Like, [Jessica: chuckles] something with music?
So, we should link to those any other call to action or anything you'd want to say - last words?
Jessica: Y, I'm on Twitter for as long as that exists at themoko. The things we come up with when we're young. [chuckles] I don't know.
And then, and LinkedIn, Jessica Powell. And if you want to try, if you have independent music and you want to just try the technology, or you have a podcast, or you have dialogue you want to separate, you can go to indie.AudioShake.ai. Which is Indie, I N D I E, dot, AudioShake, dot, A I, and you can try it out for free.
Niki: Awesome. Jess, thank you so much for taking the time. Thank you for coming on. I'm really grateful.
Jessica: Yeah, thanks for having me.
Outro:
Niki: As always, for listening. Our next episode shines a light on the deepest and darkest parts of the web - the good, the bad, and the ugly. I kind of thought I knew what the dark web was, but guest Matteo Tomasini really breaks it down.
If you like this show, please take a moment to leave a review on your podcast platform of choice. It really helps people find the content.