The AI Gold Rush: How to Build a Winning On-Prem Strategy

Kirstin Burke: We’re excited about this TECH talk. Obviously, everybody out there, everything you read, anyone you talk to is all about AI. Whether it’s AI trip building or AI speech writing or AI is being integrated into just about everything people do. But when it comes to business and when it comes to thinking about how AI can help your business compete and thrive, I think that’s where businesses are really challenged right now. You know, last year we kind of heard all about these large language models and you heard about chatGPT and really kind of all of these behemoths and kind of how they were coming to market. And now the dust is settling a little bit and people are starting to think about, okay, so how do I do this? Where does it live? How do I make sure how do I make sure I get in on this gold rush?

You’ve got a lot of vendors out there hoping it’s a gold rush. The market numbers right now, it’s projected to be 750 billion just by the end of this year, and AI related technologies, around 3.7 trillion within the next nine years. And so it’s been clearly you’ve got a lot of vendors getting in on this and a lot of spending. But from a business perspective, what is it that we spend to make sure our business takes in a little bit of that gold rush and it really helps us do business better? So Shahin and I are going to talk a little bit about how to best take advantage of it and in a lot of cases, where should that AI workload live? And in more and more cases, people are thinking on-prem.

So, Shahin, how do you see this shift evolving just in the last year and what is pushing this kind of on-prem cloud pressure right now?

Shahin Pirooz: It’s really driven, I think, by primarily the fear that the GPU cost in the cloud is really high. And so people are concerned I’m going to move to the cloud and I’ve got these workloads that I don’t totally understand how they’re going to burst, how they’re going to scale and I can’t afford to get a million dollar bill next month when it was only $2,000 this month. So that’s really what’s driving it.

I think some of the factors that people have to really consider and think about as they’re going through this is what are some of the driving components or the gears that help you determine is if a this type of workload, a cloud workload, or specifically an AI workload is better suited for the cloud or on-prem. And we’re just jumping to the conclusion that we’re going to burst and we’re going to have a lot of GPU consumption and that’s going to kill us. That makes a lot of sense. And I think so many of these factors, right when you think of storage or networking or cloud or whatever, not, not a lot under the sun is new, but the way it’s being used and consumed or the unpredictability of some of this really I think is what’s adding a lot of pressure to folks. And it’s almost where cloud was a while ago, where, yeah, it sounded great but then you started getting the bills and you’re like, I’m going to be out of business if I keep doing business this way. So so you look at. Some of the large let’s call them retailers in the cloud.

You look at, for example, Netflix was one of the first to say, oh my God, this is going to kill us. And they moved 50% of their infrastructure on prem so that they had some static cost model around what’s going on that was predictable, but they could burst to the cloud as they need it when demand required it. So that’s kind of similar thinking when you know, we’ve gone back and forth over the years with do we have a. I’m going to put the quotes because I always hated the term private cloud. Do we have a private cloud? Which let’s call that on-prem or in a colocation, do we use a public cloud, do we have a hybrid cloud? And now there’s a lot of talk about edge computing, which is effectively it’s a balance of is it at my edge, is it at the cloud edge, do we consume cloud behind it? So all of this really is the same conversation that we’ve been having for at least 15 years, if not longer.

Kirstin Burke: So I’m an organization, I’m moving forward with this technology strategy. What would you say? And I know when we marketed this right we kind of talked about four areas. We talked about security, we talked about performance, we talked about cost and then we kind of talked about customization and control. So we kind of thought about those four areas as, you know, maybe these are some guideposts for you to think about as you decide where that workload lives. Maybe we could jump into each one of those and if you think that maybe one is missing or one is more important than the other, you know, you can kind of change those dials as we talk.

Shahin Pirooz: I would say, if I’m building a TCO, the things I would think about are number one, workload characteristics. So even going beyond performance, what is this workload going to do? Infrastructure costs which are a component of it, operational costs which are going to be those things that are reoccurring and not just bursts like infrastructure costs might be personnel cost which we often ignore when we’re building out TCOs.

And then as you mentioned, data security and compliance. And ultimately the last component of that is the scalability and flexibility. Like how much am I going to be able to burst or do I have flexibility to burst if I need to? Or are we stuck and we’re going to hit a wall? So those are the areas I would put energy into is if I was building a TCO and to break those down.

If we think about workload characteristics specifically to AI, some of the things we think about is are we training the AI or are we using it as an inference engine? Those are very different kinds of workloads. If you’re training your AI, if you’re using the platform to train your models and your AI from models, that takes an awful lot of the demands are substantially higher if you’re trying to educate or train your AI to become what you want it to be versus while inference workloads are really just much simpler in terms of contextually from those two. They’re more steady state, they’re more predictable. They don’t have bursts of education if you will, or pulling in data and processing data and then the amount of data, the volume of data that you’re processing and transfer costs associated with that.

So when you’re looking at cloud workloads, for example, you’re going to have ingest exfiltration cost associated with any data you’re transferring out of the cloud to someplace else. So that’s a factor to think about when you’re building out the workload characteristics, is it going to be pushing a lot of information to the consumer of your AI?

And then the last one is the consistency. Is this really just a workload that is steady state, that just keeps continuing operating in a similar way every month or is there going to be huge spikes that you have to design for? Those are the factors in terms of what type of workload it is that are very similar to the type of workload considerations when you’re considering cloud versus on-prem to begin with.

But what’s different about these characteristics are when you think about AI, you really have to think about is are we training or are we using this as an inference engineering? So spoiler alert. If you’re training you should do it on prem. If you’re using it in an inference model then there’s an opportunity to look at the cloud.

So there’s a lot of talk about now running all your training and everything on an on-prem infrastructure and then you running the actual AI operating and putting it in a production out of a cloud environment so that it’s easily accessible and can scale and burst and, and it’s more of a steady state approach. But the training is where a significant amount of the GPU processing is taken up.

The second category is the infrastructure cost. When you’re looking at in house you have to put a significant investment in upfront hardware expenditure and then there’s the tech refresh that has to happen every, whatever your life cycle is, three years, five years, 10 years, whatever you apply here. And in this context in AI, the market is changing so fast and manufacturers, you know it used to be the, instead of there’s gold in them hills, it used to be there’s Nvidia in them hills. But now all the other players are starting to come to the table. AMD is putting out their GPUs, Intel’s putting out their GPUs. So it’s becoming a little bit more distributed in terms of who you can pick as your GPU platforms.

That said, hardware expenditure is a factor in infrastructure cost versus in a cloud computing model. You’re looking at a pay as you go model. So it really becomes more of an OPEX versus a CAPEX approach. There’s manufacturers that are helping with that if you want, if you’ve decided I’m not doing CAPEX anymore. There’s plenty of manufacturers like HP with their greenlight that’s really trying to create this consumption based approach to Hardware and NetApp which has very equivalent Keystone capabilities which are, again, consumption-based hardware. But it’s all for on-prem.

The next category is operational cost and oftentimes people think of putting OPEX in this category and I think infrastructure cost is where you think about CAPEX versus OPEX. But operational cost is really things like the power and cooling. It guides us towards a concern with moving on-prem when we look at AI workloads because AI workloads run so much hotter, they need a lot more air conditioning. Oftentimes there’s water cooled data centers that people are putting in place. So depending on the size of your AI infrastructure on premises you may want to consider colocation as opposed to on-prem. So on-prem changes a little bit, but colos become a little bit more predictable. And there’s plenty of colos out there that are designed from the ground up as water cooled colos specifically. Actually it was for the bitcoin mining rigs which also happen to be very GPU intensive. And now they don’t know what to do with themselves because crypto is not as big of a thing as it was when they started. They decided they’re going to be GPU data centers. There’s plenty of data centers out there with capacity that are water cooled to help with this.

The other operational cost is the maintenance and support. So not only are you building this stack, but you have to update it, you have to patch it, you have to continuously improve it, you have to do hardware refreshes. So that’s a factor that falls off.

And the last thing is the software and licensing which is on top of running your infrastructure. You now have to have software that is doing storage as service or buying storage as infrastructure. You have to have software that is your database infrastructure, your large language modeling, your rags, all the stuff that, the software that makes this happen and the design of the inference engine or if you’re writing it yourself, the in house developers.

So I’m going to recap as we go just so that we don’t lose track of all the stuff I’m saying. Number one was the workload characteristics, infrastructure cost, ops cost, and the next one is personnel cost. This is now you’re crossing into an area where your traditional IT folks may not have the right skills to manage this stack. All of a sudden we’re talking about water cooled when historically we’ve said no frickin water in the data center. So there’s factors of re-education and learning. This new structure. You need data scientists, you need machine learning people and you need infrastructure. So it adds characteristics to the skills you’re having to staff. And then if you are bursting you need cloud platform expertise. So all of the software and infrastructure. I talked about infrastructure. I talked about each of the cloud practitioners, the public cloud practitioners has built their own database as a service, their own infrastructure as a service, GPU as a service, storage as a service that you can tap into, and you create microservices to do the things we’re talking about. Which means that you have to have resources that understand what all those services are, and how they integrate, and how the security plays, and what so on and so forth.

Which leads us into the data security and compliance. Security measures are pretty critical. If you’re doing an in house deployment you now have to worry about not just the security of the software you write and the access to the data center, but it now becomes physical security. You have all this data inside this on prem infrastructure who can access the building, how do they get from the building to the data center? Who can get into the data center? Once they get into the data center is there any controls that prevent them from logging on the systems? So, you take on all the burdens that AWS and Azure and GCP take off of us for physical security back onto yourself.

You have this compliance issue regardless of whether you build it on-prem or not. Security has always been, this is in the application stack. So there’s the physical aspect of compliance really comes into play in your own environment because you can lean on the SOC2, SOC3s that the cloud providers have for the physical security and compliance requirements. But where things start to change is human factors in terms of who has access to what, need to know access, the type of access, how you get developers into the environment, how you restrict access to the things developers need to see versus don’t. How do you prevent IP from bleeding into your AI modeling? How you prevent in the injection of false data into the learning models? Theres a lot of attacks right now which is people are trying to put hacks into the learning models by creating websites that the learning models will scan and pull in. How do you protect against those things if it’s your own AI? So the data Security and compliance really changes from the traditional information security of firewall, endpoint, network security to now you really have to get deep into the application security as well.

And then there’s the scalability and flexibility in terms of cloud that’s obviously super easy. You’ve got on demand scalability, meet your needs. The corollary to that is that can get really expensive and out of control. Whereas in house you’re going to implement some innate limitations and that helps in terms of determining how big a workload can you train in house before you run out of capacity and you have to add more capacity. So you’re, you’re creating this limited scalability by nature of the fact that you, your container is only such a size and so it would require additional investments as you grow your requirement for training and growth. So again as a quick rehash, understand your workloads, what are they, what the characteristics are of them, what kinds of infrastructure requirements are you going to have? Operational costs associating with running the environment, personnel costs with regards to hiring and training staff that understands what you’re building whether that be internal or contractor data security and compliance as it relates to not just the physical security but also embedded into. How do you protect this? Effectively you’re creating a virtual human that is learning and you’re teaching it and you’re growing it. How do you prevent it from being corrupted?

Understand what your scalability requirements are. If your data is constantly bursting up and down and it’s, it’s tons of different types of data and tons of different types of use cases, then building an on-prem is going to be very difficult because you have to spec it to the largest spikes as opposed to some steady state format.

Kirstin Burke: It’s interesting when you mention all these things because, to your point, there’s a lot of strategy and thinking that is similar to cloud. Kind of like we’ve been there before, we’ve thought of this before, and we just apply those factors in a different way. But when we look at some of these things that are new they’re time intensive, for example personnel, new training things that they don’t necessarily know yet, and when we think of the pace that this is moving and the pressure that organizations are under to really kind of get their hands around. And then you have like reports like Deep Seek coming out that all of a sudden upend everybody for a while because they’re thinking, oh my gosh, we’re thinking about this the wrong way.

How do you encourage organizations to move forward confidently with all of this change around them? It’s like, what technology do we go with? How do we find the right people? There’s all of this movement and uncertainty that require time, training, whatever, yet we gotta move the ball down the field. How how do you advise folks in that area?

Shahin Pirooz: So I would say, enterprise architecture hasn’t changed. What the underlying technologies that make up that architecture have changed. And your enterprise architects should really have a good understanding, solid understanding of what is modern versus what did we do 30 years ago. But enterprise architecture takes into account what’s the outcome we’re trying to achieve and how does that align with business goals and needs. And then let’s find the best integrated set of texts that make that happen, make that outcome happen. I would say a couple things here.

When you’re trying to make this decision about where do you go, what do you do? The key thing I would say is let’s not forget some of the lessons we’ve learned over the last 30 to 40 years, which is our data has become more and more valuable and everybody wants to get our hand, their hands on our data. So what’s changed all of a sudden that you feel like you can take your data and dump it into a generative chat architecture in the cloud and give away all the gold that you’ve collected over the years to somebody else to make it their gold, and make it so that they can expose it to all of their customers and everybody else.

So when you consider AI, it’s very it’s very easy to say I can just use one of the pre built platforms and load my data into it and I don’t have to do the development to build an AI. The step back question is what are you trying to develop AI for? What is the real goal in the context of I need an AI and we got to be careful not to say everybody’s doing it so I got to do it. It’s a factor for sure. But what is it that everybody’s doing? Everybody is exposing the intelligence. They’re adding intelligence to the data that they’ve collected over the years about their business and exposing those things that are customer facing to customers to be able to do some self service. Or exposing more intelligence and faster computation over support data so that your support personnel can answer support questions faster. Or adding a financial like analyst in a virtual way to your financial data so you can make financial decisions faster, make business decisions faster, or analyzing your sales data to better understand what are the heuristics around the intent for acquiring our product versus a competitor. Where are we going with these things? All of those are use cases and business cases to evaluate.

Evaluate when you’re considering what do I want to get from an AI? And I think that’s the first question to answer before you start asking the question of what, where should I put my AI? The first question is what is it? Let’s, let’s go back to that workload question. What’s the workload characteristic? A simple example of that would be if I have a tremendous amount of data and let’s say I’m a manufacturer and I’ve got a tremendous amount of data around support cases and escalations that customers have had and bugs and bug resolutions. I have a tremendous amount of information that’s valuable that I can have. I can train a model to go look at that data and figure out commonalities across that data and put together the equivalent of an interactive FAQ that can actually answer questions and let my customers and my support staff be able to interact with it to get quick responses to how do I fix something that, that somebody else has seen. Rather than relying on your employees to remember that and relying on them to create FAQs. That’s a brilliant use case and one of the first use cases that AI workloads were applied to. There wasn’t a generative chat capability. We didn’t understand generative chat like we did when OpenAI released ChatGPT. Now our eyes have been open that oh my God, we can create a virtual person that we can almost talk to and it’s so much faster than Bob, who was the smartest guy in the room.

So start with what is the workload that I need. What is the problem I’m trying to solve. Faster support or faster financial decisions if I’m an investor or faster close on sales because, or walk away from sales, qualifying a sale or disqualifying a sale quickly because I understand the content, the words that this person understanding their intent, that we’re on a call, they’re saying every single time there’s not a thing I’ll buy from you. But the sales guy comes back and says, oh, they’re absolutely buying from us. We got this. And so it gives you visibility into how to operate your business better. And I think any company who’s not looking at AI to improve these things is, is at a disadvantage over companies who are. It was like when we used to say it is not a differentiator, but it equalizes the playing field. AI equalizes the playing field today. So it is new it. So start with what problem or use case am I trying to solve? And then ask the question, do I have the data to train the models to answer the questions I’m asking here? And am I comfortable with putting that data into a public AI platform? Or do I need to build it myself? If I’m building it myself, should I be building it in a public cloud? Or are the use cases such that it makes much more sense from a security perspective, from a spiking perspective, all that to build it on prem? And that’s the way. I’d work at it backwards. Right now everybody is saying, do I go on prem or do I go in the cloud? That’s the wrong question to start with. That’s the last question in that series of questions.

Kirstin Burke: So that goes a long way towards answer answering the next question I was going to ask, which is where are people getting stuck? And it sounds like your experience is people are getting stuck maybe putting the questions in the wrong order. So, the question that you should be asking first, you’re not answering until later when you’ve already made some decisions that are in effect affect your ability to be successful.

Shahin Pirooz: 100%. I was just recently in Vegas at a conference, and I was having a conversation that had nothing to do with our industry because I was in a gap period where I was just talking to some random people and they said, what do you think about this AI stuff? And I said, well, what do you mean? And they said, well, we’ve got an AI project right now and we’re going to put it in AWS. I said, okay, good, what made you make that decision? They said, well, we figured we don’t want to spend all the money and we want to make it all CAPEX. So I said, what AI are you building? We’re not sure yet.

This is an exact example of putting the cart before the horse. How do you know that workload is well suited for AWS if you don’t even know what the workload is? If it’s experimental and you want to build it in a lab and see how it goes, then great. You can do that and you’re not making capital expenditures. You can shut down the project if it fails. I’ll save you time and money. It will fail if you don’t know what you’re doing, if you don’t know where you’re headed. Like, start with what problem am I trying to build the solution to? And if there is no problem or there is no solution to that problem, you need to reassess before you start thinking about where does it go?

Kirstin Burke: One hundred percent. So as we wrap up, this is fascinating. I love getting a half hour of your insight into all of this because you are seeing things and hearing things and experiencing things at a level, because we have so many clients and we’re having so many conversations and we’re in the middle of so many projects, you are experiencing these things at a magnitude that not many other people do just by nature of the role that you’re in. So that experience is so interesting to others as we listen to this.

When we offer ourselves as DataEndure to folks who are asking these questions how would you leave it with folks? How can DataEndure help if they have a curiosity, if they’re thinking, gosh, I’m here and maybe I’ve gone too far, or I’m not sure how to start? How might DataEndure, assist them in this journey to find gold?

Shahin Pirooz: Sothe easy answer is, we can help with getting that first step unstuck, helping to understand, what is your business, what are some of the right problems to be considering for AI. And then, by extension, what are the solutions. And then, from what I shared today, there’s really three general observations that you can take away from this. If it ends up that we’re looking at creating a highly variable workload for you and it requires rapid scaling because there’s bursts and volatility associated with it, then you probably want to look at a public cloud deployment. It’s going to be the most cost effective solution for you because you don’t have to build a platform that supports the spikes. You build a platform that goes up or down with the nature of the workloads.

If, on the other hand, it’s a highly consistent high volume workload, then you really don’t need to go in the cloud because you can build a physical stack that matches that consistency and volume. And if you’ve got any data security, strict requirements that require it to be in house, then an in-house deployment is the most suitable solution. And finally, it’s super important for everybody to be taking a thorough look at this cost analysis from those two perspectives. The entire life cycle of the project has to be taken into account, not just the build. And a lot of people do TCOs from a build perspective. They don’t think about the maintenance, the ongoing support. And so if you just look at, you know, it’s going to cost me, I’m going to use easy numbers, a million dollars in infrastructure, but it’s going to cost me only about $300,000 a year in AWS. My TCO is three years. And in reality you’re not taking into account the people, the operating, the power and cooling. And maybe AWS in that scenario would actually it would take you five years to match the TCO. So maybe it is equal if you’re doing a five year renewal cycle. So I would say those are the three factors I’d take away from today’s dialogue.

Ultimately, this is not a one size fits all. So bring us in to help you determine what size T-shirt you wear. And we’ll be honest with you, We’ve got some great resources. We’ve got cloud architects, network architects, security architects, compliance architects that can all be brought in in those enterprise architecture roles to help you come to a conclusion on what it is you need to do and where you ought to do it in the most cost effective way.

Kirstin Burke: And to fill perhaps some of those gaps of the personnel that maybe you don’t have yet. You mentioned the fact that this is kind of an expanded or different type of person who are thinking about these things differently. And so we can come alongside the talent and the expertise that you have to make sure that. That you’ve got that well rounded perspective as you make your decisions.

Shahin Pirooz: Exactly.

Kirstin Burke: Shahin, thank you so much for joining us today. Thanks to all of you for joining us, and we will see you next month.

Managed Services

Compliance

Complimentary Health Checks

In-Depth Assessments

Security & Compliance

Information Management

Cloud & Data Science

Infrastructure

Network

About Us

Partners

Learn

Connect

Get started today!