We've all experienced the frustration that comes from too much or too little service management in your test environment. Lately, the DevOps engineer in me has been thinking about how we end up in one of those states. How can we get just enough service management in non-production environments?
Production environments require more service than non-prod environments. But we shouldn't throw the baby out with the bathwater when it comes to service management in non-prod. I'm a software developer who practices DevOps, so I do a lot of work involving operations, deployment, and automation. I interface with many groups to achieve a good workflow within the organization.
Operations and development often have contradictory goals. Fortunately, we can all find common ground by working together. Understanding each other's needs and goals through communication is the key to success!
But before we get into that, let's explore the world of IT service management (ITSM) for a bit. In this post, I'll discuss different levels of service management in non-prod environments and borrow some fundamental DevOps principles that can help you get the right amount of ITSM. Let's start with an overview of non-production environments.
What Are Non-Production Environments?
We use non-production environments for development, testing, and demonstrations. It's best to keep them as independent as possible to avoid any crosstalk. We wouldn't want issues in one environment to affect any of the others.
These environments' users are often internal—for the most part, we're talking about developers, testers, and stakeholders. It's safe to assume that anyone in the company is a potential user. It's also safe to assume that anyone providing a service to the company might have access to non-production environments. But there could also be external users accessing these environments, perhaps for testing purposes.
Unless you have the environment in question tightly controlled, you may not know who those users are. That's a big problem. It's important to understand who's using which environments in case someone inadvertently has access to unauthorized information. Or maybe you just need to know who needs to stay informed about changes or outages in a specific environment.
That's where service management comes in. The next section explains how bad things can be when there is no service management in non-production. This exercise should be fun...or it might make you queasy. Better have a seat and buckle up just in case!
When You Have Zero Service Management in Non-Prod
Let's call this the state of anarchy. Here's what it looks like:
- Servers are running haywire and no one knows it.
- Patches are missing.
- Security holes abound!
- The network is barely serviceable.
Can anyone even use this environment? How did it get like this, anyway? I have a couple of theories...
- Evolutionary Chaos: This model was chaos from the start. Someone set up an environment for testing an app a long time ago. It did its job and was later repurposed. Then, it got repurposed again. And again. Eventually, it started to grow hair. Then an arm sprouted out of its back. Then it grew an extra leg. Suddenly, it began to "self-organize." Now it seems to have a mind of its own. It grew out of chaos!
- Entropic Chaos: Entropy is always at play. It takes work to keep it from causing decay. In this theory, things were great in the past. But over time, service management became less and less of a priority for this environment. Entropy won the day, and the situation degraded into chaos.
However the environment got into its current chaotic state, the outcomes are the same. Issues are resolved slowly (if at all). Time is wasted digging up information or piecing it together. Data becomes lost, corrupted, and insecure. Owning chaos is a burden and a huge risk in many respects. We don't want to end up here!
If you've made it this far and still have your lunch in tow, you're past the worst of it. You can uncover your eyes, but be wary! Next, we're going to look at a wholly buckled down environment and how it can go wrong in other ways.
When You Have Too Much ITSM in Non-Prod
It's better to have too much service management than not enough. But it's still not ideal. For one thing, it's wasteful. For another, it causes morale to suffer. Granted, it's reasonable to default to production-level service management at first. But staying on default is a symptom of a big problem—communications breakdown. And the root cause of having too much ITSM is due in part to human nature and in part to organizational legacy.
Here are my two theories on how organizations end up here:
- Single-Moded Process: Service delivery, operations, and all other departments focused on service management are hell-bent on making sure the customer is absolutely satisfied with their service. Going the extra mile to make sure the customer is happy is a good thing! Operations folks are trained on production-level service management, so their priority is to keep the trains running. With this in mind, operations management systems are set up for production environments. It's easiest to use that same default everywhere. For better or worse, every environment is treated like a production environment!
- Fractured Organization: Organizations are sub-divided into functional groups. When these groups aren't aligned to a shared purpose, they'll align to their own purposes. They even end up competing with each other. They'll center up on their own aims, tossing aside the needs of others.
How You Know When There's a Problem
The fractured organization theory may explain what happened to a friend of mine recently. Let's call him Fabian.
Fabian was the on-call engineer this past June. The overnight support team woke him up several nights in a row for irrelevant issues in the development environment. He brought this up to operations, who were responsible for managing the alert system. Unfortunately, the ops engineer was not sympathetic to his concerns in the slightest. Instead, the ops guy put it upon Fabian to tell him what the alert system should do. That's understandable, but Fabian had no information to that aim. The ops guy wouldn't share anything with Fabian or collaborate with him on putting a plan together.
This story illustrates a misalignment between operations and development. Problems like this crop up all over the place. Usually, we can remedy or even avoid these situations by taking just a bit more time to understand the other side.
The four theories I've presented tell us about extremes. And yes, these extremes push the boundaries and aren't likely to occur. Still, an organization sitting somewhere in the middle may not have the right service management in non-production. As we've seen with Fabian's story, this is often an issue of misaligned goals.
So how do we get to just enough service management? Maybe the answers lay in what's working so well for DevOps! Let's see how.
Just Enough Service Management
IT teams have members with specialties suited to their functional area. Operations folks keep the wheels turning. QA makes sure the applications behave as promised. There are several other specialties—networking, security, and development are just a few examples. Ideally, all of these teams interact and work together toward a well-functioning IT department. But it doesn't just happen. It takes some key ingredients.
Working together effectively takes good leadership. Leadership happens at all levels in an organization. Remember, a leader is a person, not a role.
It's also critical to have a shared vision and shared goals. Creating a shared vision is part of being a leader. Here are a few points to remember about vision:
- A shared vision creates alignment.
- The vision should be exciting to everyone.
- You have to do some selling to get everyone aligned with the vision.
Your vision for the test environment could be something like: "Our test environment will be a well-oiled machine." Use metaphors like "Smooth Operators" or "Pit Crew" to convey the right modes of thinking.
Keep communications open and honest. Open, honest communications can be one of the most significant challenges you'll face in implementing the right amount of service management. Many of us have a hard time being honest for fear of looking weak in the eyes of others. That fear is difficult to overcome, especially in an environment where we don't feel safe and secure. Managers have the vital task of creating an environment where employees feel safe and able to communicate openly. Trust is essential to success.
One Last Look
Getting the wrong amount of service management in any environment is a problem. Too little opens up all kinds of risks. Too much ITSM results in wasted time and resources. In this post, I presented four theories for how an organization might end up with the wrong amount of service management in non-prod and discussed what changes you can make to correct that.
ITSM doesn't happen in a bubble. It takes alignment between many stakeholders. There are three main things we can do to get alignment: wear your leader hat, share the vision, and converse honestly. You can accomplish any goal when you're set up to win—even with something as challenging as achieving just enough service management.