Chapter 1 - Introduction to Katalon
Chapter 2 - What is Shift-Right Testing?
Chapter 3 - Shift-Right vs Shift-Left Testing
Chapter 5 - Shift-Right Testing Practices – Monitoring
Chapter 6 - Course Summary
This course will go through the best practices, including how to monitor your mobile application UX and performance in the production environment with different tools.
These three chapters will cover:
Chapter 1 - Introduction to Katalon
Thanks everybody for joining me here today. As I said we're going to talk about improving mobile quality with shift-right testing. Before we jump into that, I'll just give a brief introduction to Katalon as well. You may know Katalon from our Katalon Studio product. It was our first test automation product and it allows you to author tests by recording and playing back, and customizing scripts. We do that across a wide variety of platforms.
It's been well-received last year. For the second year in a row, we've received an award from Gartner Peer Insights as the best test automation software. But these days, Katalon is more than just Studio. We are an all-in-one testing platform that also includes Katalon Recorder and Katalon TestOps. We help you test across a variety of platforms: doing API testing, mobile testing, web testing, desktop testing on Android, iOS, mini browser platforms. And of course, in mobile testing, we integrate with our partners at Kobiton. We have a plugin for the Kobiton system that really simplifies connecting to devices and executing tests on the Kobiton platform.
Chapter 2 - What is Shift-Right Testing?
So as I said, let's start out with what is shift right testing, so we're all on the same page. The right and test right, I think probably a lot of you know. But we'll just kind of go over it again. The right and shift right is really referring to the right side of an imaginary or a real pipeline of development. If we imagine that we start on the left with design, we build, we test, we deploy, then we put it into active use and production. Everything after that deployment, that's the right. That's the right-hand side.
So, you know, we work to shift left and move testing earlier in the process. But there are also things to be learned from shifting to the right. That's what we'll talk about today. But, you know, before we finish with the question of what is shift right, I think there's also a question that comes up. Is this just testing in production?
Of course, the answer is both yes and no. It's yes because I just told you the right is about, you know, post-deployment production environments. But it's not testing in production in the way, in this meme, which is frequently deployed pejoratively or cynically by engineers, ''I don't always test my code but when I do, I do it in production.'' This isn't about ignoring testing prior to production and just waiting to see what happens This is going to be about how to get the most out of the test that you can run in production and why you might need to do that.
Chapter 3 - Shift-Right vs Shift-Left Testing
We just discussed, you know, what shifting right is. So why should you shift right? Why can't you get all of your testing done prior to production? Again, that may not be a mysterious question to people who are engaged in it, because there are lots of answers. There's just time constraints. But I think the main reason is because there is no other place like production. Just like there's no place like home.
Let's bring up another meme. There's no place like production for a number of reasons. One is scale and I often reflect on this because I started my career in software development slightly before the advent of the Internet at least as an application deployment platform. Before that, I think it was harder to have applications that operated at the scale that we do today. You know, in the past, before the Internet was an open platform for deployment, it was hard to get thousands, hundreds of thousands, certainly millions of people using a single version of your application.
But today, a relatively small SAS application could easily have thousands, hundreds of thousands, maybe even millions of customers. That scale produces data and transactions that are very hard to reproduce in any environment prior to your production. You just can't afford to build out something that is going to have the capacity to scale like your production environment. You're frankly going to have a hard time producing the data that would simulate it.
Your production environment is also going to be necessarily more complex than any other environment you're likely to build or that is reasonable to build. It's going to be where you have your CDN deployed, your load balancers, all your potential network connections to partner systems. It's where everything comes together and it's unlikely you're going to see everything that you could prior to that environment. So, what we're going to talk about today is how to get from that, how to learn from that, how to test there and how to learn from it.
It's also the environment that is the most complete. It's going to have all your latest functionality. Again, it's going to be all your latest connectivity. It's going to be the place that you have tools in place for system monitoring. For your marketing system, there may have scripts deployed to observe customer behaviors that you don't have or that don't really receive much use in your pre-production environments. Those scripts can affect the way your application runs and that's an important thing to remember.
So, we shift right because production is distinct. We also have in our modern environment continuous deployment into the production and continuous delivery. In many organizations, you may be releasing new code throughout the day, multiple times a day. Certainly we are far away from when we used to deliver only at the end of sprints or even longer durations. In order to support that rapid delivery of functions, we've got to be looking at it as it hits the most complex and most elaborate parts of our environment.
But one thing I'll pull out that I think is really particularly compelling about shift right testing. That is it can encourage collaboration among groups that aren't always looking in the same place. Those are your developers, your quality engineers, your quality professionals, and your operations professionals.
Your developers are often looking at their development system. Maybe it's their laptop. Maybe it's another development environment that they have access to. QA is looking into test environments. Maybe as they shift right, they're looking into production. Operations is really always focused on production. If you can start to develop conversations among those groups about how the application behaves in the place where it really matters, I think that's really powerful and something to look forward to and keep in mind.
Another reason that shifting your testing to the right is valuable is because ultimately that is where value was created. You can see that in this set of responses to a survey that was part of Capgemini's continuous testing report from last year, where they asked how you measure the effectiveness of your continuous testing process.
The top two survey results are by (I.) looking at production data and (II.) from user feedback and the adoption of new functionality. Those are the things that deliver value. Further down the list, you'll see things that we often look at as engineers, quality professionals. Things like requirements coverage, automation, code coverage. Those are all important but ultimately they are there to support those value delivery features in production.
Similarly, from that same report, there's another survey to ask what continuous testing practices you are putting in place. The one that is at the top is testing in production. So, I think this would be even bigger if we were defining testing in production in the somewhat pejorative sense of just jumping into production because I didn't test in earlier environments. I think what this is really talking about is what we're talking about in terms of shifting right. And, that is taking tests that we've already executed and we've already built, not just automated tests but also exploratory manual tests, and applying them to production so that we can see how our applications behave in that richest of environments and so we can learn from that.
Now we've talked about what shifting right is, why we want to shift right. The existence of shift right suggests that there's a shift left and that raises the question: Are they mutually exclusive or complimentary?
What if I told you, you could test both in prod. and before prod. because ultimately we're shooting for testing all along our life cycle. Unlike that original pipeline that I drew, it doesn't start one place and end right, for lucky. It goes on and on.
In particular, it goes from our ideation, development, integration, and testing into deployment, testing in production, learning from those tests, learning from monitoring in production, analyzing those results and then feeding that right back into the cycle. I think that's ultimately what we're looking for.
But sometimes we give short shift to shifting right and that's why I want to talk about that today. But if we compare and contrast them, shift left testing is about preventing problems early instead of testing only at the end of development. We're trying to push testing earlier and earlier in the process. We want to get core functions and coverage early in the process if we can.
Shift right testing is about detecting issues that may only be revealed in that post deployment environment and making sure that critical functionality functions even there. As I said earlier, that combination of shift right and shift left testing gives us continuous coverage and continuous learning throughout the product lifecycle.
Chapter 4 - Shift-Right Testing Practices – Release Management
Now I've talked about what shift right testing is. I've talked about why we want to do it and compared it to shift left testing. Now I want to talk about some practices that we think are critical to learning from shift right testing. So I'm not going to be talking just about how to run the tests per se. They could be automated tests. They could be manual tests that you execute. But it's about how to prepare to make the most of those, how to get value from them, how to learn from them.
One does not simply test in production. You know going back to my original premise, shift right testing is not about just arbitrarily testing things in production because you didn't do it earlier. It's about executing tests in production that you can learn from and so what I'm going to talk about here is how to be prepared to learn from those tests. As I mentioned earlier, in terms of mobile testing, as we go through each of these practices, I'll talk about how mobile testing, in particular, what considerations may need to be taken into account.
We've got four practices that we're going to talk about and they're split into two groups. Two of them are about deployment, how you get your software into production and how that can support shift right testing. The other two are about monitoring, how you watch what happens in your production environment and how that, again, can support your testing.
So the first of our deployment approaches is what's called a ''canary release'' and this isn't terribly complicated. But I think it's worth talking about. The term comes from canary in a coal mine, having something that you're going to deploy and that is going to warn you about issues. So the canary would warn you if there was an oxygen issue in the coal mine.
Canary releases aren't always about warning you about bad things. Sometimes, they're just about getting new features into an environment where you can test against them. So, the notion in canary releasing is that you've got a release already in your production environment. You're going to take some subset of production, maybe a server, maybe a cluster of servers, and you're going to deploy your next release into that. Then, you're going to use typically some part of your infrastructure to separate your user base. Maybe you're doing it through load balancing. Maybe you're doing it in mobile applications you may be choosing to connect to different backend endpoints.
But you're splitting some of the usages so that generally the bulk of your usage stays on your, let's say, 1.0 release and then the rest is cycled over into your next release where you can start to both have real usage and do some of your testing, again, automated testing or manual testing applied against this new release. That's a place where now you can learn if there was anything that was missed in earlier pre-production environments, maybe that didn't have all the complexity.
Because you're not just doing testing, you're also putting some real load on the system, you start to see whether there are usages, users, devices that you haven't seen before. That brings out different behaviors in the system. And you do it in a controlled fashion so that, over time, you can start to deploy more and more of your new software into the environment. Eventually, all your systems are running 1.1.
In addition to functionality, another major way that canary releases are used is a way of testing new infrastructure because they give you the opportunity. Maybe in your 1.1 version you have switched your queuing from SQS to Kafka or you've switched your database. This is a wild switch but from one database version type to another. Those are things that you can do in a canary release strategy.
Given that description, I want to talk about a couple of things that when you're doing this with a mobile application that you need to keep in mind. Those are that mobile applications, unlike your standard web application, you've typically got two parts to the application you've got your backend APIs and you have your front-and mobile application. Those have to be coordinated in your canary release. You'll probably need to release your back end in advance and then you're going to need to release your new mobile system into maybe your test flight beta release cycle.
So, I kind of led into this, in my previous statement. Our second deployment strategy is what's called dark launching. Dark launching is where you release a feature. A critical component of dark launching is what we call feature flagging. So you release a feature and you take the functionality for that feature, the code behind that feature. You put it behind a flag that allows you to control whether or not it is visible to a given user or segment of users. It's really not a whole lot more complicated than just that.
There's a boolean if-then clause around the code that you're deploying. Now, there are a lot of solutions for this. I have a graphic here from LaunchDarkly which is a well-known commercial version. But there are open source libraries for feature flagging and they all add degrees of completeness to it. So, they give you better ways to group users together. They optimize the ways that you can turn on and turn off the flags and make sure that those flags cost as little as possible to check and execute. But in general, they give you a way to deploy software and then gradually reveal new features in that software to your customer base.
You may do that in a number of ways. You may do it by features of a given customer set. So, maybe you've grouped people by regions and you want to release it to one region or another, or maybe by device types, or maybe you're going to do it at random because you're doing some sort of experimentation. You're going to gradually add random people into increasing percentages. So, maybe you start by rolling out to one percent, then 5 percent, 10, 20 percent as you go out.
What this does for you, in terms of shift right testing, is it gives you a way to get software into production. The dark and dark launching is that you don't see that new feature, that new software that has been deployed until the feature flag is flipped. So, it gives you a way to get your software out into production and gives your testers or your test automation a way to access it in the production environment before it has the opportunity to be exposed to any of your sort of general users. It gives the opportunity to then gradually bring on real production load, real production usage, and learn from that.
So as you're scaling up your feature flag, maybe you notice a use case that hadn't been seen before, or a particular load phenomenon, or a particular behavior in your infrastructure that you hadn't seen before. You can stop or even roll back your dark launch or your feature flag when you see that. That's a very powerful capability and I don't think it can be understated how useful that is particularly in our modern continuous delivery paradigm.
I talked about the fact that any feature flag implementation is going to try to make feature flagging as low cost in terms of checking and even in terms of implementation as it can. But ultimately, feature flags are a cost. You can imagine if your codebase had, then checks throughout how it eventually becomes more and more difficult to understand. So an important thing I think to keep in mind if your development team is deploying using feature flagging for all the power it gives you. You need to keep in mind that as your feature flags are being fully used, as you're running them up to 100, and you feel confident that things are all working, you need to be retiring those from your codebase over time. Some of the implementations of feature flagging have different support for this. So, you may have an easier or harder time finding out where all the feature flags are.
But ultimately, you need to check within some period to make sure that you've got all of your future flags taken out. There may be some that you keep in your system for a longer time. There may be some you keep in perpetuity if you believe that they are things you want to toggle on and off for the long term. But for the most part, you need to be taking those out maybe every 30 days, every 90 days, as they fill up.
If you think about mobile versus just pure web deployment, there are something that you want to keep in mind if you're using feature flagging. One of those is that you're going to need to synchronize your mobile front-end and your back-end. You may do that implicitly because you have the same user context on the mobile in the back-end. You may need or want to do that explicitly. So maybe you let your mobile device pass feature flags back through to your back-end to let it know what functionality it expects.
So having talked about those two deployment strategies to support shift right testing, I just want to take a little bit of time to sort of compare and contrast them because they both deal with releasing new features to subsets of users and they both are about decoupling deployment to put software into production from release, making it visible to users.
So just you know, to go over it but not to belabor it. Dark launches are most often used for new application features, in contrast to canary releases which can certainly support new application features but are most useful and usually the only way you can support new infrastructure changes. Dark launches are typically looking at how a user responds to this feature and canary releases are often looking at how this affects the performance of the system because they're often involving back-end or infrastructure features.
The dark and dark launch comes from the fact that often users don't know that they're involved in this process. They just are going to see a new feature at some point. They don't know that they don't or they do at any given time. Canary releases, you may be more explicit about that you may have people opt into a beta release that cycles them into the canary environment. But sometimes it operates the same way dark launch does too.
Chapter 5 - Shift-Right Testing Practices – Monitoring
Having talked about those two deployment strategies, I'm going to talk about some monitoring strategies and how those are necessary, and how they can help to get the most out of your shift right testing. So, the first of those is user experience monitoring and I don't know that you know there's any solid terminology about this. But what I mean by user experience monitoring is putting in place facilities that you can monitor how people are using the system and effectively reconstruct behaviors and states from that.
Because if you're going to learn from users in production, you need to be paying attention to them. And again, there are lots of ways to do this. I have some screen grabs from fellow Atlanta company for their system. But in general, this is about instrumenting your system so that you keep track of your use case and how users are interacting with your application so that you can then replay or reconstruct what they were doing. That allows you to learn from production, what people are doing live and it also helps you in your own tests in production to know what was happening and if something did go wrong.
For example, this type of monitoring can give you insights from production into debugging that may need to happen to resolve something that you've discovered. So, if you discover a failure in production through your automated testing or through your manual testing or exploratory testing, how are you going to then like quickly get to where that is in the application so that you can remediate it? Well, having this type of instrumentation and monitoring in your application will give you that.
And again, there are lots of ways to get this. If you're not using something built for it, often this can be accomplished through various event monitoring systems that you might have in place if you introduce those into your mobile system, maybe you event whenever you go into a screen or a person interacts with a feature in the system. And that brings us to what mobile considerations there are for this particular practice.
And the most important one is the need to instrument your application. Because mobile applications, at least less so than traditional web applications, aren't going to necessarily keep track of what a user is doing, just natively. In contrast, a web application often is tracking much of its user behaviors through navigation through URLs, at least in sort of more traditional web architectures, maybe less so in single-page apps today. But you're going to want to do that so you can keep track of what's going on and you're also going to need to consider all the many mobile devices networks and geographies that are going to interact with your system and how are you going to get insights from those. So, in production, you'll get a lot of that from your actual users but if you want to simulate you're going to need to think about how you get access to that maybe it's through Kobiton or many of the other device providers out in the world.
So having talked about monitoring user experience, another element of monitoring that is important to have in place if you're going to learn from your production environment or if you're going to learn from shifting right is to have application performance monitoring in place. And again here, I'm showing a New Relic that is a commercial solution. There are other commercial solutions and there are open source free solutions. I just have this here as an example of the sort of things you want to have in place.
And when I say performance monitoring, I'm not just talking about performance in terms of speed, I'm also talking about performance in terms of system health. Are there errors occurring? What is a transaction throughput look like? And the reason you want to have this in place if you're going to be testing in production, if you're going to be shifting right is because as you're applying either one of those deployment strategies, you're going to gradually see real users and your test environments in your test scripts interacting with new code and you want to be able to identify whether there's anything anomalous happening.
So for example, in this screen here, you can see, an error rate is spiking toward the end and maybe that correlates with a feature flag expanding. Maybe I've just bumped from 10% to 20% and maybe I just ran a test script against that environment to see how it's performing in production. If I see that error spike, then I may want to go back and halt my rollout and maybe investigate that in some way. Now, of course, there's always the possibility of some spurious correlation but it's a good thing to know and get out in front of if you can.
Another thing that I'll note about some of these performance monitors is that they will let you create synthetic tests. These are tests that are not dissimilar from the automation that you're creating in your pre-production environments. But they can be deployed throughout, if it's a commercial tool, throughout their network and you can use those as ways to test the overall stability and performance of your production environment. You can go beyond just testing to see if it's alive because, I don't know about everyone, but I have been burned by applications that, in production, show themselves to be available and functional but a particular transaction is not working the way it should.
So these synthetic test scripts allow you to record and playback more elaborate transactions. As a slight note to some of our products, I'll note that our Katalon Recorder will let you record these and then export them to a number of commercial products. A New Relic is what's shown here but we also do it for AppDynamics and Dynatrace. You can take those scripts, export them and then see again in production what is happening around the world. So in this case, I'm seeing my response times from Singapore and Seoul and Sydney. And that gives me the confidence as I'm running that transaction, maybe it's a transaction that's testing some new functionality that I ran out during my deployment. I can start to get some confidence about that in production by running that test.
It's also important to be, in particular in the case of mobile applications, be evaluating the performance of APIs and not just your sort of web application because those are so critical to mobile performance. Again, in terms of mobile applications and the considerations to make in performance monitoring, it's the same ones that we talked about in user experience. You need to think about how you're going to instrument your mobile applications to collect data about their performance.
Again, lots of opportunities, lots of options from commercial and open source but you want to be able to collect things like: how is the mobile system itself performing when there are failures, crashes on the mobile system? Are you able to collect those and learn from those? And again are you able to get tests to mobile devices around the world on different networks, different geographies? (which, of course, brings us right back to Kobiton and what they provide for everybody) So we talked a little bit about shift right testing, what it is, why we do it, and we've talked about some considerations for how to collect as much information about that through monitoring as you can so you can learn.
I'm going to wrap up just by talking really briefly about a new product from my company Katalon that helps you monitor your test executions and that is Katalon TestOps. So, I'm not going to go into everything about TestOps. If you want to learn more about TestOps, I hope you can come by our booth and you can contact us. But I am going to talk about how it's relevant to what we've been talking about today and that is in terms of a couple of things.
One, we let you monitor and help you record the performance of your tests over time, and that performance is not only in terms of test passing, test failing, test having errors, but also in terms of the actual time taken to run your tests. So if you see something in your shift right testing, you can take a look back in history to see if there was anything that would have predicted this in the pre-production environments that I maybe just missed. And that's a really powerful capability. You can see here, we've got the test run times, going back in time for a number of tests.
We also do some analysis on tests in any of your environments such as: Are they flaky? Do they run consistently red? Do they run consistently green? Or do they go back and forth? Which tests are slow? And flakiness particularly, I think, is something that is helpful to be able to go back and take a look at if you maybe attribute some errors in pre-production environments to some environmental instability and let something go, only to find out that, in production, it exhibits similar flakiness that suggests you may need to come back and take a look at that test.
And another thing that TestOps focuses on is integrating all of your tools, all of your testing strategies, and environments together. I would be remiss if I didn't point out that includes Kobiton. So we let you connect to your Kobiton account and then use that integration to simplify the execution of the automated execution of your tests against Kobiton devices. Again, that's just a kind of cursory look at both how we help you with mobile testing and how we help you with researching issues that you may learn in shifting right. But please come talk to us. We'd love to tell you more about any of our products TestOps, Katalon Recorder, Katalon Studio, and its integration with Kobiton as well. We'd love to just talk to you about testing in general and how we might be able to help.
Chapter 6 - Course Summary
Well done. You've finished 'Improving Mobile Quality with Shift Right Testing'. In this course, we have covered the fundamentals of shift right testing. Let's recap on the most important things to remember:
- What is shift right testing and why it matters
- How shift right practices differ from shift left testing
- What are the best practices for applying shift right testing with mobile testing
- Effective monitoring your system performance and quality with New Relic and Katalon TestOps.
So, what's next? Let's start applying shift right practices into your projects and tailor testing strategies off of the production stage data. Katalon Academy is working on bringing you even more useful test automation learning resources in the future. In the meantime, keep your automation game strong by exploring more relevant courses by clicking on the courses button right on the top menu bar. Thank you and happy learning