My name is Sterling Selover, and I’m the owner of Stingbot Games, an independent game development studio based in the United States. We developed a game called The Forbidden Arts, which is scheduled to be released on Nintendo Switch, Xbox One and Steam on August 7, 2019. I’m going to discuss my 4+ month experience of porting The Forbidden Arts to the Switch, and what it took to achieve 60FPS on the console.
I can't count the times I've read the phrase “Lazy Developers” spammed throughout the comments section of a Nintendo Switch game review. It seems as though anything not created with pixel graphics or developed by Nintendo is begging for this comment. With such high expectations these days, it’s no wonder anything but stellar, high budget games receive such criticism from the outspoken critics that frequent the comments section on all sorts of internet gaming outlets. But are all these game developers really lazy, or is the Switch just a tough platform to optimize for? Speaking from experience, I think it’s a bit more of the later.
Sure, Nintendo has some great looking first party titles like Breath of the Wild and Super Mario Odyssey, but one must also consider many factors that attribute to this, including the size of the teams developing these games, understanding of the platform, and the target platform the game is being designed for. The most important factor is that Nintendo's games are designed specifically for the Switch, from the ground up.
Most of the people who worked on The Forbidden Arts are subcontractors or outsourcing studios, and I’m the only programmer at Stingbot. I’m one of those do-it-all indie developers. I’m a musician turned programmer who has been developing games and applications since 2011, and I’ve been doing this full time since around 2014. I have another company that develops all sorts of mobile games and apps, and I’ve personally worked on 20+ different applications for not only my companies, but others as well. I have a very good understanding of how to optimize for mobile hardware and I’ve done a lot of consulting work in the past, optimizing code for various companies and developers.
It’s public knowledge that the Switch uses the mobile GPU, the Tegra X1, which was released quite a few years ago in the mobile market. There are many articles about the hardware that makes up the Switch, but I’m not going to dive into that here. But for what it is, the Switch is impressive. To compare the Switch to a Ps4 or Xbox One is not fair, as the console is a hybrid system running on a mobile GPU. It should be noted that the graphical capabilities of the system are limited in comparison.
The Forbidden Arts targets 720p in handheld mode and 1080p in docked mode, which is the maximum output resolution the Switch supports. The game does utilize dynamic resolution and will go as low as 540p in handheld and 900p in docked mode.
I used the Xbox One version of the game as a starting point for the Switch port. For context, on Xbox One X, The Forbidden Arts runs at 60fps at 4k dynamic resolution. When I begin a port, I like to see how powerful a device is and how well the game will run on the desired hardware at max settings, and then I will decide what to downgrade to improve performance. Once I switched platforms in the game engine (Unity) and made some basic controls adjustments, I built the first version of the game for Switch, installed to the dev kit and fired it up. The game was running at an average of 13 FPS in 720p hand held, going as low as 8 FPS in more taxing areas of the game. This actually didn’t surprise me. I had hoped I would be closer to a 30fps mark to be able to maintain graphical consistency, but some sacrifices were going to be needed to achieve good results.
My code is heavily optimized and the first thing I noticed was the CPU usage. The game barely ever used more than 50% of the CPU, which was great! I’m not going to go into how to optimize code, as it was already optimized to begin with, and this could be a topic for another Blog. If the game was running at such low fps, I knew I was GPU bound, which basically means I’m over utilizing what the Switch’s GPU is capable of running. I did not expect the game to run at 60fps under the settings I had targeted for Xbox One X. I knew it wasn’t realistic. The game was utilizing Anti-Aliasing, high resolution shadows, dynamic lighting, lots of alpha, post processing and tons of custom materials and shaders that are not optimized for the Switch’s hardware.
The first thing I did was remove the Anti-Aliasing from the game, which resulted in an average increase of 10 fps. The game went from 13 to 23 fps on average. Custom materials and shaders were the next thing to go, which is very common with Switch games. I removed all normal, specular, emission and detail maps from the game, and replaced them with more simplified diffuse shaders. Luckily the post processing effects kept the game looking relatively similar to its PC/Xbox One counterpart, so the ball was still in my court. These changes brought the game to an average of 32 FPS. Now I say average, because there were some scenes that dropped down to around 20 FPS. Particularly the Overworld, with the animated grass, or any scenes with lots of alpha. Sadly, as much as I love the animated grass in the Overworld, it had to be removed for the Switch version. It simply killed the frame rate no matter what I tried, and if I wanted the game to play smoothly there just wasn’t going to be any other compromise.
Anything with a lot of transparency would drop the frame rate significantly. I knew I would have to come back to this. But since I was hitting 30+ FPS in most areas of the game, I decided to give the game some playtime and see how it felt.
I capped the FPS to 30, modified the controls to improve latency, and began playing through some of the game. It felt terrible. I was trying to convince myself that other games run at 30fps on the Switch so It will be ok if mine does too. Well that’s fine if other games do, but The Forbidden Arts is relatively fast-paced, and it requires very responsive controls, something that just can’t be achieved at 30fps. The game runs at 60fps on all other platforms so I knew I must make it run at 60fps on the Switch too. Further optimizations needed to be made.
Shadows were the next big change. The game was running real time, soft shadows at high resolution. I dropped the resolution down to medium and saw a huge performance boost. Then I dropped the resolution down to low and saw an even bigger performance boost. Keep in mind, at this point, all testing has been conducted in handheld mode. If I could get the game to run well at 720p in hand held, then I assumed it should run well when the system is docked at 1080p with full GPU utilization (I was right). If you are not CPU bound, going docked gives you a massive performance boost. Lowering shadows from high to low resolution increased the average FPS to about 41-43. I was close! Less than 20 fps to go!
Some Post Processing was the next to go. I modified the Bloom effect to be more efficient than the Standard Unity post processing Bloom, and I removed Ambient Occlusion. But now, with the post processing and material changes, the color grading looked off, so I had to re-do it to achieve results that still looked nice. Here’s a few screenshots showing the differences between the PC version of the game in 4k Resolution, and the Nintendo Switch version in Docked Mode at 900p resolution:
PC - 4K
Switch - 900p Docked
This type of modification really helped to maintain a relatively consistent look with the other versions of the game. Sacrificing Post Processing gave a huge boost in FPS. The game was now averaging 50-55 FPS. Ok so, what else could I do to get to 60?
I started running performance tests on the game in various areas that took a performance nosedive, and I came across a very common culprit: alpha and cutout shaders. Let’s discuss the first level of the game: Korrath Woods. There were 2 major culprits in this scene: grass and trees. The grass uses a custom shader for waving grass, and the trees are animated and use a cutout shader. This is not good design for the Switch, but as the game was already in development 2 years prior to the Switch even being publicly announced, I had not planned for these types of hardware challenges. I highly recommend creating new objects to replace objects that use cutout shaders or lots of alpha, but due to being over budget and over extended on the game as is, I couldn’t go back to the drawing board and create new assets. But make note that diffuse shaders and opaque materials really are the way to go, especially if you are targeting 60fps.
Here’s a shot of one of the trees as well as a wireframe version so you can see all the faces that the cutout shader is using:
It’s definitely not a mobile-optimized model. If I had removed all of the faces, and instead created a tree that uses one continuous mesh and an opaque texture, there would have been a huge performance boost. I came up with a solution that a lot of games have used in the past. I created billboard trees to replace a lot of the 3d trees in the game. A billboard tree is just a quad with a texture applied to it. From a distance, it resembles a 3d tree model. The way I created the trees:
1. I isolated a tree in the scene and took a screenshot of it with a plain background behind it. I did this with the lighting, color grading and bloom effects applied, so that I can maintain color consistency on the unlit billboard texture.
2. I imported the screenshot into Photoshop
3. I cropped out the tree and exported it with a transparent background as a PNG
4. I imported back into Unity and created a material for the tree
5. I placed quads throughout the scene at the proper aspect ratio and applied the billboard tree material to it.
From a distance, you can’t fully tell that the tree isn’t a 3d model. Obviously, it doesn’t animate, but any trees close to the player (within 30 units of the Camera) will remain 3d and animated, and anything further out was replaced by billboard trees. Voila! 60fps! It wasn’t just trees that caused this issue, but anything with a lot of alpha, so I went through each scene of the game and performed these steps with every asset that contributed to what I like to call “the alpha problem.” It was quite a painstaking process, but I made it through.
There was a lot more I focused on beyond these changes, such as replacing water shaders with more optimized ones, redesigning entire sections of scenes that had too much geometry for the camera to draw, and modifying tons of particle effects to be more performance friendly, but I won’t go into all of that here. I spent hundreds of hours performing lots of various optimizations.
After all of these optimizations, most of the game was running at 60fps with about 90% GPU utilization on average. Less optimizations were done on the Overworld, as it’s not as crucial to maintain 60fps there, so lower frame rates are encountered in some parts of the Overworld, but the game never dips below 30 fps, even in the most taxing areas. The overworld runs at 648p in hand held and 900p while docked. In crucial gameplay moments, I worked very hard to maintain 60fps, and in all the platforming levels/dungeons the game strives to maintain a smooth 60fps at the targeted resolutions. When all is said and done, I’m happy with the outcome.
There is one other major culprit that needed to be addressed: loading times. Initially, the Overworld took 36 seconds to load, which was unacceptable for this type of game. It was a huge scene, but this wasn’t The Division 2 (That game takes 2 minutes to initially load on my Ps4). I first went through all graphical assets and compressed large textures that really didn’t need to be as large as they were, from some particle sprite sheets to skyboxes, and this did shave off a little bit of loading time, but not enough. From experience, I knew it was most likely the audio causing most of the issues. First off, I had been loading audio from Unity’s Resources upon scene start, which was kind of redundant and stupid to begin with, but I’m only human. I rewrote the audio code to store audio directly in the scene, and I utilized some of the extra available CPU power to stream the audio, which allowed for much faster load times. The time it took to load the Overworld went from 36 seconds down to 11 seconds, which is the longest load time in the game. All other scenes load in under 10 seconds, and most of them considerably less.
All in all, I spent 4 months working on the Switch port, while simultaneously continuing to improve and finish the game on other platforms. The Switch’s hardware is unique and unless a game is designed specifically for The Switch, it’s probably going to be very hard to maintain consistency with consoles of more powerful hardware.
To conclude, the Nintendo Switch can be a challenging platform to port to with all of the graphical features available in modern game development engines, but it can be done with some sacrifices and a bit of hard work.