Dev Diary: The Journey Of A Bug
No project is bug-free. While we do our best to prevent them from occurring in the first place, Travian has had and, unfortunately, will continue to have lovely bugs to fight with. When you encounter that dysfunctional button or a battle report which seems a little “off”, the journey of the bug begins! But what does this journey look like exactly? Join me, a former Travian QA tester, and let me tell you how each of our bugs reach their goal of being squashed!
The road to squashing is different for every type of bug, but before the journey can begin, we must meet our soon-to-be-squashed friend first. If we could have some sort of crystal ball that revealed to our development team all those bugs hiding in our 14-year-old jungle of code, we could squash them before they bite! Unfortunately, this crystal ball doesn’t exist. Similar to the never-ending battle of the Heodenings in Norse mythology, this battle is also never-ending, except we’re fighting nasty bugs, not Nordic warriors (Something to be thankful for). While we’re always looking for new methods and weapons to use in this war, let’s have a look at what we have in our current toolbox.
The newest addition to our repertoire, automated testing is exactly as the name suggests, tests performed by the computer, headless, automated. Once thought to be unachievable, automated testing was implemented in 2015 and now covers a whopping 80% of the game’s functions. For each and every build that we create, our suite of test plans is executed. If any errors are found, the QA team investigates. Automated tests have prevented many critical bugs from entering the game and allows our QA team to focus their attention on more worthwhile tasks.
Which brings us to our team of 5 dedicated QA testers. These guys are the vanguard in the battle, they’re right on the front line. Responsible for verifying new bugs, managing the existing bug database, testing the bug fixes, creating new automated test scripts, testing edge cases and much, much more. Our QA team performs the manual tests rigorously on each build and if they don’t find any major issues, that build is then installed on the PTR.
The PTR is the final layer to find bugs that slipped through the first two nets. Any bug found by players on the PTR server, COM80, are reported to the support team which then forwards them to our QA for verification. If there are no problems, then the build is green-lighted for the live servers!
Which finally brings us to you guys, our players. In a perfect world we would catch every bug with our initial nets, but, of course, we can never catch them all. With the limited manpower and technical capabilities, alongside the complex nature of the game itself, some bugs are hidden so deeply in the jungle, only the combined activity of our community is enough to eventually discover them.
Once an issue was discovered, it’s up to the QA team to verify the issue and confirm that it is indeed a bug. During this verification process, reproduction steps are also defined. These steps are important as they help our developers develop a fix more rapidly. Some bugs however just can’t be reproduced, despite our team’s best efforts to define them. This is usually due to technical limitations, but as the game is built upon years of unsophisticated code (Travian started as a hobby project), as well as its complex nature, the reproduction steps sometimes remain a mystery.
So, the QA team has confirmed that this newly reported issue is indeed a bug. Now what? Before they can pass it on to the developers for fixing, it needs to be entered into the bug database. There are several details that are required before any bug is granted entry by the QA database gatekeepers such as:
- The title of the bug
- Browser version
- A description with supporting images
- Reproduction steps
Let’s talk a bit more about those last two: severity and priority.
This is a classification based on the degree of impact the bug has on the game. How heavily does the bug interfere with game-play? Is a certain feature not working because of the issue? How important is this feature? What other departments are affected by this issue and how does it affect them?
Pretty self-explanatory, priority is how soon the bug needs to be resolved, the importance of its resolution and the urgency. Is it causing the game to be unplayable? Do we need to stop everything and get it fixed right away? Is it blocking a new feature from being released or affecting another department’s work? Is it only a minor issue with no major impact, such as a graphic problem?
It’s necessary for such details to be included in the bug task to allow the team to approach bug fixing in an organized fashion. It allows the very limited developer power to be used in the most efficient and effective way, focusing on the most important bugs. If you find a bug one day and it still hasn’t been fixed in weeks, we probably know about it, but there are more important bugs that require our attention.
These classifications, while mostly defined by QA, are not set in stone and can change over time. If any change of classification is requested, that bug is usually marked for triage.
Bug triage meeting
Bug triage meetings are meetings in which specific bugs are discussed by the attending stakeholders (technical support, customer service, game director etc.). Any stakeholder can mark bugs, creating a forum to discuss the current classification of a bug with the different departments.
Usually, bugs are marked because a stakeholder would like to increase the priority. For example, say there is a bug of a minor, graphical issue and it was given low priority by the QA team. The graphic issue doesn’t impact game-play at all and a fix isn’t required urgently (so we thought), thus low priority seemed justifiable. Meanwhile, Frank from the customer service team, receives hundreds of player tickets about the bug every day, asking for it to be fixed. This bug, while seemingly a minor issue, is costing the customer service team a lot of time to answer these tickets and the community’s desire for a fix is clearly significant. Frank would then mark the bug for triage so that he may argue his case for an increase in the bug’s priority during the meeting.
With the bug database organized, our developers can focus their attention on the highest priority and most severe bugs. Of course, the time a bug takes to be fixed varies as it depends on the complexity of the problem and, sometimes, can be dependent on something else, such as another bug or upcoming tech update.
Once the bug is fixed, its merged into a new build A.K.A game version. The QA team then performs what’s called regression testing on this build, which not only involves testing the bug again and running the automated tests but also manually testing any areas of the game that could have been affected by the code change. Think of the butterfly effect example of chaos theory – “A butterfly flapping its wings in Brazil causes a tropical storm in Australia”. Small changes in the code can build up and have very large effects elsewhere or even on the system as a whole.
Usually, bug fixes are applied in our regular weekly bug patch updates or included in the fortnightly feature updates. When the bug has an immediate priority though, it’s all hands on deck to get this bug fixed and the fix installed on the servers ASAP; these are called hotfixes.
The final stage of our bug’s journey. This is where the fix for the bug reaches our players – the most important stage. A bug fixed in our developer environment may as well be not fixed at all if it’s not on the live servers yet. Before the bug fix can be installed on all live servers, there are a few more steps that need to be taken.
- Regression tests
- Update of the PTR server
- Writing the change-log as well as translating it to all supported languages
- Partial roll-out on live servers (the build is applied to only a few servers first, to minimize unforeseen issues caused by the installation)
- Full roll-out on remaining servers
If there are any issues found within each of these steps, it may potentially further delay the update of the game. This is why, when you find a bug, it isn’t fixed the next day. We wish we could, though.
That concludes the journey of a bug from discovery to being fixed on the servers. We’re always looking for ways to improve our process, with the goal of minimizing bugs as well as fixing bugs as quickly as possible. Hopefully, you’ve enjoyed this journey and it has cleared up some of the questions you may have had about the bug fixing process. Thank you for reading! If you have any questions, please let us know.