Travian technology and the cloud
“By now we’ve moved all the game worlds to the cloud!” We received this statement towards the end of last year. But what actually is the cloud? Why has Travian been moved to the cloud? And most importantly: What benefits do our players have from this move?
Today Jörg Strathaus, the Chief Technical Officer at Travian Games, will answer these questions and more on the topic, as we’d like to offer you a look behind the scenes of Travian.
Hi Jörg, please introduce yourself briefly to our players.
Hello, my name is Jörg, I’m 53 years old and I studied computer science what feels like a gazillion years ago. I started working at Travian Games in 2012. First as an external advisor and project manager in “IT solutions”. With the reorganization of the IT department, I was asked if I was interested in taking on the role of CIO/CTO. Of course I was. So my position since January 2013 has been CIO/CTO.
What exactly is your job at Travian Games?
In simple, bold terms, my job is to ensure that our players are provided with a platform that allows our games to run smoothly. On one hand, this entails all aspects of the infrastructure, meaning server operation, office operation; basically everything that is required for us to work properly as a gaming company and to operate our games. On the other hand, customer support and community management are also part of my responsibilities. That doesn’t sound very logical at first, but since I’ve got lots of experience in that field, too, it was an easy decision to connect these two separate fields.
A short while ago we had some problems with the Travian game worlds. How do you and your colleagues meet such challenges?
Firstly, it’s important to say that every single issue is one too many for us. Despite that, I’m quite pleased with the relatively low number of overall issues we’ve had.
In the very recent past, we’ve had some issues that repeatedly caused us a headache. Following a full-scale analysis, which also involved the manufacturers of some important components of our systems, we had been able to solve those problems. How do we usually approach these kinds of issues? The team generally works incredibly hard on fixing any issues as quickly as possible and for the long term. The restoration of the services is always the initial aim, yet we also focus on solving problems once and for all. The good thing about the move to the cloud is that we have so far been able to permanently solve all issues that have occurred. That is an amazing achievement by our team and I would like to say thanks once again for that.
Please could you explain “the cloud” to our players in non-techy speak?
Sure, that’s easy. Simply put, the cloud is no different to a large, physical server platform. Imagine a large hall full of servers. That is the platform, the basis of the cloud. This platform however is not used as physical servers, but we just implement our virtual servers as required. That’s the idea behind the cloud. The bottom line is that the cloud is just computing power made available to us, where we can implement what’s necessary at any given time to have our games run stable. That’s pretty much it.
Why was Travian moved to the cloud in the first place?
In the past couple of years, we have achieved a very stable environment. So your question as to why that was changed is a very valid one. As a matter of fact, our technological infrastructure had to be renewed to a great extent. Our servers simply got old. This gave us two alternatives: One option was to simply exchange the old servers with new physical hardware. The other option was to fully revise our former concepts and to see if we could find an even more stable and reliable solution with greater flexibility and speed. Exactly that is the idea of the cloud itself and so we decided for a move to the cloud. The cloud’s advantages in flexibility, speed and scalability also make for an improved performance. Especially the issue of stability is important here, because we know this is of great importance to our players. All of this eventually led us to decide to go with the cloud.
How would you describe the move, or rather, what were/are the biggest challenges in your view?
That’s a very interesting question and one only few will actually be aware of. “You’ve moved to the cloud!” always sounds nice, but it hasn’t always been an easy ride. Essentially, what we’ve done is set up a completely new and modern data-center. So everything we had until then, has been duplicated with a new and more modern infrastructure. Our cloud was then based on that infrastructure and everything formally stored on our physical servers was then moved across into this cloud. So there were three big challenges in total that we faced and solved. Initially we had to take care of some central database components. Overall, we can say that the implementation and game migration worked almost completely smoothly and unnoticed by our players. A few small impacts can unfortunately not be prevented when faced with such a major task. I do still think that all in all it worked really well and our players did not explicitly notice which game worlds were moved at any particular time. That was the biggest challenge. Of course there have also been systems that, due to their long history, were quite difficult to move across, posing another challenge just by themselves. But it all worked out nicely and I’m very happy about that.
There used to be many DDoS attacks on our servers. Has the move to the cloud rendered those less “damaging” now?
DDoS is and always has been an issue we will be fighting against. There’s no point in denying it: DDoS attacks are always damaging. But the real question is how to tackle such attacks. If we look at how we’ve dealt with them over the last few years, we already have reached a very good level. Only rarely have we had a DDoS attack that took us off the net completely. During the move to the cloud, we took further steps to protect our systems from DDoS attacks even better in the future. I don’t want to go into more detail here, since this is a sensitive issue. DDoS attacks are a factor and we are constantly working on improving our already well-positioned systems.
Despite a few hiccups, the move has been smooth and positive for our players. To what extent do you agree with me here and what exactly are the positive aspects of the move?
I can only repeat: Every interruption is one too many and we are fully aware of that. When looking at the complexity of the task however and the way we have handled it and then comparing it to the actual hiccups, the result is very satisfying. We continue to work on this issue in order to make sure there will be even fewer problems in the future. Of course we also gain more experience with the cloud every day.
It’s worth keeping in mind that this whole project ran from April last year until the end of 2014. The busiest days had been around Christmas time and New Year’s Day. Everyone in the company joined in and we completed the project successfully. I’m very proud of this amazing team effort.
Is there anything else you wish to tell our players?
Thank you so much for your loyalty and patience, despite a few hiccups. I’m fully aware that each of them is one too many and we are working very hard on avoiding such problems. I’m grateful for the feedback and constructive criticism that we have received from you. It’s really great and helps us to improve continually. Any further feedback is always welcome. Thank you!
Thank you for taking the time to chat.