Game Backend Chronicles. Part #1: Player Request Handling

Game Backend Chronicles. Part #1: Player Request Handling

Introduction

Greetings, adventurers! I'm delighted that you've chosen to delve into this article. My name is Andrey, and I have over 10 years of experience in game development, participating in various aspects including frontend, backend, game design, and even art. I've noticed a lack of well-crafted articles discussing backend infrastructure for multiplayer games, which motivated me to embark on this writing journey.

This article is the first part of a series where we'll explore backend game development. Today, we'll delve into the initial step of developing a backend for your game: processing players' requests.

How do we communicate with the server?

A crucial preliminary step in the backend development of a multiplayer game involves processing player-generated requests. These can range from elementary operations, such as logging into the world or creating a character, to more intricate maneuvers like navigating the game environment and using abilities. Collectively, these tasks fall within the domain of "processing user input."

Effective processing of user requests necessitates the integration of a suitable network solution into your game. This can be either a proprietary or open-source system and serves to facilitate the receipt and handling of user requests. Broadly, there are three primary communication flows to consider:

Uni-Directional Communication (API)

The simplest way to implement multiplayer for a game is to provide a specific request-response API that the client can use to reflect the changes made by the player in the game. In this model, the client initiates requests to the server, and the server responds. If an update is needed, the client queries the server, which returns any available updates since the last time updates were requested.

Many mobile games implement multiplayer in an asynchronous way, where players don't directly play with each other but can interact with the possessions of other players, such as visiting other players' farms in Hay Day or attacking other players' bases in Clash of Clans. During the remaining time, players focus on improving their defenses or their own "farm," which can only be changed by the player themselves and doesn't change automatically by the server (with some exceptions, of course).

Pros:

  1. Easy to implement, document, and debug.

Cons:

  1. Lacks a straightforward method to receive immediate server updates.

  2. Generates more traffic than a socket connection, as the HTTP protocol includes a significant volume of redundant textual data in addition to the actual data being sent. This leads to larger requests and response sizes.

Extending Uni-Directional Communication to enable Server-to-Client Updates

Most of the time multiplayer games’ backends that are built with some kind of an API still require the possibility of sending updates to the client. For example, a game like Hay Day (a farming game) has a market feature where players can visit other players’ farms and buy stuff from their markets. When somebody buys an item from your market, you get money and a notification that an item has sold. Updating the player’s money amount on the client as soon as possible is crucial as it affects the player’s gameplay directly. Therefore, we need a mechanism of letting the player know that something in the game changed without his/her input.

There are a few mechanisms to achieve that.

Polling

Polling is a technique used in client-server communication where the client repeatedly sends requests to the server at regular intervals to check for updates or changes. It involves the client actively checking with the server to see if there is any new information available. This can result in increased network traffic and server load compared to other communication methods.

Pros:

  1. Simple implementation: Polling is relatively straightforward to implement as it involves making periodic requests from the client.

  2. Compatibility: Polling can be used in various environments, including web applications and IoT devices.

  3. Control over the frequency: The client can control the frequency of polling based on its needs, adjusting the interval between requests.

Cons:

  1. Increased network traffic: Polling generates additional network traffic since the client sends requests even when there may be no updates available.

  2. Delayed updates: With polling, there is a delay between the time an update occurs on the server and when the client receives it, depending on the polling interval.

  3. Resource consumption: Frequent polling can put a strain on server resources, especially in scenarios with a large number of clients or when updates are infrequent.

In summary, polling offers a simple approach for checking for updates in client-server communication but comes with drawbacks such as increased network traffic, delayed updates, and potential resource consumption issues. For real-time applications or scenarios with a need for instant updates, alternative techniques like long polling or full-duplex communication are often preferred.

Long Polling

Long polling works by establishing a connection between the client and the server. The client sends a request to the server, and instead of immediately responding, the server keeps the request open. The server waits until it has new information or an update to provide. Once there is an update, the server responds to the client with the new data. If no updates occur within a specified time period, the server sends a response indicating that there are no changes.

In short, long polling maintains a persistent connection where the server holds requests open until it has new information to deliver, allowing for real-time updates without excessive polling from the client.

Pros:

  1. Real-time updates: Long polling enables near-real-time communication, allowing the server to push updates to the client as soon as they are available.

  2. Reduced network traffic: Compared to traditional polling, long polling reduces unnecessary requests, resulting in lower network traffic.

  3. Efficient resource utilization: Long polling minimizes server load by holding requests open until updates are available, reducing the need for continuous processing.

Cons:

  1. Increased server complexity: Implementing long polling requires additional server-side logic and handling to manage open connections and deliver updates.

  2. Scalability challenges: Handling a large number of concurrent long-polling connections can pose scalability challenges for the server, requiring careful resource management.

In summary, long polling allows for real-time updates with reduced network traffic and efficient resource utilization. However, it introduces complexity and potential delays in the initial response, which need to be considered when implementing this technique. Scalability considerations should also be taken into account for applications that expect a high volume of long-polling connections.

Third-Party Messaging Service

Another good option to facilitate bi-directional communication without altering the request-response approach is by incorporating a third-party messaging service. This service would be responsible for delivering updates to the player.

For instance, one possible solution is to establish a connection between the client and an MQTT server upon launching the game. Once the connection is established, the MQTT server can relay all events to the player whenever an event is sent by someone (in this case, the game server). Consequently, the client will communicate with the game server through the existing API, while the game server can use the MQTT server to notify the client of any updates or events by sending them through the established connection.

Implementing this approach is relatively simple, as there are numerous messaging service options available. One well-known solution is RabbitMQ, an open-source, lightweight, and easily deployable message broker.

Push Notifications

Yes, push notifications can also be used for internal packets in mobile games to provide real-time updates and notifications to players. While push notifications are commonly associated with displaying alerts and messages to users, they can also be utilized to deliver internal game-related information.

For example, in a multiplayer mobile game, push notifications can be sent to players to notify them of in-game events, such as the start of a new match, the completion of a task, or the availability of rewards. These notifications can serve as a means to engage players and keep them informed about important game-related updates even when the game is not actively running on their devices.

When a push notification is received while the game is active, the game can intercept and process the notification data within its code. The game can then determine how to present the information to the player or incorporate it into the ongoing gameplay experience.

Bi-Directional Communication (Socket)

This type of communication channel supports simultaneous data transmission and reception on both ends. Often referred to as a "socket", full-duplex mode allows two communicating parties to send and receive data concurrently, enabling seamless, real-time interaction.

Most real-time games like MMORPGs, FPS games, and RTS games require extensive communication between the client and server, heavily relying on information from the server. Players need to see other players, creatures, and NPCs, while the server also needs to constantly notify all players about changes in the game environment, such as other players' movements and ability usages.

Therefore, in contrast to the Request-Response approach, these types of games require the ability for both sides to send continuous updates to each other, making the full-duplex or bi-directional approach most suitable for them.

Pros:

  1. Efficient at handling frequent data packet transfers.

  2. Enables instant data transmission from both ends.

Cons:

  1. Implementation and maintenance can be challenging due to the depth of networking knowledge required.

  2. Loss of network connection necessitates a robust recovery mechanism to restore the game's state.

  3. Depending on the chosen network solution, you might have to oversee TCP/UDP implementation, data compression, and failure handling.

Examples:

  • Manually-Operated Socket Connection

  • Any Socket Library/Framework

  • WebSocket

  • WebRTC (Web Real-Time Communication Protocol)

  • Message Queuing Systems: Message queuing systems such as Apache Kafka, RabbitMQ, or ActiveMQ

  • RPC Frameworks: Remote Procedure Call (RPC) frameworks like gRPC, Apache Thrift, or ZeroMQ

Packets and their characteristics

Once the networking solution has been chosen and implemented, the next crucial step is determining the type of data to be transmitted between the player and the server. When a game like an MMORPG is built, there's a lot of information or 'packets' being sent from the player to the game server. Some examples are:

  • Logging into the game (Authentication Packet)

  • Moving your character (Movement Packet)

  • Casting a spell (Use Spell Packet)

  • Buying an item from an in-game shop (Buy Item From NPC Packet)

Each of these packets, transmitted from the player to the server, is essential for the gameplay. However, notable differences exist among them in terms of frequency, complexity, importance, and priority.

Frequency

The first difference is frequency - how often these requests are sent. The frequency of packets varies significantly. For instance, Movement packets may be transmitted several times per second as a player navigates the world. In contrast, the Authentication packet is sent only once, when the player opens the game and initiates a gaming session.

Frequency is usually the most valuable factor in terms of the performance of your game server; even if the request is fairly simple, your server might have problems dealing with millions of those.

Complexity

The next difference is complexity; it involves how much work the server has to do when it receives a packet. The concept of complexity can be seen as a performance multiplier to the frequency of your requests. The more processing each packet requires, the greater its impact on performance.

Each packet sent over the network necessitates a certain amount of processing on both the client and server sides. This processing could involve a variety of operations, such as checking data integrity, converting data formats, querying a database, or performing complex computations. The more complex these operations, the longer it takes to process each packet.

For example, both the Authentication and “Buy Item from NPC” packets are sent infrequently, but they aren't equally complex.

  • The Authentication Packet involves searching for the player's login information in the database, checking the username & password, and loading ALL the player's data into the server's memory. It also needs to sync the player's data with the client, which means sending most of this data over the internet to the client.

  • The "Buy From The NPC" Packet, on the other hand, works with data already in the memory and might not even need a response from the server. This makes it much less complex.

So, when designing network communication for a game server, it's crucial to consider both the frequency and the complexity of each packet type. Balancing these factors effectively will help to ensure efficient and smooth operation.

Importance

Another difference is how crucial the packets are. Imagine if we can't guarantee that all packets will reach the server. Some might get lost due to a bad internet connection. Are there any packets we can afford to lose? Yes - if a Movement packet doesn't get through, it might make the player's movement a bit choppy, but it won't ruin the game. Losing any of the other packets, though, would be seen as a bug and would likely frustrate the player.

There are two different network protocols that one can use when building a game server: Transmission Control Protocol (TCP) and User Datagram Protocol (UDP).

TCP:

  1. Reliability: TCP guarantees the delivery of packets in the same order they were sent. It uses a system of acknowledgments and resends to ensure no data is lost or corrupted during transmission. This is ideal for data that must arrive intact and in the correct order, such as loading game assets or text communication.

  2. Ordering: TCP guarantees that packets will be delivered to the recipient in the same order they were sent. This is crucial for some types of data but less so for others.

  3. Connection-oriented: TCP requires a connection to be established before data can be sent. This adds to the initial latency, but once the connection is established, data transmission can proceed.

  4. Congestion Control: TCP has built-in mechanisms to avoid network congestion, slowing down the transfer rate if the network is congested, which could introduce latency and impact real-time gameplay.

UDP:

  1. Speed: UDP is faster than TCP because it doesn't require an established connection before data is sent, and it doesn't wait for acknowledgments. This can result in faster, more real-time communication, which is crucial for real-time multiplayer games.

  2. No Guaranteed Delivery: UDP doesn't guarantee that packets will be delivered or that they will arrive in the correct order. This might seem like a downside, but in a real-time game environment, it's often better to miss a packet or two than to experience lag from waiting for a resend. For example, in a fast-paced game, if a packet representing a player's position gets lost, it's less important than getting the most recent position data quickly.

  3. Stateless: UDP is stateless, meaning it doesn't require a persistent connection between the sender and receiver. This can reduce overhead and latency, making it more suitable for real-time applications.

  4. Customizable: Although UDP itself does not provide any reliability mechanisms, it is possible to implement custom reliability features on top of UDP. For example, you can choose to resend important packets if acknowledgments are not received.

Having those protocols in mind, one can accommodate both types of data in game development: data that can afford to be lost (like a movement packet), and data that must be guaranteed to reach the server (like a "use spell" packet). This is achieved by using a mix of both TCP and UDP protocols in the same game:

  1. UDP for time-sensitive but non-critical data: As discussed before, UDP is ideal for transmitting data that is highly time-sensitive and needs to be delivered with minimum latency, such as the position of a player (movement packet) in an MMORPG. If a movement packet is lost or arrives late, the impact is usually negligible because subsequent packets will correct any discrepancy.

  2. TCP for critical data that must be delivered reliably: TCP, on the other hand, provides guaranteed delivery of data, making it suitable for important actions that can't afford to be missed, such as a player casting a spell (use spell packet). If such a packet were lost, the player's action would be entirely missed, significantly affecting the gameplay experience.

By using TCP and UDP together, you can ensure that all the crucial information is reliably delivered, while still keeping the game responsive and real-time for players.

However, whether you need to differentiate your requests really depends on your game's specific requirements. A hybrid solution is typically utilized in First-Person Shooter games like Battlefield, whereas most MMORPG games are perfectly content using TCP to guarantee no packet loss, even if it results in occasional player-side lag. If your game demands extremely frequent updates, you might want to consider UDP; otherwise, sticking with TCP could be a simpler and more reliable option.

Priority

The final difference is the priority - this isn't the same as "importance". In all the network systems I've worked with, packets aren't processed right away in the same thread. Instead, they're put in a packet queue and dealt with later by a worker thread pool. Sometimes, it makes sense to use a priority queue instead of a normal one. This is a queue that takes into account the priority of each item and processes the most important ones first.

Looking at our list of packets, I'd say that the "Use Spell" packet should have a higher priority than the "Movement" and "Buy Item From NPC" packets. In real-time MMORPG games, timing is crucial, especially for spells. So, the "Use Spell" packet should be processed as soon as possible for a smooth and responsive fight. On the other hand, a small delay when buying items from an NPC might be a bit annoying, but it won't ruin the game.

Therefore, such games as World of Warcraft should prioritize packets to provide a smooth fighting experience even though some other parts of the gameplay can be lagging from time to time.

Prioritizing With Care

Let's think more deeply about priorities, using the Authentication packet as an example.

It's important to get the player into the game quickly. But is that more important than making sure the players who are already in the game have a smooth experience? That's debatable. Consider this:

  • Typically, the login request is one of the most demanding operations (in terms of data traffic) and can be a time-consuming process. Would it make a big difference if players have to wait a bit longer when they log in, if it means they have a smoother experience later? Probably not. So, we might choose to let them wait a little longer at this point to give them a better experience later.

  • Now, consider a scenario where we make the Authentication packet the highest-priority packet in the game. Doing this would impact the gameplay of those already in the game, and if there are numerous concurrent Authentication requests, existing players might experience significant lag until these requests are handled. This can trigger an unfortunate chain reaction that's challenging to mitigate:

    • Players logging into the game cause lag for existing players.

    • Existing players, experiencing this lag, suspect an issue with their internet connection and decide to restart the game, hoping it resolves the problem.

    • This causes even more players to log into the game simultaneously, worsening the lag for existing players.

    • The cycle repeats 😄

Thus, always handle packet priorities with due consideration and foresight, keeping in mind the potential ripple effects your choices may have.

Summary

Just for the record, lets summarize all the points we collected:

PacketFrequencyComplexityImportancePriority
AuthenticationVery low (once / player)HighImportant-
MovementVery High (few times a second / player)MediumNot ImportantMedium
Use SpellMedium (few times a minute)Very HighImportantHigh
Buy Item From NPCLow (few times an hour)LowImportantLow

Handling Desynchronization

Once you've set up your network solution and included a few requests for your backend, you'll soon encounter the issue of error handling.

For instance, let's say a player tries to purchase an item from an NPC, but doesn't have enough money for the transaction. Or perhaps a player is moving around the world, and the client, for some reason, assumes the player has teleported to the city without using any abilities or items to do so.

Clearly, the server should treat this as either a hack attempt or an error, and shouldn't fulfill the player's request (either to acquire the item or to teleport to the city). However, this needs to be communicated to the player, as the client might think everything went smoothly and that the player has already arrived in the city. This is known as desynchronization, where the player's or world's state differs between the client and server sides.

To rectify or prevent desynchronization, you can consider different strategies:

Restart

The easiest way to synchronize the states is to simply reload the game. When logging into the game, the server sends the current state of the player and the surrounding world. So, if the server reports an error, the client should simply restart the game session (reload the game world), starting with a fresh state loaded from the server.

Pros:

  • Straightforward to implement

Cons:

  • This can result in a frustrating experience for the player. Depending on the game, the time it takes to load the world from scratch can range from one second to one minute. Imagine if errors occur at least once per hour. Your players would be quite annoyed by these restarts. Furthermore, in real-time games like World of Warcraft where many important battles occur and a player's quick reaction is key, a restart that lasts 30 seconds could lead to the player's defeat, causing them to become quite frustrated with the game.

Server Validation

Another possible approach involves allowing the server to conduct all validations upfront. For example, each time a player attempts to purchase something from an NPC, the client sends a request to the server and waits for a "Success" or "Failure" response. By doing this, you eliminate the chance of desynchronization, as the client won't alter any state until the server approves the change.

Pros:

  • Easy to implement

  • Validation logic is entirely server-side

Cons:

  • Latency. Now, every state change requires approval from the server, meaning all in-game actions depend on the latency between the client and the server. While this isn't critical for actions like "Buy an item from an NPC," it can severely impact movement, where precision is key for smooth gameplay.

While the latency issue is significant, it doesn't mean that this approach is unsuitable for all scenarios. On the contrary, server validation is usually the best solution for most situations, while more sensitive cases, such as movement, might need a different implementation.

Partial State Updates

Another solution that could address the movement problem discussed in the previous section involves updating the state of specific game elements where desynchronization has just occurred. For example, if the server determines that the client moved the player to an impossible location, the server can reject such a move and send the client an update with the current player's position. This update forces the client to teleport the player to the specified location and rectifies the desynchronization issue.

Pros:

  • Doesn't require the client to wait for server's response

  • Automatically corrects desynchronization if the server detects that the client is out of sync

Cons:

  • Both server and client need to implement partial state updates

  • The server must manually verify if the client is out of sync and notify the client about the state correction

Implementing this kind of solution for an entire game is a massive undertaking and likely impossible for a game of considerable size. However, a more generalized approach could be considered. For instance, instead of correcting the player's location when their position is out of sync, the server could send an update packet with the complete player's information, which would include their position too. Hence, this same update packet could be utilized to address all kinds of synchronization issues related to players.

Determinism

This is the most sophisticated method and it's not typically feasible for most of the multiplayer games except Real-Time Strategy games, but it's still worthwhile for you to comprehend this approach.

Determinism in games refers to the characteristic where a game's behavior and outcomes are entirely predictable and reproducible, given the same inputs and initial conditions. Simply put, a deterministic game will produce the exact same results every time it is played under identical circumstances.

In terms of processing the player's input, it implies that identical input should yield identical results on both the client and server sides.

Pros:

  • Ensures complete consistency between the client and server.

  • Reduces the need for continuous validation checks or error correction from the server.

Cons:

  • Challenging to implement, especially in large and complex games.

  • Requires the client and server to have identical game logic.

While determinism can be an effective solution for smaller, less complex games or games with less reliance on real-time user input, it is not generally feasible for larger and more complex games like MMORPGs. These types of games usually have complex mechanics and behaviors which can't be replicated perfectly on both client and server due to various factors such as hardware differences, network latency, or unpredictable player behavior.

Conclusion

Remember, handling synchronization issues effectively is vital in providing a smooth and enjoyable gaming experience. Understanding the strengths and limitations of each approach will help you to choose the right solutions for your game, and optimize performance and player satisfaction. No matter which method you choose, ensuring that you log and fix synchronization issues as quickly as possible will help to maintain the stability and performance of your game.

Packets Reproduction

This chapter is primarily relevant for games that possess reproducible mechanics and offer offline gameplay functionality. If you are developing an MMORPG or a similar game that does not align with these characteristics, it is safe to skip this chapter without hesitation.

The next important topic to discuss is the ability to reproduce players' actions in the game on the server, even if those actions occurred on the client in the past when there was no internet connection. Let me explain further:

  • While playing the game on a mobile device, the player performs certain activities that do not affect other players (e.g., checking the farm, harvesting plants, etc.).

  • The player enters an area with no internet connectivity, such as a parking garage, and remains disconnected for a minute. However, the game continues to run and attempts to restore the connection.

  • Once the player leaves the garage, the internet connection is re-established.

The question is: What should be done with the actions performed by the player while in the garage without an internet connection? There are two approaches to consider:

  1. When the game detects a connectivity issue, it can pause the game and inform the player that it cannot continue until the connection is restored.

  2. Alternatively, allow the player to continue playing without a connection and store all the packets containing their actions to be sent to the server once the connection is restored.

The first approach is simple and often the preferred option. However, if you aim to provide a seamless and enjoyable gaming experience, let's explore how the second approach can be implemented. Here's the recipe:

  • Maintain the order of all player actions. Create a queue and store all the packets that need to be sent to the server in the correct order.

  • Each packet should include a timestamp indicating when the action actually occurred. For example:

    • If a player initiates a process at 10:00:53 that takes one minute to complete, it should finish at 10:01:53.

    • If the server receives a packet with a delay of 20 seconds at 10:01:13, it should account for the fact that the action was initiated at 10:00:53 and consider it finished at 10:01:53, not 10:02:13.

  • Configure an "allowance window" to define the time range during which the player can continue playing the game without an internet connection. This helps prevent cheating.

    • Since the client sends the action timestamp, the server needs to trust this data without the ability to validate it. Savvy individuals with development knowledge may attempt to manipulate the API by specifying action timestamps in the past to expedite long-running processes.

    • To mitigate this issue, introduce an allowed time range during which the server trusts the client. For example, this range could be 2-3 minutes, which is typically the maximum duration that users spend in elevators, garages, or other areas with no internet connection.

    • When the allowed time range is exceeded and the game displays a "no internet connection" popup, ensure that the player's actions during that period are not lost. You can send those actions to the server to reproduce them once the internet connection is available again.

  • Always handle packets based on the client's timestamp rather than the current server time.

It's important to note that there are exceptions to this solution. For example, certain requests may require immediate responses from the server, such as "buying an item from a market." These actions can be restricted while the internet connection is lost, or the game can display a "lost server connection" popup to inform the player.

Wrapping Up

That wraps up our discussion for today. I've aimed to address any queries you might have when putting together a request-handling system. If there's something you think I missed, don't hesitate to get in touch and I'll make sure to add that to the article!

Did you find this article valuable?

Support Andrey Cherkashin by becoming a sponsor. Any amount is appreciated!