“Why do we need machine learning to boost streaming quality?” is a common query we get. This is a critical issue, particularly in light of the recent hysteria surrounding machine learning and AI, which has the potential to lead to situations where we have a "solution in search of a problem." We explain some of the technological challenges we face for video streaming at Netflix in this blog post, as well as how mathematical models and machine learning techniques can help us solve them.
Netflix has over 117 million subscribers worldwide. About half of those participants reside outside of the United States, indicating a significant opportunity to expand and introduce Netflix to a wider audience. It's a huge technological challenge to have a high-quality streaming experience for this global audience. The engineering effort needed to build and maintain servers around the world, as well as algorithms for streaming content from those servers to our subscribers' devices, accounts for a significant portion of this. A "one size fits all" approach for streaming content becomes increasingly suboptimal as we gradually grow to consumers with varied viewing habits, operating on networks and devices with widely differing capabilities. Consider the following scenario:
On mobile devices, viewing/browsing activity differs from that on Smart TVs.
Fixed broadband networks can be more volatile and unreliable than cellular networks.
Some markets can have more congested networks than others.
Due to hardware variations, different device classes have different internet connection capabilities and reliability.
To provide a high-quality experience for existing members as well as grow into new markets, we must adapt our methods to these various, sometimes fluctuating circumstances. We monitor network and system conditions, as well as aspects of the user experience (e.g., video quality) for each session at Netflix, enabling us to use statistical modelling and machine learning in this area. In a previous post, we discussed how data science is used to distribute information through our global servers. In this article, we'll go over some of the technical issues we're dealing with on the system side.
Characterization and prediction of network efficiency
It's difficult to define and forecast network efficiency. Although average bandwidth and round trip time are well-known measures of network efficiency, other characteristics such as reliability and predictability are crucial when it comes to video streaming. A more detailed network quality characterization will be useful for evaluating networks (to target/analyze product improvements), assessing initial video quality, and/or adapting video quality during playback (more on that below).
The graphs below show network throughput during real-world viewing sessions. They are very noisy and fluctuate over a wide range, as you can see. Can we forecast throughput for the next 15 minutes based on the previous 15 minutes' data? How do we incorporate network and system historical data over a longer period of time? What kind of data from the server can we provide to allow the system to adapt optimally? Even if we can't foresee when a network would go down (it may happen for a variety of reasons, such as a microwave turning on or streaming from a vehicle passing through a tunnel), can we at least describe the throughput distribution we expect to see based on historical data?
We may use more complex models that combine temporal pattern recognition with various contextual indicators to make more reliable predictions of network quality since we are experiencing these traces at scale.For More amazing articles, please visit My Articles