These few days I try to setup the CloudFront for my website and everything working fine as per expected except the real user agent is not passing from the CloudFront to my origin server, instead it’s replace the user-agent to “Amazon CloudFront” when I check from my Nginx Log.
After some Googling, and find out that this is default behavior where CloudFront actually not sending the user-agent header to the origin because, but there is still a way to work around with that.
CloudFront caches responses against the request headers it sends, a cached response that was obtained by forwarding a request with
User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36 will not be considered usable by CloudFront for serving a future request for
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.85 Safari/537.36 even though for all practical purposes, it’s the same browser.
When you configure CloudFront to cache based on one or more headers and the headers have more than one possible value, CloudFront forwards more requests to your origin server for the same object. This slows performance and increases the load on your origin server. If your origin server returns the same object regardless of the value of a given header, we recommend that you don’t configure CloudFront to cache based on that header.https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/header-caching.html
The phrase “configure CloudFront to cache based on one or more headers” is synonymous with “whitelisting headers” to be forwarded to the origin.
This is the reason to forward only as much as you need — to do otherwise hurts your cache hit ratio, in this case because of the variation of
User-Agent: strings, which means you’re not getting the full benefit of the edge caches, there are more requests processed by the origin server and more bandwidth used between the origin and CloudFront. but there isn’t really an alternative. CloudFront doesn’t charge anything for storage in the edge caches, so the only cost difference will be whatever is found in those other factors.
To add the User-Agent whitelist, you can go to your CloudFront console, select the distribution that you want to create the whitelist, than go to Behaviour tabs, select the path pattern that you wish to add the whitelist headers.
By default, the User-Agent is not listed as whitelist headers in the selection, but you still can type in in the filter text box, than click on Add Custom. After you add, CloudFront will show the warning message.
Some headers have a lot of possible values, and caching based on the values in these headers would cause CloudFront to forward more requests to your origin. We recommend that you do not cache based on the following headers:User-Agent
Remember to use this whitelisted headers only when really needed only, because this will reduce your hit ratio as more request will be bypass by the CloudFront and forward to your origin to process.