Transparent Webproxy & CPU Load

Is anyone running the transparent web proxy with a large number of users? I have a Mikrotik 2.9.x box which has 3 soon to be 4 Cyclades T1 cards for Internet connectivity and also runs as a transparent web proxy cache. It does p2p control and firewalling too. CPU load is massive. Its a 2.6Ghz P4 with a Gig of RAM.

The moment I disable the proxy cache DST-NAT rule the CPU load drops to a little of nothing. I shrank the cache size way down and it does very little good. Actually I set it at 0 and it was still very high. So I assume redirecting all the http requests is whats giving the high CPU load?

Has anyone here used wpad.dat to force use of a catch? How do you get all your users to change the setting in there web browser to use it?

Looking for input.

Matthew

Havent experienced it yet, but I can very much understand it.

MT runs Squid for the proxy. With that amount of traffic / users, squid is required to have a highly optimised configuration. Both in regards to its cache stores (Which, should be MASSIVELY big in your case), access lists, and various other things.

MT however, does not allow you to use such extensive configuration options as it would make the configuration pain stakingly complicated for novice users.

Your solution is thus to get a dedicated proxy server running Squid, configure and optimise it for your bandwidth and user requirements, and then alter your dst-nat rule to transparently forward web traffic to the dedicated proxy server for caching.


C

I don’t think that would do any good. Right now I have the cache set at 0 for max disk, ram and object size and the CPU load is still high. The moment I disable the DST-NAT redirect rule a few minutes later the CPU load will drop to a little of nothing. So right now the cache is empty and its not caching anything at all. I don’t see how adding a parent cache will do any good. Will it?

Matthew

Whether your cache-size is 0, or 10 trillion, won’t make a difference. The request still goes through squid (which is why the abnormality goes away when you DISABLE the dst-nat rule, seeing the request does not go through squid then).

A parent proxy is also not what I am refering to here… Don’t use the proxy package in MT at all as the configuration is not optimised.

add chain=dstnat in-interface="My Network" protocol=tcp dst-port=80 \
    src-address-list="All Clients" dst-address-list=!noHTTPProxy \
    action=dst-nat to-addresses=x.x.x.x to-ports=3128 comment="" \
    disabled=yes

As you can see with the above dst-nat rule, I’m forwarding the traffic directly away from the MT, to a dedicated proxy server on a seperate machine…

I’m 200% confident, your problem is plain and simply that the squid configuration in the proxy package, simple cannot cope with the amount of traffic it needs to process. Whether it caches it or not, it still has to process the packet.


C

I see what your saying. I am going to have to try that. Clever how you whitelist IP’s you do not want to cache.

How many clients do you have running on your server and how much bandwidth? I still think DST-NAT’ing that much traffic could cause quite a load. But perhaps thats not it at all.

Matthew

The natting shouldn’t be a problem. Remember, the linux kernel and firewalls (iptables, which I presume MT is based on), already is optimised - no configuration required for that. I’m sure any half decent MT box will be able to shift a couple of million packets per second with ease…

Just google arround a bit and spend some time on squid-cache.org. There’s quite allot of papers and formulas to calculate the size, hardware requirements, and performance of caches as you build them. From the top of my head really, SCSI disks (<9GB), more than one cache dir, diskd processes to read/write the cache store, etc etc etc…

Properly configured, I can almost guarantee you, you will see a major improovement in performance. I have a couple of users, but use a external proxy simply on principal. MT is a router - not a cache engine.

Anyways, I have a mere 3GB cache, I get anything from about 40% to 60% cache hits with a fully populated cache…


C

Hello thanks for this your post pls can you help me with howto on proxy server and the config thanks in advances