Re: Squid Overkill? (Speed Up Web Surfing)



On Mon, 29 Dec 2008 22:15:04 -0500, Timmy wrote:

I friend sent this article to me. It's written for Ubuntu users but the
same could apply for any Nix system. Doesn't it seem like overkill to
store browsers cache on the HD for one box?

Yes, it is overkill. Each user's browser already maintains a cache of web
sites visited. When a user returns to a web site already visited, much of
the data is retrieved from the browser's own cache.

Installing Squid on a user's machine would only duplicate the contents of
the cache that the browser already stores -- you end up having one copy
of the data in the browser cache, and the second duplicate copy in the
Squid cache. With both caches on the same machine, what do you think
you gain?

The Squid cache only gives you an advantage in a multi-user environment
when the following conditions are true:

a) Squid is installed on a local server (not on a user machine)
b) You have _many_ users accessing the local Squid server
c) _Many_ users access the _same_ web page(s)

The following is a brief summary of how it all works:

The first local user to access a web site on a distant web server will
cause the local Squid server to fetch a copy of the data over the wire
(slow) and to pass it on to the user's browser. If this same user later
returns to the same web site, the data is retrieved from the browser's
cache on the user's machine. For the first user, there is no advantage
to having Squid in the pipeline between the user's browser and a distant
web server.

The second, and subsequent users, who, for the _first time_, visit the
_same_ web site as the _first_ user, get the data served to their
browsers from the local Squid cache. This is where the speed advantage is
gained. Since the data, fetched for the first user, is already in the
Squid cache, it is served locally from the Squid cache (fast) for
subsequent users, rather than being retrieved over the wire (slow) for
each individual user. If any of these subsequent users later return to
the same web site, the data is retrieved from the browser's cache located
on each individual user's machine.

For example, if you have 10 users accessing the _same_ web site, you will
fetch the data _once_ for the _first_ user to access that particular web
site, and serve the data from the local Squid cache for the other 9 users
accessing _that same web site_. If all 10 users access 10 different web
sites, then there is no advantage to having Squid because different data
will have to be fetched for each user and there will be 10 fetches over the
wire, one for each user. The more users you have accessing the _same_
web site, the greater the benefits of using Squid will be. Much depends
on how many users you have and their usage patterns, whether they all go
to the same web site, or each goes to a different web site.

Running Squid on a single user machine will give you no advantage
whatsoever and will only consume disk space on a user's machine for
a redundant copy of what is stored in the browser's cache.


The fastest way I've found to speed up web surfing is by blocking all of
the banners etc via /etc/hosts 127.0.0.1 Here is a good list of stuff to
drop into /etc/hosts. http://someonewhocares.org/hosts/

Blocking unwanted web sites by redirecting them to 127.0.0.1 in /etc/hosts
is an obsolete technique from an era before DSL high speed Internet, where
speed could be gained by not downloading ad banners over slow dial-up
connections. For a while it was effective.

To discourage this ad blocking technique, many modern web pages now set
delay timers of up to 30 seconds per ad banner, to give the ad banner a
chance to load, before instructing the browser to ignore the URL. The
web browser pauses and waits for one of two things to happen -- either
the ad banner begins to load, or it times out. If the ad banner begins
to load, the web browser begins to concurrently load and display the
web page. If the ad banner URL is redirected via /etc/hosts to somewhere
else, the web browser pauses, waiting indefinitely for an ad banner until
the delay timeout period elapses.

In practice, what this means is that, for example, if you block, say 6 ad
banners using this obsolete technique on a modern web page, you could be
staring at a blank screen displayed by your browser for 6 x 30
seconds/banner = 180 seconds (3 minutes) before any content from the
web page appears. How's that for encouraging you not to use this
technique for blocking ad banners? From a web site operators point of
view, if you don't want to see the ad banners, you don't get to see the
web page content, unless you are very patient.

Furthermore, if you are running a web server on a user machine for the
purpose of running web based applications, for example, a personal
address book application written in PHP, redirecting unwanted web sites
to IP Number 127.0.0.1 in /etc/hosts is in effect, redirecting all the
crud to your own web server. In this case, the logs for your own web
server will fill up with endless Error 404 messages as you surf the web.
If you do this, make sure you have lots of disk space in /var for your
log files.

For personal workstation use, the current approach to deal with this
problem is to use a non-caching proxy server and to set up the URL
filters in the proxy server, rather than using /etc/hosts for purposes
it was not intended. The non-caching proxy server filters out not only
the black listed URLs, but also nullifies the associated delay logic
that makes the browser hang, waiting for the programmed in timeout delay.
The proxy server accomplishes this by doing an edit on the incoming HTML
before the web browser gets to see it, resulting in snappy performance
from the web browser.

tinyproxy is the non-caching proxy server that I use. The contents of
my /etc/tinyproxy/filter look something like this:

\.bat
\.cmd
\.srs
\.exe
\.vbs
\.sis
\.pif
ads.space.com
phpads.cnpapes.com
cdn.valueclick.net
as.casalemedia.com
view.atdmt.com
clk.atdmt.com
www.bustnet.com
media.fastclick.net
ad.ca.doubleclick.net
ad.uk.doubleclick.net
ad.au.doubleclick.net
ad.doubleclick.net
..
..
other.domains.that.I.block

The first seven lines, above, filter out all the MS Windows executables
that can be injected for execution via web browser (to protect those on
my network still running Windows), followed by a list of domain names
that I block. The list of domain names came from the list I had at one
time in my /etc/hosts You can use any list published for use in /etc/hosts
and with some sed magic, convert it for use in tinyproxy.


All though, if you do use squid and add squidguard or dansguarding I
guess that could make a difference... Anyway check out the article below
and tell me what you think...

http://www.squidguard.org
- squidguard is a URL filter for use with Squid

http://dansguardian.org
- dansguardian is a content censorship tool
- works with tinyproxy also


In a continuing series of articles highlighting that GNU/Linux is a
viable replacement operating system, today we're exploring one quick way
to speed up your web browsing experience in the popular Ubuntu
distribution.

Everyone wants to have a faster web browsing experience! In this short
How-To I'm going to cover an easy way of doing this using the recently
released Ubuntu 8.10 Intrepid distribution.

My recommendation in this article is to install a proxy server on your
local computer. This stores local copies (caching) of web sites on your
computers hard drive.

Why?


When you surf to a site it checks the cache first and it it finds the
page or image there, it loads directly from the local hard drive copy.
This is much faster than downloading again from the Internet, especially
if you don't have a fast connection speed and it has the added benefit
of reducing downloads from the Internet.

Uh huh, this is what your web browser does with the cache it keeps.
This does not explain why you want to use Squid to create duplicate cache
that your web browser already keeps for this purpose.


To install and use this in Ubuntu 8.10 is incredibly simple. Go to the
System menu, select Administration and then Synaptic Package Manager.
Click on search and type in squid.

In the main part of the window, you'll need to go about two-thirds of
the way down the page until you find squid. Right-click on it and select
Mark for Installation. Click Apply. This will then install squid and any
dependencies it has (squid-common from memory). As this is a relatively
tiny application this really only takes a few seconds.

When this is done, go to Firefox and click on Edit, Preferences,
Advanced, Network, Settings. Click on Manual proxy configuration, and
type into the box marked "Http proxy" the word localhost. In the port
number, type in 3128. You can also tick the "Use this proxy for all
protocols box".

Now go to any website and see if it works. If it does then you have
successfully installed squid! If you can't get websites to load, then
please read onto Page 2 for two very simple troubleshooting tips and
tricks.



Article Link at http://www.itwire.com/content/view/21727/1162/

.



Relevant Pages

  • Re: browser sending its cache
    ... I know my browser sent it's cache because the ... I couldn't visit the web site until ... Relaunch Internet Explorer and try the site again. ...
    (microsoft.public.windowsxp.basics)
  • Re: avoid a page going back to site after logging out
    ... Most likely you're seeing the contents of the browser cache, ... page generated by the server. ... > I am working on a web site which authenticates an user using forms. ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: Proxy
    ... Or just delete the cache file on the ISA Server. ... >I am running a ISA server as proxy server. ... The changes are not shown on the web site if I try to access the ... >old copy of the web site and sending it to the browser. ...
    (microsoft.public.isa.configuration)
  • Re: Strange Behavior Viewing HTML file
    ... I always get a popup to open the page from my browser ... cache. ... but it was certainly not the normal behavior exhibited by any web site I've ... My guess is who ever did the page tested it only on iE7 ...
    (microsoft.public.windows.inetexplorer.ie6.browser)
  • RE: Avoid Cache in Mobile Web Page
    ... page in the cache, but the problem is how if there are so many different ... > mobile internet tool kit and ASP.net. ... > I have tried every code or script,but it didn't work.I think the answer is related with machine configjust like Simon Calver said. ... More specifically do you know the browser and version on ...
    (microsoft.public.dotnet.framework.aspnet.mobile)