FOSS Network Infrastructure and Security/Major Networking Functions with FOSS

Domain Name System edit

DNS is the glue that keeps the technical side of the Internet connected to the users of the network. Simply stated, the DNS provides names to IP address translation and vice versa. The Internet is based on the IP protocol, which means that computers on the Internet know the path to each other only through the IP addresses. However, remembering the numerical IP address of each computer on the Internet is not possible for most people. Here is where DNS comes in handy.

The main purpose of DNS is to map names to objects on the Internet. The object may be an IP address or the identification of mail servers, name servers, and even personal telephone numbers. Names, after all, are easier to remember than numbers.

The earlier version of DNS was a file named hosts.txt, which was manually maintained by the Stanford Research Institute-Network Information Center (SRI-NIC) in the early days of the ARPANET[1] in the 1970s. This hosts.txt file was updated on a single computer and pulled by computers all over the world. While this method worked for some time, name collision became imminent when more hosts were added to the network. The flat single file mapping was simply not scalable. Thus DNS was created. It is defined in RFC 1034 and 1035.

DNS consists of three components:

  • DNS name space – the domain names;
  • DNS server – the server that hosts the name space; and
  • Resolver – the client which uses the server.

The three components work in tandem to create a viable solution to name resolution. DNS is designed in such a way that all the data are maintained locally but retrievable globally. The data is distributed among different name servers and no single computer has all of the data. The data is internally always consistent, thereby providing a stable system. Moreover, DNS is designed in such a way that any device can send DNS queries to a server.

DNS Name Space edit

DNS name space is a concept. Names are references to addresses, objects and physical presences that constitute a human, understandable reference identifying an endpoint. DNS name space is the name space defined for DNS. On the Internet, domain names provide a hierarchy for DNS name space, and thus, an order to how Internet addresses are identified.

Let’s take a comparative example. If someone were to send me a letter, it would be addressed to

Gaurab Raj Upadhaya
205/8 Sahayogi Marg, Kathmandu, Nepal

In this example, the address provides a way in which the letter can be sent to me from anywhere in the world. Similarly, if someone were to send me an e-mail, they could use either of the following e-mail addresses:

In both cases, while the e-mail ends up in the same mailbox, it travels differently to that address. What needs to be understood is the domain hierarchy that makes the e-mail work. It is part of the DNS name space. Domain names are the implementation of DNS name space. The domain name system uses an inverted tree-shaped structure. The topmost level of the DNS tree is called the ‘root’.[2] The ‘root’ is referenced with a ‘ . ’ (dot). Immediately below the ‘root’ are the Country Code Top Level Domain (ccTLD) and the Global Top Level Domain (gTLD). These two top levels are predefined and fixed on a global scale. The ccTLDs are assigned as per the ISO 3166 standard. The gTLDs are decided by the Internet Corporation for Assigned Name and Numbers (ICANN). Examples of ccTLDs are .np, .in, .my, .uk, .se, etc. ccTLDs are always two letter codes. Examples of gTLDs are .com, .org, .net, .gov, .edu, .mil, .info, .name, and .aero.

Domains and Sub Domains edit

Below the TLDs are the user level spaces. These are commonly referred to as the domain names. For example, anything under .net is in the net domain, and anything under .uk is under the UK domain. And by extension, a sub domain under, such as, is under the domain.

Every domain created under an upper level domain is referred to as a sub domain. Thus, in the example above, is a sub domain under

Zones and Delegation edit

In computer terms, each DNS name space is reflected by its zone file. It is also referred to as the ‘administrative name space’. Each domain or sub domain on a name server has its own zone file, which is the main file that provides the mapping.

What makes DNS so scalable is its ability to define a delegation for sub domains to other servers and other zone files. Thus, the root zone file delegates ccTLD and gTLD functions to their respective servers, and each ccTLD or gTLD server further delegates specific domain information to their registered owners. Thus, the actual names to object mapping will be provided only by the authoritative zone for that domain. This procedure can be compared to a parent delegating authority to their child.

Name Servers edit

Name servers host the DNS zone files. They answer queries directed to them. Name servers are of two types.

  1. Authoritative name servers
    • master
    • slave
  2. Non-authoritative name servers
    • caching name servers
    • caching forwarders

Most implementations are a combination of two or more types.

Authoritative Name Servers edit

Authoritative name servers host the main zone files for the designated domain. The authority of the name server is based on delegation from the upper level domain. Thus for any server to be authoritative for domain, it has to be delegated in the zone file.

The master file is where the main file is hosted. The slave mirrors the file from the master. There can be multiple slave servers to one master server. A single master server can support more than 20 million names, but it might not be a good idea to actually do this. Different DNS server software are capable of handling large numbers of DNS queries. A commonly cited example is 300,000 queries per second. Changes in the master copy of the database are replicated to the slaves immediately or according to timing set by the administrator.

Recursive Name Server edit

A recursive name server is not authoritative for all domains for which it is serving data. It acts on behalf of other clients, and caches the result in its memory. If the same query is sent within a predefined time period, then instead of searching the entire DNS structure, it serves the data from the cache. In case of caching forwarders, the server uses another DNS server to get the result. When the data are forwarded to the client, they are marked as non-authoritative.

Mixed Implementation edit

In smaller organizations, a single name server can be used for multiple purposes. A server can be authoritative for a select few domains but it may also serve as a non-authoritative caching server for other domains. With recent cases of DNS cache poisoning, it is strongly recommended that the same server not be used for both authoritative and caching functions.

Resolver edit

The resolver is the client that asks the server for the DNS data. The resolver is normally implemented at the operating system level in the form of a library, so that multiple applications can use it.

DNS Security edit

A big strength of the Internet is in the user’s ability to use names to reach the right servers and systems. A mis-configured DNS or a malformed DNS query can deter users; hence, the need for a secure DNS system. It is very important to follow these simple points:

  • Allow only authorized systems to do zone transfers from the master server.
  • Have a minimum of two DNS servers, and remember not to put them in the same location.
  • Make sure that your forward and reverse DNS information is consistent.
  • Follow current best practices for DNS implementation.

Using BIND for DNS edit

BIND is an implementation of the DNS protocols. It provides an openly re-distributable reference implementation of the major components of the domain name system, including:

  • A DNS server (named);
  • A DNS resolver library; and
  • Tools for verifying the proper operation of the DNS server.

The BIND DNS server is used on the vast majority of name serving machines on the Internet, as it provides a robust and stable architecture on top of which an organization’s naming architecture can be built. The resolver library included in the BIND distribution provides the standard interface for translation between domain names and Internet addresses and is intended for linking with applications requiring name service.

Getting and Installing BIND edit

BIND is normally installed by default by most GNU/Linux distributions. Otherwise, you can always get a copy from the BIND home page at The installation process of BIND in different distributions of GNU/Linux may differ. It is best to follow the distribution guide for installation.

Configuration of BIND edit

The BIND configuration has to be undertaken in three stages.

First, the client side or the resolver library needs to be configured, followed by the server itself and finally, the tools.

Resolver configuration edit

The resolver is the client side of the DNS system. Even if you do not run a DNS sever on the computer, you will need to have the resolver installed. Naturally, in order to configure BIND, you first need to configure the resolver library. This is done by configuration of the following files:


This file specifies how the host name resolution is performed. This has been made obsolete, but older installations may still use it.


This file has replaced host.conf. It specifies the order in which name resolution takes place. It tells the computer the order in which it should try to convert names into IP addresses.

# /etc/nsswitch.conf

Any line starting with a # sign is a comment.

'#' In this example, hosts are resolved through DNS, and then from files.

hosts: dns files

'#' only files are used for network name resolution

networks: files

The default file created during installation is usually sufficient.


resolv.conf is the basic DNS configuration file, which specifies the DNS server and the domain name. The three key words here are ‘domain’, ‘search’, and ‘nameserver’.

# /etc/resolv.conf
# Our domain
# Default search domains in order of priority
# We use the local server as the first name server.
# we have second name server at up stream provider.

This file is also created during the installation, and if your network configurations have not changed, you can leave it unchanged.

Server configuration edit

named’ is the best known FOSS DNS daemon. A daemon is a software program that runs continuously on the server as a service. The DNS server is thus commonly referred to as the ‘name daemon’ or just ‘named’. The file name of the DNS server is also ‘named’.

For BIND versions 4.x.x, named used the configuration file/etc/named.boot. But in the later versions of BIND (8.x.x), the configuration file is /etc/named.conf.

For our purposes, we use /etc/named.conf. In the following example, a master DNS for the domains and is specified. A slave DNS for domain is also shown.

// /etc/named.conf file for
// in this file ‘//’ is the comment.

// you specify the default data directory for DNS. Now all DNS
// related files should go into /var/named or any other
// directory as specified.

options {
directory “/var/named”;

// First you need to add the DNS root zone file name. It’s there
// by default.

zone “.” {
type hint;
file “”;

// Now we are specifying a master domain called
// whose information is stored in the file ‘’

zone “” {
type master;
file «»;

// the whole thing can also be done in a single line.

zone “” { type master; file “”;};

// Now this server is also a slave for another domain “’

zone “” { type slave; masters {; };
file “slave/”; };

zone “” { type master; file “named.local”; }

This file sets the stage for adding the real data about host name and IP addresses. The /etc/named.conf file can take a lot of additional configuration directives, but these will not be discussed here.

After relevant entries have been made in the named.conf file, it is necessary to create the host name records for the corresponding domains. All of the files should be placed in the directory specified by the directory directive in the named.conf file.

The named.local file provides reverse zone lookup for the loopback interface or the network used by the loopback addresses. The default file should be left unchanged. The provides the root server information to the DNS server. The default should never be edited. Now let’s look at a sample DNS file for the domain (

; file /var/named/

@ IN SOA (
2004050801 ; serial number
86400 ; refresh: once per day (1D)
3600 ; retry: one hour (1H)
3600000 ; expire: 42 days (6W)
604800 ; minimum: 1 week (1W)
# we are specifying three Name servers.
# local mail is distributed on another server
IN MX 10
IN MX 20
; loopback address
localhost. IN A
# The glue records so that the NS records can resolve.
ns IN A
ns1 IN A
# main DNS entry
www IN A
mail IN A
# Aliases for the www machine.
tftp IN CNAME www

The above file is the main file for the domain ‘’. If you want to add additional names for the domain ‘’, like and, then you should add them in the above file.

The order in which the name servers are listed in this file makes them master and slave. The first NS is always the master server and the other two are slave servers. Each time the master server is updated, it can automatically send a notification to the other NS servers listed in the file. This parameter can be configured in the named.conf file.

A note about Reverse DNS edit

The majority of DNS-related problems are usually due to mis-configured reverse DNS. Reverse DNS is the mapping of numbers into names, or the opposite of the forward name resolution. Many applications use this facility to verify that the network source IP address is valid. A common example is SPAM or unsolicited commercial e-mail (junk e-mail) prevention software, which may refuse to accept mails from any domain that does not have a reverse DNS configured.

The reverse DNS works through delegation of the particular group of IP addresses from one of the Regional Internet Registries (RIRs), which is Asia Pacific Network Information Centre (APNIC) ( in the Asia-Pacific region. Since Internet Service Providers (ISPs) are normally APNIC members, they are responsible for configuring the appropriate reverse DNS for the IP addresses being used by them and their clients.

Since each computer has its own loopback interface and the IP address associated with it, BIND comes with the default installation of the named.local file, which is the reverse DNS for network. This file looks like the following:

; /var/named/named.local
$TTL 86400
@ IN SOA localhost. root.localhost. (
1997022700 ; Serial
28800 ; Refresh
14400 ; Retry
3600000 ; Expire
86400 ) ; Minimum
IN NS localhost.
1 IN PTR localhost.

Administrating BIND DNS edit

BIND includes a utility called rndc that allows you to administer the named daemon, locally or remotely,with command line statements. The rndc program uses the ‘/etc/rndc.conf’ file for its configuration options, which can be overridden with command line options.

Before you can use the rndc, you need to add the following to your named.conf file:

controls {
inet allow { localhost; } keys { <key-name>; };
key “<key-name>” {
algorithm hmac-md5;
secret “<key-value>”;

In this case, the <key-value> is an HMAC-MD57[3] key. You can generate your own HMAC-MD5 keys with the following command:

dnssec-keygen -a hmac-md5 -b <bit-length> -n HOST <key-file-name>

A key with at least a 256-bit length is a good idea. The actual key that should be placed in the <keyvalue> area can be found in the <key-file-name>.

Configuration file /etc/rndc.conf

options {
default-server localhost;
default-key “<key-name>”;
server localhost {
key “<key-name>”;
key “<key-name>” {
algorithm hmac-md5;
secret “<key-value>”;

The <key-name> and <key-value> should be exactly the same as their settings in /etc/named.conf.

To test all of the settings, try the rndc reload command. You should see a response similar to this:

rndc: reload command successful

You can also use the rndc reload to reload any changes made to your DNS files.

DNS Tools edit

There are two common tools to test DNS: nslookup and dig. Nslookup is the older of the two, and is less preferred. You can use the dig utility to test the DNS service. Use the command ‘man dig’ on most Unix and Unix-like systems to access the relevant manual pages.

The Mail Server edit

Internet and e-mail were considered synonymous in the early days of the Internet. Even today, more than a quarter of the total Internet traffic is still e-mail, and it is not surprising that FOSS rules the world of e-mail. On the Internet, e-mail messages work on the basis of Simple Mail Transfer Protocol (SMTP) which is defined in RFC 2821. SMTP is a really simple protocol designed to make the transfer of e-mail messages between mail servers as easy as possible.

SMTP works in plain text, and communicates between the mail servers, which are also referred to as Mail Transfer Agents or MTAs. The most popular mail server software is ‘sendmail’. Other examples are exim, qmail, and postfix. The closed source alternatives are Lotus Notes and Microsoft Exchange.

The beauty of FOSS is the level of software complexity available. While exim and postfix have a smaller footprint and consume a small amount of memory on the servers, sendmail is a complex beast that runs the busiest mail servers of the world.

Another important benefit of FOSS mail servers is the modularity of the software. Sendmail itself provides for a large number of extensions and provision for including modules. This makes it easier for developers to extend the software for their in-house needs. If you need to develop an extension to your e-mail server to automatically handle different types of e-mail, FOSS is a better option.

Other Mail-related Protocols edit

The two main mail-related protocols are Post Office Protocol (POP) and Internet Mail Access Protocol (IMAP). These provide the end-user functionality for users. POP and IMAP are used by e-mail software for accessing e-mails stored on a server. So if you use an e-mail client like Eudora or Thunderbird, then it will use either POP or IMAP to pull e-mail from your mail server to the local machine.

Handling Spam edit

Unsolicited Commercial E-mail (UCE) or spam is increasingly a big problem for all service providers. Most mail server software now have at least minimum anti-spam features.

Incoming spam edit

Spam Assassin is a popular software used to filter incoming spam. It can be invoked for either a single user or the whole system, and provides the ability to configure a complex set of rules to detect and delete incoming spam.

Stopping outgoing spam edit

It is also the duty of the provider not to let spammers use their network for sending mail. Mis-configured mail servers that allow for open-relay are some of the biggest sources of spam. Increasingly, mail servers are configured not to be open-relay by default. Virus-infected computers are also another source of spam.

Anti-spam Features edit

One of the biggest advantages of FOSS is its extensibility. Nothing highlights this more than the antispam features available in mail servers. Today, almost 80 percent of all e-mail messages are thought to be UCE, commonly referred to as junk e-mail or simply spam. UCE not only consumes a lot of bandwidth and network resources, it is also a nuisance to users and decreases productivity for organizations.

The best anti-spam tools available today are all FOSS. In order to stop junk e-mail, it is necessary to identify their origin, and what better way of doing this than thousands of users collectively identifying spammers. The FOSS concept makes sure that not a single junk e-mail goes unreported so that the origin can be identified easily.

A common anti-spam technique is the use of Real Time Block Lists or RBLs. Different RBLs list the IP addresses of networks that are known to be the origin of huge amounts of spam. Again, the open nature of these lists, as well as software like Spam Assassin, makes it easier to tune the software to a user’s own needs.

For example, in a corporate environment, the users needed to send e-mail in capital letters due to the nature of their work. Now, if the entire e-mail were in capital letters, most anti-spam tools would identify it as junk e-mail. However, if we use FOSS solution, we can modify the code and remove this criterion for mail originating from within the network.

Using Sendmail for SMTP edit

Sendmail is one of the SMTP servers available under Linux. It is also one of the oldest open source software that is widely used. Many people consider sendmail to be too complicated and difficult to use. Sendmail has its advantages and disadvantages. Because it has many features, it is a complex piece of software. But, at the same time, the basic operations of sendmail can be managed easily.

Sendmail configuration is handled either by directly editing the sendmail configuration file (not recommended) or through the use of the M4 macro language in creating a new configuration file from a set of variables.

Now we will deal with sendmail. Given below are the minimum necessary changes to the default sendmail installation.

Enabling Network-ability in sendmail edit

Default installation of software in many distributions is limited to the mail server listening only on the loopback address,[4] i.e., the server is not reachable over the network. For sendmail to be reachable from the network, you will need to edit the appropriate line in the /etc/mail/ file. You should edit to remove from the following line

DAEMON_OPTIONS (‘Port=smtp, Name=MTA’)

After that you will have to run the m4 macro to create new sendmail configuration files

[root@mail /etc/mail]# m4 >
[root@mail /etc/mail]# service sendmail restart

This should enable network reachability for the sendmail daemon. There are also a lot of other options on the file that you can play around with.

Local Domain Names edit

Edit /etc/mail/local-host-names and add all domain and domain aliases that your site uses.

# local-hosts-names -
# include all aliases for your machine here.
# some examples

These are necessary so that sendmail accepts mail for these domains.

Virtual Domain Users edit

However, the above configuration does not entirely solve the problem of virtual domain users. For that, use the virtusertable feature. Go to /etc/mail/virtusertable

# /etc/mail/virtusertable
#virtual e-mail address real username user1_yoursite1
# for domain pop, i.e, all e-mail in a domain into a single account yoursite2

Be sure to restart the sendmail daemon after making the changes.

[root@mail /etc/mail]# service sendmail restart

Access Control edit

Sendmail provides an access control feature through the /etc/mail/access file.

192.0.2. OK

OK - Accept the mail message. RELAY - Accept messages from this host or user even if they are not destined for our host; that is, accept messages for relaying to other hosts from this host. REJECT - Reject the mail with a generic message.

It is required that you allow RELAY from your own network. Otherwise, computers on the network using the server as their outgoing SMTP server will not be able to send e-mail.

Running sendmail as System Daemon edit

The script is located at /etc/rc.d/init.d/sendmail and is started automatically when the computer is started. You can also start it using other commands

[root@mail /etc/mail]# /etc/init.d/sendmail start
[root@mail /etc/mail]# service sendmail restart

Running sendmail from xineted edit

It is a good idea (from a security standpoint) to have sendmail run from xinetd.conf and not as a standalone daemon. For that we need to add it to /etc/xinetd.d directory and remove it from /etc/rc.d/init.d, and then add the sendmail queue processing to cron. Here is what you have to do:

1. When using xinetd, create a file sendmail in /etc/xinetd.d/ similar to

default: on
service sendmail
socket_type = stream
wait = no
user = root
server = /usr/bin/sendmail -bs

2. Edit /etc/rc.d/init.d/sendmail to have exit 0 somewhere in the very beginning (this might not be the best way, so be sure to document the changes you do to these files) so that this file does nothing other than start sendmail.

3. By editing your (root’s) crontab9 (to edit use crontab -e), add a line like this

*/20 * * * * /usr/sbin/sendmail -q

That would process the sendmail queue every 20 minutes (if it exists).

Other Mail Servers edit

The other popular mail servers are postfix, exim, and qmail. While postfix is shipped as a default on a few Linux distributions, many small service providers have adopted exim because of its simplicity and robustness. Exim also has strong anti-spam features built into it.

The Web Server – Apache edit

The dominance of FOSS in the web server market is well known. The Apache web server is the undisputed leader in web server surveys conducted on the Internet. It also has many strengths: it is the leader in introducing name-based virtual hosts, and the first truly modular web server that would work seamlessly with database servers. It also has fully built authentication, authorization and access control functions, as well as scripting support.

April 2006 Web Server Survey edit

The latest survey statistics are available at

Apache also fully integrates with OpenSSL, which provides a secure sockets layer,[5] thereby enabling use of the Apache web server for e-commerce and secure transaction. And, best of all, it can be fully configured through a text-based configuration file – httpd.conf.


Configuring Apache edit

Configuring Apache is also fairly easy, unless you want to run complex software on the web server. The configuration files are usually located by default in ‘/etc/httpd/conf’. Additional configuration files are located in ‘/etc/httpd’. It is also common for Apache to be installed in ‘/usr/local/apache’.

The main configuration file for Apache is the httpd.conf. Apache usually works out of the box.

The httpd.conf File edit

Directives are the settings that define how Apache should actually run, where files are located on your server, how much of the machine’s resources Apache may use, which content visitors are allowed to see, how many concurrent visitors the server can handle, and other parameters.

Let’s look at the main directives:

/etc/httpd/conf/httpd.conf – the main configuration file

Server Identification

ServerName: construct self-referential URLs
ServerAdmin: e-mail of server administrator displayed on error messages
File locations
DocumentRoot - the location where the static content of your website lives
ServerRoot – for relative location of files that do not begin with a slash “/”
ErrorLog – to log server-wide error messages
PidFile – contains process ID of the httpd process
Alias – to serve files outside the DocumentRoot
ScriptAlias – the location for CGI scripts, dynamic content generators
DirectoryIndex - the file specified is displayed by default
Userdir – to serve files from public_html dir in user’s home dir as
Process creation
MaxClients – the number of simultaneous connections allowed from clients
Server-Pool Regulation - Apache under UNIX is multi-process; it balances the overhead required to spawn child processes with system resources. You can change setting such as MinSpareServers, MaxSpareServers, StartServers, MaxClients to fine tune the performance of the server.
User,Group – set the privileges of the Apache child processes
Network configuration
BindAddress - restricts the server to listening to a single IP address
Listen - specifies multiple IP addresses and/or Ports
KeepAlive - an extension to HTTP, which provides a persistent connection
Port – the TCP port number the web server runs on; can be changed to an unused port
URL redirection:
To redirect requests to another URL Redirect permanent /foo/

Virtual Hosts edit

Virtual hosts is the practice of maintaining more than one web server name on one physical machine. For example, the same physical machine can host both the and

Parameters are specific to a virtual host, which overrides some of the main server configuration defaults. There can be two types of virtual hosts – IP-based and name-based.

In an IP-based virtual host, the IP address of the connection is used to determine the correct virtual host to serve. The approach requires a separate IP address for each virtual host. In name-based virtual hosts, the host names are sent as part of the HTTP headers, which means that many different hosts can share the same IP address. However, you will need to map each host to the IP address in DNS. This eases the demand for scarce IP addresses. Name-based virtual hosts cannot be used with SSL secure servers and older software may not be compatible.

Virtual host directives edit

  • NameVirtualHost – designate the IP address and port number to listen (optional)
  • <VirtualHost> – same argument as NameVirtualHost
  • ServerName – designate which host is served
  • DocumentRoot – where in the file system the content for that host lives
  • ServerAlias – make the host accessible by more than one name

Here is an example.

NameVirtualHost *

<VirtualHost *>
ServerName www.domain.tld
DocumentRoot /www/domain
<VirtualHost *>
ServerName www.otherdomain.tld
DocumentRoot /www/otherdomain
serverAlias otherdomain.tld *.otherdomain.tld

If no matching virtual host is found, then the first listed virtual host that matches the IP address will be used. All other standard Apache directives can be used inside the virtual host directive.

Access Control per Directory using .htaccess file edit

.htaccess file is a text file containing Apache directives

AccessFileName .htaccess …in httpd.conf

.htaccess file contents

AuthName “restricted stuff”
AuthType Basic
AuthUserFile /usr/local/etc/httpd/htusers
AuthGroupFile /usr/local/httpd/htgroup
require valid-user
require group staff
require user lahai gaurab
AuthName “restrict posting”
AuthType Basic
AuthUserFile /usr/local/etc/httpd/htusers
AuthGroupFile /usr/local/httpd/htgroup
<Limit POST>
require group admin

htpasswd – to manage users for access control

htpasswd -c /usr/local/etc/httpd/users martin

htpasswd /usr/local/etc/httpd/users ritesh

/usr/local/etc/httpd/htusers contents:
/usr/local/httpd/htgroup contents:
staff:martin jane
admin:lahai gaurab

References edit

Proxy and Web Caching with Squid edit

As with mail servers and web servers, FOSS also set the standard in the area of proxy and cache servers. Squid is synonymous with proxy services in the networking world. It is a very modular, high performance proxy and web caching server. The Squid website is Squid proxy caches can be clustered to provide much better speed and access. Squid cache was also one of the first cache systems to implement a hierarchical cache system.

Some advantages of Squid:

  • High-performance proxy caching server for web clients
  • A full-feature web proxy cache
  • Designed to run on UNIX systems
  • Free, open source software
  • Handles all requests in a single, non-blocking I/O-driven process
  • Keeps meta data and especially hot objects cached in RAM
  • Caches DNS lookups
  • Implements negative caching of failed requests
  • Supports SSL, extensive access controls, and full request logging
  • Using ICP, caches can be arranged in a hierarchy or mesh for additional bandwidth savings

Squid consists of:

  • Main server program Squid
  • DNS lookup program dnsserver for faster DNS lookups
  • Optional programs for rewriting requests and performing authentication
  • Some management and client tools

squid.conf – the Main Configuration File edit

  • Default configuration file denies all client requests
  • Configure to allow access only to trusted hosts and/or users
  • Carefully design your access control scheme
  • Checks it from time to time to make sure that it works as you expect
  • People will abuse it if the proxy allows access from untrusted hosts or users
  • Makes their browsing anonymous
  • Intentionally uses your proxy for transactions that may be illegal
  • Websites exist with a list of open-access HTTP proxies

Here is an example of the Squid configuration: to run basic Squid, the only thing configurable is the proxy port. The default Squid proxy port is 3128 but you can always change it.

Network options
http_port port Hostname: port

Squid Access Control edit

Squid is better known for its complex access control system. You can allow and restrict access based not only on IP addresses but also on domain name. The use of regular expression lets you create complex rules for access through the proxy server. For access control in Squid, a sophisticated access control system, similar to the one used in routers, is used. It is basically a two-step process:

  1. Defining the access listed through use of the acl command; and
  2. Allowing or denying access based on the access list created earlier.

i. acl used for defining an Access List. ‘ ‘acl’ literally stands for access control list. The default ACLs are:

acl all src
acl manager proto cache_object
acl localhost src
acl SSL_ports port 443 563
acl Safe_ports port 80 21 443 563 70 210 1025-65535
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http

ii. http_access – to control http access to clients If there are no “access” lines present, the default is to allow the request.

http_access allow manager localhost
http_access deny manager
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access deny all

The “deny all” line is very important.


Restrict access to work hours (9 am-5 pm, Monday to Friday) from IP 192.168.2/24
acl ip_acl src 192.168.2/24
acl time_acl time M T W H F 9:00-17:00
http_access allow ip_acl time_acl
http_access deny all
Rules are read from top to bottom
acl xyz src
acl morning time 06:00-11:00
acl lunch time 14:00-14:30
http_access allow xyz morning
http_access deny xyz
http_access allow xyz lunch
Be careful with the order of allowing subnets
acl mynetwork src
acl servernet src
http_access deny servernet
http_access allow mynet
Always_direct and never_direct tags
# always go direct to local machines
always_direct allow my-iplist-1
always_direct allow my-iplist-2
# never go direct to other hosts
never_direct allow all

After all of the http_access rules, if access is not denied, then it is allowed.

If none of the “http_access” lines cause a match, then the default is the opposite of the last line in the list.

It is a good idea to have a “deny all” or “allow all” entry at the end of your access lists.

iii. cache_dir: dir to store cached data

cache_dir /usr/local/squid/cache/ 100 16 256

Can support more than one disk with multiple mount points

cache_dir /usr/local/squid/cache1/ 10000 16 256
cache_dir /usr/local/squid/cache2/ 20000 16 256

iv. cache_mgr: e-mail of the Cache Admin

Appended to the end of error pages returned to users

cache_effective_user squid
cache_effective_group squid
Changes user and group ID’s once it has bound to the incoming
network port
ftp_user: set the e-mail address that is used for FTP proxy

Client: Connects to a cache and requests a page, and prints out useful timing information v. Squid logs


Transparent Caching/Transparent Proxy edit

This picks up the appropriate packets, caches requests and solves the biggest problem with caching, which is getting users to use the cache server in a transparent way. Four factors need to be considered:

  • Correct network layout - all network traffic needs to pass through a filter device
  • Filtering: filtering out the appropriate packets
  • Kernel transparency: redirecting port 80 connections to Squid
  • Squid settings: Squid needs to know that it is supposed to act in transparent mode

A detailed explanation of how to achieve transparent proxy is available at

Footnotes edit

  1. Advanced Research Program Agency Network is considered the precursor to the current Internet.
  2. Not to be confused with the ‘root’ user on GNU/Linux and Unix systems.
  3. A popular way to encrypt. It uses a one way hash algorithm for encryption.
  4. The IP address of the loopback interface is
  5. Secure Sockets Layer (SSL) encrypts data that is travelling over the public network.