Introduction to Information Technology/Print version


Introduction to Information Technology

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/Introduction_to_Information_Technology

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.

About

Target Audience

edit

This book is designed for first year, undergraduate students majoring in Information Technology or a related field. The material is also useful for non-IT majors who wish to learn more about the information technologies in use in today's business environments. For many students, this course is their first formal introduction to Information Technology as an academic discipline. Therefore, the book touches on a cross-section of the major topics in IT.

Authors

edit

This book is being authored by students in the Introduction to Information Technology course at Georgia Southern University. However, we welcome contributions and edits from everyone with a passion for IT.


Introduction

What is Information Technology? Well that's simple. Information Technology (IT) encompass the study and application of computers and any form of telecommunications that store, retrieve and send information. IT includes a combination of hardware and software used together to perform the essential functions people need and use everyday. Most IT professionals will work with an organization to focus on and meet their needs technologically by understanding what they need, showing them options on what current technology is available to do their needed tasks, then implementing the technology into their current setup, or creating a whole new set up.


Computing Hardware

Introduction

edit

Computer hardware consists of the physical components within a computing device that work together to form a system, enabling the computer to process data. Computer hardware can be internal or external. Internal components are the parts not visible without opening the device, while external components are components on or attached to the outside of the device. Examples of internal devices are storage, processing, and fans. Examples of external devices are keyboards, monitors, a mouse, and anything connected directly to the device (like a USB or printer).

Major Components of a Computing Device

edit

Central Processing Unit (CPU)

edit

The central processing unit (CPU) is the brain of any computer system. It is a component designed to handle I/O (the exchange of data between the computer and the user) and basic control of computer systems. More specifically, the CPU manages the basic arithmetic, logical, and I/O operations of a computer system. The CPU takes instructions from a program, decodes the instructions, then executes the instructions. This process is called an instruction cycle. The average modern microprocessor completes hundreds of thousands of these instructions cycles every second.

The CPU comes equipped with many specific components. The first component is known as the arithmetic logic unit (ALU), which performs simple arithmetic and logical operations. The second component is the control unit (CU), which directs all of the processor's operations for the various components of the computer. The CU reads and interprets instructions from memory and converts them into multiple signals to activate other parts of the computer, following those instructions. The CU calls upon the ALU at times to perform the necessary calculations.

The CPU also generates large amounts of heat. This requires a heatsink and fan attached to the CPU with thermal grease (also called thermal paste) to maintain safe temperatures.The thermal grease creates a uniform contact surface for excess heat to transfer across. Some computer builders prefer to use a larger aftermarket cooler or liquid cooler to keep the CPU protected from high temperatures.

Power Supply Unit

edit

Power Supply Unit (PSU) is a vital hardware component of a computer; without a power supply a computer would not work. The power supply converts alternating current (AC) line from an electrical outlet to a lower voltage direct current (DC) power needed to run a computer. The power supply unit handles supplying power to the various electrical components of the computer. There are two kinds of power supplies, linear power supply and switch mode power supply.

Motherboard

edit
 
Asus motherboard with various expansion slots

The motherboard is the main circuit board of the computer. It contains a socket for the CPU, memory slots for RAM, the BIOS, the controller ports for peripheral devices (keyboard, mouse, speakers, etc.), and even expansion slots for a video card or sound card. The motherboard also facilitates communication between all the computer's devices and systems. It is usually the main component of the computer and is often referred to as a system board. Some motherboards contain extra connectors that allow expansions to upgrade the computer and add additional chips.

Motherboards usually have one or more expansion slots. These slots are most commonly a PCI (Peripheral Component Interconnect) slot, or a PCI-Express 16x slot. These slots are most commonly used to connect devices such as a video card, sound card, WLAN adapter, etc. they are normally aligned with a expansion card cutout on the computer case to allow for cables to be plugged into them. Most ATX motherboards will have two or three PCI-E 16x slots depending on the brand and how high-end the board is.

System Memory

edit

System Memory -- also referred to as Random access memory (RAM) -- is a form of temporary storage that computers use to quickly access data. RAM is necessary due to inadequately slow hard disk drive (HDD) read and write speeds that are required for a computer's normal operation. The reason RAM is random is because it gives the computer the ability to quickly and directly access values anywhere in memory. In order for the memory to stay active and available, it must be supplied electricity from the power supply, which is why if you turn off a computer data stored in memory is lost. Data in memory that you wish to save must be placed onto a hard disk drive (HDD) or other storage device where it will stay until deleted.

Hard Disk Drive

edit

A hard disk drive (HDD) is a long-term, persistent storage device that can store large amount of data. It serves as the primary storage for computers to access and store data or files. Such files as pictures, music, videos, spreadsheets, documents, databases, etc. Most HDD contains hundreds of gigabytes (GB) or more. Hard disk drives are slower than system memory for data access, but they are also significantly cheaper. Any information that is saved to a hard disk drive is persistent and will not be lost even if the computer is turned off. The same applies for flash drives that serve as portable data storage that is accessible from any computer the drive is plugged into.

Solid State Drive

edit

Solid state drives (SSD) are the next technological evolution of the HDD. SSDs, unlike HDDs, do not contain any moving parts, making them cooler, faster, safer, and silent. SSDs utilize a non-volatile, persistent version of flash memory, more specifically, NAND-based flash memory, which, unlike the flash memory in RAM, retains data when it loses power. There are a few downsides to SSDs though. Currently, SSDs come in much smaller sizes and are orders of magnitude more expensive than HDDs of the same size. As SSD technology matures, this is expected to change and, recently, consumer SSDs have significantly dropped in price.

Video Card

edit

A video card is a type of expansion card, or circuit board, that can be inserted into a computer's motherboard to add functionality. Video cards are used to produce a supply of output signals which are fed into a display to produce images. The video card's subcomponents include a processing unit, memory, connections to a display device, and a heat sink. A video card typically consists of these components mounted on a printed circuit board. Video cards for desktop computers come in one of two size profiles, regular and low profile. These profiles are based on width only, while the length and thickness of the video card may vary. Most video cards allow access to advanced graphics like 2D and 3D images, gaming, and other video output.

It should be noted that most modern motherboards come with an integrated video card, eliminating the need to buy a separate component. However, purchasing a separate video card typically provides a much higher level of performance in terms of image quality.

Input Devices

edit

Input devices are any form of computing hardware that is used to provide and use data in order to control signals and commands for an information system such as a computer. Examples of things that could be classified as input devices are keyboards, mouses, or joysticks.

Output Devices

edit

Output Devices are devices that receive data from a computer and convert the data into human-accessible physical forms such as text, audio, or video. Common output devices include monitors, which convert data from a computer into a readable display, printers, which convert data into physical reproductions, and speakers, which convert computer data into audio.

Casings

edit

A computer case is the box that contains most of what a computer is composed of. It is also known as a tower, base unit, housing and etc. It protects a computer's inner workings and makes transporting it a lot easier. It can come in many different shapes and sizes, but it usually depends on the size of the motherboard. In fact, the motherboard, along with the power supplies and computer case, must all be compatible in order for them to function together. The cases are usually made from aluminum or steel, but other materials include glass, wood, or plastic.

Mainframe Computer

edit

The mainframe computer is a monumental computer in comparison to a normal computer. Usually large in size, they have the potential to encompass an entire room. Mainframe computers are a vast amount more costly than personal computers also; the prices are potentially hundreds of times as that of a generic computer. Their sole purpose is to handle tremendous calculations for big companies and governments. The census bureau uses such computers for the vast amount of information needed each year for the census of the general population. Federal and private banks also use These computers for the processing of transactions and other number related endeavors.

Units of Information

edit

There are different ways to measure stored information. The smallest unit of information is known as a bit. A bit can have either the value of 1 or 0. From there, a group of 4 bits is called a nibble which is half a byte. A byte used to be the number of bits used to encode text on a computer. Now In today's age a byte is 8 bits or an "octet". Then kilobyte, megabyte, and gigabytes are other familiar units of information.


Networking

Introduction

edit

Networking is a form of telecommunication between computers where they exchange data with a data link. One computer-network everyone is familiar with is the internet. Computer nodes or hosts can access, create, delete and alter data that is on this network. If a device can transmit information to another device, then they are considered to be networking. Networking utilizes devices such as switches, modems, routers, gateways, etc.

Network Topology

edit

Network Topology is a structural network layout that is either physical or logical and arranged by a pattern of connected computers, devices, nodes, and other links of a network. It has different structures of a network topology that shows how a network is created and connected a link (in different methods) to a device. Such network topology structures are bus, ring, mesh, fully connected (or complete), star, and hierarchical (tree). Computers MUST connect to a network of any topology because of information sharing and communication. Without a network, users are unable to share files, send emails, print files, creating and sharing database, etc. An example of this is a Local Area Network (LAN). Any node in the LAN has one or more links to other devices within the network, mapping these links can result in a geometric shape.

Classification

edit

Network topology has eight classifications: Bus, Ring, Mesh, Star, Point-to-Point, Hybrid, Tree, and Daisy Chain.

 
Example of Bus Network

Each node in this topology connects to one single cable. This cable is essentially the spine of the network. Data is sent through either side of the cable and into the machines, where machines either ignore the data or accept it. This is considered inexpensive due to there being only one cable, but this can make it extremely detrimental to the company if it were to fail as it is the only wire connecting the different computing devices.

 
Example of a Ring Network

Ring

edit

A ring topology is just a bus topology within a closed loop. The difference being that it goes through one side of the loop and into each node until a machine accepts the data. The nodes keep the strength of the signal in order to maintain connection. If one of the nodes were to fail it would disconnect itself from the other nodes in either direction.


Mesh

edit
 
Example of Fully Connected Mesh

A mesh network has each machine distribute data among the network. Data bounces between each node to get to the machine where it needs to be. There are 2 types of Mesh topology.

Fully Connected has each machine connect to all the machines in the network. This is usually only practical for a small number of machines as the upkeep for such a network grows as the number of nodes grows. If one node were to fail, the network would be fine as there are other nodes the data could jump to in order to get to the right machine.

 
Example of a Partial Mesh

Partially Connected has the machines connect only to either one or two other machines. This is usually used to reduce the necessity in having all the connections a fully connected mesh network has.


Star

edit

This is one of the most popular network topologies. A star network consists of a central component such as a hub, switch or computer, that connects to all systems and transmits messages. These systems, also known as nodes, receive the messages or data and act as a client, whereas the central component acts as a server. One of the biggest advantages to this topology is that if a cable breaks or a computer on the network fails, the rest of the network will continue to work. Other advantages include easy installation, easy detection of errors and the ease to share. The disadvantages include expense and central component failures. Because this topology requires a lot of cabling, it is more expensive. If the central component stops working, the entire network and anything connected to it, will also stop working.

a point-to-point connection refers to a communications connection between two communication endpoints or nodes. An example is a telephone call, in which one telephone is connected with one other, and what is said by one caller can only be heard by the other. This is contrasted with a point-to-multipoint or broadcast connection, in which many nodes can receive information transmitted by one node. Other examples of point-to-point communications links are leased lines, microwave radio relay and two-way radio.

Hybrid

edit

Tree

edit

Daisy Chain

edit

Classified as one of the easiest "Star Based Networks" to add computers to a network. It works like the game telephone, if a message or a desired prompt is for a specific computer, it jumps down the line of the assigned computers until it reaches the one the message was intended for.

Networking Hardware

edit

Network Interface Card (NIC)

edit

The Network Interface Card, or NIC is the primary component of a computer responsible for accessing transmission data. The NIC is responsible for connecting a PC to both the internet, and the local network. To avoid conflicts inside of a local network, every PC is assigned a Media Access Control, or MAC, address. MAC addresses are usually stored within the NIC's permanent memory. To maintain the uniqueness of MAC addresses, the Institute of Electrical and Electronic Engineers(IEEE) maintains and administers addresses, ensuring no two addresses are the same.

Wired Technologies

edit

There are several wired technologies used to connect to a local area networks. Coaxial cables contain copper or aluminum wire surrounded by two insulating layers which are used for cable systems, office buildings, and other work sites. Coaxial cables transmission speed is between 200 million to 500 million bits per second. Twisted pair wire is common for all telecommunication. These cablings are twisted into pairs. Ordinary telephone wires consist of two pairs while wired Ethernet consist of four pairs. These cables have a transmission range from 2 million to 10 billion bits per second. An optical fiber carries high rates of data that can be up to trillions bits per second.

Wireless Technologies

edit
 
TP-Link Archer C9 router used to create a wireless home network

A wireless network is any type of computer network that connects to network nodes without using wires. It’s popular for its easier and faster way to link devices. For example, in a traditional workplace, using wireless devices eliminates the possibility of having the wrong things unplugged. The base of a wireless network is the access point. The access point sends out signals using radio frequencies that computers can detect and join. All wireless devices also have a LAN adapter built in that sends out and receives data through the radio signals sent by the access point.

Network switch

edit

A network switch is a multi-port device that connects multiple computers together to create a network. It can be used for sharing data between computers and can also act as a network bridge. The switch filters out network packets from each connected device and forwards them the their destination on the network, unlike a less advanced network hub, a network switch only forwards the data to one or more devices that specifically need the data rather than broadcasting the data to all of its ports. Other names for a network switch are switching hub, bridging hub, or Mac Bridge.

Ethernet Hub

edit

A Ethernet Hub (multi-port repeater) is a small rectangular electronic network hardware device that connects many computers and other network devices to form a single central switching point. Once connected through the hub, all computers and network devices communicate with each other. The number of ports that an Ethernet hub varies from four and five ports to eight and sixteen ports. Original Ethernet hubs only offered 10 Mbps speeds, newer hubs now offer 100 Mbps support which usually offer both 10 Mbps and 100 Mbps capabilities.

Modems

edit

The modem's purpose is to connect network points that are not specifically meant for network traffic by wire or wireless. They are mostly designed for telephone lines by a Digital Subscriber Line.

Firewall

edit

The computing term "Firewall" came to existence during the 1980's. This was around the time when the internet emerged as a new globally used technology. A Firewall is a hardware or software network device that is responsible for controlling network access and security. Firewalls track all incoming and outgoing traffic and block or allow traffic based on preset perimeters. With the increase prevalence of cyber attacks, firewalls are essential for any network to remain secure.

There are many different types of firewalls used to fulfill different purposes. These include but are not limited to:

Network Layer or Packet Fillers

    This firewall works on the smallest level. Everything that communicates between the network and the computer releases packets of information. This type of firewall filters through all these packets by terms set by either the system or the user.

Proxies

    Proxy servers serve as almost a gateway between networks, and can pass and filter packets between them. their job is to make it difficult to for outside access to an internal system. Network invaders may use public systems as proxies to perform an action known as IP Spoofing. IP Spoofing is when IP packets are created with a faux IP address to disguise the indentity of the sender. This method can also be used to impersonate another network and/or system.

Application Layer

    These firewalls work on the application level, and filter out packets coming and going from a specific application.

Cables

edit

A wired network must contain some type of medium to transfer data over. The types of media can include:

  • Coaxial Cable - A cable consisting of insulated copper or aluminum. This type is most common, and used for cable television and CCTV networks.
  • Power line communication - This refers to the transfer of data over electrical wires.
  • Ethernet cables - Also known as a twisted pair because the individual wires are twisted into pairs. This is the most common for home networks.
  • Fiber Optic - This is a strand of glass fiber that carries pulses of light to transmit data. Fiber optic cables can transfer multiple steams of data on different wavelengths of light; this increases data transfer rate. These cables do not have a high data loss rate and thus are used for long distance lines, such as under sea cables.

These cable types are organized in roughly slowest transfer speed to fastest.

Ethernet cables require a repeater (a device that cleans and reproduces the data at a higher strength) about every 100 meters. Fiber optic cables on the other hand, only require a repeater after about 10-100 kilometers. This make them suited for undersea cables such as the Transatlantic cable.

Computer Network Diagram Symbolization

edit

A computer network diagram is an illustration portraying the nodes and connections amongst nodes in any telecommunications network. Basic symbols and pictures are used to portray common network appliances. In certain depictions you can see lines connecting CPU's and a single switch. That switch may also be connected to a printer or fax machine and a router. The clouds you see in many pictures are used to portray the external networks connections between external and internal devices, without showing the details of the outside network. In some cases representative hypothetical devices may be pictured instead of showing every existing node. For example, if a network appliance is intended to be connected through the Internet to many mobile devices, only a single mobile device may be shown.

 
twisted pair cable from side
 
twisted pair cable from top

IP Addressing

edit

IP Address

edit

An IP address is a series of numbers separated by periods that are unique to each computing device that is connected to the Internet. Internet Protocol (IP), is a set of rules that facilitate all of the actions that happen within the connected parts of the World Wide Web. An IP address allows people to send and receive data over the Internet connections so that they reach their intended destination, thus making all two-way communications possible in the modern era. IP Address can also be static or dynamic. Dynamic IP addresses are randomly assigned to a computing device each time it connects to the internet. A Static IP address is one that never changes so that people have a convenient and reliable way that remote computers can access.

The Internet protocol is within the computing network that is a set communication protocol that is used on the internet and similar computer networks. It has been known since the beginning as TCP/IP, because they were the first networking protocols. It was also known as the Department of Defense model because it was funded by DARPA. TCP/IP, specify how data should be organized. For example, how it should be addressed, transmitted, routed, and received. There are four layers that are organized to sort every protocol. The lowest is link layer, it is the communication for data that remains within a single network. Second, the internet layer connects independent networks that provide inter networking. Third, the transport layer transmits between host-to-host. Finally, the application layer provides data exchange for applications from process-to-process.

Private IP

edit

A private IP address is an IP address that cannot directly contact the Internet and are usually provided by routers or other network devices. Private IP addresses are usually used because they provide a completely separate set of IP addresses that still allow access on a network without taking up any of the public IP address space.

Public IP

edit

Public IP address are any IP that is connected to the Internet. They are usually used by web sites, DNS servers, or network gateways. A public IP address is completely unique, and can only be assigned to one unique computing device at a time.

Class Id

edit

The Internet is the biggest and most compound TCP/IP network to date. The biggest problem faced by the internet is making sure that no two devices end up on having the same IP address. An institution called the Internet Assigned Number Authority, was formed to help track and administer the IP addresses to the people that needed them. They decided that the way the IP addresses are handed out would be to create a class for each of the different IP addresses. They created five classes to help them distinguish on who and what the person/company may need. Class A would consist of 1-126, Class B would consist of 128-191, Class C would be 191-223, Class D would consist of 224-239, and Class E would be 240-255. The reason why you don not see the number 127 used, is because it is used for the loop back address.

DHCP

edit

Dynamic Host Configuration Protocol or (DHCP) is a client/server protocol that provides an Internet Protocol (IP) host with its IP address and other related configuration information such as the subnet mask and default gateway. The DHCP protocol is controlled by the DHCP server. So if you moved your computer or got a new computer the DHCP server would give you your IP address instead of configuring it manually. DHCP will also give you other IP Addresses that are on the same subnet.


Virtualization

Introduction

edit

Virtualization is the creation of a simulated version of something else. Computers can create virtual networks, storage, disk drives, operating systems, and other forms of hardware. Physical machines are referred to as "Host" machines while simulated machines are referred to as virtual machines. While also providing physical services, virtual hardware gives the user extended services by providing more functionality than was first installed on their device. Virtual machines usually have more configuration properties than actual hardware machines, which is a reason why many have developed more virtual devices to extend the range of their current computer.

Virtualization

edit

The word virtual means something that does not physically exist, or something that cannot be physically touched. In IT, it's safe to say the same about virtualization. It is the simulated part of a computing device that cannot be touched. Examples are the operating system, storage, disk drives, etc.

Reasons for Virtualization

edit

In networking, virtualization is highly useful. We work with tools like routers, switches, servers etc. In the real world, having access to all these tools could be expensive but with the aid of virtualization we get to work with these tools virtually. In other words, the major reason for virtualization is that it is cost effective. It also comes with benefits like expansion. When doing a configuration say for example multiple routers connected to a server on a network, we can virtually setup another topology and do the necessary configurations we want to.

Hardware Virtualization

edit

This type of virtualization creates a simulation of a real computer with a working OS(operating system). The host machine is the actual computer that runs the virtual machine and the virtual machine is the guest machine. The host machine can allow a guest machine to run software regardless of the hosts hardware resources, such as the guest machine can run as a Linux or Unix system even if the host machine runs as a Windows System. There are different types of virtualization: Full Virtualization, Partial Virtualization, and Para-virtualization.

Full Virtualization

edit

This is the simulation of almost all the hardware required for the certain software that one would want. This usually contains a guest operating system to run said programs, if they require.

Partial Virtualization

edit

This is the simulation of only some of the hardware requirements for software. Any software that needed the hardware that wasn't simulated would need to be modified to be usable in the simulated environment.

Para-virtualization

edit

Paravirtualization is another virtualization technique that presents a software interface to virtual machines that is similar, but not completely the same as that of underlying hardware. It is used to lessen the guest's total time spent performing operations that would be increasingly difficult to run in a virtual environment compared to a non-virtualized environment. Paravirtualization have what you would call "hooks", allowing the guest and host to request and execute different tasks and information, instead of being executed in a virtual domain where performance is significantly worse.

Server Virtualization

edit

Server virtualization refers to partitioning a physical server into one or more virtual server machines. Most servers tend to only use a small part of their computing processing power. When using server virtualization, it maximizes the computing usage of a server and a single server is able to do the computing power of many servers. In a data center, one server could be running several virtual operating systems simultaneously, which in turn results in reducing operational cost and space needed to place physical servers.

Network Virtualization

edit

Network virtualization consists of the rebuilt of a physical network in software. Software and services are installed to make channels of which are securely separate from each other and can be assigned to particular devices. This in return allows for management in the sharing of applications, storage and computing cycles. Therefore, no matter how the physical components are set up, the servers or assigned devices on the network become a pool of resources for anyone to access. Like most, this method in virtualization aids in the main point to increase utilization in hardware.

Snapshots

edit

Snapshots are states in a virtual device that allows one to restore any changes that may have been done at that exact point where the snapshot has been implemented. It is a good backup tool if one is doing any changes to the virtual device and has made an error of some sort.

Reasons for Virtualization

edit

One of the main reasons for virtualization is to cut costs on the amount of servers that would be on a network. Even though you cut the costs on hardware doesn't mean that you lose anything else. You still have your operating system used on the virtual machine instead of being run on a physical server. This reduces costs, as it conserves less energy to run on a virtual machine than it does to run on a physical server. It also has the capability of moving from one machine to another which saves costs as well.

Benefits

edit

Expansion

edit

Virtualization’s biggest advantage is its ability to expand. This means that if an application takes up too much space on a server, another virtual environment can be created on another server. Server virtualization also decreases the amount of servers needed to run all your applications. Before each app is assigned its own server, which takes up more physical space and energy, but virtualization allows more than one application to run on a server instead of having several different ones.

Transportation

edit

Another advantage of virtualization is its ability to be transported from one location to another. A virtual machine can be copied in its exact current state, called a snapshot. These snapshots preserve every single component of a virtual machine, allowing it to either be used as a perfect backup or transferred to a different location. This process, called Migration, is performed by stopping a VM, creating a snapshot(perfect copy), moving the snapshot to another VM, then resuming normal operations. Most often, this is used to move the operations of one server to another while the physical server undergoes maintenance or replacement.

Redundancy

edit

Another advantage of virtualization is its ability for transportation from one location to another. A virtual machine can be copied in its exact current state, called a snapshot. These snapshots preserve every single component of a virtual machine, allowing it as a perfect backup, or transferred to a different location. This process, called Migration, is performed by stopping a VM, creating a snapshot(perfect copy), moving the snapshot to another VM, then resuming normal operations. Most often used to move the operations of one server to another, while the physical server undergoes maintenance or replacement.

Security

edit

The fact that virtualization allows for full control of its environment means that it can be made much more secure. If any errors occur in the virtualization that could possibly harm the computer, all damage would be contained in the virtualization. This security is useful for testing programs on multiple varieties of virtual machines without actually placing the host system in jeopardy of failure or damage.

Risks/Drawbacks

edit

While virtualization is an efficient and cost saving technique for businesses to use, it does have its risks and drawbacks. Security is one of the largest threats, due to how easy it is for someone to create a virtual machine of their own. There are ways to avoid this, but every system has its flaws. Running multiple machines can also be difficult to patch and keep track of. Much more so than the hardware which they are replacing. A lack of ability to control what is happening on these machines, and where customer data is stored and handled presents more risks for these machines. However, with a well-developed and maintained system, virtualization can still save money and increase efficiency.

Operating Systems

edit

An operating system is a system that maintains a computer's software and hardware and allows the computer to do the simple task for its programs. All of these programs have to use an operating system except for firmware. You can find an operating system on every computing device. These devices include computers, mobile phones, game consoles, tablets, and other computing devices. There are four main operating systems companies that control most of the market today. These companies are Microsoft, Google, Apple, and Linux. Microsoft dominates the desktop while Google dominates the smartphones and Linux dominate supercomputer operating systems. Apple follows right behind Microsoft and Google for operating systems in desktops and smartphones.

Host Operating System

edit

Host Operating System (OS) is the primary and physical operating system that is installed in the computer. It interacts with the hypervisor or Virtual Machine Monitor (VMM) primarily manages multiple Virtual Machines (VM). A host operating system is used in a virtualization process. First, a host operating system controls and interacts with an installed Type 2 hypervisor or called Virtual Machine Monitor (VMM). Then, a Type 2 hypervisor interacts with installed multiple operating systems also refers to guest operating systems. In a host operating system, multiple guest operating systems use shared hardware applications and resources. Also, they do not know the existence of a host OS but are managed by Virtual Machines.

Guest Operating System

edit

A guest operating system is the software installed on a virtual machine or partitioned disk that describes an operating system different from the host operating system, it can also provide an alternative OS for a device. For example, the host operating system could run Windows, while the guest OS runs Linux. A guest OS must be present in order for a virtual machine to exist. In addition, the guest operating system on a virtual machine may be different from the host OS, but a guest OS on a partitioned disk has to be the same as the host operating system.


Operating Systems

Introduction

edit
An operating system (OS) is an important program that runs on a computer. It is a type of system software that controls computer hardware and software, and provides common services for computer programs. All computers must have an operating system to run programs and applications. They provide a usable interface for users to interact between the hardware and the software.

Major Operating Systems

edit

Microsoft Windows

edit

Apple OS

edit

History

The Apple operating system currently known as OS X originated at a company called NeXT after Steve Jobs left apple. The original system, known as NeXTSTEP, and soon after OPENSTEP and was launched in 1989. When Apple purchased NeXT and re-hired Steve Jobs in 1996, they used OPENSTEP as the basis for the first Apple OS. Steve Jobs begin to head the transformation of a more programmer based OPENSTEP into a more, home user and business friendly version under a project called Rhapsody and then renamed Apple OS X. Apple released the first version of OS X for consumers in 2001. Since its release, OS X has gone through several versions, most of which named after big cats such as Lion and Snow Leopard. Since the release of the iPhone Apple has made OS X more compatible with iOS, the iPhone and iPad OS. OS X has also received many of its design aspects from iOS. Today OS X users make up over 10% of the Computer using population.

Linux

edit

Linux is the best-known and most-used open source operating system. As an operating system, Linux is software that sits underneath all of the other software on a computer, receiving requests from those programs and relaying these requests to the computer's hardware.


Distributions

edit
Ubuntu
edit

Known by many as the most user friendly kernel of Linux, Ubuntu is a Debian-based operating system used for personal, smartphone, and corporate use. Ubuntu finds its popularity from its user friendly UI know as Unity. Ubuntu is currently used and distributed for free by Canonical Ltd.

Fedora
edit
RedHat
edit

Red Hat, Inc. is an American multinational software company providing open-source software products to the enterprise community. ... Red Hat has become associated to a large extent with its enterprise operating system Red Hat Enterprise Linux and with the acquisition of open-source enterprise middleware vendor JBoss.

File Systems

edit

File Allocation Table (FAT):

A file allocation table is a type of file system that the operating system maintains on a storage device. The FAT provides a map of the clusters that contain each file. It works by splitting a file disk into different clusters. There are two types of clusters on a FAT file:

  • Data Clusters: hold the file contents
  • Directory Clusters: contain metadata for all files (file names, timestamps and starting cluster for the file)

After each cluster is given a unique ID number, the FAT file uses a table to track what part of a file was stored in each cluster.

New Technology File System

New Technology File System or NTFS is a common journaling file system used for Microsoft Windows. It can recover disk errors which can be read. NTFS is necessary for Windows to perform improved reliability, to support a large amount of spaces, and to provide data encryption for sensitive files. NTFS can store files in a disk space more than 16 terabytes (TB). Unicode character set is supported and you can create a long name for storing files in a NTFS disk space.

Giving Commands to an Operating System

edit

ls is the common format that all Linux commands follow. "ls" stands for "list" because when you type ls in the Linux command line, your're asking the computer to list something. A regular ls command will list the basic files and directories. A ls command with a - and a letter after it will list other things depending on the letter. For instance " ls -a" will list all the files in the directories.


Web Technologies

Introduction

edit

In order to make websites look and function a certain way, web developers utilize different languages. The three core languages that make up the World Wide Web are HTML, CSS, and JavaScript.

In the IT world, the internet is an essential platform, whether it`s for developing or for consumer use. When developing a website, typically three main languages come into play. These languages are JavaScript, CSS, and HTML. HTML is the backbone of most webpages. Essentially, it is used to create the structure of how a specific website would look like, from the headings, to the paragraphs, the body, links, and even images.

Markup Languages

edit

Markup languages are the languages in which the web is written. The most common markup language used is HTML, which uses tags to annotate text so that a computer can then manipulate the text. Most markup languages are human readable, and use annotations that are distinguishable from the annotated text. There are many different kinds of markups and languages, but all are consistent in the way in which they annotate documents.

Hypertext

edit

Hypertext is defined as the arrangement of information inside a database that allows the user to receive information and to navigate from one document to another by clicking on highlighted words or pictures inside the primary document. Hypertext is the base of the World Wide Web, because it enables user to click on other links to get more information. Hypertext is a term used for all links, whether it appears as texts or other graphical part.

Hypertext Markup Language (HTML)

edit

HTML is the conventional markup language used to create and edit web pages and web applications. HTML is used for creating the basic structure of a website. HTML consists of different elements preceded by an opening tag, <tag>, and a closing tag, </tag>. The content between the tags, <html> and </html>, is the content of the webpage. The content between the tags, <head> and </head>, is the title of the webpage. This text is displayed between the <title> and </title> tags. The content between the tags, <body> and </body>, is the main content of the webpage. The content can include links , paragraphs, headings, and various other elements.

Here are the most commonly used HTML tags:

Tag Description
<h1> - <h6> Gives a web page a heading. 1 is the largest heading you can have and 6 is the smallest.
<p> Starts a paragraph in your web page.
<i> Italic font style.
<b> Bold font style.
<a> Inserts hyperlinks onto a web page.
<ul> & <li> Starts an unordered or ordered list.
<!DOCTYPE> Defines the document type of the web page.
<!-- --> Allows you to insert comments into your HTML code. Comments aren't displayed in on the web page, but are helpful for organization.
<img> Inserts an image onto a web page.
<br> Inserts a line break between bodies of text.

HTML Major Versions

edit

HTML 2.0

edit

Published in 1995, HTML 2.0 flushed out the RFC system, allowing detailed mechanical explanations of the system.

HTML 3.2

edit

Published in 1997, HTML 3.2 performed major housecleaning on the structure of HTML. It removed mathematical formulas, reconciled code overlap, and adopted Netscape's Visual Markup Tags.

HTML 4.0

edit

Published at the end of 1997, HTML 4.0 introduced 3 different versions and browser specific plugins. 4.0 Allowed custom experiences tailored to specific browsers.

XHTML

edit

Released in 2000, XHTML fused HTML and XML into a language that was very precise, almost too precise. XHTML is widely considered a tedious and difficult language.

HTML 5.0

edit

Released in 2014, HTML 5.0 is the currently used version of HTML. HTML 5.0 removed some of the tedium and severity of XHTML, while keeping its ability to remain precise and detailed.

Hypertext Transfer Protocol (HTTP)

edit

HTTP is the protocol used by the World Wide Web that determines how messages are formatted and transmitted. It also directs web servers and browsers to what actions they should take in reaction to several commands. When you open your web browser and enter a URL, you are using HTTP. The Web server directs it to get and transmit the requested Web page based on the HTTP command that is sent.

HTTP Protocol

edit

HTTP is an application used as the fundamental foundation of communication on the web. HTTP is the first letter you type in when inputting a web address. HTTP is a request - response protocol. The client might request something and the http allows the client to access the information. Like when we updated our virtual machines the request we wanted was to go get updates for the software and http request went out and got updates. If you were on a bank website or the wikibook site it would be https the ‘s’ meaning secure. That means that the computer is communicating on a secure network.

Cascading Stylesheets (CSS)

edit

CSS is a style sheet language standard set by W3C (World Wide Web Consortium) used to create and edit the visual presentation of web pages. CSS allows web developers to isolate a web page's content and visual styles into separate documents and gives better page layout control. An external CSS sheet is generally linked to HTML and XHTML, it also can be linked to XML, SVG, and XUL. HTML and Javascript, with CSS, is a vital part of technology used by the majority of interfaces for websites. This is also used in interfaces for mobile devices making the websites more engaging.


Here are the most commonly used CSS tags:

Tag Description
background A shorthand property for setting all the background properties in one declaration.
color Sets the color of text.
opacity Sets the opacity level for an element.
border Sets all the border properties in one declaration.
border-color Sets the color of the four borders.
float Specifies whether or not a box should float
padding Sets all the padding properties in one declaration.
/*...*/ Allows you to insert comments into your CSS code. Comments aren't displayed in on the web page, but are helpful for organization.
width Sets the width of an element.
clear Specifies which sides of an element where other floating elements are not allowed.

Types of CSS

edit

CSS can be incorporated with HTML in 3 different ways; Inline, Internal, and External.

  1. Inline styles add style to a single element on the page by placing 'style' after the element you wish to be styled.
  2. Ex: h2 style = "color: blue"


  3. Internal styles create a style for a single document because the CSS is stored in the head of the HTML document. Internal styles are placed using a <style> tag around all style selectors.
  4. Ex: <style>
    body {background-color: white;}
    /*This is a comment!
    'Body' is the selector,
    'background-color' is the declaration*/
    h2 {color: blue;}
    </style>

  5. External style sheets exist in separate documents from HTML documents, allowing for better organization of style and structure. An external style sheet can be linked to all HTML documents making up a web site, allowing a web developer to style the entire site (all pages) using one document.

Web Design Programs

edit

Web Design Programs help the webpage creator manage and create the content of a website. Many Web Design programs have many built in tools that ease the process of creating a website. Such programs are Dreamweaver and Sublime. There are also publishing programs like Wordpress and Ghost that allow the user to have more of a GUI based interface for blogging and managing a website.

Sublime

Sublime is a text editor that allows the web developers, programmers, software engineers, etc. manipulate code. It's not only for HTML and CSS it can be set-up for many different programming languages and new productivity tools. One contribution that Sublime has that many do not is the "Package Control" Tool. The tool gives you full access to an entire library of content to better your coding experience. For example, there is a package you can install called Emmet, helping in typing massive amount of HTML, if you type "html:5" and press "tab" then emmet will push out all the correct syntax for an HTML 5 document.

SASS

SASS is some what like emmet but is more of a language. It is a Ruby engraved language that gives CSS much more capabilities like variables and nesting. Like emmet it makes writing CSS much faster and more efficient saving the programmer lots of time.

Dynamic Web Content

edit

Client-Side Scripting

edit

Generally refers to computer programs on the web that are executed by the user's web browser, instead of on a web server, enabling web pages to be scripted. Client-side scripts do not require additional software on the server but instead utilize the user's web browser to understand the scripting language in which it is written.

Server-Side Scripting

edit

Server-side scripting is a technique used in web development that involves using scripts on a web server which produce a unique response for each user's request to the website.

Combination technologies

edit

When both client side and server side scripting collectively build a webpage it is known as a web application. This web application can manage user interaction, security, and help improve performance between the client and server. Web applications can include anything from online stores to instant messaging services as long as both server and client sides execute scripts to a achieve a common goal in unison.

JavaScript

edit

JavaScript is a scripting language that is used along with HTML and CSS as the three core components of the World Wide Web. JavaScript has first-class functions and is used in most websites. JavaScript does not have any I/O which means that it has to be embedded in the host environment. JavaScript is also used in PDF documents, game development, and desktop and mobile applications. JavaScript is most commonly used to make DHTML by adding client-side behavior to HTML pages.

Worldwide Web Consortium

edit

Worldwide Web Consortium (W3C) is an international community of web members to meet the Web standards. It was founded by Tim Berners-Lee, an inventor of the Web, back in the 20th century. W3C is designed to reach a full potential of the Web and to make it accessible to all users from all over the world. Also, another aim for W3C was to make standards to maintain the growth of the Web in a single direction rather than splitting into competing groups. Here are some standards by W3C:

  • Accessibility
  • Web Authoring
  • Web Performance
  • Cascading Style Sheets
  • HTML5
  • Web Fonts
  • Widgets
  • Media Access
  • Mobile Web Applications
  • Internationalization of Web Design and Applications
  • Mobile Web Authoring
  • XML
  • Graphics
  • RDF
  • HTTP

And many more


Spreadsheets

Introduction

edit

Spreadsheets are computer applications used to store, analyze, organize and manipulate data in the rows and columns of a grid. The program operates by taking in data, which can be numbers or text, into the cells of tables.  If the data is numbers, the program will compute it for you depending on the function you need to be completed. Microsoft Excel is currently the industry standard for spreadsheets and worksheets. It is the most used spreadsheet and is available for Windows, MacOS, Android, and iOS. Other programs used include Google sheets, a cloud web-based program, LibreOffice, and several more. The jobs that were once done by accountants are now managed and filed by a computer program for reasons of efficiency and organization. Spreadsheets and computer programs used to optimize data have changed the world for business and data analysis.

 
An excel table used to create a graph from a formula

Features and Terminology

edit

Spreadsheets have many features which help users visualize and manipulate data. This allows information to be processed faster and with more efficiency. Spreadsheets allow users to enter simple or complex formulas to perform automatic calculations on data in multiple cells. Spreadsheets can also perform dynamic updates, allowing users to generate data in one cell based on the values of others. Spreadsheet software gives users the ability to generate graphs and charts based on your inputted data.

When working with spreadsheets, one of the common terminology used is "cell", without a cell, there cannot be a spreadsheet. In other words, we can say a spreadsheet is the arrangement of cells in rows and columns. A "cell" is a box where all data is inputted within a spreadsheet. A cell can be identified by the intersections of the rows and columns assigned to data that it represents, defined as the "cell address". They are usually expressed in the format of, (column row). Examples of cell addresses can be A1,(column A row 1) B2(column B row 2),C3 (column C row 3), etc. The data imputed into a cell are usually texts, a numeric value, or a formula.

Spreadsheet Software

edit

Microsoft Excel

edit

Microsoft Excel is the most used spreadsheet software around the world.  The spreadsheets present tables arranged in rows and columns and are used to calculate basic and complex mathematical operations and functions. Along with the ability to handle complex mathematical operations, Excel features graphing tools, pivot tables, and also features a macro programming language called Visual Basic for Applications. The first version of Excel was released on September 30, 1985, for the Macintosh. The first Windows version wasn't released until November of 1987.

Gnumeric

edit

Gnumeric is an open source spreadsheet program that is part of GNOME free Software Desktop Project. Gnumeric version 1.0 was released on December 31, 2001. Its original intention was to replace other spreadsheet programs such as Microsoft Excel.

Formulas

edit

Formulas used in spreadsheets, automatically process data how the user see fits. The formula takes data from certain areas in the spreadsheet, processes it, and places the output into the new area of the spreadsheet based on where the formula is written. The formula can be as simple as "=SUM(A10,A11)" (which takes the information in the 10th and 11th cells of row A and outputs the sum), or as complex as the user wishes to make it. The functions used to create the formula (such as SUM), are predesignated by the spreadsheet software.

Functions

edit

A function uses a specific formula on an input to produce an output. They make it possible to do complicated math problems in spreadsheets without knowing the actual formula as functions are built into the software. For example, if you use a sum function for a column to find the total, all you would need to do is select all the cells you want to add and then use the SUM function. These are also useful when working with large amounts of data. With functions a complex question of, "How much money does the average customer spend in my store?" could be accounted for by summing the total of all specified cells and dividing by the average amount of money. Functions don't always have to be the right way of working with an Excel document or any spreadsheet. Oftentimes, functions are used when compiling large amounts of data that creating a function takes less time than doing estimates by hand.  

Common Functions

edit

This function gives you a sum of all numeric data in a specified range of numbers.

Example:

 
SUM in action

COUNT

edit

This function counts the number of cells (only numbers will count) that are in a specified range of numbers.

Example:

 
COUNT in action

COUNTA

edit

This function counts the number of cells of any values: numbers, text, error. This function does not count any empty cells.

Example:

 

VLOOKUP

edit

This function is used to find and retrieve matched values of a table. This matched values will fit a value from the first column of a table with a specified argument (you will have to specify which data is true (approximately matched) or false (exact match)).

Example:

 
MIN in action

This function finds the lowest value in a specified range of numbers.

Example:

This function finds the highest value in a specified range of numbers.

Example:

AVERAGE

edit

This function gives you an arithmetic mean of numeric data in a specified range of numbers.

Example:

CONCATENATE

edit

This function is used to combine text from left to right in one cell. It is used when you separate the list of first and last names into two columns. Then, you use CONCATENATE function to specify one first name and one last name. It combines those names from two cells in one cell.

Example:

PROPER, UPPER, and LOWER

edit

These three functions are used to format text (word). The UPPER function capitalizes each word capitalized. A PROPER function capitalizes the first letter of each word. The LOWER function formats each word in lowercase.

Example:

This function gives you the current date and time.The date and time on your system will be recorded. It should be noted that if the date and time on your system is incorrect, the date and time inserted will also be incorrect.

Example:

TODAY

edit

This function only shows the current date.

Example:

Spreadsheet Risk

edit

This is the risk associated with errors being made to spreadsheets that in turn will be used to make numerical decisions. These errors include data input errors, calculation errors and formatting errors. One single mistake in input to a spreadsheet could change the end result in calculations, thus determining business decisions in an inaccurate way. Other risks can include loss of data, lack of documentation standards, and lack of skilled users.  

Analyzing Data

edit

Pivot Tables

edit

A pivot table is an interactive table that allows a person to categorize large amounts of data in a concise, tabular format for easier reporting and analysis. Pivot tables are used to sort, tally, and compile the data, which can be a simpler way to organize a spreadsheet. When using a pivot table, you are able to take many data items and make it simpler to summarize the information. Pivot tables allow selection of a certain set of data and see only that specific data.  Also, when using Microsoft Excel, you will have to create your own pivot table from the entire data in the original table. A list of all the column headers allows the user to input the different data types that they need, which makes it more accessible.

Multi-Dimensional Spreadsheets

edit

A Multi-dimensional spreadsheet is a spreadsheet with a third dimension, allowing for more advanced data management. This third dimension acts like pages of a book containing multiple spreadsheets of a similar format or topic. One of the most common examples is a spreadsheet for each month of the year with different information for each month, contained in a single file. This also allows information from these multiple spreadsheets to be calculated together while in a single organized file.

Charts

edit

Charts are visual application featured in a spreadsheet. That and other applications are useful when working with formulas and multiple cells. Charts can be linked to formulas and functions within the same spreadsheet or from a completely separate spreadsheet.  

Creating Charts

edit

Creating a chart in most spreadsheet programs is usually as simple as defining a chart's size, then assigning properties to the chart that allow it to interact with other elements. With modern spreadsheets, this process is almost completely automated, with the user only having to add input values.


Relational Databases

Introduction

edit

A relational database is a database which has a structure that can recognize the relations between data. This means that data within the database can be analyzed in many ways without needing to change the database tables. The basic structure of these databases is a set of tables organizing data into predefined categories. These databases are easy to add new data to, whether it is more information or a new data category. The standard way in which this system is interacted with is through the structured query language (SQL). This language is used to gather data and perform queries for information within the database.

Terminology

edit

A Database is a collection of data organized so that it can be managed and updated. It is a set of schemas, tables, queries, reports, views and other objects.

 
Relational database terminology.

The relational database was first defined in June 1970 by Edgar Codd, of IBM's San Jose Research Laboratory. Codd's view of what qualifies as an RDBMS is summarized in Codd's 12 rules. A relational database has become the predominant type of database. Other models besides the relational model include the hierarchical database model and the network model.

The table below summarizes some of the most important relational database terms and the corresponding SQL term:

SQL term Relational database term Description
Row record A data set representing a single item
Column Attribute or field A labeled element of a tuple, e.g. "Address" or "Date of birth"
Table Relation or Base relvar A set of tuples sharing the same attributes; a set of columns and rows
View or result set Derived relvar query

DBMS

edit

DBMS stands for Data Base Management System. This type of software is used to interact with the database and access something stored in one. DBMS provides a way to see data that’s accessed by many users from multiple locations while limiting what data is visible. A few advantages of using a DBMS include the protection of your data, data will be easier to find and harder to lose if its organized in one place and activity in a DBMS is logged so you can see who accessed what and when.

Update

edit

Modifying data within the database by inserting new data, deleting data, or modifying existing data.

Retrieval

edit

Retrieving Data from the database and providing it to the user.

Administration

edit

Performing any services to keep the data base running, secure, and recover lost or corrupted data. The services provided could be either registering and maintaining new users as well as enforcing data security. The people who handle these tasks are Database Administrators. So far there are at least 3 types of Database Administrators.

1. Systems DBAs (Database Administrator) - These admins focus on the physical aspects of administrating databases such as the DBMS installation, patching and upgrading the database, and general maintenance.

2. Application DBAs - These admins are in charge of managing the application components that access the database and configure the database management systems to work for users. They handle application patches and upgrades to this application software.

3. Development DBAs - They focus on development aspects of database administration. This can include data model maintenance and design, SQL writing, and generation of the DDL or data definition language.

What is a Key?

edit

A key is a tool that links a piece of data to another table. A primary key is what links data to other tables. A primary key should not repeat and it can’t be “NULL”, blank, empty, or a zero when referring to values. A foreign key links an entire row of data to that table’s or another table’s primary key. This provides a link between data in two tables.

Other types of Keys

edit

Keys are an important part of a relational database. They are used to establish and identify a relation between tables. They also ensure that each record within a table can be identified by a combination of one or more fields within a table.  We also have a super key and the candidate key. The super key is a set of attributes within a table that identifies each record within a table. It is a superset of a candidate key. The candidate key is the set of fields from which primary keys can be selected. It is an attribute or set of attributes that can act as a primary key for a table to identify each record in that table. Keys that are not selected as primary keys are known as secondary keys or alternative keys.

Relationships

edit

Types Of Relationships

edit

one-to-one

edit

A value in one table corresponds to only value on the related table. For example, a list Social Security Numbers that corresponds to a list of the people who go with those numbers.

one-to-many

edit

A value in one table may correspond to many different values in the related table. For example, a list of parents and a list of their kids.

many-to-many

edit

Many values in one table may correspond to many values in another table. For example, a list of siblings and a list of their siblings.

Entity-Relationship Diagrams

edit

An entity-relationship diagram is a representation of a system of information that shows the relationship between people, objects, places, concepts or events within a system in the form of a graph. One would usually use an ER model if they wanted to analyze, define, and describe what is important to processes in an area of a business. They are a series of entities and relationship. Entities are capable of existing independently from anything else, that can also be identified. A relationship shows how each entity is related to another.

Junction Tables

edit

Junction tables (aka Bridge Tables) are tables designed to handle many-to-many relationships between two groups. They create "junctions" between data sets, allowing associations with one another. For instance, if a data set had a group for people's names and a group for people's classes, a junction table would allow correlations between the two. This is a many-to-many relationship since more than one student can be in a single class, and one class can have several students.

Constraints

edit

Constraints make it possible to further restrict the domain of an attribute. For instance, a constraint can restrict a given integer attribute to values between 1 and 10. Constraints provide one method of implementing business rules in the database. SQL implements constraint functionality in the form of check constraints. Constraints restrict the data that can be stored in relations. These are usually defined using expressions that result in a boolean value, indicating whether the data satisfies the constraint. Constraints can apply to single attributes, to a tuple (restricting combinations of attributes), or to an entire relation. Since every attribute has an associated domain, there are constraints (domain constraints). The two principal rules for the relational model are entity integrity and referential integrity.

Normalization

edit

Database normalization is a systematic process of organizing data in columns and rows. Its purposes are to cut data redundancy (to avoid anomalies) and make sure the data is stored. It is important to reduce or eliminate data redundancy as it makes it difficult for application developers to store objects in a relational database. A direct benefit of data normalization is the performance of the database systems are fast, accurate, and efficient. As a result, you can get a quick response from the database.

Cardinality

edit

Cardinality pertains to the uniqueness of data values within a column. Low cardinality refers to a column containing several repeated values and high cardinality represents when a column has several unique values. It is also referred to the relationships between tables. These relationships are one-to-one, one-to-many, and many-to-many. Cardinality is important because it links tables together with precision.  

Index

edit

An Index is a type of data structure that helps reduce the amount of time for retrieval operations on a database table. Indices give the user the ability to locate any data they seek without having to search every row in a database table by hand. Database indices are set up to use queries that filter using attributes that can find matching sequences using the index. Although indices are usually not considered part of a database they are essential in working with databases.


SQL

Introduction

edit

Structured Query Language (SQL) is a programming language designed for managing data held in the relational database management system. The SQL is divided into six language elements:

  • Clauses - They are components of the statements and queries.
  • Expressions - They produce either scalar values or tables that consist of columns and rows of data.
  • Predicates - They specify conditions, used to limit the effects of the statements and queries, or to change the program flow.
  • Queries - Based on given criteria, they retrieve data.
  • Statements - They control transactions, program flow, connections, sessions, or diagnostics.
  • Insignificant whitespace - This is usually disregarded in SQL statements and queries.

SQL was originally based on relational algebra and tuple relational calculus, and consists of a data definition language, data manipulation language, and data control language.

SQL became an accepted standard of the American National Standards Institute (ANSI) in 1986, and also of the International Organization for Standardization (ISO) in 1987. Although it has accepted as the standard, it has been revised to include a larger set of features. Even with the existence of the standards, most code is not fully portable among database systems without adjustments.

Major SQL Statements

edit

Operators

edit
Operator Description Example
= Equal to Author = 'Alcott'
<> Not equal to (many DBMSs accept != in addition to <>) Dept <> 'Sales'
> Greater than Hire Date > '2012-01-31'
< Less than Bonus < 50000.00
>= Greater than or equal Dependents >= 2
<= Less than or equal Rate <= 0.05
BETWEEN Between an inclusive range Cost BETWEEN 100.00 AND 500.00
LIKE Match a character pattern First_Name LIKE 'Will%'
IN Equal to one of multiple possible values DeptCode IN (101, 103, 209)
ISorIS NOT Compare to null (missing data) Address IS NOT NULL
IS NOT DISTINCT FROM Is equal to value or both are nulls (missing data) Debt IS NOT DISTINCT FROM - Receivables
AS Used to change a field name when viewing results SELECT employee AS 'department1'

Other operators have at times been suggested and/or implemented, such as the skyline operator (for finding only those records that are not 'worse' than any others).

SQL has the case/when/then/else/end expression, introduced in SQL-92. In its most general form, called a "searched case" in the SQL standard, it works like Conditional (programming) else if in other programming languages:

CASE WHEN n > 0
          THEN 'positive'
     WHEN n < 0
          THEN 'negative'
     ELSE 'zero'
END

SQL tests WHEN conditions in the order they appear in the source. If the source does not specify an ELSE expression, SQL defaults to ELSE NULL. An abbreviated syntax—called "simple case" in the SQL standard—mirrors switch statements:

CASE n WHEN 1
            THEN 'one'
       WHEN 2
            THEN 'two'
       ELSE 'I cannot count that high'
END

This syntax uses implicit equality comparisons, with SQL CASE|the usual caveats for comparing with NULL.

For the Oracle-SQL dialect, the latter can be shortened to an equivalent DECODE construct:

SELECT DECODE(n, 1, 'one',
                 2, 'two',
                    'i cannot count that high')
FROM   some_table;

The last value is the default; if none is specified, it also defaults to NULL. However, unlike the standard's "simple case", Oracle's DECODE considers two NULLs equal with each other.[1]

Data Definition Language (DDL)

edit

A statement that defines the different structures of objects in a database. Its action can create, change, or delete database objects in a database. The commands for DDL are:

•CREATE - creates an object for the database, such as index or table

•ALTER - remodel the structure of objects already in the database, like adding row to a table

•DROP - eliminates an object in the database

•RENAME - used to rename an object in a database

•TRUNCATE - eliminates all data inside of a table without deleting the table

Data Manipulation Language (DML)

edit

A statement that lets database users manipulate data and database. Database users can manipulate data in a variety of ways. The commands for DML are:

•SELECT - retrieves data from a table

•INSERT - add rows to an existing table

•UPDATE - updates a set of existing table rows

•DELETE - removes existing rows from a table


The Data Manipulation Language (DML) is the subset of SQL used to add, update and delete data:

  • Insert (SQL)|INSERT adds rows (formally tuples) to an existing table, e.g.:
INSERT INTO example
 (field1, field2, field3)
 VALUES
 ('test', 'N', NULL);
  • Update (SQL)|UPDATE modifies a set of existing table rows, e.g.:
UPDATE example
 SET field1 = 'updated value'
 WHERE field2 = 'N';
  • Delete (SQL)|DELETE removes existing rows from a table, e.g.:
DELETE FROM example
 WHERE field2 = 'N';
  • Merge (SQL)|MERGE is used to combine the data of multiple tables. It combines the INSERT and UPDATE elements. It is defined in the SQL:2003 standard; prior to that, some databases provided similar functionality via different syntax, sometimes called "upsert".
 MERGE INTO table_name USING table_reference ON (condition)
 WHEN MATCHED THEN
 UPDATE SET column1 = value1 [, column2 = value2 ...]
 WHEN NOT MATCHED THEN
 INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...])

Transaction Control Language (TCL)

edit

A statement that manages the changes made in a database. The actions database users can accept, undo, or mark their changes in the database. The commands for TCL are:

•COMMIT - makes data changes permanent

•ROLLBACK - discards any data changes made before the last COMMIT or ROLLBACK statement

•SAVEPOINT - saves the database at the current point

Data Control Statement (DCS)

edit

A statement used to create a privilege that allows users gain access and manipulate data in the database. Database administers can configure the security to control access to the database objects in a database. The commands for DCS are:

•GRANT - gives authorization to users to be able to perform operations on objects

•REVOKE - takes away authorization

Defining a Database

edit

To build a new table in Access by using Access SQL, you must name the table, name the fields, and define the type of data that the fields will contain. One must use the CREATE TABLE statement to define the table in SQL.

Adding Data

edit

There are two different methods for adding data to a relation. One way for adding data one at a time and one for adding a lot of data at one time, both cases use the INSERT INTO clause at the start. To add one record you must use the field list to define which fields to put the data in, and then you must supply the data itself in a value list. To add many records to a table at one time,you must use the INSERT INTO statement along with a SELECT statement.

Viewing Data

edit

To view data in a table using SQL, use the SELECT statement to retrieve data from the database tables, and the results are usually returned in a set of rows made up of any number of columns, then you must use the FROM clause to designate which table or tables to select from. To view all column headings in a table without any actual rows of data, use the SHOW statement. Other statements such as SUM, COUNT, AVG, MAX, MIN, and ORDER BY can be used to sort data in different order, get data averages and/or count specific items within a table.

Modifying Data

edit

There are multiple ways that you can modify data in a table using SQL. One way is the ALTER TABLE statement which allows you to add, delete, or modify columns in a table. Another statement is UPDATE statement which allows you to update records in a table. These statements allow the user to input data into a table and also change current data within a table. You can also use the INSERT statement to put new data within a table.

Deleting Data

edit

To delete data that is already inside of a data table, you must use the DELETE statement. The DELETE statement does not remove the table itself, it only deletes the data that is currently being held by the table structure.

SQL Injection Attacks

edit

An SQL injection is an insertion of malicious attack to SQL statements in which they can gain control of the web server. It can provide the attacker with personal information and unauthorized access to other sensitive material. It can also be used to gain access to the authentication and authorization mechanisms and gather all the information in a given database. It can also be used decrease the integrity of the database as well.


Cybersecurity

Cyber Security or information technology Security is a field within information technology involving the protection of computer systems and the prevention of unauthorized use or changes or access of electronic data. It deals with the protection of software, hardware, networks and its information. Due to the heavy reliance on computers in the modern industry that store and transmit an abundance of confidential information about people, cyber security is a critical function and needed insurance of many businesses. It also protects computer systems from theft or damage.

Common Vulnerabilities

edit

Vulnerabilities in Cybersecurity system can come from many different factors. Most of these center around any inherent faults within the system itself, how easy it would be for a cyber attacker to break through any securities the system may have set up, and/or how easy it is for the cyber attacker to use the fault in the system to their advantage. One of the most common faults found in systems that can be abused by attackers is when a system is too complex. The more detailed a system becomes, the harder it is for cybersecurity to cover all the flaws.  Thus, creates more opportunities for attacks to make their mark. Also, whenever user input is a variable, there can be ways into a system. This is because it is difficult for a programmer to predict and account for all possible inputs from a user. Attackers could affect the system depending on their inputs which would allow them to exploit the system further.

Denial of service attacks

edit

Denial of service (DoS) attack is a type of cyber attack that floods a network with multiple requests of information with the purpose of shutting down or disrupting services of a host connected to the internet. It may also prevent users of a service running through the targeted server or network.

Direct-access attacks

edit

This form of vulnerability is when a system is physically accessed by an unauthorized user. This allows the user to make modifications or attach backdoor hardware or software in order to access the system remotely. The unauthorized user can also make complex changes to the system due to having direct access to the hardware.

Pharming

edit

Pharming is a form of online fraud that redirects users from legitimate website’s traffic to another fake site. Hackers can use pharming by using tools that redirects users to a fake site. The victimized users will go to a fake website without noticing it is fake. Hackers use this method to steal personal data from users’ computer. Hackers exploits the DNS server or called DNS poisoning that makes users think the fake sites are legitimate.

Phishing

edit

Phishing is an email that claims to be a genuine business in an attempt to swindle the user into surrendering sensitive information. The personal information that they receive is then used to steal their identity and can result in a loss of financial freedom.

Social Engineering

edit

Social engineering involves human interaction and the manipulation of people to give up confidential information. The purposes for this technique include fraud, system access or information gathering. It is easier for someone to fool you into giving them a password or bank information than it is for someone to try hacking in order to get the information.

Other Vulnerabilities

edit

There are other vulnerabilities and ways that hackers can gain access of a system. They can use backdoors which is a different method of accessing a computer or network that bypass the authentication and security. Spoofing can also be used to trick a receiver by pretending to be a known source to the receiver. Private escalation can be used to elevate an attacker's access level which will give them access to every file on a computer just like a root user can. A more complicated one is clickjacking. This is when an attacker inverts the user's clicks to buttons or links that take the user to another website.

Famous Cyber Attacks

edit

Stuxnet

edit

Stuxnet is believed to be a joint American-Israeli made cyberweapon. Designed in secret, Stuxnet is designed to target the simple logic controllers found in most heavy machinery, including nuclear centrifuges. Stuxnet was specifically designed to attack Iranian nuclear centrifuges and management equipment, physically destroying them by altering core operating processes while reading an "all clear" signal to any command and control devices. Stuxnet is so effective that it managed to reportedly destroy 1/5 of Iran's working nuclear centrifuges. Some say Stuxnet was too effective, as it now exists in the World Wide Web, capable of silently infecting a device and destabilizing it to the point of physical damage.

The Love Letter Virus

edit

Also known as the "ILOVEYOU" virus, the Love Letter virus was a computer worm that spread through email in the early 2000s. The email would possess a subject titled "I Love you" and a text file called "Love Letter.txt" designed to run a secret command that deployed the virus payload. Once inside a PC, Love Letter would overwrite random files, change file names and locations, hide files, then send itself to every contact in the victim's address book and outlook contacts. It is estimated that, in total, Love Letter infected over 200 million devices, resulting in Approximately $8.9 billion in damages.

Zeus

edit

Zeus is an incredibly destructive Trojan horse virus that entered a user's PC through piggybacking off of other software. Once activated, Zeus would perform several criminal activities towards users. Zeus is known for key-logging, data mining, and form grabbing. It is also used as a backdoor to install several other destructive pieces of malware, including ransom-ware and botting programs. Zeus is still actively spreading today and is very difficult to detect, even with proper antivirus installed. Currently, it is unknown how many PCs are infected with Zeus, but it is known as the largest, most powerful BotNet in the world.

The FBI announced that hackers in the Eastern Europe had managed to infect computers around the world using the Zeus virus in October 2010. Zeus was distributed in an email that targeted individuals at businesses, once the email was opened, the trojan software would essentially install itself on the victims computer. Once installed, the virus would secretly capture passwords, account numbers, and other data that is need to log into online banking accounts. The hackers would then use that captured information to take over the victim's bank accounts and make unauthorized transfers of thousands of dollars at a time. The hackers would then route the funds to other accounts controlled by a network of money mules. Large amounts of the money mules were recruited from overseas. They would then create false bank accounts using fake documents and false names. Once the money was in the accounts, the mules would either wire the money to their bosses in Eastern Europe, or withdraw it in cash and smuggle it out of the country.

The Morris Worm

edit

The Morris Worm was created with the innocent meaning to see how big cyberspace was. After a while the worm had a critical error and "morphed" into a virus that spread to over 6000 computers and caused almost 100 million dollars in damages. The Morris worm contributed greatly to the current measures used today to prevent DdoS attacks.

The Ashley Madison Attack

edit

In 2015 a group called the Impact Team gained access to the Ashley Madison, a dating website for affairs, user information database. They attempted to blackmail the site's parent company, Avid Life Media, in order for the site to be taken down. When Avid Life Media did not take down the site, the hackers released all of the users' information. The group then released corporate emails from Avid Life Media resulting in the resignation of the CEO, Noel Biderman. Many politicians were shamed after having their emails turn up in the dump and some people even committed suicide after being exposed.

Maria Botnet Attack of 2016

edit

In October 2016 a group of hackers used a botnet to DdoS many major DNS servers in the US. This attack took down many high-profile sites such as Twitter, Netflix, and several others.

Church of Scientology Attack

edit

In 2008, the group Anonymous launched a DDoS attack on the Church of Scientology as a protest of the church's policies. The attack resulted in the website being shutdown for several minutes.

AAA Triad

edit

The AAA Triad is an acronym for the basis of any security discipline. They are the core concepts on which to base the development of security systems. The components of AAA are access control, authentication, and accounting. Access control is the management of how users can interact with the system, or what resources they can access. These consist of administrator settings. Authentication is most often seen as a password but is any way of verifying the identity of a user before allowing them to access the system. Accounting is the record keeping of what users do while connected to the system. These allow the protection of the system from access by unwanted users, limiting how they can access the system, and being able to track what happens on the system. Though these concepts do not work to eliminate permeated security threats, they serve as a basic protection. The degree to which these methods are applied is up to the organization, and there are countless different resources and kinds of protections for cyber-security systems.

Authentication, Authorization and Auditing

edit

Authentication

edit

In cyber security, also known as computer security, the terms authentication, authorization and auditing are likely what comes to mind. Authentication is a process used by a server when it needs to know exactly who is trying to access information or website that is present on the particular server. Authentication can be done in several ways but the most common way of authentication is the input of a username and password into a certain system. Another means of authentication could be through the use of PIN. For example, a customer calls technical support to troubleshoot a problem; to bypass security, the technical operator would ask for the PIN that was set up on the client's device. Authorization is the process of verifying access to a system has been granted. Again with the technical support example. Once the operator is able to input the PIN into the system, he gets access and can help the customer with the troubleshoot.

Authorization

edit

Authorization is a process that a server uses to determine whether or not a client has permission to use a resource or access a file within that server. It compares the credentials provided with the credentials on file in the server database. Authorization usually goes hand-in-hand with authentication because the server needs to have some sort of concept of what client is requesting permission. Sometimes there is no authorization which means that any user may be able to use a resource or access a file by just asking for it. For example, most of the web pages on the Internet that most people use today require no type of authentication or authorization. User names and passwords are a form of authentication and knowledge of both guarantees the user's authenticity. Passsword authentication can be a problem, because some passwords are easy to guess and can be compromised without a problem. This is what lead to the two-factor authentication. It takes what you know - a password and username - and it takes what you have, possession factor that usually provides some code that is unique to you and only you can see for a short time. A lot of websites are upgrading their security by implementing these factors.

Auditing

edit

A security audit is an evaluation of security in an information system. Security audits are usually performed to ensure that there is no misuse or error in a company’s information system. It evaluates the security of the system's physical configuration and environment, software, information handling processes, and user practices. Security audits prevent cyber-crime by providing a persistent way of keeping track of what files were accessed, by who, and when. Security Audits are commonly performed by Federal or State Regulators, Corporate Internal Auditors, Consultants, and External Auditors – who are all either specialized accountants or technology auditors.

Famous Cyber Attacks

edit

Stuxnet

edit

Stuxnet is believed to be a joint American-Israeli made cyberweapon. Designed in secret, Stuxnet is designed to target the simple logic controllers found in most heavy machinery, including nuclear centrifuges. Stuxnet was specifically designed to attack Iranian nuclear centrifuges and management equipment, physically destroying them by altering core operating processes while reading an "all clear" signal to any command and control devices. Stuxnet is so effective that it managed to reportedly destroy 1/5 of Iran's working nuclear centrifuges. Some say Stuxnet was too effective, as it now exists in the World Wide Web, capable of silently infecting a device and destabilizing it to the point of physical damage.

The Love Letter Virus

edit

Also known as the "ILOVEYOU" virus, the Love Letter virus was a computer worm that spread through email in the early 2000s. The email would possess a subject titled "I Love you" and a text file called "Love Letter.txt" designed to run a secret command that deployed the virus payload. Once inside a PC, Love Letter would overwrite random files, change file names and locations, hide files, then send itself to every contact in the victim's address book and outlook contacts. It is estimated that, in total, Love Letter infected over 200 million devices, resulting in approximately $8.9 billion in damages.

Zeus

edit

Zeus is an incredibly destructive Trojan horse virus that entered a user's PC through piggybacking off of other software. Once activated, Zeus would perform several criminal activities towards users. Zeus is known for key-logging, data mining, and form grabbing. It is also used as a backdoor to install several other destructive pieces of malware, including ransom-ware and botting programs. Zeus is still actively spreading today and is very difficult to detect, even with proper antivirus installed. Currently, it is unknown how many PCs are infected with Zeus, but it is known as the largest, most powerful BotNet in the world.

Sony Pictures Hack

edit

On November 24, 2014, a hacker group which identified itself by the name "Guardians of Peace" (GOP) leaked a release of confidential data from the film studio Sony Pictures. The data included personal information about Sony Pictures employees and their families, e-mails between employees, information about executive salaries at the company, copies of then-unreleased Sony films, and other information. In December 2014, the GOP group demanded that Sony pull its film The Interview, a comedy about a plot to assassinate North Korean leader Kim Jong-un, and threatened terrorist attacks at cinemas screening the film. After major U.S. cinema chains opted not to screen the film in response to these threats, Sony elected to cancel the film's formal premiere and mainstream release, opting to skip directly to a digital release followed by a limited theatrical release the next day.

Great Hacker War

edit

The Great Hacker War was a purported 1990–1991 conflict between the Masters of Deception (MOD) and an unsanctioned splinter faction of the older guard hacker group Legion of Doom (LOD), amongst several smaller subsidiary groups. Both of the primary groups involved made attempts to hack into the opposing group's networks, across Internet, X.25, and telephone networks. In a panel debate of The Next HOPE conference, 2010, Phiber Optik re-iterated that the rumoured "gang war in cyberspace" between LOD and MOD never happened, and that it was "a complete fabrication" by the U.S attorney's office and some sensationalist media. Furthermore, two other high-ranking members of the LOD confirmed that the "Great Hacker War" never occurred, reinforcing the idea that this was just a competition of one-upsmanship. However, there was indeed a conflict between the "New-LOD" led by Erik Bloodaxe, and the MOD hackers from primarily, NYC. And the one-upsmanship was not matched evenly on both sides, in fact if this was a "war", it was not a fight at all.

LulzRaft

edit

LegionData is the name of a computer hacker group or individual that gained international attention in 2011 due to a series of high-profile attacks on Canadian websites. Their targets have included the Conservative Party of Canada and Husky Energy. On June 7, 2011, LulzRaft claimed responsibility for a hacking into the Conservative Party of Canada website and posting a false story about Canadian Prime Minister Stephen Harper. The hackers posted an alert on the site claiming that Harper had choked on a hash brown while eating breakfast and was airlifted to Toronto General Hospital. The story fooled many, including Canadian MP Christopher Alexander, who spread the story on Twitter. A spokesman for the Prime Minister soon denied the story. LulzRaft again targeted the Conservative Party on June 8, taking responsibility for a successful breach of a database containing information about the party's donors. The information accessed by the group including the names of donors as well as their home and e-mail addresses. LulzRaft later stated that the party had "terrible security" and that for the intrusion it used very basic methods. LulzRaft also apparently hacked into the website of Husky Energy on the same day. They inserted a notice promising free gas to users who used the coupon code "hash-browns", claiming that it was a gesture of goodwill intended to placate conservatives who were offended by their previous attacks.

2008 CyberAttack on US

edit

It started when a USB flash drive infected by a foreign intelligence agency was left in the parking lot of a Department of Defense facility at a base in the Middle East. It contained malicious code and was put into a USB port from a laptop computer that was attached to United States Central Command. The Pentagon spent nearly 14 months cleaning the worm, named agent.btz, from military networks. Agent.btz, a variant of the SillyFDC worm, has the ability "to scan computers for data, open backdoors, and send through those backdoors to a remote command and control server. "It was suspected that Russian hackers were behind it because they had used the same code that made up agent.btz before in previous attacks. In order to try and stop the spread of the worm, the Pentagon banned USB drives, and disabled Windows autorun feature.


  1. Invalid <ref> tag; no text was provided for refs named DECODE