Session #2 (Part 1) Internet 101

Do I really need to know how it works?

You don't need to know how the telephone system works in order to use it, but the Internet isn't quite as transparent as the phone system.

An understanding of how it works will help you make better decisions about content, better decisions about what services to provide, etc.

The Internet has four histories:

The Internet wasn't planned or designed per se, it evolved, the work of numerous organizations.

It although the "founders" had a lot of insight, emphasized expdandability and scalability.

I'll be explaining Internet technology with a historical perspective…explaining how various aspects of the Internet work as they were developed. But keep in mind that a historical view of the Internet would have four distinct streams:

Technological

How the hardware, software, and communications evolved.

Organizational

Cooperation on the national and international levels, between government and academia, the formation of numerous oversight and standards bodies.

Commercial

How the system adapted to accommodate private industry, and how businesses and publishers embraced and it

Social

Acceptance, Internet mania, Internet phobia, Internet addiction, etc.


  1. Definition of the Internet (A Network of Networks)
    1. "An amorphous collection of unique networks sharing common addressing scheme and using compatible communications technology."
    2. Food for thought: Why does it have a "the" in front of it? Why is it capitalized? Compare it with network, intranets, extranets.
    3. Internet vs. Private networks (AOL, CIS, Prodigy -- until recently, these were separate networks, perhaps with gateways to provide internet access. Now, users are "on" the internet when they log on… more later)
  2. Origins of Internet technology
    1. The telephone system

      (You can use a telephone line to connect, but the Internet is not the same network.)

      1. A circuit-switched network: when you called someone, a discreet circuit was opened up between you and your calling party. When you talked, 'information' was passed, but when silent, the lines were empty.
      2. OK for voice traffic until the 60s, when the phone companies researched other technologies, such as Time Division Multiplexing, and packet switching.
      3. Transition to digital: Once the whole phone system was analog…now, it is only analog from your handset to the telephone company, where it becomes digital. This is why we still need modems.
    2. The military
      1. At the same time (late 60s) the military searched for ways to use networks, and create ARPANET. (Advanced Research Projects Network)
      2. ARPANET tried to find a way to connect all the various military networks, which included satellite and radio. (It operated numerous networks throughout the 70's)
      3. Survivability: Another requirement the military had was that the Internet survive in case of nuclear war: if one node was wiped out, the network would survive.
      4. In a stroke of genius, they invited numerous government and academic institutions, and insisted the research be applied and used in real-word situation.
      5. Openness: Perhaps the most important outcome of the government-academic alliance: the Internet was devised on open standards. This is very important because otherwise it would have little commercial value.

      Hobbes' Internet Timeline v4.0

      http://www.isoc.org/guest/zakon/Internet/History/HIT.html

  3. Internet Structure (What does the Internet Look Like?)
  4. TCP/IP
    1. IP = Internet Protocol
      1. The basic communications protocol, defines the rules of how two computers communicate with each other.
      2. Specifically, it dictates how packets are formed and delivered.
    2. TCP = Transmission Control Protocol
      1. Augments IP with a layer of reliability--TCP is what does error checking and efficient routing.
    3. IP Address
      1. example: 154.848.822.177
  5. Packet switching

    How does data actually MOVE through the Internet, and how is it different than the phone system?

    1. The technology the phone companies came up with to overcome the limitations of the circuit-switched network.
    2. What is a packet?

      source

      0101010100101010

      check

      destination

    3. Three things packet switching solves:
      1. Filling up the Space: Data can be chopped up into smaller pieces in order to make the "pipes" completely full, using all available bandwidth. Multiple computers can "take turns" sending data.
      2. Checking accuracy: Through use of checksum and smart routing and re-routing, packets ensure data is transmitted without errors.
      3. Efficient routing: Smaller units are more easily routed and rerouted.
  6. Commercialization
    1. In the beginning, only government agencies, the military, and a few non-profits were allowed to join. The Government's original "acceptable use policy" forbid any commercial or private activity from taking place on the NSFNET backbone.
    2. This actually helped foster the growth of the infrastructure:
      1. Private industry was forced to create their own national and regional networks instead of using governmental ones.
      2. Hardware and software vendors were forced to adopt TCP/IP, because the government & university were major customers.
      3. When the private and public networks were allowed to join, the Internet was huge.
    3. Commercialization a slow process
      1981: VERY Limited commercial traffic .. only a sharing agreements
      1988: NSF initiated privately-funded enhancements to the backbone
      1989: email links, dial-up access began
      1990: ARPANET was decommissioned
      1995: NSFNET backbone de-funded
    4. Entrepreneurs bought & networks below mid-level, sell access as retail.
    5. End-users subscribe to a service giving them access at the retail level.
  7. Internet Growth
    1. Statistics: (see chart) How can it grow so fast?
    2. Because it is so open -- any kind of computer can connect
    3. The Internet itself has no capital costs -- regional / local nodes bring their own equipment
      1. Is the Internet free?
      2. No, someone has always paid
    4. Uses existing lines & networks
    5. This nearly exponential *rate* of growth can't continue:
      1. the number of new nodes will eventually outpace possible production of computers.
      2. Current TCP/IP addressing scheme will only support 4,294, 967, 296 unique addresses. (About 15 times as large as the # of computers) IP2 is in the works.
    Date       Hosts
    12/69	         4
    12/70	        13
    04/71	        23
    10/72	        31
    01/73	        35
    06/74	        62
    03/77	       111
    12/79	       188
    08/81	       213
    05/82	       235
    08/83	       562
    10/84	     1,024
    10/85	     1,961
    11/86	     5,089
    12/87	    28,174
    07/88	    33,000
    01/89	    80,000
    10/89	   159,000
    10/90	   313,000
    01/91	   376,000
    07/91	   535,000
    10/91	   617,000
    01/92	   727,000
    07/92	   992,000
    10/92	 1,136,000
    10/93	 2,056,000
    01/94	 2,217,000
    07/94	 3,212,000
    10/94	 3,864,000
    01/95	 4,852,000
    07/95	 6,642,000
    01/95	 5,846,000
    07/95	 8,200,000
    01/96	14,352,000
    07/96	16,729,000
    01/97	21,819,000
    07/97	26,053,000
    01/98	29,670,000
    07/98	36,739,000
    01/99	43,230,000
    
  8. Access speed / Bandwidth
  9. Distributed addressing: The Domain Name System
    1. IP Addressing
      1. Every computer on the Internet requires one, in order for packets to find their way.
      2. IP Address composed of four bytes (example: 128.180.48.64).
        (maximum of 4,294,967,296 possible IP addresses with current system)
      3. Dynamic vs. Static
        1. Servers, routers, PCs on LANs usually gets a static IP address. This never changes, and you can always be found easily.
        2. PCs that dial-in, and some on LANs get dynamic IP addresses. AOL or an ISP may have 100K customers, but only 50K dial-in modems. Rather than assign each user an IP address, each modem has one.
        3. Note implications of dynamic IP addressing for publishing.
          - identifing individuals for interactive services, detecting for demographics
    2. Domain Name System
      1. The Domain Name System was introduced in 1984, when the number of host computers reached about 1,000. It allowed users to find computers w/o needing to know the numeric address.
      2. Basically a lookup system: when you make a send/retrieve request, your computer first checks the nearest name server (DNS) to translate the name into the number.
      3. A hierarchical, distributed system
        1. Sample: www.nynex.com, or members.aclu.org
          1. Top level domains; COM EDU NET ORG MIL GOV
          2. New top level domains are being added
          3. Mid-level domains = name of your network
          4. Computer name
            1. Computers can have multiple entries in the DNS server.
            2. The convention of using www.companyname.com is totally arbitrary.
              It began as a way for universities to distinguish their web servers from their other servers (math.yale.edu, physics.yale.edu, www.yale.edu)
        2. How it works
          1. You look up members.aclu.org, query goes to your DNS
          2. If your DNS finds aclu.org, and transfers the request
          3. The aclu.org DNS knows the address for members, and resolves
          4. Analog: US Postal System: the mail sorter in NY doesn't know cities in Wisconsin, the sorter in Milwaukee doesn't know the streets of Oshkosh, etc.
    3. Assigning Names
      1. Originally performed by InterNIC,
        1. a private company commissioned to be the DN registry in 1993
        2. costs $70 per 2 year period, to offset costs
      2. New plan (just recently implemented due to politics)
        1. Single Registry
          1. maintained by a central body (http://www.internic.net)
          2. administers the database, nothing else
          3. gets a small fee
        2. Multiple registrars
          1. for profit, charge what they want - some companies (dotster.com) charge as little as $15 to register a domain name
          2. take care of the processing or orders
        3. Domain names are like 800#s -- not trademarks

      as web publishers, the domain name authority is likely the only agency you'll interact with.

      We'll talk more about getting domain names when we discuss site hosting options.

  10. Everything but the Web
    1. Email
      1. The first "killer application" for the Internet, and still the main reason people cite for going online.
      2. Email is instantaneous, convenient, useful for delivering text and images.
      3. It uses TCP/IP to transfer messages
      4. An extension of the DNS & mail directories for addressing
    2. FTP - File Transfer Protocol
      1. A variant communication protocol devoted to downloading and uploading binary files.
      2. This protocol works in tandem with HTTP, so you can download binary files while surfing the Web seamlessly
      3. Computers can be set up for anonymous FTP, or accounts.
    3. Newsgroups
      1. The Network News Transfer Protocol (NNTP) is a bulletin board system that can be either public or private.
      2. NNTP uses a topical hierarchy to organize groups…it looks similar to the DNS syntax
      3. Groups can be moderated, or free-for-all
      4. The news database replicates across computers automatically. If an administrator decides to carry a "newsgroup."
    4. telnet
      1. "Remote Control"
      2. A way of running programs on remote computers, sometimes called terminal emulation. You actually get the operating system prompt, and are executing commands on the other computer.
      3. Still widely used, although most programs are getting a Web interface.
    5. Gopher / Veronica / Archie / WAIS
      1. Before the Web made them obsolete, there were numerous schemes for indexing and accessing documents and other data files on the Internet. None are really in use today.
      2. These systems used menus to point to documents / were precursors to browser.

    MBONE

    for multicast

    Streaming Media

    net telephony

  11. The Web
    1. The Difference
      1. Instead of logging-in, downloading, or searching, you are now browsing. Information is displayed during the process of obtaining it.
      2. Graphics, sound, text, and other media are integrated.
      3. Each document has its own address, making it possible to point to documents on other computers.
    2. The Effect
      1. Because the whole Web is interconnected, all presented in the same Browser interface, it becomes seamless to end users.
    3. The [original] Goal
      1. Intended to help scientists collaborate faster: preliminary results could be published quickly, and get immediate feedback.
      2. Intended to help scientists more easily over large distances. When a paper was presented on the Web, you could actually follow links in the bibliography without a trip to the library.
      3. Because of the openness of the Internet, and the variety of computing platforms connected to it, the Web was intended to normalize access to documents created on different platforms. (later, talk about HTML)
    4. The Origins
      1. Really early history: Vannevar Bush, science advisor to FDR during WWII, wrote a paper called "As We May Think," in which he described the Memex, a electromechanical device that would store the whole library of knowledge. He even described how pressing buttons accompanying the text would allow you to call up related works.
      2. Fast-forward to 1991, when Tim Berners-Lee of CERN released the World-Wide Web.
      3. In 1993 Mosaic catches on, and WWW traffic jumps with an annual growth rate of over 350,000%. (The same year the US National Information Infrastructure Act is signed)
    5. The Web Protocol
      1. http stands for HyperText Transfer Protocol, and it is the lingo of the World-Wide Web
      2. Browsers will also accommodate other protocols, usually by launching an ancillary program (ftp, nntp, gopher, mail)
    6. Uniform Resource Locators
      1. A URL is the address for a document.
        http://www.nycmta.com/Subway/Maps/Electrical.html
      2. http:// -- the protocol, information the browser requires.
        1. Usually, browsers will try http:// as the default if you leave it off.
      3. www.nycmta.com -- domain name,
        1. tells the browser what computer to look on for the document
        2. NOT case sensitive
        3. Most organizations name their web servers "www" but it isn't required or necessary.
      4. /Subway/Maps/Electrical.html -- file path/name
        1. The file name and where it lives on the host computer
        2. Just like c:\subway\maps\electrical on your local PC
        3. This part IS case sensitive
  12. The Internet is Not Done Changing!
    1. Bandwidth and speed must increase
    2. Will need to evolve as computers evolve
    3. Will need to accommodate new kinds of devices
    4. Will need to adapt to new services, multimedia, etc.
    5. Internet phone and Internet TV