| |
CS 642 Term PaperA survey of distributed languagesCopyright © 1996 Alex Nicolaou. All rights reserved. |
|
|
|
|
AbstractThe huge popularity of the World-Wide Web has provided a new demand for languages that support an elegant model for distributed programming so that the Internet can be fully utilized by programmers. Several new languages have appeared that attempt to provide solutions for the Web.
In this paper these languages are looked at as languages in their own right,
rather than as special purpose Internet languages. By considering them as
serious entries into the language market more interesting observations are
possible, perhaps including an intuition as to which of the new languages
will be best accepted and used in the future.
The languages considered are
Java,
Phantom,
and
Python.
|
|
|
|
|
1. Introduction |
|
| This paper's layout may be freely copied, but its content may not. |
|
All of the languages provide some degree of network support, and
this is an important factor in considering what languages are
well suited to programming "for the internet". However, it is
equally important that the languages support general
programming tasks as well, since the general features are those
that will make it easy or difficult to do the bulk of the
programming. All of the languages have some features in common,
but each also provides some unique features.
As would perhaps be expected in a world buzzing about "objects",
all of the languages are object-oriented. Since that
particular term has seen so much use as to be ambiguous, it seems
it now needs a (re-)definition for any author who cares to use
it.
For the purposes of this article, a language shall be deemed
object-oriented if it provides sub-typing, polymorphism,
inheritance and dynamic binding.
|
|
| Research in compiler technology for languages such as MIT Scheme and ML may ultimately discover new and more effective techniques for implementing closures and continuations, but these features are still amongst the most expensive aspects of languages which have them. |
|
All the languages are also interpreted, which makes sense
considering their intended application. Being interpreted means
that they can be largely platform independent, and more easily
migrate programs and objects from location to location as
desired. Only
Java, whose
roots
indicate a
need for a compiled language, has really left the door open to
efficient compilation in terms of current compiler technology.
A more surprising choice that all the languages have made is the
support of "programming-in-the-large" features, in the form of
reasonably well developed module systems that allow the
programmer to create a larger project with a coherent
organization. Considering the immediate applications anticipated,
namely web programming and scripting (in the case of
Python) this choice seems a
strangely wise one. Perhaps "C" has taught the language community
all too well that it is wise to plan ahead for a popular
language, lest a poorly planned language be stretched far beyond
the original intentions.
In terms of differences, the languages each have unique features
that set them apart from each other and from their predecessors.
Java is the least novel of the
languages.
Java's designers were aiming to
produce a language that was similar to C++ so that the language would
be popular, and their strategy is proving very successful.
The main feature that sets
Java apart from other languages
is the investment in secure transmission of code so that the
users of the object programs can execute
any program provided on a web page without
worrying about trojan horses.
The irony is that the very decisions that have limited
Java are the very decisions
that will make the language successful.
Phantom
provides the distributed semantics of
Obliq,
but is a substantially different language. It implements a mutation of
Modula-3, which provides it with a well defined base upon
which to build some more exotic features. The main advantage seen
in
Phantom
is the handling of distributed objects, but unfortunately the
work on the network support will not be completed, as the author feels
that
Phantom
does not have the momentum to compete with
Java.
Python seems to be originally
designed to be a scripting language but has all the features that
would enable it to compete with
Java, including
a web browser
that can download and execute
Python applets. The
implementation focuses on the language itself as opposed to
issues specific to distributed programming, and provides a rich
set of functionality built-in to the language with the intention
of making programming with
Python easy.
Unfortunately, the plethora of built-in operations on objects
also gives the language's semantics a "fat" feeling, and places a
large burden on even the simplest objects.
|
|
|
|
|
2. Classes and Objects |
|
| Classes are to functions what control-flow constructs were to goto's: a solution applied universally, even when not suited to the task. |
|
Every one of the languages considered provides classes, inheritance
and polymorphism.
Java
is modeled directly after C++, which helps to make its class
system easy to learn.
Phantom
is also quite conventional, borrowing from and restricting
Modula-3's class system.
Python is the most peculiar of
the three, treating the classes themselves as first
class objects that can be modified on the fly.
Classes are supposed to bring three things to the language party:
a clear way to define abstract data types, a convenient
method to re-use implementations, and subtype polymorphism
. In most object oriented languages abstract data types are
just particular classes which hide implementation details that
the user of the type should not need to see or know about.
Inheritance is used to re-use implementation details,
and subtype relations are usually also represented by the
inheritance relation, allowing the programmer to use a derived
class wherever a base class is expected. Each of the languages
considered provides a portion of these features.
|
|
|
|
|
2.1. Java's class and interface model |
|
| Interfaces help to resolve multiple inheritance problems, but class structure also defines subtype relations so there isn't a clean break between sub-typing and implementation inheritance. |
|
Java
provides two ways to define a type. The first is by inheritance
from an existing class: in this case the new class inherits all
the type and implementation details of the base class, and can
extend or modify it as appropriate. The second method is by the
definition of an interface which defines all of the
signatures of methods that must be provided by a class, but gives
no implementation details.
Class inheritance is specified in the code by defining the new
class as extending the old class. Interface
implementation is defined by saying that a class implements
a particular interface. A class may only extend one other
class, but may implement as many interfaces as it desires. The
multiple implementation of interfaces is analogous to multiple
inheritance in other object oriented languages, but does not
create the confusion of common base classes.
When looking for which method to use
there is no confusion, since there is only one possible
implementation.
This separation simplifies inheritance issues greatly, but it is
perhaps unfortunate that the language doesn't go all the way and
prohibit classes from introducing new methods inside class
bodies. To have done so would have truly separated sub-typing from
implementation inheritance. As it currently stands, if one class
author doesn't anticipate a need to multiply inherit from two
classes, and thus doesn't define interfaces for them, the class
user can never use the classes in the way that is desired.
Unfortunately, even the class libraries provided with the
distribution don't make good use of the interface feature, using
interfaces rarely, and preferring to use class inheritance to
create both subtypes and implementations, in the same way that is
commonly seen in other object oriented languages.
|
|
| The terminology police might object to the use of the word "method" to refer to any data or method member of a class. |
|
As in C++,
Java
provides different levels of data hiding on a method-by-method
basis. Private methods are the most restricted, and are
visible only in the class in which they are defined.
Protected methods have slightly more complicated
semantics, of which for the moment the most interesting is that
they are visible in the base class and in all derived classes.
Finally, public methods provide the interface that is available
to outside users of the class. Typically a class will declare
everything as protected if it wishes to be easily modified or
extended, reserving private for items that are likely to be
changed in a later revision of the class.
|
|
|
|
|
2.2. Phantom's class model |
|
| Phantom's origins as a research project are never more clear than when considering the structure of the class system. |
|
Phantom's
class system is somewhat disappointing.
The main feature lacking is some form of multiple inheritance,
which has been entirely omitted, presumably to keep the language
simple. However, no functionality is provided to replace multiple
inheritance.
Data abstraction is provided in a way not seen in
other widely known languages (it most closely resembles
MOO, an
object oriented language designed for developing multi-user
servers). Read and write permission bits are
used to control access to data members of a class, and
execute permission is used to control visibility of a
member function. The language specification does not make it
clear what access derived classes are given to their parent
classes, so it seems reasonable to assume that there is either no
equivalent of
Java's protected, or
that everything is protected and there is no equivalent of
private. Either way is unsatisfactory for larger
projects.
Since no multiple inheritance is possible sub-typing is somewhat
more limited than in other object oriented languages, but for
simple programming tasks single inheritance would suffice. A
slightly less object-centric programming paradigm would be used
for larger projects, so that one could make good use of
Phantom's
module features for organization rather than using interfaces and
subclassing as one would in
Java.
|
|
|
|
|
2.3. Python goes first class |
|
| First class classes seem to be a potential nightmare for the would-be compiler writer. |
|
Python is the most dynamic of the
three class systems considered here. Absolutely no data
abstraction is provided by the language, which is sure to be a
problem for developing larger pieces of software. In
addition, there no types are used in function declarations,
giving a very polymorphic style to programs, since any type with
the correct members is permitted as a parameter.
What must be re-iterated to be absolutely clear is that
classes are first class objects. This means that new
functions and member variables can be added to a class at any
time, and then new instances of the class created. Existing
instances of the class are affected by changes to the class.
There are even hooks to allow the class programmer to redefine
what happens when a new member is added to a class if desired!
|
|
| Some would say "dynamically typed". |
|
In a typeless system, subtyping becomes irrelevant but
implementation inheritance is still useful and allowed, including
multiple inheritance with specific rules to allow the
programmer to depend
on how the interpreter deals with specific cases.
Although a Smalltalk programmer might feel right at home in the
Python environment, programmers used to strongly typed languages
should watch their step at first. A great number of pitfalls
exist that are not immediately obvious. For example, the behavior of the
namespace.
A data member hides a method member of the
same name. The description of
mix of static and dynamic name resolution in the documentation is
confusing, although the upshot is that the name resolution
basically behaves as one would expect it to.
The bottom line is that some would consider
Python to be the most object
oriented language recently developed, but working without the
crutches of type-safety and static class objects can be harder at
first.
|
|
|
|
|
3. Other Abstractions |
|
| Convenience features are a double edged sword, as often they are only convenient for the original implementor of a language. |
|
Java
provides the user with objects,
and uses them as the universal solution to all problems. This
helps keep the core language
small, and allows much of the "standard functionality" to be
provided as libraries written in Java.
This approach can really
simplify the job of creating new Java
implementations. Unfortunately, it doesn't make the programmer's
job any easier, since the class libraries provided with the
language contain sparse documentation that frequently leaves one
looking at the code to understand the features.
Phantom
takes the middle road, providing some useful built-in features
for handling lists an threads.
The list features include working with a slice of the
list which is some subrange of the list's valid indexes, and
an operator for appending lists. Although Scheme-style functions
such as map and reduce are not provided in the language they
would be trivial to implement since closures are provided.
Threads get an above-average treatment in
Phantom
with synchronization built-in as a keyword for class definitions
and a mutex type for writing critical sections supported directly
by the language specification.
Python goes the furthest of all,
providing genuinely useful abstractions built-in to the language.
The first encountered and most commonly seen abstraction is the
sequence. Lists, strings, and tuples are all examples of
the sequence type. All of these objects have convenient indexing,
slicing, and concatenation facilities. The values a variable in a
for loop takes on are the successive values of any sequence data
type, leading to an intuitive and powerful looping construct.
|
|
| Dictionaries with keys which are not strings are a relatively recent addition to Python. |
|
Dictionaries are another much-used and convenient
workhorse of Python programs. Similar to associative arrays in
Awk, dictionaries are used extensively both within the language
implementation and by the programmer whenever a lookup table is
needed.
A rarely seen extension present in
Python is a more "mathematical"
set of conditional operators. For example,
a < b == c is taken to mean a is less than b,
and b is equal to c. This seems strange only to the
individual who has worked with computers for so long that it
seems more natural to consider c to be a boolean value
which is being compared to the result of the comparison
a < b. In addition, the comparison operators are
well defined for sequence types! The definition is set up so that
comparison of strings works exactly as you would expect; that is
a sequence s1 is less than a sequence s2 if the
first element in which they differ is i and
s1[i] < s2[i].
Python
also provides closures, and (more recently) the expected
functions such as map, reduce, etc. In addition, programs written
in Python can create new definitions of functions, classes, or
instances on the fly by passing to the interpreter a string which
is compiled into the environment, so self-modifying code is easy
to write. All in all,
Python
provides the most complete language support for the mundane tasks,
which makes it extremely well suited to getting a prototype
working quickly.
|
|
|
|
|
4. Modules |
|
| It seems ironic that so many little-used languages provide a strong module system while "professional" developers waste hours stretching C and C++ well beyond their limits in large software projects. |
|
The basic goal of a module system is to package functionality in
a way that can be easily bought and sold by the various consumers
and produces of code. Each lower level library should sell a set
of features to a higher level library, which in turn provides a
set of features to still higher level libraries, and ultimately
applications are little more than a layer of user-interface
between the end user and the various engines that make up the
product.
At first glance, it would seem that
Phantom
should have the strongest module system since it borrows from
Modula-3, whose very name implies a strong module system. Although
Phantom's
system is reasonable, it is a watered down version of the
original and doesn't provide any significant power not present in
Java or
Python.
Fundamentally, all three module systems provide the same type of
functionality, packaged slightly differently. A module exports an
interface (only Java allows a
single module to export more than one interface) which is
imported by a client who wishes to use the interface.
Once imported all the public aspects of the interface are visible
to the client. There are some syntactic differences in
how modules are accessed, but these are simply the syntactic
sugar (or salt) of each individual language. Perhaps the only
difference of note is that in
Java
global functions and data have been eliminated, which means that
packages of classes containing static functions are used instead.
There
is no particular advantage to this other than helping to keep the
language small; ultimately this object-centric view could be a
problem when global generic functions are desired.
|
|
|
|
|
5. Distributed Programming Features |
|
| In reality, neither Java nor Python support distributed programming. What they've really done is distribute the interpreter. |
|
There can be no doubt about the fact that the only reason
Java
is popular as that it is targeted and suited for developing programs that will
run in your web browser.
Python has been used to build a
browser with very similar features, and so could be considered a
competitor to
Java
except for the lack of marketing.
Phantom,
on the other hand, provides real distributed programming features
but has no application to show them off.
The main area where these languages are trying to shine is in the
security department. Fear is the one thing that might
keep people from using the features: fear that some virus will
come into their computer via the internet and wreak havoc on
their system. For this reason, the emphasis has been on providing
a secure and restricted environment for these web applets to play
in so that users can be amused and not be afraid.
|
|
|
|
|
5.1. Java's distribution model |
|
|
|
|
In the security category, there can be no question that
Java
has the head start. When the interpreter loads
Java bytecodes over the network
they are checked by a verification process to help ensure that
they don't perform any illegal actions, and run-time checks such
as array-bounds checking are performed as well. The only fly in
the ointment that is clear now is the use of the
finalize() method. This method is similar to a C++
destructor in that it is called when an object is being
destroyed. However, there are two things to worry about in terms
of security and finalize(). The first is that
this method can resurrect the object by creating a new
reference to it, thus perhaps breaking the promise that when you
leave a web page the Java objects are killed for you. The second
is that there are no guarantees about which thread will invoke
the finalize() call, which means potentially an
important thread (such as the one running the garbage collection)
could invoke and be killed by the code in a
finalize() member. It is not clear whether these are
real security holes or not as the author has not attempted to
verify whether these attacks are possible, but the documentation
certainly suggests that they are.
In terms of real distributed programming, the support in Java is
disappointing. There is no way to access an object over the
network, and no way to send an object across the network. Only
code can be easily transmitted under the Java model which means
that the data must be communicated by a more conventional
socket-style communication.
|
|
|
|
|
5.2. Phantom's distributed objects model |
|
|
|
|
Phantom
provides the best support for distributed programming of
all three languages. The closest equivalent to a pointer is
really an interpreter location (specified as an IP address and
port) and a 128-bit key which specifies the memory location. The
intent is that these global locators are unforgeable,
since the 128-bit address space is large and sparsely populated,
and any given interpreter transparently translates access to a
remote resource by communicating with the remote inrepreter on
the programmer's behalf. Although this system is supposed to be
secure, it is not clear that it is in fact protected from
malicious interpreters, which could create references to real
objects with different types than their actual types in order to
access data that should be inaccessible in the remote
interpreter. However, provided that all of the interpreters are
certified, the mechanism is safe and does not allow the
application programmer to do inherently insecure operations in
the language itself.
Since the interpreter creates closures for all the functions it
encounters, logical and safe static scope rules govern what each
function has access to. The semantics of distributed objects are
borrowed from
Obliq.
Objects are always passed as references (global locators) to the
actual instance, so the programmer can't easily send objects from place
to place, although closures are transmitted so that object like
entities can be moved around. At the same time as being a
feature, one is led to wonder how slow transmitting closures must
be, especially if the programmer isn't taking special care to
keep the free variables few and far between.
|
|
|
|
|
5.3. Python's distributed support |
|
|
|
|
Python is in much the same state as Java, with absolutely no security. It too can transmit code across the network. It would be easy to transmit objects across the network using the pickling facility as well. However, no security provisions yet. Naturally the security is being worked on but it could be a rough game of catch-up to get to where Java already is. |
|
|
|
|
6. Miscellaneous Factors |
|
| Although syntactic sugar might be dismissed as unimportant, I've yet to meet a programmer who eats broccoli as his late night snack. |
|
A variety of miscellaneous factors are in the end amongst the
most compelling reasons to choose one language over another.
Phantom
is clearly out of the running, since work on the interpreter has
stopped, and both
Java and
Python are so far ahead in terms
of library support and general developer interest.
Sun's choice of a language to resemble C/C++ will certainly help
Java's popularity. It feels
comfortable - a little strange, perhaps, but basically
comfortable. Any proficient C++ programmer is likely to find
Java liberating. In particular,
the freedom from header files, memory management
and from the C preprocessor mean
that the language is cleaner and faster to develop in than C++.
Sun is also backing their language with
interesting programming contests
with
substantial prizes.
The
media attention
Java is getting will help
ensure a high level of demand for Java programmers.
Java's biggest weakness is the
lack of well documented and professionally finished libraries.
The marketing push is so web-page oriented that the windowing
toolkit is terribly hard to use for real application development,
and more often than not the would-be Java programmer finds that
reading the source is the quickest road to enlightenment.
|
|
| Perhaps using spaces as a meaningful part of program input didn't go out with FORTRAN. |
|
Python can clearly give
Java some stiff competition.
Unfortunately the syntax is just foreign enough to be badly
disconcerting, especially grouping of statements by indentation
level. In fact, with some use it wears well as an intuitive and
powerful language.
The complete lack of static type checking means a more
Smalltalk-like feel, but
it also bodes ill for compilation to static code and for larger
projects. Although
Java's abstract window toolkit
is weak, the stdwin module for
Python is not recommended for use
by the documentation, since it is only ported to X11 and
Macintosh systems and lacks functionality that would be wanted by
a serious application effort.
|
|
|
|
|
7. Conclusions |
|
|
|
|
Phantom, although an interesting and worthwhile attempt, doesn't
give us much to measure it by since there is no complete
implementation and there will be no complete implementation.
Although it offers the richest set of distributed programming features
it isn't clear how viable they are without an implementation to
play with.
Java and Python, however, are an evenly matched pair. On the one
hand, Java provides things that Python cannot: a bytecode that
has great potential to be compiled and optimized, a broad and
rapidly growing user base, and a good deal of marketing hype. On
the other, Python has a great array of features for scripting
type tasks (which is after all the original goal of Python).
Python's dynamic typing is either a blessing or a curse depending
on the programmer's personal point of view, and similar things
can be said about closures and first class classes. Overall it
seems that Java is more suited to development of complicated
systems, whereas Python's features may speed prototype
development but stand in the way of complicated systems.
|
|