Ever wondered how you could make your postings in forums, news and
similar places more interesting?
HTML is the answer and you can learn the basics in just a few
minutes. Read on if you would like to enhance the clearity and
usefulness of your postings and make your voice heard.
Line and Paragraph Breaks
The most important thing to notice about HTML is that a line
doesn’t break unless you tell it to do so. You break a line by typing
<br>, and you separate paragraphs with <p>.
| You type in this: |
And it’ll turn out like this: |
This might be a poem <br>
And the next stanza is here. |
This might be a poem
And the next stanza is here. |
This is one paragraph.
<p>
This is the next. |
This is one paragraph.
This is the next. |
This is one paragraph.
This is the next. |
This is one paragraph.
This is the next.
This one didn’t have a line break, because you didn’t tell it to. |
Emphasize your text
You might want to emphasize portions of your text to get your point
through. Either by putting it in italics or making it
bold.
Basically, you just tell the browser when to start to emphasize and
when to stop.
| You type in this: |
And it’ll turn out like this: |
Emphasizing is <em>really</em> useful. |
Emphasizing is really useful. |
Emphasizing is <strong>really</strong> useful. |
Emphasizing is really useful. |
Lists
Lists can help you organize a bunch of information in a clear and
readable way. Most widely used are the ordered and
unordered lists.
Just as with emphasizing, lists require you to tell when to start and
when to stop a list. On top of that you have to mark up each
individual part of the list.
| You type in this: |
And it’ll turn out like this: |
<ul>
<li> It's easy
<li> It enhances readability
<li> It looks good
</ul> |
- It’s easy
- It enhances readability
- It looks good
|
<ol>
<li> It's easy
<li> It enhances readability
<li> It looks good
</ol> |
- It’s easy
- It enhances readability
- It looks good
|
Note, that the only difference between the two lists is how you start
and stop the list: <ul></ul> makes an unordered list, and
<ol></ol> gives you a list with numered items.
Links
Inserting a link to another web page is often very useful. The tag
look like this:
| You type in this: |
And it’ll turn out like this: |
<a href="http://www.pinds.com">My web site</a> |
My web site |
The page you want to link to is in quotes
("http://www.pinds.com"). The text you want the user to be able to
click on is between the <a ….. > and </a> tags.
Summary
You’ve learned the four most important types of tags:
- Line breaks:
- <p> for paragraph breaks.
<br> for line breaks.
- Emphasis:
- <em>Italic text</em>
<strong>Bold text</strong>
- Lists:
- <ul>
<li>item one
<li>item two
</ul>
- Links:
- <a href=”http://www.pinds.com”>the text to display</a>
This is all you need to make your postings turn out a lot nicer, but
there’s a lot more you can do. You’ll find plenty of resources on the
web for learning HTML. Some of the more important are:
- <a
href=”http://photo.net/wtr/thebook/html.html”>The
HTML chapter of Philip and Alex’ Guide to Web Publishing
- <a
href=”http://hotwired.lycos.com/webmonkey/96/53/index0a.html?tw=authoring”>The
Webmonkey guide
- Claus’ longer tutorial
Good luck!
Humans work best when they can concentrate on one task at a time. But
very often, in order to accomplish your task, there are a number of
subtasks that need to be done first. These necessary subtasks are
often seen as annoying and distracting, because they’re not your main
focus of attention. Consequently, they often also end up being badly
performed, because you just want to get them over with, so you can get
back to your real goal.
When you’re hungry, for example, you want to eat. But in order to eat,
you have to cook. And before you can cook, you’ll have to go shopping
and do the dishes from yesterday. But you just wanted to eat, not to
shop or do dishes.
What do we do? We outsource all of the annoying tasks to
McDonald’s. The trick is, that people have different interests. The
thing that you find boring and tedious, will be seen as interesting
and challenging by someone else.
Enter The Task Board
The same thing happens all the time in companies. I’ll be happily
hacking on my program, when I realize that I could really use this
utility procedure that just isn’t there yet. I can decide to spend a
few days to write it myself, the right way. But most often I’ll just
end up hacking something up quickly that works for me, but isn’t
pretty or reusable, and get on with what I’m doing.
What if I could instead post a request for the utility proc on a task
board. All the other programmers in my company would get instant email
notification, and perhaps there’d one of them out there that would see
that utility proc as an interesting task, and devote a few days to do
it right. Or someone might tell me that he’s already made something
almost like what I’m looking for. If nobody responds, I’d end up
in the original situation, having lost nothing for trying.
This need not be used only for stuff that you need badly. The idea is
also to have a place to put all the tasks that would be nice to get
done, but they’re not critical. This way, you can put them in, thus
sharing them with everybody. The software will remember them for you,
and probably one day, someone will find it an interesting thing to
spend a day taking care of.
A nice side effect of this is the transparency in being able to see
what needs people have. Maybe you’ll take the consequence and hire a
few people dedicated to filling some gaps.
A Little Software Design
Users can post a proposal and categorize it. By using categories,
users can sign up for alerts only in categories that they’re
interested in. Users can say they’re interested in doing the task. The
system should have a mechanism of coordinating who actually ends up
doing it, so we don’t duplicate efforts. Users should also be able to
register interest in the outcome, in case there are others that could
use the work. Every posting should of course be commentable, so the
interested users can discuss the details of how the task is
accomplished.
The Wider Perspective
Although I’m from a programming background myself, this need by no
means be limited to programming tasks. All kinds of tasks are
candidates for outsourcing.
In fact, a Danish company, Oticon,
introduced a so-called “spaghetti” company structure, where people
didn’t have fixed job descriptions. Rather, all sorts of tasks were
performed by putting them up on a board and letting people bid on what
tasks they liked. So a guy who used to do book-keeping but wanted a
feel for graphic design could apply for a task in graphic design, and
try it out. That way, you encourage employees to grow by enabling them
to learn and experiment with different tasks. (Note: The above
description may not be accurate. I read about it some years ago. It
doesn’t matter what exactly Oticon did or did not do. It’s the ideas
that are important.)
By the way: any open source community should have one of these.
When I want to write something, writing on computer is much more
efficient than writing it with pen and paper. But then I want to throw
in a quick diagram or a drawing to illustrate my point, and the
situation is reversed. There’s nothing inherent in the technology that
says it has to be that way. It’s simply because our user interfaces
aren’t sophisticated enough … yet. These are some quickly fetched
ideas for how this situation could be remedied.
Drawing Diagrams
All the diagramming tools I’ve seen still live by the old MacPaint
paradigm. They’ll have me choose the tool first (“I want to draw a
box”), then let me use it. There’s no reason that the software
couldn’t try to guess from my drawing what I want to do, then adapt to
that.
Here’s how I envision it: I grab my pen and start drawing: boxes,
lines, point to where I want to write annotation and start typing. The
software should be smart enough to figure out that I’m trying to draw
a box and make it nice and rectangular. It should figure out when I’m
trying to connect boxes with a line and make that line look nice and
smooth. It should also let me quickly and easily move around my box
and keep the lines connected right.
Of course the software’s guess will be wrong at times, so it should be
transparent what it’s doing, and easy for me to correct it without
breaking my rythm too much. I don’t have any good suggestions for this
yet.
The Mouse Sucks for Drawing
The mouse is a really lame input device, especially for drawing. Humans have
used pens for centuries, and we’re really good at controlling them
with our fingertips, as opposed to mice, where we have to use the
whole arm. Pens will hopefully become more ubiquitous soon.
References
Knowledge management is one of those very ill-defined buzzwords that
everybody claims to be doing. Also, it’s an area that exposes some
quite fundamental ideological differences. Which is why it’s an
interesting area. Here’s my take on it.
The Problem We’re Trying To Solve
Knowledge Management is fundamentally about creating a forum
where people can teach each other. The same person will
contribute with his knowledge in some areas and learn from other
people in other areas. People want to learn and they want to teach. I
like to think of it as cooperatively maintaining a knowledge base.
It’s vital that managers don’t take the “management” part of
“knowledge management” too seriously. It’s not really about
managing knowledge at all, at least not in any
top-down sort of way. It’s all about providing the best tools
possible to help users to share what’s on their mind, not
yours. That is usually what produces the most interesting,
useful and worthwhile information the internet and the web is
the world’s largest knowledge base, and it’s worked pretty well so
far.
There are, obviously, two uses of a knowledge base: You can either put
stuff into it, or you can get stuff out of it. It’s the same group of
people, but as a person, you’ll usually be doing one at a time.
Putting Stuff In To It
Putting stuff into the knowledge base can generally take two forms:
Say you have just finished some project and you’ve learned from
it. You should be able to just put in whatever you feel that
you’ve learned and have it be available to other users. It’s
important that this be completely free-style, so users can post
anything from longish articles to a short book review or simply a link
to some interesting web page.
The other form of putting stuff in there is when someone (your
employer?) decides that they want to collect information on
certain types of objects of interest to the
community. Someone (a moderator or your boss?) would define a
set of object types and, for each object type, a set of
questions. Your answers to those questions becomes part of the
knowledge base. The good thing about this form is that it
doesn’t require the same writing and teaching skills on the
part of the author. The downside is that it might obscure the
important lessons that the author has to share.
Getting Stuff Out Of It
When a users turns to the knowledge base it will most often because
he has a problem at hand that he’s trying to solve,
or a theme he’s interesting in learning more about because he needs
the knowledge. Users are directed, which is good, because
only directed attention can generate valuable knowledge. (Remember,
that the knowledge base can only give information; it’s up to the
individual to turn that into knowledge.)
As a starting point, we should do everything we can to <a
href=”/software/scoring-content”>help users find what they
want. But sometimes the user will not find what he wants, either
because our search tools are not good enough or because the
information is not in there. So he should be able to pose a
question that will be read by hundreds or thousands of real
people. This is a traditional <a
href=”http://photo.net/bboard/q-and-a.tcl?topic_id=21&topic=web%2fdb”>Q&A
forum, but we should make sure there’s a tie-in with the knowledge
base.
We should of course record all the Q&A threads and make sure it shows
up in future searches. More than that, if a question is simply
answered by pointing to an item already in the knowledge base, we
should make sure that if a user in the future goes looking for an
answer to this question, he’ll find that item
directly. I don’t know how that could be done, other than by
having moderators that go over the Q&A’s and re-categorize or add some
keywords to the item. Also, if there isn’t an item in the knowledge
base, a moderator might find the question worthy of one, and ask
someone to write about it. So we should have a good moderator
interface that facilitates this work.
Another way to get things out is to subscribe to
alerts on the content. The thing about alerts is that they
don’t make sense if they never fire, but they also don’t make sense if
you get hundreds a day. That’s an artifact of the human mind. Users
should be able to register alerts on categories or keywords or
anything, either instantly or as a daily/weekly summary.
In order to share your thoughts, ideas and knowledge, you often have
to write them down. This will often take the form of a short document,
a memo if you will. But a memo is really static and boring. What’s
interesting is the evolution of thoughts and ideas through
collaboration. This paper outlines a software system to
support this (on the web, of course).
The Scenario
Say Wendy Wise has a great idea about how to make politicians tell the
truth. She writes up some initial thoughts about it into our software
system, then publishes it. A bunch of people will get alerted about it
and read it, and she also sends an email with a link to some of her
friends. People read it. If they have something to add, they can post
a comment.
Wendy can tell form Sid Smart’s comments that he already gave this
subject a lot of thuoght, so Wendy and Sid talk about it and agree
that Wendy make Sid a co-author. They keep thinking
about the subject (how to make politicians tell the truth). the many
enlightened comments make them see new perspectives on it, so they
revise the document several times as they grow smarter.
Eventually, they might realize that their fundamental assumption
(politicians are not stupid) was wrong. So they decide to
rewrite the whole argument, which makes all the
existing comments irrelevant. So they mark this “version 2.0” and
comments start afresh. Version 1.0 and all intermediary versions are
still available as a link, with all the respective comments, but their
thoughts have evolved to a new level, which is reflected in version
2.0.
The Software Details
The software is fundamentally a tool for groups of authors to share
ideas. The authors are in the drivers’ seat.
The authors can edit the document through a web form, or use a desktop
editor and upload versions. The documents will be in HTML, because
that’s what’s used on the web. The software will keep track of
all versions so users can go back to see how it
evolved. The URL of the document will never
change. There’ll be one URL that will always point to the
latest version and another that will always point to a specific
version, and thus is guaranteed not to change. This is important
because we want to encourage links to documents.
Everybody, readers and authors alike can post their
comments right there on the page, just like on this page. The
comment-on-the-page facility doesn’t acommodate discussions very well,
so if comments turn in to a discussion, it can be redirected to a
discussion forum attached to the document.
The author can decide who should have access to read or write the
document. Often, you want to work a little on a document before you
publish it. Thus, the author decides when to
publish. Also, the author can name something an entire new
version of the document. Unlike revisions, a new version will
not inherit the comments of old versions. The author can work on the
new version without publishing it, just as with the first
version. (The same should probably be the case with revisions.)
We want to encourage and support multiple authors. The obvious problem
is with concurrent editing. <a
href=”http://www.loria.fr/~molli/cvs-index.html”>CVS has solved
these problems pretty well. It involved being able to show your intent
to edit, perhaps even obtain a lock, but more importantly, the ability
to automatically merge changes as long as they’re not overlapping, and
alerting the user to conflicts that must be manually resolved. Our
software should include these features.
WAP will fail, because every single assumption it is built on is
wrong. The instant someone delivers true, always-on internet to a
handheld device, nobody will ever care about WAP again. And it will
inevitably happen, because that’s what people want.
Freedom
The WAP is built on the old-fashioned broadcast model as we know it
from TV, a model that could be described as “TV with a buy
button”. That model has nothing to do with the internet.
Telcos want tight control over the distribution pipeline, so they’ve
made sure the individual user is tied in with his telco. You can only
access the internet through their gateway, and you only get access to
what they’ll allow you access to. And of course, that’s all about
revenue. And sure, revenue is fine. But the telcos could get revenue
simply by creating a portal, like Netscape and Microsoft has done,
that will be the start page of the cell phone browser. Look at an
announcement like the one between <a
href=”http://www.ecompare.com/press.html”>E-compare and Sprint PCS. That shows so clearly that this is about telcos
getting money for delivering customers (consumers) to companies, not
the other way around.
What the telcos fail to see is that what generates value on the
internet is exactly its openness. They’re trying once again to believe
in the ideas that made Compuserver and the old AOL fail in favor of
the internet. Companies and ecommerce sites didn’t create the internet
itself, nor the internet takeup that we’ve been experiencing. When
the network is open, as with the internet, users can get what they
really want, because if nobody else delivers, they can do it
themselves. But telcos want to control. After all, that’s what they’ve
been used to doing for so long; it’s all they know.
People don’t primarily want to buy stuff. They want to connect to each
other. They want to talk to each other, meet each other as real
persons. That’s where the real explosion sets in, as it did for the
internet. Think about it: The major attraction of <a
href=”http://www.noamazon.com”>Amazon is not that you can buy
books without leaving home, it is that you can learn what other normal
people think about the book in question. The internet explosion is due
to letting people meet around common interests and learn. There’s no
reason that shouldn’t go wireless, and it will. But WAP is not the
answer. Or rather, it’s the answer to the wrong question of How do
we get money out of this?.
Here are some little facts about WAP to get you worried:
- You have to agree to a licence agreement before
being allowed to download the specs from <a
href=”http://www.wapforum.org/what/technical.htm”>The WAP Forum,
The internet never worked like that. Standards are free and want to be shared.
- Those specifications are in PDF and not HTML. PDF is only good for
printing, and you completely loose all the benefits of the internet,
like being able to link to or from them, easy searching, etc.
- Four of the 29 protocol docs are about push, an idea that failed
miserably on the internet as a whole, because it’s completely contrary
to everything the internet is all about. Push is marketing’s dream,
not users’.
Constraints: Bandwidth, Screen Size, Memory, etc.
WAP is also built on assumptions about technical constraints. To quote
from the spec (gosh, I hope they won’t sue me for doing that (I’d link
if I could)):
- Less powerful CPUs,
- Less memory (ROM and RAM),
- Restricted power consumption,
- Smaller displays, and
- Different input devices (eg, a phone keypad).
and then about the network itself:
- Less bandwidth,
- More latency,
- Less connection stability, and
- Less predictable availability.
These are all constraints that will inevitably go away! CPUs
are getting more and more powerful, smaller and cheaper, memory is
getting cheaper and smaller, batteries are lasting longer, getting
better and cheaper, displays are getting better and cheaper, as will
wireless network connection. Do you really want to base a whole new
suite of protocols entirely on factors that will go away in a few
years?
The network problems will also go away. First of all,
The only constraints that will not change is the size of
our hands, heads and pockets (and I don’t mean metaphorically). Of
course there are limitations to the size of the widget that we’re
willing to carry around. And there’s a limit to how small the keyboard
can be before we start hitting more than one key at a time. I don’t
see what the input device has to do with design of the protocol. I do
see that screen size matters, but surprisingly, the HTML was designed
with exactly different display devices in mind, so that shouldn’t
really be a problem.
The fact remains that existing protocols can do everything you
need. There’s nothing in HTTP or HTML that prevent you from designing
web sites that consume little bandwidth and will work with a small
(and even text-only) screen. If you need to conserve even more
bandwidth, you could probably tunnel them over some compression
protocol without changing HTTP and HTML itself. And if the cell phone
(or whatever we choose to call the device) would simply stay connected
all the time, instead of having to establish a new connection each
time the user wants to connect, that would help a lot on the initial
latency. The cause of the problem is that the telcos still use a
circuit-switched network technology, in contrast to the
packet-switched nature of the internet.
The browser is free to not support Java and Javascript and plug-ins
and all that crud. The HTTP already contains a user-agent header field
whereby you let the server know that you’re on a limited, small-screen
browser. If the website wants your business, they’d have to make sure
their page will render properly on a small screen and wihtout the
flashy stuff. Web publishers should already be doing that, so what’s
the fuss?
The Japanese Way
Nobody doubts that having the internet available on you cell phone (as
well as on you palm pilot) is an excellent idea, in fact, it’s
inevitable. But WAP is, as I’ve made clear above, not the
internet. It’s a proprietary set of protocols designed explicitly to
take all the characteristics of the internet out before delivering it
to the cell phone. The Japanese know what they’re doing.
NTT DoCoMo has launched the <a
href=”http://www.businessweek.com/2000/00_03/b3664013.htm”>i-mode
phone, which
- Is open by nature; DoCoMo does not play the gatekeeper role.
- Builds on an packet-switched network like the internet.
- Has considerably larger screens than most cell phones; the biggest
one opens like a clamshell, thus allowing the screen to be twice the
size of the gadget itself. It also renders graphics and colors.
The speed is still slow, but it’s quickly getting faster and
faster.
Input devices will always be a problem, but this is not unique to cell
phones. There are several attempts at solving this, e.g. overloading
keys like in cell phones; writing with a pen like on the palm pilot;
voice commands. And there are several others. Keep in mind also, that
soft, foldable
keyboards and <a
href=”http://panopticon.ices.cmu.edu/design/FoldableDisplay.html”>displays
are starting to appear. The internet device doesn’t have to look like
a cell phone as we know it. In fact, why would it? After all,
attaching a microphone and speaker with a distance of a few inches is
all it takes to turn a hand-held internet device into a phone (oh yes,
and of course some stuff inside as well, but that shouldn’t be a
problem).
So let WAP die in peace. Nobody really wants it anyway. What we want
is the internet delivered wireless to a handheld device. And will get
that, sooner or later. When that happens, nobody will care about WAP.
References
Here’s a few things that you want to avoid when you’re building your
website. Don’t worry, you’ll save time by avoiding them. And the world
will benefit.
Don’t Make Links go Bold on Mouseover
Some sites think it’s cool to make links turn bold when the user moves
the mouse over them. Please don’t do this. Your users won’t be able
to click on the link.
Chances are that your link will fit neatly on one line under normal
circumstances, but when the user moves his mouse over the link, and it
goes bold, it takes up more horizontal space, and the browser has to
wrap the link onto the next line. Of couse, this means the link moves
out of the spot where the mouse pointer is. so now you can’t click
it. The browser hasn’t really noticed yet that the mouse isn’t over
the link anymore, it doesn’t see that until you move your mouse ever
so slightly. In which case the link goes back to normal, and swaps
back in place.
|
Here’s what such a link looks like: <a href=”#”
class=”bold”>This is a link, try moving your mouse over it.
|
Don’t Use FramesJust Don’t
When you publish something on the web, you want other people to link
to your pages. Links, whether on a web page or in an email, is
word-of-mouth, the only form of advertising that’s ever worked (and
everybody knows that, although your average marketing guy might not
admit to it). You actively want people to link to your site.
When using frames, people can’t link to the pages that interest
themthe interesting part is almost never the opening page. So
people won’t link to your site. Too bad, because that means lost
visitors.
Besides, frames are simply bad user interfaces. When you’re using
frames, they’ll usually not have the right size for my font size,
screen size or whatever, and I’ll have to scroll around them. But
before I can scroll, I’ll usually have to reach for the mouse and
click in the frame I want to scroll. Ugh! Frames stink. Such a
disgrace to see that one of
the Gods of interaction design is using frames on his
website. What does that tell you? Be sceptical when he talks about
designing for the web.
Leave The Link Colors Alone
Links is the key user interface on the web. The user’s manual
for a web browser reads: “If it’s blue and underlined, click on
it!”. Don’t mess with link colors. They should be blue and
underlined. A visited link should show in a slightly darker color. The
visited link color is actually very important when navigating the
web. It’s a subtle, mostly subconscious reminder, that you’ve been
there, it sort of marks off your territory as you go along, like a dog
spraying telephone poles. Some webmasters think it’s better to
force the visited link color to be the same as the fresh link
color. They’re dead wrong. It’s not. (Note! I used to have a link to
AOLserver here, but they’ve changed
it.)
And don’t fall for the temptations of CSS to <a
href=”http://synkron.com/default.asp?id=33”>make links only underlined
when the mouse is over them, or other stupid tricks like that. The
user interface of that sucks. If you do that, your readers will have
to let their mouse wander over all of the page like a combine
harvester before they realize what’s clickable and what’s not. It
might look good, but it doesn’t work well in practice.
Why Links Are So Important
Why are the links so important? Because the links are what brings
value to a web page. Who wrote it or what web page it’s on is not as
important as helping your reader find the information she needs.
Think about it! The number one most visited page on the web is <a
href=”http://yahoo.com”>Yahoo!, and that page almost exclusively
contain links. This even goes one step further: Today, it’s not as
important to know the answer to a lot of things, as it is to know
where to find answers, and being able to connect the people
in need of answers with the people that have the answers.
Don’t Set Explicit Font Sizes
Most browsers are equipped with the ability to scale fonts on-screen
at my discretion. If you’re running MSIE on Windows and you’re
equipped with a wheel mouse, you can even hold down the Ctrl key and
roll your wheel to scale fonts up and down. But it doesn’t work, if
the publisher has decided to put explicit font sizes in his HTML or
CSS code. I can tell my browser to adjust the font size as much as I
like, but nothing changes, except perhaps the size of the bullets in a
bulleted list! I know that I can go to my browser config and tell it
to use my settings over the publishers, but not everybody will figure
that out. And why should I have to do it? Most of the time I’m quite
happy with what the publisher has decided, but sometimes it’s simply
too small or too big, depending on what screen resolution the designer
happened to be using. And that’s when I hit Ctrl and the wheel and
nothing happens.
Here’s first an example of what you shouldn’t do (absolute size
specified in points), and an example of what you should
do. Try for yourself to scale the font size. Note in particular how
the bullet in the first line is resized, but the text’s the same.
This line has a fixed size of 12 pt
<p style="font-size: 12pt;"> ... </p>
This line has a size setting saying
large
<p style="font-size: large;"> ... </p>
Be Careful When Using Flash
Flash is sort-of the right idea: Using vector graphics for the
web. It’s right, because vector graphics will display well at all
screen sizes and resolutions, plus it’s relatively small and therefore
fast to download. It’s even more right, because we do need ways to
enrich the user interfaces that a browser is able to present.
It’s just that Flash is so poorly engineered. Flash doesn’t support
dynamic websites very well. True, the <a
href=”http://openswf.org”>file format used by Flash is open. But
it’s an annoying binary format which is hard to debug. And Macromedia
has never been very good at software
engineering. In other words, Flash lacks a plain-text
representation reflecting the fact that Flas is simply yet another
programming language.
Moreover, Flash encourages the author to completely circumvent the
existing UI of the browser, including scroll lists, links, back
buttons and much more. Instead, authors will typically completely
reinvent the user interface from scratch each time, and while the
author think it looks very k00l, the fact is that users don’t have any
interest whatsoever in your user interface, they want things done. So
having non-standard interface idioms is simply bad business.
There’s also the basic problem of forcing your users to download a
plug-in; a plug-in that doesn’t even exist for several flavors of
unixes.
For an example of a truly horrible Flash site, go to <a
href=”http://www.siemens.com/en2/flash/”>Siemens homepage. Several
things suck about this page. First and foremost, links aren’t blue and
underlined anymore, as they should be. Second, Their reinvention of
the scrolling list (“by category”) t happens to not work the same way
as the user is used to (clicking in the darker area between the arrow
and the slider only scrolls one line and not a screenful as usually;
clicking and holding the mousebutton down on the arrows only scrolls
one line, not continuous as it would normally). You also have to wait
while all the things scroll into place, which is a very good way of
making sure nobody will every want to use this site as part of their
daily work. The mouseover effect on the top navbar might look k00l to
some, but it’s so slow it doesn’t work as a UI feedback effect.
I’ll be fair and give Flash a chance to defend itself
References
Don’t forget to visit the <a
href=”http://stgtech.com/staff/gcallah/BadHTML/bad1.html”>How to
Create Really Cool Web Sites page.
The Big Picure
LDAP is basically a specialized database. Some of the characteristics are:
- It consists of entries organized in a hierarchy.
- It favors reading over writing.
- Every entry has a primary key called the Distinguished Name (DN).
- It’s notion of schema is much more flexible than that of a RDBMS.
You typically use an LDAP directory to store information about
entities like people, offices, machines and that sort. But you could
equally well store most other relatively static information in there,
e.g. information about books or movies or cars. Anything you can
describe by a set of attributes.
For a more thorough and hand-holding explanation, read <a
href=”http://ldapman.org/articles/intro_to_ldap.html”>Michael
Donnelly’s article. It’s pretty good. The guy knows what he’s
talking about.
What Is LDAP
LDAP is a protocol. Actually, it’s the Lightweight Directory
Access Protocol. Lightweight, as opposed to its ancestor, the X.500 DAP
protocol.
Being a protocol, you can implement it any which way you like, as long
as the protocol is adhered to. There is a number of implementations to
choose from, including the free
OpenLDAP or <a
href=”http://oradoc.photo.net/ora816/network.816/a77230/toc.htm”>Oracle’s
LDAP server that runs on top of an Oracle database.
As for the client implementation, the OpenLDAP distribution includes
client libraries that you can link into your C programs to LDAP-enable
your application. Java 1.2 comes with <a
href=”http://java.sun.com/products/jndi/docs.html”>JNDI, which
includes client support for LDAP.
A word of warning: there are two versions currently out there; LDAPv2
and LDAPv3. LDAPv3 solves some important problems, including more
secure authentication and transport security like SSL. The problem is
that we’re still waiting for the open-source implementation of this
from OpenLDAP. You can’t just assume
that you can use a v3 client against a v2 server. They’re quite
different.
Don’t forget where to find the standards. <a
href=”http://ldapman.org”>Michael Donnelly has collected <a
href=”http://ldapman.org/ldap_rfcs.html”>links to all the relevant
RFCs.
What’s In A Directory
A directory contains objects, or entries. Each entry has a
Distinguished Name (DN), that acts as the entry’s primary key. The
entry could be me:
uid=lars, ou=persons, dc=arsdigita, dc=com
Or it could be a machine:
cn=ls.arsdigita.com, ou=machines, dc=arsdigita, ou=com
Or just about anything else. The entry itself will have a bunch of
attributes holding the information. They could look like this:
dn: uid=lars, ou=persons, dc=arsdigita, dc=com
objectclass: person
objectclass: organizationalPerson
objectclass: inetOrgPerson
objectclass: arsdigitaPerson
uid: lars
cn: Lars Pind
cn: Lars Holger Pind
sn: Pind
givenName: Lars
title: Developer
telephoneNumber: 718-387-0115
jpegPhoto: (my portrait)
mail: lars@pinds.com
loginshell: /bin/bash
userPassword: {crypt}X5/DBrWPOQQaI
homeDirectory: /home/lars
uidNumber: 10
gidNumber: 10
The objectclass stuff is used to say something about what other
attributes MUST or MAY be present. The other attributes contains
various information we want to store about the user. The same
attribute can have multiple values, as with the cn attribute
above.
The Structure of the Directory
The directory structure is hierarchical. The tree structure will be
familiar to every programmer (take DNS or the Unix file systems as
other examples). Look at the DN again:
uid=lars, ou=persons, dc=arsdigita, dc=com
It shows the hierarchical structure: The top-level node here is
dc=com. Then comes dc=arsdigita, dc=com hanging off
of that. Then, under that is ou=persons, dc=arsdigita,
dc=com, and under that is where you’ll find me.
As is customary with trees, you can also use the abbreviated form of a
DN, called a Relative Distinguished Name (RDN). This is the leftmost
part of the DN, and is relative to it’s position within the
hierarchy.
Every LDAP server also has a base. For ArsDigita, the base of our
directory would probably be dc=arsdigita, dc=com. Everything
under this node is managed by out directory server, everything else in
the world is (potentially) managed by some other directory
server. This divides the LDAP name space between LDAP servers, like
is done with DNS.
So in other words, the DN is composed of:
- The base DN: dc=arsdigita, dc=com
- The position relative to the base: ou=persons
- The RDN of the entry: uid=lars
How you structure this, your Drectory Information Tree (DIT), and what
you choose for you base DN, is up to you. You can get a feel for the
concerns by <a
href=”http://developer.netscape.com/docs/manuals/isdirdeploy/chaptera.htm#1012406”>reading
about what Netscape does (or thought they’d do as of October
1998). Like with all such cases, you should think about what needs you
have and will have in the future.
One thing most people seem to agree on by now is, that using your DNS
domain as your base DN as in dc=arsdigita, dc=com is a very
good idea.
What’s the Schema
The schema definition of LDAP is a little messy, albeit very
flexible. The LDAP schema, as with all database schemas, is the
definition of what can be stored in the directory. The basic thing in
an entry is an attribute, like
givenName. Each attribute is associated with a
syntax that determines what can be stored in that
attribute (plain text, binary data, encoded data of some sort), and
how searches against them work (case sensitivity, for example). An
objectclass is a three-tuple, consisting of (must
have, required, may have), saying what other attributes can or should
be present.
There is a standard core of schema definitions (object classes,
attributes and syntaxes), and you can define your own to suit your
particular needs. Most every organization will want to do that.
Object Classes
Let’s take an example object class definition:
objectclass person requires sn cn allows userPassword
telephoneNumber seeAlso Description
To create an object (entry) of a specific class, you add an attribute
to your entry that says objectclass: person, or whatever
object class you want to use. When you do that, you must also add the
other required attributes. You can have an object be an instance of
multiple classes, simply by adding more objectclass
attributes.
The object classes is furthermore arranged into an inheritance
hierarchy, such that, e.g. objectclass: organizationalPerson
is a subclass of objectclass: person. The caveat here is that
LDAP doesn’t really deal with the inheritance (this is one of the
things pulled out from X.500 to make it lightweight). What it does
instead is that the definition of organizationalPerson
requires that objects of this class also includes
person. This is the reason my example entry given above
contains four different values for the objectclass attribute.
Attributes
An attribute definition just contains a name, possibly an abbreviated
name, and the syntax the attribute uses. It can also contain a
description. It can say something like:
attribute commonName cn cis
This defines the attribute commonName, abbreviated as cn, syntax cis
(case-insensitive string).
Syntaxes
Some common syntaxes include:
- bin: binary
- ces: case-sensitive (case exact) string
- cis: case-insensitive (case ignore) string
- tel: telephone number string
- dn: distinguished name string
The Standard Schema
Exactly what object classes are defined and how they’re defined is a
bit unclear to me. Apparantly, the <a
href=”http://ldapman.org/ldap_rfcs.html”>RFCs collectively define
a bunch of them, although some are outdated. The best resource for
information about this is the <a
href=”http://www.hklc.com/ldapschema/”>LDAP schema repository
where you can browse object classes, attributes, syntaxes and matching rules.
Defining Your Own
It’s pretty straight-forward to define your own attributes and object
classes with OpenLDAP. Here’s an
excerpt from the <a
href=”http://www.openldap.org/software/man.cgi?query=slapd.conf&sektion=5&apropos=0&manpath=OpenLDAP+1.2-Release”>man
pages for slapd.conf:
<B>attribute</B> <<B>name</B>> <B>[</B><<B>name2</B>><B>] { bin | ces | cis | tel | dn }</b>
Associate a syntax with an attribute name. By
default, an attribute is assumed to have syntax
<B>cis</B>. An optional alternate name can be given for
an attribute. The possible syntaxes and their mean-
ings are:
<B>bin</B> binary
<B>ces</B> case exact string
<B>cis</B> case ignore string
<B>tel</B> telephone number string
<B>dn</B> distinguished name
<B>objectclass</B> <<B>name</B>> <B>requires</B><<B>attrs</B>> <B>allows</B> <<B>attrs</B>>
Define the schema rules for the object class named
<name>. These are used in conjunction with the
schemacheck option.
The schemacheck option? I hear you ask. With OpenLDAP you can turn off
the checks for whether the attributes required by the object class is
in fact present. Huh.
References
- ldapman.org has some great
introductory articles.
- The LDAP Schema
Repository is indispensible for figuring out what to stuff in
there and how.
- A System
Administrator’s View of LDAP by Bruce Markey from Netscape is
a very clear introduction to our use of it (note how his layout style
resembles ours :-P).
- Jeff Hodge’s
LDAP roadmap and faq which seems to be the authoritative guide to
links. Unfortunately, it’s so badly organized that it’s almost not
worth it. Beware that this guy is way confused about “versioning” his
web site, so you may very well find yourself reading something
out-of-date by more than a year! Check the “Last updated” on top of
the page and try the other versions.
- <a
href=”http://dir.yahoo.com/Computers_and_Internet/Communications_and_Networking/Protocols/LDAP__Lightweight_Directory_Access_Protocol_/”>The
Yahoo! category has fine links.
- Here’s
something about the Abstract Syntax Notation used in specifying the
protocol.
- <a
href=”http://renoir.vill.edu/~cassel/netbook/ber/node1.html”>Here’s
something about the Basic Encoding Rules defining what the
protocol looks like on the wire.
- More about BER,
this time LDAP-specific
This page contains several ideas on how to help rank and categorize
content on a web site, to help users find the information they are
looking for. The basic premise is that by watching what users actually
do and counting that as implicit votes, we can generate much more
precise information about how valuable content actually is to users,
than by asking users to explicitly score items.
Categorization
As is well known in the Human
Sciences, especially Information Science,
Categorization is in and of itself very problematic and not
necessarily helpful to users. Coming up with proper categories and
mapping the real world onto those categories is basically an
unsolvable problem, and doing it so that it generates value for the
user is hard. The categories chosen must depend on a deep knowledge of
the actual content you want to categorize. The best way to deal with
that is probably to come up with a limited set of categories, and let
the set of available categories evolve as content develops and you
learn about the content and your users.
We need ways to let not only the set of available categories, but also
the categorization of individual items evolve over time. To do the
first, it’s reasonable to let your users suggest new categories. To do
the latter, you should at least give each mapping between content and
a category a weight. A weight is a number, either positive or
negative that says how strong the relation between the content
and the category is.
But the act of categorizing content shouldn’t be solely in the hands
of authors and moderators. Empower your readers, by letting readers
also suggest their idea of how a particular content item
should be categorized. The readers’ categorizations should be used to
adjust the weights of content-to-category-mappings in some balanced
way, so that the mappings gradually get better for the readers.
Scoring Content
We should always strive to score content based on all the information
we can get. Higher scoring content should bubble to the top whenever
the user is searching or browsing for content.
Scoring can happen along several dimensions, depending on what the
user is looking for. Sometimes we will want to favor new
content. Sometimes we want to favor content that has generated a lot
of comments. When browsing a specific category, we’ll want the weight
of the mapping of a content to that category to influence the
score. And we’ll try to associate a notion of “quality” with a piece
of content, and consequently score high-quality content higher. This
is what we use the notion of votes for.
Explicit Votes
Explicit votes (“rank this item from 0 to 5”) is not very
interesting. When given the option of giving from zero to five stars
to some item, most readers either go for either zero of five. If their
feeling about the content is somewhere in-between they probably won’t
rank it. The deeper problem with simply giving a zero-to-five vote is
that it doesn’t say anything about why. So an explicit rank should
always be associated with some comment. It goes without saying that
all content should be commentable.
Implicit Votes
Implicit voting is much more interesting. Implicit voting is what
happens when you email a link to a friend. You’re telling your friend
to look at this because you think he’d find it interesting. We want to
capture those implicit votes as best we can, and record that as a
testimony to the item’s “quality”. Thus, we want to offer a “send this
page to a friend” option on every content page. Whenever a page is
recommended, we record it, and increase the “quality value” of that
item. Since most people have a limited set of people that they
regularly recommend pages to, we can increase usability by showing the
last five-or-so people they’ve emailed recommendations to and let them
recommend by one click (hope <a
href=”http://www.noamazon.com/”>Amazon won’t sue us for this).
Another form of implicit vote is links. If someone chooses to link to
our content, it’s probably because it’s valuable to them, i.e. high
quality. So we should catch referer-headers and record it as a vote.
Bookmarks is another form of links. We should have a link on every
content page that will allow users to add a “virtual bookmark” to that
content piece, and we record the act as a vote. The bookmarks can be
stored on the same site, so the user has a <a
href=”/bookmarks”>bookmarks page there, or we can offer to
integrate with Yahoo!
Bookmarks or other sites that the users might prefer.
By allowing authors of content to include links to other items, we
might even be able to do something a’ la <a
href=”http://www.google.com/why_use.html”>Google to build up a web
of votes from content items to each other. A link is a vote. A link
from a highly scoring author weighs more.
Scoring the Voters
The next level of indirection is when we start <a
href=”http://www.epinions.com/help/index.html?show=web_of_trust”>ranking
rankers. There are two things we want to do here. First, some
people just say ridiculous things whenever they open their mouth. We
want to avoid having their votes count by downplaying them. Second,
all people don’t agree about their opinions. We want to establish
links between users that says “user x generally agrees with user
y”.
We do this by letting users say whether a comment was useful to them
or not, on a per-comment basis. Such a vote influences both the score
of the comment in question, and the score of the author of that
comment. This will let us bubble interesting comments on top, and give
more weight to votes from highly values people.
We also let users explicitly say that they trust some other
person. That indicates an agreement in preferenecs and will directly
influence the scoring of content for that particular user. It will
also give the person being trusted a higher score.
Recording User Interest
As the user searches and browses, we not what search keywords and
categories the user likes. That way, we can even further score content
to a specific user’s liking. With some clever <a
href=”http://click.arsdigita.com/doc/clickstream”>clickstream
analysis, we should even be able to estimate how much time the user
spends on individual pages, and use that to figure out whether the
user liked what he found or not. That’s even more implicit voting.