Lars Pind

internet software, coaching, and entrepreneurship

Lars Pind - internet software, coaching, and entrepreneurship
Check out Coach TV, my video blog on happiness and personal development for geeks.

HTML for ACS Use

May 22, 2000 · 1 comment

Ever wondered how you could make your postings in forums, news and similar places more interesting?

HTML is the answer and you can learn the basics in just a few minutes. Read on if you would like to enhance the clearity and usefulness of your postings and make your voice heard.

Line and Paragraph Breaks

The most important thing to notice about HTML is that a line doesn’t break unless you tell it to do so. You break a line by typing <br>, and you separate paragraphs with <p>.

You type in this: And it’ll turn out like this:
This might be a poem &lt;br&gt;
And the next stanza is here.
This might be a poem
And the next stanza is here.
This is one paragraph.
&lt;p&gt;
This is the next.
This is one paragraph.

This is the next.

This is one paragraph.

This is the next.
This is one paragraph. This is the next.

This one didn’t have a line break, because you didn’t tell it to.

Emphasize your text

You might want to emphasize portions of your text to get your point through. Either by putting it in italics or making it bold.

Basically, you just tell the browser when to start to emphasize and when to stop.

You type in this: And it’ll turn out like this:
Emphasizing is &lt;em&gt;really&lt;/em&gt; useful.
Emphasizing is really useful.
Emphasizing is &lt;strong&gt;really&lt;/strong&gt; useful.
Emphasizing is really useful.

Lists

Lists can help you organize a bunch of information in a clear and readable way. Most widely used are the ordered and unordered lists.

Just as with emphasizing, lists require you to tell when to start and when to stop a list. On top of that you have to mark up each individual part of the list.

You type in this: And it’ll turn out like this:
&lt;ul&gt;

&lt;li&gt; It's easy
&lt;li&gt; It enhances readability
&lt;li&gt; It looks good

&lt;/ul&gt;
  • It’s easy
  • It enhances readability
  • It looks good
&lt;ol&gt;

&lt;li&gt; It's easy
&lt;li&gt; It enhances readability
&lt;li&gt; It looks good

&lt;/ol&gt;
  1. It’s easy
  2. It enhances readability
  3. It looks good

Note, that the only difference between the two lists is how you start and stop the list: <ul></ul> makes an unordered list, and <ol></ol> gives you a list with numered items.

Links

Inserting a link to another web page is often very useful. The tag look like this:

You type in this: And it’ll turn out like this:
&lt;a href="http://www.pinds.com"&gt;My web site&lt;/a&gt;
My web site

The page you want to link to is in quotes ("http://www.pinds.com"). The text you want the user to be able to click on is between the <a ….. > and </a> tags.

Summary

You’ve learned the four most important types of tags:

Line breaks:
<p> for paragraph breaks.
<br> for line breaks.

Emphasis:
<em>Italic text</em>
<strong>Bold text</strong>

Lists:
<ul>
<li>item one
<li>item two
</ul>

Links:
<a href=”http://www.pinds.com”>the text to display</a>

This is all you need to make your postings turn out a lot nicer, but there’s a lot more you can do. You’ll find plenty of resources on the web for learning HTML. Some of the more important are:

  • <a href=”http://photo.net/wtr/thebook/html.html”>The HTML chapter of Philip and Alex’ Guide to Web Publishing
  • <a href=”http://hotwired.lycos.com/webmonkey/96/53/index0a.html?tw=authoring”>The Webmonkey guide
  • Claus’ longer tutorial

Good luck!

1 comment

Task Board

May 20, 2000 · 0 comments

Humans work best when they can concentrate on one task at a time. But very often, in order to accomplish your task, there are a number of subtasks that need to be done first. These necessary subtasks are often seen as annoying and distracting, because they’re not your main focus of attention. Consequently, they often also end up being badly performed, because you just want to get them over with, so you can get back to your real goal.

When you’re hungry, for example, you want to eat. But in order to eat, you have to cook. And before you can cook, you’ll have to go shopping and do the dishes from yesterday. But you just wanted to eat, not to shop or do dishes.

What do we do? We outsource all of the annoying tasks to McDonald’s. The trick is, that people have different interests. The thing that you find boring and tedious, will be seen as interesting and challenging by someone else.

Enter The Task Board

The same thing happens all the time in companies. I’ll be happily hacking on my program, when I realize that I could really use this utility procedure that just isn’t there yet. I can decide to spend a few days to write it myself, the right way. But most often I’ll just end up hacking something up quickly that works for me, but isn’t pretty or reusable, and get on with what I’m doing.

What if I could instead post a request for the utility proc on a task board. All the other programmers in my company would get instant email notification, and perhaps there’d one of them out there that would see that utility proc as an interesting task, and devote a few days to do it right. Or someone might tell me that he’s already made something almost like what I’m looking for. If nobody responds, I’d end up in the original situation, having lost nothing for trying.

This need not be used only for stuff that you need badly. The idea is also to have a place to put all the tasks that would be nice to get done, but they’re not critical. This way, you can put them in, thus sharing them with everybody. The software will remember them for you, and probably one day, someone will find it an interesting thing to spend a day taking care of.

A nice side effect of this is the transparency in being able to see what needs people have. Maybe you’ll take the consequence and hire a few people dedicated to filling some gaps.

A Little Software Design

Users can post a proposal and categorize it. By using categories, users can sign up for alerts only in categories that they’re interested in. Users can say they’re interested in doing the task. The system should have a mechanism of coordinating who actually ends up doing it, so we don’t duplicate efforts. Users should also be able to register interest in the outcome, in case there are others that could use the work. Every posting should of course be commentable, so the interested users can discuss the details of how the task is accomplished.

The Wider Perspective

Although I’m from a programming background myself, this need by no means be limited to programming tasks. All kinds of tasks are candidates for outsourcing.

In fact, a Danish company, Oticon, introduced a so-called “spaghetti” company structure, where people didn’t have fixed job descriptions. Rather, all sorts of tasks were performed by putting them up on a board and letting people bid on what tasks they liked. So a guy who used to do book-keeping but wanted a feel for graphic design could apply for a task in graphic design, and try it out. That way, you encourage employees to grow by enabling them to learn and experiment with different tasks. (Note: The above description may not be accurate. I read about it some years ago. It doesn’t matter what exactly Oticon did or did not do. It’s the ideas that are important.)

By the way: any open source community should have one of these.

0 comments

UI For Diagramming Software

May 17, 2000 · 0 comments

When I want to write something, writing on computer is much more efficient than writing it with pen and paper. But then I want to throw in a quick diagram or a drawing to illustrate my point, and the situation is reversed. There’s nothing inherent in the technology that says it has to be that way. It’s simply because our user interfaces aren’t sophisticated enough … yet. These are some quickly fetched ideas for how this situation could be remedied.

Drawing Diagrams

All the diagramming tools I’ve seen still live by the old MacPaint paradigm. They’ll have me choose the tool first (“I want to draw a box”), then let me use it. There’s no reason that the software couldn’t try to guess from my drawing what I want to do, then adapt to that.

Here’s how I envision it: I grab my pen and start drawing: boxes, lines, point to where I want to write annotation and start typing. The software should be smart enough to figure out that I’m trying to draw a box and make it nice and rectangular. It should figure out when I’m trying to connect boxes with a line and make that line look nice and smooth. It should also let me quickly and easily move around my box and keep the lines connected right.

Of course the software’s guess will be wrong at times, so it should be transparent what it’s doing, and easy for me to correct it without breaking my rythm too much. I don’t have any good suggestions for this yet.

The Mouse Sucks for Drawing

The mouse is a really lame input device, especially for drawing. Humans have used pens for centuries, and we’re really good at controlling them with our fingertips, as opposed to mice, where we have to use the whole arm. Pens will hopefully become more ubiquitous soon.

References

0 comments

Knowledge Management

May 16, 2000 · 0 comments

Knowledge management is one of those very ill-defined buzzwords that everybody claims to be doing. Also, it’s an area that exposes some quite fundamental ideological differences. Which is why it’s an interesting area. Here’s my take on it.

The Problem We’re Trying To Solve

Knowledge Management is fundamentally about creating a forum where people can teach each other. The same person will contribute with his knowledge in some areas and learn from other people in other areas. People want to learn and they want to teach. I like to think of it as cooperatively maintaining a knowledge base.

It’s vital that managers don’t take the “management” part of “knowledge management” too seriously. It’s not really about managing knowledge at all, at least not in any top-down sort of way. It’s all about providing the best tools possible to help users to share what’s on their mind, not yours. That is usually what produces the most interesting, useful and worthwhile information — the internet and the web is the world’s largest knowledge base, and it’s worked pretty well so far.

There are, obviously, two uses of a knowledge base: You can either put stuff into it, or you can get stuff out of it. It’s the same group of people, but as a person, you’ll usually be doing one at a time.

Putting Stuff In To It

Putting stuff into the knowledge base can generally take two forms: Say you have just finished some project and you’ve learned from it. You should be able to just put in whatever you feel that you’ve learned and have it be available to other users. It’s important that this be completely free-style, so users can post anything from longish articles to a short book review or simply a link to some interesting web page.

The other form of putting stuff in there is when someone (your employer?) decides that they want to collect information on certain types of objects of interest to the community. Someone (a moderator or your boss?) would define a set of object types and, for each object type, a set of questions. Your answers to those questions becomes part of the knowledge base. The good thing about this form is that it doesn’t require the same writing and teaching skills on the part of the author. The downside is that it might obscure the important lessons that the author has to share.

Getting Stuff Out Of It

When a users turns to the knowledge base it will most often because he has a problem at hand that he’s trying to solve, or a theme he’s interesting in learning more about because he needs the knowledge. Users are directed, which is good, because only directed attention can generate valuable knowledge. (Remember, that the knowledge base can only give information; it’s up to the individual to turn that into knowledge.)

As a starting point, we should do everything we can to <a href=”/software/scoring-content”>help users find what they want. But sometimes the user will not find what he wants, either because our search tools are not good enough or because the information is not in there. So he should be able to pose a question that will be read by hundreds or thousands of real people. This is a traditional <a href=”http://photo.net/bboard/q-and-a.tcl?topic_id=21&topic=web%2fdb”>Q&A forum, but we should make sure there’s a tie-in with the knowledge base.

We should of course record all the Q&A threads and make sure it shows up in future searches. More than that, if a question is simply answered by pointing to an item already in the knowledge base, we should make sure that if a user in the future goes looking for an answer to this question, he’ll find that item directly. I don’t know how that could be done, other than by having moderators that go over the Q&A’s and re-categorize or add some keywords to the item. Also, if there isn’t an item in the knowledge base, a moderator might find the question worthy of one, and ask someone to write about it. So we should have a good moderator interface that facilitates this work.

Another way to get things out is to subscribe to alerts on the content. The thing about alerts is that they don’t make sense if they never fire, but they also don’t make sense if you get hundreds a day. That’s an artifact of the human mind. Users should be able to register alerts on categories or keywords or anything, either instantly or as a daily/weekly summary.

0 comments

Sharing Ideas

May 16, 2000 · 0 comments

In order to share your thoughts, ideas and knowledge, you often have to write them down. This will often take the form of a short document, a memo if you will. But a memo is really static and boring. What’s interesting is the evolution of thoughts and ideas through collaboration. This paper outlines a software system to support this (on the web, of course).

The Scenario

Say Wendy Wise has a great idea about how to make politicians tell the truth. She writes up some initial thoughts about it into our software system, then publishes it. A bunch of people will get alerted about it and read it, and she also sends an email with a link to some of her friends. People read it. If they have something to add, they can post a comment.

Wendy can tell form Sid Smart’s comments that he already gave this subject a lot of thuoght, so Wendy and Sid talk about it and agree that Wendy make Sid a co-author. They keep thinking about the subject (how to make politicians tell the truth). the many enlightened comments make them see new perspectives on it, so they revise the document several times as they grow smarter.

Eventually, they might realize that their fundamental assumption (politicians are not stupid) was wrong. So they decide to rewrite the whole argument, which makes all the existing comments irrelevant. So they mark this “version 2.0” and comments start afresh. Version 1.0 and all intermediary versions are still available as a link, with all the respective comments, but their thoughts have evolved to a new level, which is reflected in version 2.0.

The Software Details

The software is fundamentally a tool for groups of authors to share ideas. The authors are in the drivers’ seat.

The authors can edit the document through a web form, or use a desktop editor and upload versions. The documents will be in HTML, because that’s what’s used on the web. The software will keep track of all versions so users can go back to see how it evolved. The URL of the document will never change. There’ll be one URL that will always point to the latest version and another that will always point to a specific version, and thus is guaranteed not to change. This is important because we want to encourage links to documents.

Everybody, readers and authors alike can post their comments right there on the page, just like on this page. The comment-on-the-page facility doesn’t acommodate discussions very well, so if comments turn in to a discussion, it can be redirected to a discussion forum attached to the document.

The author can decide who should have access to read or write the document. Often, you want to work a little on a document before you publish it. Thus, the author decides when to publish. Also, the author can name something an entire new version of the document. Unlike revisions, a new version will not inherit the comments of old versions. The author can work on the new version without publishing it, just as with the first version. (The same should probably be the case with revisions.)

We want to encourage and support multiple authors. The obvious problem is with concurrent editing. <a href=”http://www.loria.fr/~molli/cvs-index.html”>CVS has solved these problems pretty well. It involved being able to show your intent to edit, perhaps even obtain a lock, but more importantly, the ability to automatically merge changes as long as they’re not overlapping, and alerting the user to conflicts that must be manually resolved. Our software should include these features.

0 comments

WAP Will Fail

May 10, 2000 · 0 comments

WAP will fail, because every single assumption it is built on is wrong. The instant someone delivers true, always-on internet to a handheld device, nobody will ever care about WAP again. And it will inevitably happen, because that’s what people want.

Freedom

The WAP is built on the old-fashioned broadcast model as we know it from TV, a model that could be described as “TV with a buy button”. That model has nothing to do with the internet.

Telcos want tight control over the distribution pipeline, so they’ve made sure the individual user is tied in with his telco. You can only access the internet through their gateway, and you only get access to what they’ll allow you access to. And of course, that’s all about revenue. And sure, revenue is fine. But the telcos could get revenue simply by creating a portal, like Netscape and Microsoft has done, that will be the start page of the cell phone browser. Look at an announcement like the one between <a href=”http://www.ecompare.com/press.html”>E-compare and Sprint PCS. That shows so clearly that this is about telcos getting money for delivering customers (consumers) to companies, not the other way around.

What the telcos fail to see is that what generates value on the internet is exactly its openness. They’re trying once again to believe in the ideas that made Compuserver and the old AOL fail in favor of the internet. Companies and ecommerce sites didn’t create the internet itself, nor the internet takeup that we’ve been experiencing. When the network is open, as with the internet, users can get what they really want, because if nobody else delivers, they can do it themselves. But telcos want to control. After all, that’s what they’ve been used to doing for so long; it’s all they know.

People don’t primarily want to buy stuff. They want to connect to each other. They want to talk to each other, meet each other as real persons. That’s where the real explosion sets in, as it did for the internet. Think about it: The major attraction of <a href=”http://www.noamazon.com”>Amazon is not that you can buy books without leaving home, it is that you can learn what other normal people think about the book in question. The internet explosion is due to letting people meet around common interests and learn. There’s no reason that shouldn’t go wireless, and it will. But WAP is not the answer. Or rather, it’s the answer to the wrong question of How do we get money out of this?.

Here are some little facts about WAP to get you worried:

  • You have to agree to a licence agreement before being allowed to download the specs from <a href=”http://www.wapforum.org/what/technical.htm”>The WAP Forum, The internet never worked like that. Standards are free and want to be shared.

  • Those specifications are in PDF and not HTML. PDF is only good for printing, and you completely loose all the benefits of the internet, like being able to link to or from them, easy searching, etc.

  • Four of the 29 protocol docs are about push, an idea that failed miserably on the internet as a whole, because it’s completely contrary to everything the internet is all about. Push is marketing’s dream, not users’.

Constraints: Bandwidth, Screen Size, Memory, etc.

WAP is also built on assumptions about technical constraints. To quote from the spec (gosh, I hope they won’t sue me for doing that (I’d link if I could)):

  • Less powerful CPUs,
  • Less memory (ROM and RAM),
  • Restricted power consumption,
  • Smaller displays, and
  • Different input devices (eg, a phone keypad).

and then about the network itself:

  • Less bandwidth,
  • More latency,
  • Less connection stability, and
  • Less predictable availability.

These are all constraints that will inevitably go away! CPUs are getting more and more powerful, smaller and cheaper, memory is getting cheaper and smaller, batteries are lasting longer, getting better and cheaper, displays are getting better and cheaper, as will wireless network connection. Do you really want to base a whole new suite of protocols entirely on factors that will go away in a few years?

The network problems will also go away. First of all,

The only constraints that will not change is the size of our hands, heads and pockets (and I don’t mean metaphorically). Of course there are limitations to the size of the widget that we’re willing to carry around. And there’s a limit to how small the keyboard can be before we start hitting more than one key at a time. I don’t see what the input device has to do with design of the protocol. I do see that screen size matters, but surprisingly, the HTML was designed with exactly different display devices in mind, so that shouldn’t really be a problem.

The fact remains that existing protocols can do everything you need. There’s nothing in HTTP or HTML that prevent you from designing web sites that consume little bandwidth and will work with a small (and even text-only) screen. If you need to conserve even more bandwidth, you could probably tunnel them over some compression protocol without changing HTTP and HTML itself. And if the cell phone (or whatever we choose to call the device) would simply stay connected all the time, instead of having to establish a new connection each time the user wants to connect, that would help a lot on the initial latency. The cause of the problem is that the telcos still use a circuit-switched network technology, in contrast to the packet-switched nature of the internet.

The browser is free to not support Java and Javascript and plug-ins and all that crud. The HTTP already contains a user-agent header field whereby you let the server know that you’re on a limited, small-screen browser. If the website wants your business, they’d have to make sure their page will render properly on a small screen and wihtout the flashy stuff. Web publishers should already be doing that, so what’s the fuss?

The Japanese Way

Nobody doubts that having the internet available on you cell phone (as well as on you palm pilot) is an excellent idea, in fact, it’s inevitable. But WAP is, as I’ve made clear above, not the internet. It’s a proprietary set of protocols designed explicitly to take all the characteristics of the internet out before delivering it to the cell phone. The Japanese know what they’re doing.

NTT DoCoMo has launched the <a href=”http://www.businessweek.com/2000/00_03/b3664013.htm”>i-mode phone, which

  • Is open by nature; DoCoMo does not play the gatekeeper role.
  • Builds on an packet-switched network like the internet.
  • Has considerably larger screens than most cell phones; the biggest one opens like a clamshell, thus allowing the screen to be twice the size of the gadget itself. It also renders graphics and colors.

The speed is still slow, but it’s quickly getting faster and faster.

Input devices will always be a problem, but this is not unique to cell phones. There are several attempts at solving this, e.g. overloading keys like in cell phones; writing with a pen like on the palm pilot; voice commands. And there are several others. Keep in mind also, that soft, foldable keyboards and <a href=”http://panopticon.ices.cmu.edu/design/FoldableDisplay.html”>displays are starting to appear. The internet device doesn’t have to look like a cell phone as we know it. In fact, why would it? After all, attaching a microphone and speaker with a distance of a few inches is all it takes to turn a hand-held internet device into a phone (oh yes, and of course some stuff inside as well, but that shouldn’t be a problem).

So let WAP die in peace. Nobody really wants it anyway. What we want is the internet delivered wireless to a handheld device. And will get that, sooner or later. When that happens, nobody will care about WAP.

References

0 comments

Don't Let The World Suffer From Your Website Too

May 09, 2000 · 0 comments

Here’s a few things that you want to avoid when you’re building your website. Don’t worry, you’ll save time by avoiding them. And the world will benefit.

Don’t Make Links go Bold on Mouseover

Some sites think it’s cool to make links turn bold when the user moves the mouse over them. Please don’t do this. Your users won’t be able to click on the link.

Chances are that your link will fit neatly on one line under normal circumstances, but when the user moves his mouse over the link, and it goes bold, it takes up more horizontal space, and the browser has to wrap the link onto the next line. Of couse, this means the link moves out of the spot where the mouse pointer is. so now you can’t click it. The browser hasn’t really noticed yet that the mouse isn’t over the link anymore, it doesn’t see that until you move your mouse ever so slightly. In which case the link goes back to normal, and swaps back in place.

Here’s what such a link looks like: <a href=”#” class=”bold”>This is a link, try moving your mouse over it.

Don’t Use Frames—Just Don’t

When you publish something on the web, you want other people to link to your pages. Links, whether on a web page or in an email, is word-of-mouth, the only form of advertising that’s ever worked (and everybody knows that, although your average marketing guy might not admit to it). You actively want people to link to your site.

When using frames, people can’t link to the pages that interest them—the interesting part is almost never the opening page. So people won’t link to your site. Too bad, because that means lost visitors.

Besides, frames are simply bad user interfaces. When you’re using frames, they’ll usually not have the right size for my font size, screen size or whatever, and I’ll have to scroll around them. But before I can scroll, I’ll usually have to reach for the mouse and click in the frame I want to scroll. Ugh! Frames stink. Such a disgrace to see that one of the Gods of interaction design is using frames on his website. What does that tell you? Be sceptical when he talks about designing for the web.

Leave The Link Colors Alone

Links is the key user interface on the web. The user’s manual for a web browser reads: “If it’s blue and underlined, click on it!”. Don’t mess with link colors. They should be blue and underlined. A visited link should show in a slightly darker color. The visited link color is actually very important when navigating the web. It’s a subtle, mostly subconscious reminder, that you’ve been there, it sort of marks off your territory as you go along, like a dog spraying telephone poles. Some webmasters think it’s better to force the visited link color to be the same as the fresh link color. They’re dead wrong. It’s not. (Note! I used to have a link to AOLserver here, but they’ve changed it.)

And don’t fall for the temptations of CSS to <a href=”http://synkron.com/default.asp?id=33”>make links only underlined when the mouse is over them, or other stupid tricks like that. The user interface of that sucks. If you do that, your readers will have to let their mouse wander over all of the page like a combine harvester before they realize what’s clickable and what’s not. It might look good, but it doesn’t work well in practice.

Why Links Are So Important

Why are the links so important? Because the links are what brings value to a web page. Who wrote it or what web page it’s on is not as important as helping your reader find the information she needs. Think about it! The number one most visited page on the web is <a href=”http://yahoo.com”>Yahoo!, and that page almost exclusively contain links. This even goes one step further: Today, it’s not as important to know the answer to a lot of things, as it is to know where to find answers, and being able to connect the people in need of answers with the people that have the answers.

Don’t Set Explicit Font Sizes

Most browsers are equipped with the ability to scale fonts on-screen at my discretion. If you’re running MSIE on Windows and you’re equipped with a wheel mouse, you can even hold down the Ctrl key and roll your wheel to scale fonts up and down. But it doesn’t work, if the publisher has decided to put explicit font sizes in his HTML or CSS code. I can tell my browser to adjust the font size as much as I like, but nothing changes, except perhaps the size of the bullets in a bulleted list! I know that I can go to my browser config and tell it to use my settings over the publishers, but not everybody will figure that out. And why should I have to do it? Most of the time I’m quite happy with what the publisher has decided, but sometimes it’s simply too small or too big, depending on what screen resolution the designer happened to be using. And that’s when I hit Ctrl and the wheel and — nothing happens.

Here’s first an example of what you shouldn’t do (absolute size specified in points), and an example of what you should do. Try for yourself to scale the font size. Note in particular how the bullet in the first line is resized, but the text’s the same.

  • This line has a fixed size of 12 pt

    &lt;p style="font-size: 12pt;"&gt; ... &lt;/p&gt;
  • This line has a size setting saying large

    &lt;p style="font-size: large;"&gt; ... &lt;/p&gt;

Be Careful When Using Flash

Flash is sort-of the right idea: Using vector graphics for the web. It’s right, because vector graphics will display well at all screen sizes and resolutions, plus it’s relatively small and therefore fast to download. It’s even more right, because we do need ways to enrich the user interfaces that a browser is able to present.

It’s just that Flash is so poorly engineered. Flash doesn’t support dynamic websites very well. True, the <a href=”http://openswf.org”>file format used by Flash is open. But it’s an annoying binary format which is hard to debug. And Macromedia has never been very good at software engineering. In other words, Flash lacks a plain-text representation reflecting the fact that Flas is simply yet another programming language.

Moreover, Flash encourages the author to completely circumvent the existing UI of the browser, including scroll lists, links, back buttons and much more. Instead, authors will typically completely reinvent the user interface from scratch each time, and while the author think it looks very k00l, the fact is that users don’t have any interest whatsoever in your user interface, they want things done. So having non-standard interface idioms is simply bad business.

There’s also the basic problem of forcing your users to download a plug-in; a plug-in that doesn’t even exist for several flavors of unixes.

For an example of a truly horrible Flash site, go to <a href=”http://www.siemens.com/en2/flash/”>Siemens homepage. Several things suck about this page. First and foremost, links aren’t blue and underlined anymore, as they should be. Second, Their reinvention of the scrolling list (“by category”) t happens to not work the same way as the user is used to (clicking in the darker area between the arrow and the slider only scrolls one line and not a screenful as usually; clicking and holding the mousebutton down on the arrows only scrolls one line, not continuous as it would normally). You also have to wait while all the things scroll into place, which is a very good way of making sure nobody will every want to use this site as part of their daily work. The mouseover effect on the top navbar might look k00l to some, but it’s so slow it doesn’t work as a UI feedback effect.

I’ll be fair and give Flash a chance to defend itself

References

Don’t forget to visit the <a href=”http://stgtech.com/staff/gcallah/BadHTML/bad1.html”>How to Create Really Cool Web Sites page.

0 comments

LDAP Introduction

May 08, 2000 · 5 comments

The Big Picure

LDAP is basically a specialized database. Some of the characteristics are:

  • It consists of entries organized in a hierarchy.
  • It favors reading over writing.
  • Every entry has a primary key called the Distinguished Name (DN).
  • It’s notion of schema is much more flexible than that of a RDBMS.

You typically use an LDAP directory to store information about entities like people, offices, machines and that sort. But you could equally well store most other relatively static information in there, e.g. information about books or movies or cars. Anything you can describe by a set of attributes.

For a more thorough and hand-holding explanation, read <a href=”http://ldapman.org/articles/intro_to_ldap.html”>Michael Donnelly’s article. It’s pretty good. The guy knows what he’s talking about.

What Is LDAP

LDAP is a protocol. Actually, it’s the Lightweight Directory Access Protocol. Lightweight, as opposed to its ancestor, the X.500 DAP protocol.

Being a protocol, you can implement it any which way you like, as long as the protocol is adhered to. There is a number of implementations to choose from, including the free OpenLDAP or <a href=”http://oradoc.photo.net/ora816/network.816/a77230/toc.htm”>Oracle’s LDAP server that runs on top of an Oracle database.

As for the client implementation, the OpenLDAP distribution includes client libraries that you can link into your C programs to LDAP-enable your application. Java 1.2 comes with <a href=”http://java.sun.com/products/jndi/docs.html”>JNDI, which includes client support for LDAP.

A word of warning: there are two versions currently out there; LDAPv2 and LDAPv3. LDAPv3 solves some important problems, including more secure authentication and transport security like SSL. The problem is that we’re still waiting for the open-source implementation of this from OpenLDAP. You can’t just assume that you can use a v3 client against a v2 server. They’re quite different.

Don’t forget where to find the standards. <a href=”http://ldapman.org”>Michael Donnelly has collected <a href=”http://ldapman.org/ldap_rfcs.html”>links to all the relevant RFCs.

What’s In A Directory

A directory contains objects, or entries. Each entry has a Distinguished Name (DN), that acts as the entry’s primary key. The entry could be me:

uid=lars, ou=persons, dc=arsdigita, dc=com

Or it could be a machine:

cn=ls.arsdigita.com, ou=machines, dc=arsdigita, ou=com

Or just about anything else. The entry itself will have a bunch of attributes holding the information. They could look like this:

dn: uid=lars, ou=persons, dc=arsdigita, dc=com
objectclass: person
objectclass: organizationalPerson
objectclass: inetOrgPerson
objectclass: arsdigitaPerson
uid: lars
cn: Lars Pind
cn: Lars Holger Pind
sn: Pind
givenName: Lars
title: Developer
telephoneNumber: 718-387-0115
jpegPhoto:  (my portrait)
mail: lars@pinds.com
loginshell: /bin/bash
userPassword: {crypt}X5/DBrWPOQQaI
homeDirectory: /home/lars
uidNumber: 10
gidNumber: 10

The objectclass stuff is used to say something about what other attributes MUST or MAY be present. The other attributes contains various information we want to store about the user. The same attribute can have multiple values, as with the cn attribute above.

The Structure of the Directory

The directory structure is hierarchical. The tree structure will be familiar to every programmer (take DNS or the Unix file systems as other examples). Look at the DN again:

uid=lars, ou=persons, dc=arsdigita, dc=com

It shows the hierarchical structure: The top-level node here is dc=com. Then comes dc=arsdigita, dc=com hanging off of that. Then, under that is ou=persons, dc=arsdigita, dc=com, and under that is where you’ll find me.

As is customary with trees, you can also use the abbreviated form of a DN, called a Relative Distinguished Name (RDN). This is the leftmost part of the DN, and is relative to it’s position within the hierarchy.

Every LDAP server also has a base. For ArsDigita, the base of our directory would probably be dc=arsdigita, dc=com. Everything under this node is managed by out directory server, everything else in the world is (potentially) managed by some other directory server. This divides the LDAP name space between LDAP servers, like is done with DNS.

So in other words, the DN is composed of:

  • The base DN: dc=arsdigita, dc=com
  • The position relative to the base: ou=persons
  • The RDN of the entry: uid=lars

How you structure this, your Drectory Information Tree (DIT), and what you choose for you base DN, is up to you. You can get a feel for the concerns by <a href=”http://developer.netscape.com/docs/manuals/isdirdeploy/chaptera.htm#1012406”>reading about what Netscape does (or thought they’d do as of October 1998). Like with all such cases, you should think about what needs you have and will have in the future.

One thing most people seem to agree on by now is, that using your DNS domain as your base DN as in dc=arsdigita, dc=com is a very good idea.

What’s the Schema

The schema definition of LDAP is a little messy, albeit very flexible. The LDAP schema, as with all database schemas, is the definition of what can be stored in the directory. The basic thing in an entry is an attribute, like givenName. Each attribute is associated with a syntax that determines what can be stored in that attribute (plain text, binary data, encoded data of some sort), and how searches against them work (case sensitivity, for example). An objectclass is a three-tuple, consisting of (must have, required, may have), saying what other attributes can or should be present.

There is a standard core of schema definitions (object classes, attributes and syntaxes), and you can define your own to suit your particular needs. Most every organization will want to do that.

Object Classes

Let’s take an example object class definition:

objectclass person requires sn cn allows userPassword
telephoneNumber seeAlso Description

To create an object (entry) of a specific class, you add an attribute to your entry that says objectclass: person, or whatever object class you want to use. When you do that, you must also add the other required attributes. You can have an object be an instance of multiple classes, simply by adding more objectclass attributes.

The object classes is furthermore arranged into an inheritance hierarchy, such that, e.g. objectclass: organizationalPerson is a subclass of objectclass: person. The caveat here is that LDAP doesn’t really deal with the inheritance (this is one of the things pulled out from X.500 to make it lightweight). What it does instead is that the definition of organizationalPerson requires that objects of this class also includes person. This is the reason my example entry given above contains four different values for the objectclass attribute.

Attributes

An attribute definition just contains a name, possibly an abbreviated name, and the syntax the attribute uses. It can also contain a description. It can say something like:

attribute commonName cn cis

This defines the attribute commonName, abbreviated as cn, syntax cis (case-insensitive string).

Syntaxes

Some common syntaxes include:

  • bin: binary
  • ces: case-sensitive (case exact) string
  • cis: case-insensitive (case ignore) string
  • tel: telephone number string
  • dn: distinguished name string

The Standard Schema

Exactly what object classes are defined and how they’re defined is a bit unclear to me. Apparantly, the <a href=”http://ldapman.org/ldap_rfcs.html”>RFCs collectively define a bunch of them, although some are outdated. The best resource for information about this is the <a href=”http://www.hklc.com/ldapschema/”>LDAP schema repository where you can browse object classes, attributes, syntaxes and matching rules.

Defining Your Own

It’s pretty straight-forward to define your own attributes and object classes with OpenLDAP. Here’s an excerpt from the <a href=”http://www.openldap.org/software/man.cgi?query=slapd.conf&sektion=5&apropos=0&manpath=OpenLDAP+1.2-Release”>man pages for slapd.conf:

        <B>attribute</B> &lt;<B>name</B>&gt; <B>[</B>&lt;<B>name2</B>&gt;<B>] { bin | ces | cis | tel | dn }</b>
              Associate  a  syntax  with  an  attribute  name. By
              default, an attribute is  assumed  to  have  syntax
              <B>cis</B>.   An  optional alternate name can be given for
              an attribute. The possible syntaxes and their mean-
              ings are:
                      <B>bin</B>    binary
                      <B>ces</B>    case exact string
                      <B>cis</B>    case ignore string
                      <B>tel</B>    telephone number string
                      <B>dn</B>     distinguished name

       <B>objectclass</B> &lt;<B>name</B>&gt; <B>requires</B>&lt;<B>attrs</B>&gt; <B>allows</B> &lt;<B>attrs</B>&gt;
              Define  the schema rules for the object class named
              &lt;name&gt;.  These are used  in  conjunction   with the
              schemacheck option.

The schemacheck option? I hear you ask. With OpenLDAP you can turn off the checks for whether the attributes required by the object class is in fact present. Huh.

References

  • ldapman.org has some great introductory articles.
  • The LDAP Schema Repository is indispensible for figuring out what to stuff in there and how.
  • A System Administrator’s View of LDAP by Bruce Markey from Netscape is a very clear introduction to our use of it (note how his layout style resembles ours :-P).
  • Jeff Hodge’s LDAP roadmap and faq which seems to be the authoritative guide to links. Unfortunately, it’s so badly organized that it’s almost not worth it. Beware that this guy is way confused about “versioning” his web site, so you may very well find yourself reading something out-of-date by more than a year! Check the “Last updated” on top of the page and try the other versions.
  • <a href=”http://dir.yahoo.com/Computers_and_Internet/Communications_and_Networking/Protocols/LDAP__Lightweight_Directory_Access_Protocol_/”>The Yahoo! category has fine links.
  • Here’s something about the Abstract Syntax Notation used in specifying the protocol.
  • <a href=”http://renoir.vill.edu/~cassel/netbook/ber/node1.html”>Here’s something about the Basic Encoding Rules defining what the protocol looks like on the wire.
  • More about BER, this time LDAP-specific

5 comments

Make Your Users Get What They Want

May 05, 2000 · 0 comments

This page contains several ideas on how to help rank and categorize content on a web site, to help users find the information they are looking for. The basic premise is that by watching what users actually do and counting that as implicit votes, we can generate much more precise information about how valuable content actually is to users, than by asking users to explicitly score items.

Categorization

As is well known in the Human Sciences, especially Information Science, Categorization is in and of itself very problematic and not necessarily helpful to users. Coming up with proper categories and mapping the real world onto those categories is basically an unsolvable problem, and doing it so that it generates value for the user is hard. The categories chosen must depend on a deep knowledge of the actual content you want to categorize. The best way to deal with that is probably to come up with a limited set of categories, and let the set of available categories evolve as content develops and you learn about the content and your users.

We need ways to let not only the set of available categories, but also the categorization of individual items evolve over time. To do the first, it’s reasonable to let your users suggest new categories. To do the latter, you should at least give each mapping between content and a category a weight. A weight is a number, either positive or negative that says how strong the relation between the content and the category is.

But the act of categorizing content shouldn’t be solely in the hands of authors and moderators. Empower your readers, by letting readers also suggest their idea of how a particular content item should be categorized. The readers’ categorizations should be used to adjust the weights of content-to-category-mappings in some balanced way, so that the mappings gradually get better for the readers.

Scoring Content

We should always strive to score content based on all the information we can get. Higher scoring content should bubble to the top whenever the user is searching or browsing for content.

Scoring can happen along several dimensions, depending on what the user is looking for. Sometimes we will want to favor new content. Sometimes we want to favor content that has generated a lot of comments. When browsing a specific category, we’ll want the weight of the mapping of a content to that category to influence the score. And we’ll try to associate a notion of “quality” with a piece of content, and consequently score high-quality content higher. This is what we use the notion of votes for.

Explicit Votes

Explicit votes (“rank this item from 0 to 5”) is not very interesting. When given the option of giving from zero to five stars to some item, most readers either go for either zero of five. If their feeling about the content is somewhere in-between they probably won’t rank it. The deeper problem with simply giving a zero-to-five vote is that it doesn’t say anything about why. So an explicit rank should always be associated with some comment. It goes without saying that all content should be commentable.

Implicit Votes

Implicit voting is much more interesting. Implicit voting is what happens when you email a link to a friend. You’re telling your friend to look at this because you think he’d find it interesting. We want to capture those implicit votes as best we can, and record that as a testimony to the item’s “quality”. Thus, we want to offer a “send this page to a friend” option on every content page. Whenever a page is recommended, we record it, and increase the “quality value” of that item. Since most people have a limited set of people that they regularly recommend pages to, we can increase usability by showing the last five-or-so people they’ve emailed recommendations to and let them recommend by one click (hope <a href=”http://www.noamazon.com/”>Amazon won’t sue us for this).

Another form of implicit vote is links. If someone chooses to link to our content, it’s probably because it’s valuable to them, i.e. high quality. So we should catch referer-headers and record it as a vote.

Bookmarks is another form of links. We should have a link on every content page that will allow users to add a “virtual bookmark” to that content piece, and we record the act as a vote. The bookmarks can be stored on the same site, so the user has a <a href=”/bookmarks”>bookmarks page there, or we can offer to integrate with Yahoo! Bookmarks or other sites that the users might prefer.

By allowing authors of content to include links to other items, we might even be able to do something a’ la <a href=”http://www.google.com/why_use.html”>Google to build up a web of votes from content items to each other. A link is a vote. A link from a highly scoring author weighs more.

Scoring the Voters

The next level of indirection is when we start <a href=”http://www.epinions.com/help/index.html?show=web_of_trust”>ranking rankers. There are two things we want to do here. First, some people just say ridiculous things whenever they open their mouth. We want to avoid having their votes count by downplaying them. Second, all people don’t agree about their opinions. We want to establish links between users that says “user x generally agrees with user y”.

We do this by letting users say whether a comment was useful to them or not, on a per-comment basis. Such a vote influences both the score of the comment in question, and the score of the author of that comment. This will let us bubble interesting comments on top, and give more weight to votes from highly values people.

We also let users explicitly say that they trust some other person. That indicates an agreement in preferenecs and will directly influence the scoring of content for that particular user. It will also give the person being trusted a higher score.

Recording User Interest

As the user searches and browses, we not what search keywords and categories the user likes. That way, we can even further score content to a specific user’s liking. With some clever <a href=”http://click.arsdigita.com/doc/clickstream”>clickstream analysis, we should even be able to estimate how much time the user spends on individual pages, and use that to figure out whether the user liked what he found or not. That’s even more implicit voting.

0 comments