jack: (haylp/wacky races)
[personal profile] jack
Recently my "books lent and borrowed" list ballooned by 100% or more, and it occurred to me how useful this would be if it were stored in a simple database, so "Jack, friend, lent/borrowed, book" would automatically contribute to friend's list as well. (You might make a few tweaks on a server large enough that everyone doesn't automatically trust each other such as recording if both people have approved the entry.) And if it's returned, either of us can check it off.

Obviously someone used to web programming could knock up something that works in a simple way in an evening if they put their mind to it (though I'm rather rusty myself).

However, this is obviously limited by server. Someone would write it and host it on chiark, or other server, and invite people to use it, but if someone elsewhere did the same thing, you'd have to have an account on both if you had friends on both. (Which isn't a big deal for _this_, but I'm curious, and would be great for many applications, maybe even social networking ones.)

What is the most obvious way of letting the two (or many) servers cooperate? Tim described this as "usenet for databases". Mobbsy said the nearest thing that sprang to mind was DNS, which do trade information back and forth. You might find the bandwidth wasn't even that high -- eg. many web apps send email alerts anyway, so the extra hit of sending the data to another server mightn't be so much extra.

(I'm more interested in the question about the technology of distributed databases, but I'm interested in the book thing too -- I haven't searched yet, so it probably exists, or can be integrated to librarything or similar, and if you know of it please do tell me, but that wasn't the real reason for the post.)

(Another wrinkle is my envisaged service/applet/facebook-app I may have mentioned before called "not a bank" which does the same thing as book loans but for money and other fungible amounts, where it can be automatically rebalanced so all your debts cancel out and you owe/are owed by just one person you know. It would be emphasised this was an aide memoire for small loans to someone you trust to pay, or don't care if they don't, but you might not remember.)

(Distributed databases have lots of problems with being transaction safe and not getting data desynchronised between servers, etc. These applications seem superficially more suited as: each person can have a home server that might be considered authoritative; or at worst one server has to push a change to another one; and should be low risk if something does go wrong. A big headache (or showstopper) is security, need servers trust each other? Can you arrange it so if some idiot trusts a friend on a bad server, the worst that happens is they individually get spammed, but not actually corrupted?)

Date: 2008-04-09 09:39 pm (UTC)
From: [identity profile] oedipamaas49.livejournal.com
Theoretically, you could maybe build something on top of a distributed version control system (git/bazaar/mercurial/darcs)?

Date: 2008-04-10 12:56 am (UTC)
From: [identity profile] mobbsy.livejournal.com
It's not just the servers you need to co-operate; that's (relatively) easily achieved by having a single backing data store which all servers reference. The problem is when you don't have any meaningful synchronization point for the servers. Life is easier if there's a single point of truth for a given datum; the distributed system is just a set of slaved copies of that, which can always be referred back to the known true point.

It gets much trickier if there's no definitive truth. You can use distributed transaction protocols (like two phase commit), that gets expensive if you use it to spread the data over all servers, but you could just use it in a more light-weight to ensure correct versioning, but that can hurt availability.

DNS doesn't even try to solve any of those problems; it relies on a hierarchical distribution of data and responsibility, and timeouts to refresh data. Like a lot of early internet protocols, it's a fairly pragmatic approach to the problem that works pretty well most of the time.

Date: 2008-04-10 08:16 am (UTC)
ext_8103: (Default)
From: [identity profile] ewx.livejournal.com

I think in this case you don't want a single consistent view of who has what cobbled together from lots of sources, but a list of everyone's claims, and let the reader make their own judgement as to the current state of the world from those. The input data is inherently unreliable to start with (people forget even if they're not malicious), so no matter how clever your software you're not going to get a 100% accurate and consistent picture.



For example if we were both conscientious we'd have "Jack claims Richard borrowed LOTR from him" and "Richard claims he borrowed LOTR from Jack", and any search for "Jack's LOTR" would find both of these; if one of us was lazy only one of these might show up; if communications were disrupted you might get exactly one of them depending where you asked from for a while.

Date: 2008-04-10 09:40 am (UTC)
ext_57795: (Default)
From: [identity profile] hmmm-tea.livejournal.com
Reading this immediately made me think of LibraryThing, although I've never really explored it.

Looking at site, it doesn't appear to do what you want. However, to me, it would seem the obvious feature to add would be the ability to show books lent from one member to another.

Date: 2008-04-10 06:20 pm (UTC)
From: [identity profile] theinquisitor.livejournal.com
Were it me, I'd implement communication via ACI or similar, and make each server contain a list of all claims made by its own users, and all claims pertaining to its users that it has recieved notification of.

Also, of course, make it easy to reciprocate claims you agree with.

A@B claims he lent book C to D@E. (@ being used to designate home servers, obviously). So B sends a message to this effect, to E. E will later give D the chance to acknowledge this, and if he does, E will send a message to B to this effect.

That way A can see that he claimed to have lent C to D, and whether or not D agrees.

A malicious server can claim you owe it books, but never force you to accept this fact. It can also lose or deny your claim to have lent someone on it a book, but never force you to drop your claim. I don't see any potential problems that can't be solved with a UI arranged to allow you to display your version of reality, rather than that asserted by others.

Date: 2008-04-10 08:29 pm (UTC)
From: [identity profile] d37373.livejournal.com
Obviously someone used to web programming
*Waves*

Databases: Sounds like you want some sort of multimaster system.

The system we use at work is based on knowledge-dates. We timestamp every change to the database, enact it locally, then pump out a message to our other servers telling them what happened and when. We also store history for each table.

When server A gets a message from server B saying that change C happened at time T, server A applies the change to the row as it was at time T then reapplies all changes to the row since then (if any).1

This all works given a messaging service that guarantees a message is processed exactly once (possible, even over dodgy IP) and the servers have synchronised clocks (also possible). It copes with high latency, even to the extent of a server dropping off the net for days before coming back online.

None of the data is guaranteed up-to-date, but there is a guarantee that at some point in the future everyone will agree about what happened in the past. I consider that a bretty big win.

[1]: The implementation we use is a little more complicated and a lot more efficient than this sounds.

Date: 2008-04-13 10:04 pm (UTC)
From: [identity profile] robhu.livejournal.com
Slight aside - I believe someone made a tool that let you store stuff in DNS, i.e. distribute blocks of it up around the world by getting random DNS servers cache blocks. Then they made a tool that let you query DNS servers to get the original file back.

I believe the proof of concept of this was to store a CD ISO in the DNS tree, and then get the advantages of caching and so on when downloading it.