jack: (Default)
Bitwise copy

I'd read this but not really thought about it before. Rust prioritises data structures which can be moved or copied with memcpy. That eases various things. But to achieve it you need to keep a very tight rein on many things which are used all over the place in most languages.

Notably, you can only have a pointer to a struct from one place, unless you specifically arrange to "borrow" it, but a borrowed value can't be changed or moved (either the original or any of the borrows). Rust blogs describe this as similar to the discipline needed when dealing with data from multiple threads, except to avoid mistakes like "I am in the middle of a computation using this value, and then call another function which changes this value, and forgot that created an implicit dependency between those bits of code".

This shows up in lots of confusing ways, like function parameters need to be borrowed or copied, else they are moved by default, and once moved, the original is gone and can't be accessed.[1]

I'm not sure if this will turn out to be really useful or not. I see the logic, and agree that can prevent mistakes, but I don't know if it's possible to write code to avoid those problems, or if in practice everyone ends up using one of the ways to work round this restriction, and then tracks any unfortunate implicit dependencies in their heads just like they used to.

The specific example I'm going to mention below is having owned structs which contain a pointer back to the owner, which doesn't usually work because someone else needs to have a pointer to the owner as well.

Interior mutability

This is only slowly making sense to me, I'm not sure how much sense the version I'm writing does.

A common pattern in some programs is "having a struct containing a bunch of settings which can be accessed and changed from multiple parts of the program".

This is particularly difficult in rust because having different pointers to an object usually forbids you from changing it.

The way round this is "interior mutability", that is, a value that can be changed even if it's part of a struct which is supposed to be immutable. It's a bit like "mutable" in C++ which is used for weird edge cases like caching calculated values in an apparently const class or allowing const functions to lock mutexes, etc. Except that you're apparently supposed to use it for any "normal" variables which you can read and write from multiple places. IIRC you can either use "Cell" which works on a normal value variable in other languages or RefCell which works like a pointer and has a lock at runtime which panics if you get it wrong.

This brings us back to the topic I was thinking about before, originally inspired by these features of rust. That a common pattern is needing a pointer to a owning class from an owned class. But you might not need that if you had the feature I discussed before, that whenever you access the owned object through a reference to the owning object, it carries along a secret pointer to it, like a second "this" pointer.

That could work for the case of "access some central object from several different parts of the program". If various parts are owned objects, that can access the parent object, but only when they're entered by code from that object, the parent object (including the settings object) is only accessed from one of the components at once, which borrows exclusive access to the parent when its member function is called.

However, the feature I wasn't sure of but would need to be added is, if you have a pointer to that owned object from anywhere *else* (notably, a callback of some sort), it needs a special pointer that wraps up a pointer to it and a pointer to its parent together. That does sound really hairy, although if you have a 64 bit pointer there should be lots of room to implement the sordid details somehow. Assuming you never need to nest this too deep. Although of course, at that point, you've given up any pretence these could be moved around in memory anyway, so maybe there's no benefit to this flexibility?

Footnotes

[1] I think explanations of this explain it really badly, in that most people encounter these errors before understanding why "move by default" is a thing at all, so don't find that a satisfying answer.
jack: (Default)
Removing code is good! But everywhere I've worked has had a "pile of makefiles" build system, which have invariably had problems when you remove a file, because the .d files are still hanging around, and make chokes on a source file because it doesn't have the headers it needed last time, even though they're actually not necessary to actually build the file.

And it's a matter of culture whether it's "when you check out code, you often need to make clean or make undepend somewhere to get it to compile" or "when you check in code, you need to find a workaround to make it build cleanly even if you've removed files".

Do people with more recent build tools than "make" avoid this problem?

However, after thinking it through carefully I eventually decided on one of the ways to makefiles cope with this correctly.

The trick

You still do "-include $(OBJ_FILES:%.c=%.d)" or equivalent.

But when you produce a .d file with gcc (usually as a side effect of producing a .o file via -MMD), add an extra line at the end of the recipe, a perl script which edits the .d file in-place and replaces each "filename.o: header1.h header2.h..." with "filename.o $(wildcard: header1.h header2.h...)"

That way, if any dependency has *changed* a rebuild is forced as normal. But only dependencies that actually exist become dependencies within the makefile. (Deleting a header file doesn't trigger a rebuild, but it doesn't with the old system either since the .o file already exists.)

I can share the exact script if anyone wants to see.
jack: (Default)
OK, so. We want to allocate a large block of memory that is contiguous as physical memory. That means allocating physical memory in the kernel (as with kmalloc), and then later providing it to userspace software. Presumably then mapping it into virtual memory for use in userspace with mmap from physical memory in dev/mem, although we may be doing something different for reasons which aren't relevant here.

We happen to have a kernel driver already for other experiments with our specific hardware, so we have somewhere convenient to put this kernel code as needed.

This is running on a hardware board dedicated to a single task, so we have a few advantages. We would prefer to allocate a large chunk on start-up, and will have complete control over which programs we expect to use it, we don't need to dynamically manage unknown different drivers trying to get this memory, and we never intend to free it, and the board will only be used for this so we don't need to make sure other programs run ok. And there's no restriction on addresses, DMA and other relevant peripherals can access the entire memory map, so unlike x86 we don't need to specifically reserve *low* memory.

There are several different related approaches, and I went through a few rabbit holes figuring out what worked.

Option 1: __memblock_alloc_base()

From research and helpful friends, I found some relevant instructions online. One was from "Linux Device Drivers, 3rd edition", the section entitled "Obtaining Large Buffers", about using alloc_bootmem_low to grab kernel pages during boot. I'm not sure, but I think, this was correct, but the kernel started using memblock instead of bootmem as a start-up allocator?

From the code in the contiguous memory allocator (search the kernel source for "cma"), I learned that possibly I should be using memblock functions as well. I didn't understand the different options, but I used the same one as in the contiguous memory allocator code, __memblock_alloc_base and it seemed to work. I tried large powers of 2 and could allocate half of physical memory in one go. I haven't fully tested this, but it seemed to work.

There are several related functions, and I don't know for sure what is correct, except that what the cma code did worked.

This code is currently in a kernel driver init function. The driver must be compiled statically into the kernel, you can't load it as a module later. You could put the code in architecture specific boot-up code instead.

Option 2: cma=

fanf found a link to some kernel patches which tried to make a systematic way of doing this, based on some early inconsistently-maintained patch, which later turned into code which was taken up by the kernel. Google for "contiguous memory allocator". There's an article about it from the time and some comments on the kernel commit.

It's a driver which can be configured to grab a large swath of contiguous memory at startup, and then hand that out to any other driver which needs it.

You specify the memory with "cma=64MB" or whatever size on the kernel command line. (Or possibly in the .config file via "make menuconf"?) You need to do this because it allocates on start-up, and it doesn't know if it should have this or not.

It then returns this memory to normal calls to "alloc_dma_coherent" which is designed to allocate memory which is physically contiguous, but doesn't normally allocate such big blocks. I hadn't tested this approach because I didn't need any specific part of memory so I'd been looking at kmalloc not "alloc_dma_coherent", but a colleague working on a related problem said it worked on their kernel.

It may also do clever things involving exposing the memory to normal allocating, but paging whatever else is there out to disk to free it up when needed, I'm not sure (?)

I was looking at the source code for this and borrowed the technique to allocate memory just for our driver. We may either go with that (since we don't need any further dynamic allocation, one chunk of memory is fine), or revert to using the cma later since it's already in the kernel.

I went down a blind alley because it looked like it wasn't enabled on my architecture. But I think that was because I screwed up "make menuconfig" not specifying the architecture, and actually it is. Look for instructions on cross-compiling it if you don't already have that incorporated in your build process.

Option 3: CONFIG_FORCE_MAX_ZONEORDER

This kernel parameter in .config apparently increases the amount of memory you can allocate with kmalloc (or dma_alloc_coherent?). We haven't explored this further because the other option seemed to work, and I had some difficulties with building .config, so I don't know quite how it works.

I found the name hard to remember at first. For the record, it means, ensure the largest size of zone which can be allocated is at least this order of magnitude (as a power of two). I believe it is actually 1 higher than the largest allowed value, double check the documentation if you're not sure.

Further options

There are several further approaches that are not really appropriate here, but may be useful under related circumstances.

* On many architectures, dma does scatter-gather specifically to read or write from non-contiguous memory so you shouldn't need this in the first place.

* Ensure the hardware can write to several non-contiguous addresses.

* Allocate the several blocks of the largest size kmalloc can allocate, and check that they do in fact turn out to be contiguous since kernel boot-up probably hasn't fragmented the majority of memory.

* Ditto, but just allocate one or several large blocks of virtual memory with malloc, and check that most of it turns out to be allocated from contiguous physical memory because that's what was available. This is a weird approach, but if you have to do it in userspace entirely, it's the only option you could take.
jack: (Default)
The android game I wrote last month is available for download (see bottom of this post).

Gameplay

It's a variant on an augmented reality match three game. Physically walk around to change which square is highlighted with a light grey background. Click that square to place the next tile there. The next tile is shown at the bottom of the screen. Match three of the same type in a row, and they vanish forming a new type. Then try to match three of *those*. When you reach hearts, match three hearts of any colour and they vanish entirely (but give lots of score).

For instance, three fish next to each other in a line make an octopus, three octopuses make a whale, three whales make a blue heart, three hearts of any colour vanish entirely. And similarly for the three other starter animals.

Only vertical and horizontal. But if you make a line of four, or two crossing lines of three, they all vanish. They only give one new tile, but you get more points.

It would be trivial to play if you could just click on a square, but it's surprisingly addictive when you play it walking about.

Be careful not to walk into the middle of roads! It's surprisingly easy to make that mistake when you're concentrating on your location in the game.

The screen wraps round, so you can always keep walking in one direction rather than walk in the opposite direction. It's best to start by figuring out which compass direction corresponds to which direction on the grid :)

Tips: When you complete an octopus, think about where you're going to put the fish to make the next octopus next to the first one.

Details

If you open the .apk file on an android device, it should ask if you want to install it. You can only do so if you agree to install apps which come from me not the play store. I think that should work but I don't know for sure.

It is very early stages. It seems to work on one or two devices, but I haven't tested it more extensively than that. It will hopefully be ok, but I don't know for sure. I would appreciate knowing everyone who tried it, just whether it ran ok or not, and if the game itself seemed to work.

It still has some UI from the open source OpenSudoku game I based the code on. Don't pay any attention to the menus or help.

File:

https://www.dropbox.com/s/md5sjt25xe3eean/emojilution-debug.apk

(Let me know if the link doesn't work. You should *not* need a dropbox account to use it, but you may have to scroll to the bottom of the screen to continue to download without one.)

Feedback

I would appreciate knowing everyone who tried it, just whether it installed ok or not, and if the game itself seemed to work.

Lots of things are known to be unfinished, so don't waste energy enumerating what's missing in menus etc. Do let me know anything that seems to prevent me playing the game. Do ask if it doesn't run or it's not obvious what to do. Comments on what's fun and what isn't are very much appreciated!

Thank you!
jack: (Default)
I previously talked about accessing the scope of an owning class from a class declared and instantiated within it. Latest post here: https://jack.dreamwidth.org/1017241.html

The possible approaches seem to be: the owned class has a pointer back to the owning class; or functions dealing with the owned class get an extra parameter to the owning class. Whether that's implemented manually, or automatically by the language, or somewhere between.

Thinking some more about this, various things occurred to me:

Java

I hadn't realised, but I learned Java and C# (and maybe other recent/managed languages) do this automatically, presumably implemented by the owned class automatically having a pointer to the owning class, and checked where its instantiated that it's only instantiated by an instance of the owning class.



I was naturally drawn to a "owning pointer is passed in alongside or as part of a this pointer" implementation as it seemed more conceptually correct. However, the actual benefit of this is a lot smaller in most languages other than rust. I first started thinking about these options in a rust example, where having a pointer to the owning class needed some fancy dancing, because rust prefers to keep tight limits on how many pointers to a class at once (ideally one only).

This hopefully makes memory management safer, and means you can usually move classes around in memory using a raw memcpy, because they don't usually have internal pointers to different parts of them. But most other languages don't even try to do that, just assume that a non-trivial class is fixed in place in memory (or moved only by a garbage collector that knows where all the pointers are).

Implementation

If you try to avoid having a permanent pointer back to the owning class, and if you ever need a pointer to the owned class (this is common if you use it as a callback), you need to accept your pointers would actually be a pair (or more) of pointers, to the owning class, and to the owned class. The owned pointers might be an offset rather than a complete pointer. That's clunky, but wouldn't necessarily take up that much space if the language supports it. You could do a similar thing for iterators, like pointers to members of a collection, rather than having a bare pointer that only makes sense if you already know what the collection is.

That seems a useful concept, but I'm not sure how useful it would be in practice.
jack: (Default)
I was reflecting further on my previous comments on meta-history in source control.

One use case I imagined was that you can rebase freely, and people who've pulled will have everything just work assuming they always pull rebase. But I may have been too pessimistic. A normal pull rebase may usually just cope with the sort of rebasing people are likely to have done upstream anyway.

The other question is, are you losing old history by rebasing older commits? Well, I don't suggest doing it for very old commits, but I guess, you're not losing any history for commits that were in releases.

Although that itself raises a reason to have a connection between the new branch and the old: you shouldn't be rebasing history prior to a release much (usually not at all. maybe to squash a commit to make git bisect work?) But if you do, you don't want too parallel branches with the same commit, you want to be able to see where the release was on the "good" history (assuming there's a commit which is file-identical to the original release commit), and fall back to the "original" history only if there's some problem.

And as I've said before, another thing I like is the idea that if you're rebasing, you don't have a command that says "do this whole magic thing in one step", you have a thing that says "construct a new branch one commit at a time from this existing branch, stopping when there's a problem", and there is no state needed to continue after resolving a problem, you just re-run the command on the partially-constructed new branch. And then can choose to throw away the old branch to tidy up, but that's not an inherent part of the commadn.
jack: (Default)
I like the principle of duck typing.

Roast it if it looks sufficiently duck-like. Don't worry about whether it's officially a duck, just if it has the relevant features for roasting.

However, I don't understand the attachment to getting 3/4 of the way through the basting, stuffing and roasting project before suddenly discovering that you're trying to crisp a small piece of vaguely duck-shaped ornamental stonemasonry.

I agree with (often) only testing for the *relevant* features of duck-ness. But it seems like the best time to test for those relevant features is "as soon as possible", not "shut your eyes, and charge ahead until you fail". Is there a good reason for "fail fast, except for syntax errors, those we should wait to crash until we're actually trying to execute them"?

I've been working on my non-vegetarian metaphors, how did I do? :)
jack: (Default)
Thanks to everyone who commented on the previous post, or posted earlier articles on a similar idea. I stole some of the terminology from one of Gerald-Duck's posts Simon pointed me to. And have tried to iterate my ideas one step closer to something specific.

Further examples of use cases

There are several related cases here, many adapted from Simon's description of PuTTY code.

One is, in several different parts of the program are a class which "owns" an instance of a socket class. Many of the functions in the socket also need to refer to the owning class. There are two main ways to do that. One way is every call to a socket function passes a pointer to the parent. But that clutters up the interface. Or the socket stores a pointer to the parent initialised on construction. But there is no really appropriate smart pointer, because both classes have pointers to each other.

A socket must have an owner. And classes which deal with sockets will usually have exactly one that they own, but will also often have none (and later open one), or more than one.

And because you "know" the pointer to the parent will never be invalid as long as the socket is owned by the parent, because you never intend to pass that pointer out of that class, but there is no decent language construction for "this pointer is a member of this class, and I will never copy it, honest" which then allows the child to have a provably-safe pointer to the parent. This is moot in C if you don't have smart pointers anyway, but it would still be useful to exactly identify the use case so a common code construction could be used, so programmers can see at a glance the intended functionality. It would be useful to resolve in C++. And there are further problems in rust, where using non-provably-safe pointers is deliberately discouraged, and there's a greater expectation that a class can be moved in memory (and so shouldn't contain pointers to parts of itself).

The same problem is described too different ways. One is, "a common pattern is allocating an owned class as a member of another class, the owned class has an equal or shorter lifetime than the owner, and a pointer back to it which is known to always be valid with no pointer loops", or a special sort of two-way pointer, where one is an owning pointer and the other is a provably-valid non-owning pointer. Another is "classes often want to refer to the class that owns them, or the context they were called from, and there is no consistent/standard way of doing that."

Proposal

Using C++ terminology, in addition to deriving from a class, a class can be declared "within" another class, often an abstract base class aka interface aka trait of the actual parent(s).

class Plug
{
virtual void please_call_me_from_socket(int arg1)=0;
}

class Socket : within Plug
{
// Please instantiate me only from classes inheriting from Plug
public:
void do_something();
private:
int foo;
};


The non-static member functions of Socket, in addition to a hidden pointer parameter identifying the instance of socket which is accessed by "this->", has a second hidden parameter identifying the instance of Plug from which is was called, accessed by "parent->please_call_me_from_socket(foo)" (or parent<Plug>->please_call_me_from_socket(foo) or something to disambiguate if there are multiple within declarations. Syntax pending).

Where does that pointer come from? If it's called from a member function of a class which is itself within Plug, then it gets that value. That's not so useful for plug, but is useful for classes which you want to be accessible almost everywhere in your program, such as a logging class.

In that case, you may want a different syntax, say a "within" block which says all classes in the block are within an "app" class, and then naturally all pass around a pointer to the top-level app class without any visual clutter. And it only matters when you want to use it, and when you admit the logger can't "just be global".

For Socket, we require than member functions are Socket are only called from member functions of Plug (which is what we expected in the first place, but hadn't previously had a way of guaranteeing). And then the "parent" pointer comes from the "this" pointer of the calling function.

There is probably also some niche syntax for specifying the parent pointer explicitly, if the calling code has a pointer to a class derived from Plug, but isn't a member function, or wants to use a different pointer to this. The section on pointers may cover this.

Pointers, Callbacks, Alternatives, and Next Steps )
jack: (Default)
I am still mulling this over after reading some articles on it (thanks, fanf, Kaela).

Background

Imagine you have a fairly simple function.

RetType func1(arg1, arg2)
{
   return func3(func2(arg1),func2(arg2)).func4();
}


But those other functions may encounter errors. Eg. they involve opening files, which may not be there.

Assume the error return can't usually be passed to a follow-up function.[1] The obvious then necessary step is for each function call, test the return value, if it's an error, return an error from this function. Else continue with the calculation. But this usually involves several lines of code for each of these functions, which obscures the desired control flow.

If you are willing to accept exceptions, you can just write the code above an allow any exceptions to propagate. But that represents a lot of hidden complexity from not knowing what might be thrown. And often overhead in runtime.

And in fact, this may obscure a common pattern, that for some function (eg. "parse this"), you SOMETIMES want to treat the failure as an error, and sometimes to interrogate it. As in, choose in the calling code whether failure is an error-value or exception.

Also remember, in C-like languages, many values unavoidably have a possible error case which can't be passed to other functions, null pointer. Ideally it would be clear which pointers might be null and which have already been assumed not to be.

In Rust

In Rust (if I understand correctly), these possibilities are often wrapped up in a try macro.

There is a conventional "Result" return type from most functions which may succeed or fail, which has one of two values. Either 'Ok' (usually though not required wrapping a return value). Or 'Err', wrapping a specific error (just a string, or an error object).

The try macro combines the "test return value, if it's an error, return that error from this function, else, evaluate to the successful value" into a brief expression:

try!(func2(arg))

Which seems like often what you want. Obviously if you want to handle the error in some way (say, you're interested in whether it succeeds, not just the successful result), you can interrogate the result value for ok or err.

And there's also a macro for "assume success, unwrap the result value, panic if it's not there", just like you can access a pointer without checking for null if you want. But functions which can't return an error shouldn't return "Result", so if you do that, it's clear you *might* fail. Which is exactly what you want for throw-away code. But it does mean, you can search for the unwrap macro if you want to find all the points where you did that and fix them.

Rust recent innovation: ?

I mention try! for historical reasons, but just recently, Rust has promoted it into a language feature, reducing the overhead further from four to six characters, to 1: '?' after a value means the same thing as the try macro.

Result<int, errtype="ErrType"> func1(arg1, arg2)
{
   return func3(func2(arg1)?,func2(arg2)?)?.func4()?; // Pseudocode, not actual rust syntax
}


Rust recent innovation: chaining

This is also really new and not standard yet, but I like the idea. Error chaining. The function .error_chain(|| "New error") is applied to the result of a function call. If it was a success, that's fine. If not, this error is added to the previous error. It is typically then followed by the try macro or ?. (I think?)

That means that your function can return a more useful error, eg. "could not open log file" or "could not calculate proportion". Which carries along the additional information of WHY it couldn't, eg. "could not open file XXXX in read mode" or "div by zero".

And then a higher level function can decide which of those it cares about handling -- usually not the lowest level one.

In some ways like exceptions, but (hopefully, because Rust) with no runtime overhead.

Footnotes

[1] I often think of it as, an error-value is one that, under any future operation of any sort, stays the same error value, but that's usually not how it's actually implemented.
jack: (Default)
A little while ago, someone told me about a really simple algorithm brainteaser. Suppose you want to find both the minimum and maximum of an array. Instead of writing something like:
   for (int i=0;i<size;i+=2)
   {
      if (arr[i]<min) min = arr[i];
      if (arr[i+1]<min) min = arr[i+1];
      if (arr[i+1]>max) max = arr[i+1];
      if (arr[i]>max) max = arr[i];
   }

You can reduce the number of comparisons per two elements from 4 to 3 by doing something like:
      if (arr[i]<arr[i+1])
      {
         if (arr[i]<min) min = arr[i];
         if (arr[i+1]>max) max = arr[i+1];
      }
      else
      {
         if (arr[i+1]<min) min = arr[i+1];
         if (arr[i]>max) max = arr[i];
      }

I asked, does it make a difference if that pipelines less efficiently, and I didn't really get an answer, but I got the impression that wasn't a sensible question to ask.

But when I actually tried it, with some simple instrumentation code (using "clock()" from "time.h"), the second took about twice as long. On a windows PC, compiled with cl, using O2.

When I looked at the disassembly, each comparison looked to be something like:
   if (arr[i]<min) min = arr[i];
0040118B  mov         ecx,dword ptr [i] 
0040118E  mov         edx,dword ptr [arr] 
00401191  mov         eax,dword ptr [min] 
00401194  mov         ecx,dword ptr [edx+ecx*4] 
00401197  cmp         ecx,dword ptr [eax] 
00401199  jge         min_max_2+59h (4011A9h) 
0040119B  mov         edx,dword ptr [min] 
0040119E  mov         eax,dword ptr [i] 
004011A1  mov         ecx,dword ptr [arr] 
004011A4  mov         eax,dword ptr [ecx+eax*4] 
004011A7  mov         dword ptr [edx],eax

Which didn't seem great, but did seem like the number of instructions was proportional to the number of lines expected to be executed.

What have I missed?
jack: (Default)
In C and C++, you should avoid using an uninitialised variable for several reasons, not least of which, it's undefined behaviour (?) But in practice, what are the relative likelihoods of the (I think?) permitted outcomes:

(a) it being treated as some unknown value
(b) the following code being deleted by the compiler
(c) something even weirder happening?
jack: (Default)
I talked about this several times before, but the idea was still settling down in my head and I don't think it made a lot of sense.

Imagine each commit had one parent that was the "branch" parent or "this is a logical atomic change" parent or the "the tests still pass" parent, or the "space not time" parent. All the same guarantees you'd expect from code on the same branch hold (eg. compiles, tests pass). This represents history the same way a non-branching source control history does, a list of changes that sum up the current state of the software. Or, for that matter, the same way a heavily rebased clunk-free history does. It shows the code being built up.

And each commit may have none or one (or more) "content" parent or "rebase" parent or a "chronological" parent or a "meta" parent, that represents "the change made in this commit, is based on this other commit".

If you already merge from branch into trunk, you may find the parents are quite like this already.

Why might you want to do this? Well, to me, the good reason is that it does away with all the "oh no I want to rebase but I already pushed". The pre-rebase history is just always there by default, though you could choose to purge those commits if you wanted. So any software working off your remote, when it pull-rebases, can just automatically move anything that was built on top of your pre-rebase branch onto your post-rebase branch, just as if you'd committed a few extra commits without rebasing anything. And the new code isn't just suddenly in limbo, anyone can check that the branch tip is the same as the pre-rebase branch tip.

It also may provide useful hints for merging avoiding duplicated commits, when there's extra info about which commits are "the same" other than their content. It doesn't solve the problem, but it may help some of the time your source control program can't automatically fix it.

It also removes all the "oh fuck, I screwed up, reflog, reflog, flog harder!" if you accidentally screw up a branch you're working with. Instead of the previous tip of the branch floating in limbo hoping you'll retreive it, the previous branch history, all of it, is retained, until you explicitly delete it. You don't even need to be on the same computer, someone else can ask you can push the changes and sort out your mess, when you can't (I think?) push a reflog.
jack: (Default)
https://www.dropbox.com/s/re1eu39rpaphtx1/pride_and_prejudice_e2f.html?dl=0

You may have seen the hash I made of Pride and Prejudice last week. Here's the point.

The idea is, take a simple word-by-word translation and an English text, and apply it to one word in the first sentence, and also anywhere that word appears thereafter, and another word in the second sentence, and anywhere that word appears thereafter, as so on. And the replacement has a hovertip telling you the english word (that should work in the link, at least in Chrome, let me know if not?).

The idea is, like people learn bits of Lapin by reading Watership Down, without ever consciously trying to remember it, or by watching TV with subtitles, that you always can click to get the translation, but by being repeatedly reinforced, you start to remember it as you go on without having to make a big effort to do so.

Or at any rate, that was the idea. I've no idea if it would work!

I apologise for the quality of the translation, I used a dodgy wordlist to get something up and running, but it doesn't do an actual context-sensitive translation. You could also do the reverse, take a French text and translate all the words in your wordlist to start with, and successively less as you go. So the grammar would be correct.

Any suggestions?
jack: (Default)
State of progress: "Qu'il N'est-ce UNE vérité universally acknowledged, Qu'on UNE single Homme In possession DU UNE Bonne fortune, must s'agir In voulez DU UNE épouse."

No, it's not supposed to be a good translation, it's supposed to be an approximate word-by-word thing for reasons which will become apparent later. I'm pleased it worked AT ALL, as badly-capitalised franglish as it is :)
jack: (Default)
I got my robot-programming game to run on Android! I knew it shouldn't be *that* difficult, but it's really magical seeing something I wrote running on a platform I never expected to run on.

I haven't done any more to the game since last year, so it's not really *playable*, but it runs and you can interact with it.

I used kivy as the UI framework for a python game, because it advertised being able to compile to android, and I wanted to learn more python more than I wanted to learn java. So I developed the program on the PC, using kivy for graphics and mouse events (which later become touch/drag events on a touchscreen).

And then after several false starts, I downloaded a VM set up to the do the "build to android" step with buildozer, updated buildozer to the latest version, copied my source to the VM, I already had generated a buildozer.spec file, and it all just worked -- I got an apk, I opened it from dropbox on android, and there was my game running.

Gotchas, I don't expect anyone to try this with my instructions, but in case you do, things I didn't find obvious: to share a folder with the VM, you need to add your user account to a "can see shared folders" usergroup; buildozer can fail to work on a shared folder, copy the files to a local directory on the VM; you should be able to install the android dev kit etc with or without buildozer, but I couldn't get it to work.
jack: (Default)
My latest thought is, I expect to be able to get a close-to-optimal matching for N=2n or 2n+1 people in n rounds of N/3 groups of 3, when N is a multiple of 3.

But if we have exactly enough groups that each person meets each other, that relies on each group of 3 in each round, all 3 pairs not having met before. So maybe the right algorithm is for each round, "randomly generate groups of three where all 3 pairs haven't met before, made of people not already placed this round". And as soon as you can't, back up a round or more and try again. That guarantees permuting when we need to, as soon as we need to. And I sort of hope that when things work in the middle, they'll "just work" for the last few rounds but I don't know if they will.

However, I'm away until Sunday so I probably won't have a chance to try it. Anyone else interested enough to have a go?
jack: (Default)
Having failed to use the biggest hammer ("ask the internet if anyone can think of a general mathematical solution"), I tried the next-biggest, brute-force.

I wrote a program which divides people into random matchings, and then switches matches which are over- or -under represented. I'm not quite sure what sort of "random switching" is best, I hoped to just get lucky.

The first effort found a version for 9 people easily, but didn't find one any larger than that. The second was about the same. That's where I am now. I can describe the shuffling in detail if anyone is interested or thinks they might have any suggestions.
jack: (Default)
logbook

fanf linked to an article about this, I can't remember where now, but I wanted to talk about keeping a logbook. I think most people do something similar in some way, but I don't know if everyone thinks about it the same way. About three years ago, I noticed that I didn't always remember things I needed, and it was a problem, and I started keeping the simplest possible solution, which seems to have done fine so I still have it.

I use a text file (backed up, and in a separate source control system just in case I ever want to fish out anything I've deleted, tho' that doesn't seem likely) as a diary or logbook.

Every code change goes in source control, but almost everything ELSE goes in the diary. Notes on where such-and-such server is, and who to ask for log-in details. Results from a series of experiments. Dead-ends investigated and abandoned. Steps to install something that I'm likely to need again in 18 months, but I don't know if anyone else will need.

Brief summaries of conversations with people, suggestions, etc. Informal deadlines, comments on other projects I don't officially need to know but it's likely useful to be aware of. A summary, entirely for myself, of what I've done today, at whatever level of detail seems useful (sometimes just, "closed off several dead ends, nothing else useful", often "did X, Y and Z. need to finish A and B. Decide if C." Occasionally a more detailed checklist pending being copied somewhere else).

And if anything seems like it WILL be important to other people, it's easy to copy it into a bug report or anywhere else.

what DOESN'T go in there

There's several things that do NOT go in the logbook. Anything that's already in source control commit comments doesn't need to (though sometimes the comments are a cleaned-up version of a longer work-in-progress in the diary).

Personal thoughts, frustrations, introspection, etc, etc, I find it valuable to write down in a SEPARATE file. Thinking through them is sometimes useful, but I never need to refer back to them in the same way.

Immediate TODOs go into another file, although I'm considering changing that. Long-term suggestions go into another file, or into bug database.

Brainstorming is typically on paper, but once I'm thinking of things which are fairly definitely relevant, they usually go on TODO list or logbook.

There's an art to judging what's necessary. I used to write too much. If I've gone down a bunch of blind allies fixing transient problems, I don't usually need to record all the details, just say "after a bunch of faff, realised the problem is X".

Format

You could do something different. Keep a physical logbook. Have a procedure for everything. Organise by topic instead of date. Have a better memory.

But what I do is simple. One long text file. I find everything important by just searching (occasionally I'm slightly redundant to make sure any likely keywords are included in what I'm writing). I have a line ">>>> [date]" in the correct format, a blank line, and then whatever. Usually a short description of what I did. Sometimes a list of things, which are copied to a todo-list or lightly edited throughout the day. Sometimes results of something. Sometimes a useful command line I might need to refer to later.

I never need to skim, I rarely need to say "what things did I do that week", so as long as I can clearly separate days, I don't worry about taking up too much space. If I needed to, I might annotate each day with a split of which tasks I spent time on, but I generally work on projects measured in weeks or longer.

At the top are a few general notes, such as links to servers, internal webpages, etc.

If I worked on significantly different work, I might keep one per project. Eg. for each home project I would have something similar, but at work I just have one for everything related to the company codebase.

Benefits

Anything I previously knew, is generally easily searchable, I almost never have to say "wait, why isn't it working, I had this problem before, why can't I remember what I did"?

If I need to check what state things were in at some time in the past, it's fairly easy to check what I was working on at the time.

It's always obvious what I was in the middle of, I never have to ask "wait, what was I doing on Friday, why did I stop?"

I almost never have to think "I would have written that down. But where?"

You

What do you do?
jack: (Default)
Aha! My Ubuntu VM had an old version of git. After updating, my previous attempts at cvs-fast-export now Just Work, and work's entire CVS repository as of a month ago is now in git! Now I need to decide if that's useful by itself, if it's worth tidying up with reposurgeon and otherwise, and if I can persuade anyone else it would be useful to use... :)

But I feel chuffed it worked, even though ESR did all the work and all I did was run "cvs-fast-export | git fast-import" :)
jack: (Default)
I could have found an answer that fitted this question and yesterdays question both, but I decided they were interesting in different ways.

Technological innovations I think we're groping towards, which I'm impatient to have already:

A programming language with a syntax as straightforward as python, but works like C++14 is trying to, of letting it all compile to blazing fast code, even for embedded systems, by default, but letting you easily use dynamic typing where you actually want it. And of letting you use static type checking MOST of the time, but lets you be as dynamic as you need when you actually need it.

Widespread 3D printing of replacement parts, etc. We're nearly there, but we're waiting for a slightly wider variety of materials, and a wider database of possible things. Where you can say "I want this £10 widget holder from the supermarket, but can I get one 30% longer if I pay extra? OK? Thank you!"

Private cars replaced by mega-fleets of robot taxis and universal good public transport throughout/between all population dense areas.

Everyone uses git, or another dvcs, and the interface is actually consistent and friendly for everybody.

Decent, standardised, change-tracking and formatting for non-plain-text documents that allows sensible merging. (OK, this seems to be two steps forward and three steps back, so maybe there's no point waiting for it, but I'd still like it! :))